30 Advanced PostgreSQL SELECT Questions for Senior Developers [Part 2]

Question 1: Aggregating Data Across Tables

How do you aggregate data across multiple tables using `JOIN` and compute summary statistics such as averages and sums?

Answer

To aggregate data across multiple tables, you can use the `JOIN` clause to combine the tables based on a related column and then use aggregate functions like `AVG()` and `SUM()` along with the `GROUP BY` clause to compute summary statistics.

Example


SELECT o.customer_id, AVG(o.amount) AS average_order_amount, SUM(o.amount) AS total_spent

FROM orders o

JOIN customers c ON o.customer_id = c.id

GROUP BY o.customer_id;

Explanation

This query calculates the average and total amount spent per customer by joining the `orders` and `customers` tables on the customer ID.

Question 2: Filtering Aggregated Data

How can you filter aggregated data using the `HAVING` clause?

Answer

The `HAVING` clause is used to filter the results of an aggregation based on a specified condition. Unlike the `WHERE` clause, which filters rows before the aggregation, `HAVING` filters the rows after the aggregation.

Example


SELECT employee_id, COUNT(*) AS total_sales

FROM sales

GROUP BY employee_id

HAVING COUNT(*) > 10;

Explanation

This query selects employees who have made more than 10 sales, demonstrating how `HAVING` filters the groups created by `GROUP BY` based on the condition that the count of sales per employee is greater than 10.

Question 3: Using Window Functions for Ranking

How do you rank items within a group without changing the result set size using window functions?

Answer

You can use the `RANK()` or `ROW_NUMBER()` window functions along with the `OVER()` clause to assign ranks to rows within a partition of the result set.

Example


SELECT

  name,

  department,

  sales,

  RANK() OVER(PARTITION BY department ORDER BY sales DESC) as rank

FROM employees;

Explanation

This query ranks employees within each department based on their sales in descending order. Unlike GROUP BY, window functions do not reduce the number of rows returned, allowing you to see the rank alongside each row's original data.

Question 4: Recursive Queries with CTEs

How do you implement a recursive query in PostgreSQL to retrieve hierarchical data, such as organizational charts?

Answer

Use a Common Table Expression (CTE) with recursion. Recursive CTEs consist of a base case and a recursive step to process hierarchical or tree-structured data.

Example


WITH RECURSIVE org_chart AS (

  SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL

  UNION ALL

  SELECT e.id, e.name, e.manager_id FROM employees e

  INNER JOIN org_chart oc ON oc.id = e.manager_id

)

SELECT * FROM org_chart;

Explanation

This query constructs an organizational chart starting from employees who do not have a manager (manager_id IS NULL) and recursively adds their subordinates. The result is a hierarchical view of the organization.

Question 5: Optimizing Queries with Indexes

What strategy would you use to optimize a slow-running `SELECT` query that filters on a non-indexed column?

Answer

First, analyze the query execution plan using `EXPLAIN` to confirm the performance bottleneck. If filtering on a non-indexed column is the cause, consider creating an appropriate index on that column.

Example


CREATE INDEX idx_column_name ON table_name(column_name);

Explanation

Adding an index on the column used in the filter condition can significantly improve query performance by reducing the need for a full table scan.

Continuing with more advanced PostgreSQL questions tailored for senior developers, focusing on performance, optimization, and advanced querying techniques:

Question 6: Partitioning Large Tables

How can table partitioning improve query performance in PostgreSQL?

Answer

Partitioning a large table into smaller, more manageable pieces can significantly improve query performance, especially for operations that only access a subset of the data. Partitioning can also make maintenance tasks like backups and deletes more efficient.

Example


CREATE TABLE measurement (

    city_id         int not null,

    logdate         date not null,

    peaktemp        int,

    unitsales       int

) PARTITION BY RANGE (logdate);

Explanation

This example creates a partitioned table based on the range of logdate. You would then create individual partitions for specific ranges of logdate.

Question 7: Efficiently Querying JSON Data

What are the benefits of using the `jsonb` data type, and how do you query a JSON object stored in a `jsonb` column?

Answer

The `jsonb` data type stores JSON data in a binary format, allowing for more efficient querying, indexing, and manipulation of JSON objects. Unlike `json, jsonb` supports indexing, which can significantly speed up search operations.

Example


SELECT info ->> 'name' AS name

FROM orders

WHERE info -> 'items' @> '[{"product": "SKU123"}]';

Explanation

This query retrieves the name field from a `jsonb` column named `info` where the `items` array contains an object with a `product` key of SKU123.

Question 8: Optimizing Full-Text Search

How do you implement and optimize full-text search in PostgreSQL?

Answer

Use the `tsvector` and `tsquery` data types for full-text search. For optimization, create a GIN (Generalized Inverted Index) index on the `tsvector` column to speed up text search queries.

Example


CREATE INDEX document_text_idx ON documents USING gin (to_tsvector('english', document_text));

Explanation

This creates a GIN index on the `document_text` column converted to `tsvector, optimizing full-text search queries on the documents` table.

Question 9: Using Listen/Notify for Real-Time Updates

How can you use PostgreSQL's LISTEN and NOTIFY commands to implement real-time notifications?

Answer

The `LISTEN` command allows a session to listen on a specified notification channel, while the `NOTIFY` command sends a notification to the channel. This mechanism can be used to implement real-time notifications within an application.

Example

Session 1:


LISTEN channel_name;

Session 2:


NOTIFY channel_name, 'My message';

, and cost estimates.

Example


EXPLAIN ANALYZE SELECT * FROM table_name WHERE column_name = 'value';

Explanation

This shows the execution plan and runtime statistics for the query, helping identify potential performance bottlenecks.

Question 14: Managing Large Objects (LOBs)

How does PostgreSQL handle large objects (LOBs), and what are the best practices for storing them?

Answer

PostgreSQL provides a large object facility to store LOBs. Best practices include using the `bytea` data type for smaller LOBs or the Large Object feature for larger ones. For very large data, consider storing outside the database with only a reference stored in the database.

Example

For bytea:


CREATE TABLE lob_table (id serial, data bytea);

For Large Object:


SELECT lo_create(0);

Explanation

bytea is suitable for smaller LOBs directly in the table. The Large Object feature allows storing up to 2GB per object but requires using special functions to access the data.

Question 15: Tuning PostgreSQL for Read Performance

What are some key PostgreSQL configurations for optimizing read-heavy workloads?

Answer

Key configurations include adjusting `shared_buffers, work_mem, maintenance_work_mem, and effective_cache_size` to better utilize system memory. Also, consider increasing `max_parallel_workers_per_gather` for better use of parallelism in read queries.

Explanation

These parameters help PostgreSQL better manage its memory and computational resources, improving the performance of read-heavy operations.

Feel free to checkout latest database jobs in kerala