30 Advanced PostgreSQL SELECT Questions for Senior Developers [Part 2]

By admin | 9 months ago

interviewjobsdatabasepsqlmysql senior developerquerydb

Question 1: Aggregating Data Across Tables

How do you aggregate data across multiple tables using `JOIN` and compute summary statistics such as averages and sums?

Answer

To aggregate data across multiple tables, you can use the `JOIN` clause to combine the tables based on a related column and then use aggregate functions like `AVG()` and `SUM()` along with the `GROUP BY` clause to compute summary statistics.

Example

SELECT o.customer_id, AVG(o.amount) AS average_order_amount, SUM(o.amount) AS total_spent FROM orders o JOIN customers c ON o.customer_id = c.id GROUP BY o.customer_id;

Explanation

This query calculates the average and total amount spent per customer by joining the `orders` and `customers` tables on the customer ID.


Question 2: Filtering Aggregated Data

How can you filter aggregated data using the `HAVING` clause?

Answer

The `HAVING` clause is used to filter the results of an aggregation based on a specified condition. Unlike the `WHERE` clause, which filters rows before the aggregation, `HAVING` filters the rows after the aggregation.

Example

SELECT employee_id, COUNT(*) AS total_sales FROM sales GROUP BY employee_id HAVING COUNT(*) > 10;

Explanation

This query selects employees who have made more than 10 sales, demonstrating how `HAVING` filters the groups created by `GROUP BY` based on the condition that the count of sales per employee is greater than 10.


Question 3: Using Window Functions for Ranking

How do you rank items within a group without changing the result set size using window functions?

Answer

You can use the `RANK()` or `ROW_NUMBER()` window functions along with the `OVER()` clause to assign ranks to rows within a partition of the result set.

Example

SELECT name, department, sales, RANK() OVER(PARTITION BY department ORDER BY sales DESC) as rank FROM employees;

Explanation

This query ranks employees within each department based on their sales in descending order. Unlike GROUP BY, window functions do not reduce the number of rows returned, allowing you to see the rank alongside each row's original data.


Question 4: Recursive Queries with CTEs

How do you implement a recursive query in PostgreSQL to retrieve hierarchical data, such as organizational charts?

Answer

Use a Common Table Expression (CTE) with recursion. Recursive CTEs consist of a base case and a recursive step to process hierarchical or tree-structured data.

Example

WITH RECURSIVE org_chart AS ( SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id FROM employees e INNER JOIN org_chart oc ON oc.id = e.manager_id ) SELECT * FROM org_chart;

Explanation

This query constructs an organizational chart starting from employees who do not have a manager (manager_id IS NULL) and recursively adds their subordinates. The result is a hierarchical view of the organization.


Question 5: Optimizing Queries with Indexes

What strategy would you use to optimize a slow-running `SELECT` query that filters on a non-indexed column?

Answer

First, analyze the query execution plan using `EXPLAIN` to confirm the performance bottleneck. If filtering on a non-indexed column is the cause, consider creating an appropriate index on that column.

Example

CREATE INDEX idx_column_name ON table_name(column_name);

Explanation

Adding an index on the column used in the filter condition can significantly improve query performance by reducing the need for a full table scan.

Continuing with more advanced PostgreSQL questions tailored for senior developers, focusing on performance, optimization, and advanced querying techniques:

Question 6: Partitioning Large Tables

How can table partitioning improve query performance in PostgreSQL?

Answer

Partitioning a large table into smaller, more manageable pieces can significantly improve query performance, especially for operations that only access a subset of the data. Partitioning can also make maintenance tasks like backups and deletes more efficient.

Example

CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate);

Explanation

This example creates a partitioned table based on the range of logdate. You would then create individual partitions for specific ranges of logdate.


Question 7: Efficiently Querying JSON Data

What are the benefits of using the `jsonb` data type, and how do you query a JSON object stored in a `jsonb` column?

Answer

The `jsonb` data type stores JSON data in a binary format, allowing for more efficient querying, indexing, and manipulation of JSON objects. Unlike `json, jsonb` supports indexing, which can significantly speed up search operations.

Example

SELECT info ->> 'name' AS name FROM orders WHERE info -> 'items' @> '[{"product": "SKU123"}]';

Explanation

This query retrieves the name field from a `jsonb` column named `info` where the `items` array contains an object with a `product` key of SKU123.


Question 8: Optimizing Full-Text Search

How do you implement and optimize full-text search in PostgreSQL?

Answer

Use the `tsvector` and `tsquery` data types for full-text search. For optimization, create a GIN (Generalized Inverted Index) index on the `tsvector` column to speed up text search queries.

Example

CREATE INDEX document_text_idx ON documents USING gin (to_tsvector('english', document_text));

Explanation

This creates a GIN index on the `document_text` column converted to `tsvector, optimizing full-text search queries on the documents` table.


Question 9: Using Listen/Notify for Real-Time Updates

How can you use PostgreSQL's LISTEN and NOTIFY commands to implement real-time notifications?

Answer

The `LISTEN` command allows a session to listen on a specified notification channel, while the `NOTIFY` command sends a notification to the channel. This mechanism can be used to implement real-time notifications within an application.

Example

Session 1:

LISTEN channel_name;

Session 2:

NOTIFY channel_name, 'My message';

Explanation

Session 1 listens for notifications on channel_name. When Session 2 sends a NOTIFY on channel_name, Session 1 receives the message 'My message'.


Question 10: Working with Arrays

How do you efficiently search for a value within an array column in PostgreSQL?

Answer

Use the `ANY` or `CONTAINS` operators to search within array columns. Creating a GIN index on the array column can also improve search performance.

Example

SELECT * FROM table_name WHERE 'value' = ANY(array_column);

Explanation

This query selects rows from `table_name` where the `array_column` contains 'value'.


Question 11: Conditional Aggregation

How do you perform conditional aggregation in a `SELECT` query?

Answer

Use the `FILTER` clause with aggregation functions to perform conditional aggregation.

Example

SELECT COUNT(*) FILTER (WHERE condition) AS conditional_count FROM table_name;

Explanation

This counts only the rows that meet a specific condition, allowing for more complex aggregations in a single query pass.


Question 12: Upserting Data

What is the best way to perform an upsert (update or insert) operation in PostgreSQL?

Answer

Use the `ON CONFLICT` clause with the `INSERT` statement to specify an upsert operation.

Example

INSERT INTO table_name (id, column1) VALUES (1, 'A') ON CONFLICT (id) DO UPDATE SET column1 = EXCLUDED.column1;

Explanation

This attempts to insert a row and, if a conflict on the `id` column occurs, updates the existing row instead.


Question 13: Analyzing Query Performance

How do you analyze the performance of a PostgreSQL query?

Answer

Use the `EXPLAIN` or `EXPLAIN ANALYZE` command to see the execution plan of a query, including details about joins, indexes used

, and cost estimates.

Example

EXPLAIN ANALYZE SELECT * FROM table_name WHERE column_name = 'value';

Explanation

This shows the execution plan and runtime statistics for the query, helping identify potential performance bottlenecks.


Question 14: Managing Large Objects (LOBs)

How does PostgreSQL handle large objects (LOBs), and what are the best practices for storing them?

Answer

PostgreSQL provides a large object facility to store LOBs. Best practices include using the `bytea` data type for smaller LOBs or the Large Object feature for larger ones. For very large data, consider storing outside the database with only a reference stored in the database.

Example

For bytea:

CREATE TABLE lob_table (id serial, data bytea);

For Large Object:

SELECT lo_create(0);

Explanation

bytea is suitable for smaller LOBs directly in the table. The Large Object feature allows storing up to 2GB per object but requires using special functions to access the data.


Question 15: Tuning PostgreSQL for Read Performance

What are some key PostgreSQL configurations for optimizing read-heavy workloads?

Answer

Key configurations include adjusting `shared_buffers, work_mem, maintenance_work_mem, and effective_cache_size` to better utilize system memory. Also, consider increasing `max_parallel_workers_per_gather` for better use of parallelism in read queries.

Explanation

These parameters help PostgreSQL better manage its memory and computational resources, improving the performance of read-heavy operations.

Feel free to checkout latest database jobs in kerala