Sample Exam Questions

From the objective of OSS-DB Exam Gold
- Monitoring Performance - Execution plans

Sample Question

1.01

"Select three correct statements regarding the following EXPLAIN output.

postgres=# EXPLAIN SELECT relname,nspname FROM pg_class left join pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);

                               QUERY PLAN

-------------------------------------------------------------------------

 Hash Left Join (cost=1.14..15.97 rows=288 width=128)

   Hash Cond: (pg_class.relnamespace = pg_namespace.oid)

   -> Seq Scan on pg_class (cost=0.00..10.88 rows=288 width=68)

   -> Hash (cost=1.06..1.06 rows=6 width=68)

         -> Seq Scan on pg_namespace (cost=0.00..1.06 rows=6 width=68)

(5 rows)

  1. Executing this EXPLAIN command actually executes the SQL statement specified in the argument.

  2. The descriptions "Hash Left Join", "Seq Scan on pg_class", and "Seq Scan on pg_namespace" in the plan nodes represent the optimal data processing algorithms chosen automatically by the query planner.

  3. The "cost=..." portion represents the actual time required for processing.

  4. The "rows=..." portion represents the estimated number of rows that each plan node will process.

  5. The "width=..." portion represents the estimated average width (in bytes) of a row based on statistics.

※This sample exam is different from those that appear in the actual OSS-DB Exam.
2025/05/29

Answer and Explanation

The correct answers are B, D, and E.

When a user or application executes SQL statements against a database, the planner automatically selects the optimal algorithm to execute the SQL statement. The executor then actually executes the SQL statement.

For developers of database-based applications, there are cases where they want to tune the selection of a more optimal algorithm or evaluate the performance of different SQL queries that achieve the same result. In such cases, the EXPLAIN command can be used to check and evaluate the planner's choice of algorithm without actually executing the SQL statements.

The syntax for executing the EXPLAIN command is as follows:
EXPLAIN [ANALYZE] <SQL statement>

When executed without the ANALYZE option, as in the example, <SQL statement> is not actually executed. On the other hand, when executed with the ANALYZE option, <SQL statement> is actually executed.

The output of the EXPLAIN command is referred to as an execution plan. The execution plan indicates the table scans referenced by the SQL statement, the algorithm to perform table joins, and so on. The output of the execution plan has the following syntax:

QUERY PLAN
-------------------------------------------------------------------------
 <plan node> (cost=<estimated start-up cost>..<estimated total cost> rows=<number of rows> width=<average size of rows>)

The description of each item is as follows:
<plan node>
This is where the specific algorithm for the actual data processing by the executor is described. There are various algorithms such as table scans, index scans, table joins, sorts, etc.
<estimated start-up cost>
This is the estimated cost (in time units) to retrieve the first row using the algorithm described in the <plan node>.
<estimated total cost>
This is the estimated total cost (in time units) to complete the execution of the algorithm described in the <plan node>.
<number of rows>
The number of rows processed as estimated based on the statistics.
<average size of rows>
The size in bytes per row of output from the <plan node>, estimated based on the statistics.

Let's look at the options one by one.

[A. Executing this EXPLAIN command actually executes the SQL statement specified in the argument.]
Incorrect. The EXPLAIN command in the example does not have an ANALYZE option for actually executing the SQL statement, so the SQL statement is not actually executed.

[B. The descriptions "Hash Left Join", "Seq Scan on pg_class", and "Seq Scan on pg_namespace" in the plan nodes represent the optimal data processing algorithms chosen automatically by the query planner.]
Correct.

[C. The "cost=..." portion represents the actual time required for processing].
Incorrect. The "cost" portion represents the estimated start-up and total cost (in time units) of the operation, and does not represent the actual time required for processing.

[D. The "rows=..." portion represents the estimated number of rows that each plan node will process.]
Correct.

[E. The "width=..." portion represents the estimated average width (in bytes) of a row based on statistics.]
Correct.