Sample Exam Questions
※This sample exam is different from those that appear in the actual OSS-DB Exam.
2024/05/01
Partitioning is one of the methods to improve database performance. It aims to enhance data access efficiency by dividing a large table into multiple smaller tables. PostgreSQL has long supported partitioning using inheritance. However, starting from PostgreSQL version 10, partitioned tables are supported through declarative partitioning.
In declarative partitioning, the parent table is created with a PARTITION BY clause, like CREATE TABLE table_name(column_definition) PARTITION BY....
Each partition is created with CREATE TABLE partition_name PARTITION OF table_name FOR VALUES..., and the FOR VALUES clause specifies what data to put into that partition.
For instance, consider a large table of sales data spanning many years. If you want to analyze the data for the most recent month, you would probably perform an index-based search according to the date specified in the WHERE clause. Even when the period is limited, the data itself is usually scattered throughout the large table, necessitating the reading of a large number of data pages.
For instance, you can create a partition for each month of sales and store the data for each month in that partition. Then, when analyzing sales data for the most recent month, you only need to read from specific partitions. This can significantly reduce the number of data pages that need to be read, and in some cases, a full scan of a partition without using an index can greatly improve performance.
Since you can specify the tablespace for each partition, it's also possible to place a set of frequently accessed rows (partition) and a less frequently accessed partition in different disk spaces to optimize cost and performance. Note that it's not possible to place each columns in a different tablespace. In such cases, the table itself should be separated, and view functions are used as needed.
For example, if you want to bulk delete data from 10 years ago, you can do so by dropping the partition containing the data with the DROP TABLE command. When bulk deleting large amounts of data from a table that doesn't use partitioning, there is a significant performance difference because each data page containing the target row needs to be updated internally, and VACUUM operations also occur.
Note that each partition contains the same data as a regular table, so the total data size of the partitions is the same as if all the data were stored in a single large table.
Therefore, the correct answers are B, C, and E.
Here, for example, you can create a partition for each month of sales and store the data for each month in that partition. Then, if you want to analyze sales data for the last month, you can read only certain partitions, which greatly reduces the number of data pages that need to be read, and in some cases, a full scan of a partition without an index can greatly improve performance.
Because you can specify the tablespaces to be placed on a per-partition basis, you can also place a set of frequently accessed rows (partitions) and non-frequently accessed partitions on separate disk space to optimize cost and performance. Note that it is not possible to place each column in a separate tablespace, so in such cases the table itself can be placed separately and the view functions can be used as needed.
For example, if you want to bulk drop data that is 10 years old, you can replace it by dropping the partition that contains the data with DROP TABLE. When bulk deleting large amounts of data in a table that does not use partition functionality, there is a significant performance difference because internally each data page containing the target row needs to be updated, and VACUUM operations also occur.
Note that each partition contains the same data as a regular table, so the data size of the entire partition is the same as storing all the data in one large table.
Therefore, the correct answers are B, C, and E.
© EDUCO All Rights Reserved.