Sample Exam Questions

From the objective of OSS-DB Exam Gold
- Operation and Management
- General O&M commands - PITR

Sample Question

1.01

Choose the two most appropriate statements for the
explanation of point-in-time recovery (PITR).

  1. Base backup is a backup of the whole database cluster that can be used "as is" if it is copied to an environment where the same version of PostgreSQL is installed.5, 5, 2
  2. The base backup only needs to be taken once after starting operation of the system.
  3. If the pg_xlog directory is deleted, there may be instances, at recovery, where some of the updated information is lost and recovery to the most recent state is not possible.
  4. If the base backup is taken again, older archive logs can be deleted.
  5. In order to ensure safe operation, it is preferable that the archive_timeout parameter be set to as high a value as possible within a range that does not impact performance.

※This sample exam is different from those that appear in the actual OSS-DB Exam.
2012/03/14

Answer and Explanation

First, let us confirm the operating principles of PITR.
A backup is taken of the whole database cluster. This is referred to as the base backup.
The procedure is as follows. First, connect to the database and then execute pg_start_backup(), take a copy of the whole database cluster using OS commands, such as tar, reconnect to the database and execute pg_stop_backup().
PostgreSQL 9.1 and subsequent versions allow these steps to be executed together using the pg_basebackup command.
Copy all subsequent transaction logs and store (archive) them in a safe location after taking the base backup.
In PostgreSQL, this transaction log is referred to as a WAL (Write Ahead Logging) file.
If recovery is required because the database has become damaged, there is a way of returning the database content to its original state by applying the WAL file and subsequent transactions to the base backup.
In addition to its use with PITR, the base backup can also be used as the starting point for streaming replication; however, in contrast to a cold backup, a streaming replication does not include a backup of the complete database, it cannot be copied to a different environment and used as a backup.
If a failure occurs in the database, the base backup and all subsequently archived WAL files will be required for PITR. Furthermore, all WAL files under the pg_xlog directory that have not been archived will also be required.
Logically, as long as the base backup is taken all at once and all subsequent WAL files are archived, recovery is possible using PITR. However, a huge amount of capacity will be required for the archives and recovery will take a long time. Therefore, it is recommended that the base backup be retaken periodically.
If the base backup is retaken, older archive logs will not be used for recovery and can be discarded.
For actual operation, it is necessary to configure several parameters in the postgresql.conf file.
When the size of the WAL file reaches 16 MB or the time configured in the archive_timeout parameter has passed, the file will be archived by calling the commands specified in the archive_command parameter.
WAL files that have not been archived will remain under the pg_xlog directory. If the pg_xlog directory is deleted, transactions that have not been archived will not be recoverable and will be lost (referred to as an incomplete recovery).
To prevent this (or make operation safer), the archive_timeout parameter should be reduced. However, if the archive_command is run too frequently, this will impact system performance, operation, and required disk capacity. Therefore, the archive_timeout parameter should be set to a value as low as possible to avoid problems in these areas.
In the PostgreSQL 9.1 manual, it is stated that an archive_timeout of approximately one minute is reasonable. However, if it is necessary to copy data at faster speeds, the user should consider streaming replication rather than archiving.
Even if the archiving occurs at the time set by the archive_timeout has elapsed, the size of the copied WAL file size will still be 16 MB. Reducing the archive_timeout value will require a large disk capacity for archive space.
For example, if the archive_timeout is set to five minutes, the WAL file will be archived at least 12 times per hour, resulting in 300 WAL files per day. This would require at least 5 GB of storage space per day.

Therefore, the correct answers are C and D.

Refer to the following pages in the manual for details on PITR configuration, operation, and recovery.
http://www.postgresql.jp/document/current/html/continuous-archiving.html