Sample Exam Questions

From the objective of OSS-DB Exam Silver
S1.2 General knowledge of relational databases (Database design and normalization)

Sample Question

1.03

"Below is a part of the sales table that we want to normalize. Choose the two most important aspects to consider.

+---+------+--------------+-------+-----+------+-----+------+----+
| Date | Customer Name | Customer Location   | Item 1  | Qty 1| Item 2| Qty 2| Item 3| Qty 3|
+-----+--------------------+------------------------+----------+-----+-------+-----+--------+----+
|2/1 | Yodobashi      | Ichibancho, Chiyoda-ku | TV        |     5| PC   |    3| Camera |    5|
|2/1 | Yamada         | Minato-ku Azabudai     | PC      |     5| Radio |    2|         |      |
|2/1 | Kojima          | Uchisaiwaicho, Chiyoda-ku | Refrigerator |    1| Washing machine |    4| Battery   | 100|
|2/2 | Nojima      | Ichibancho, Chiyoda-ku | TV       |   10|      |     |         |     |
|2/5 | Yamada        | Minato-ku Azabudai  | Mouse     |   10|DVD  |  20 |CD      |   20|
|2/8 | Kojima          | Uchisaiwaicho, Chiyoda-ku | Camera |   10|   |    |         |     |
+-----+--------------------+------------------------+----------+-----+-------+-----+--------+----+

  1. Since the dates have duplicates, such as 2/1 appearing three times, they are separated into a different table.
  2. Separate the customers into a different table because they appear repeatedly.
  3. If we know the customer name, we know the customer location, so we separate this combination into a different table.
  4. Since Yodobashi and Nojima have duplicated locations, the locations are separated into a different table.
  5. Separate the combination of items and quantities into a different table.

※This sample exam is different from those that appear in the actual OSS-DB Exam.
2024/05/25

Answer and Explanation

Table normalization eliminates data duplication and is very important for reducing capacity and improving maintainability. The key points of normalization are to separate repeated columns with the same purpose within a table and to separate parts with dependencies within the columns of a table.

Columns with the same purpose, for example, are the combinations of items and quantities in the example table, which are repeated in a format such as Item 1, Qty 1, Item 2, Qty 2, and so on. This results in a table structure that extends horizontally, and it can be organized in the first normal form. Specifically, you create a sales detail table and store items and quantities in it. In the data of Yodobashi for 2/1, one row is created in the sales table and three rows of data are created in the sales detail table. You can link these data by creating a common key column, such as a sales number, in both tables.

Column dependency means that a value in one column uniquely determines a value in another column. It is classified into the second normal form, the third normal form, etc., depending on its dependency. In the sales table of this example, if the customer name is determined, the customer location is also determined uniquely, so you create a customer table that stores the customer name and the customer location, and the sales table stores only the customer name, then the customer table is referenced for the location. This is the second normal form.

When it comes to eliminating duplication, it might seem like it is about avoiding repetition of the same values in fields such as date, customer name, customer location, item, etc.  However, the key factor is whether there is a dependency between the items. Dates and items have no dependencies on other items. There is a dependency between the customer name and the customer location, so it is subject to normalization, but the customer name is not dependent on any other item, and it is just a coincidence that multiple customers have the same location, so there is no dependency.

So the correct answers are C and E.

In practice, it is common to store the customer ID in the sales table instead of the customer name, and to refer to the customer table for the customer name. The point is that the sales table is not normalized if it has both the customer ID and the customer name, and it is normalized if it only has the customer name, although there is a question of what is the primary key of the customer table.