|dc.description.abstract||When a relational database is chosen, normalization theory is a set of guidelines that may lead to efficient database designs. Thus, normalization of tables in a
database is a common process used for the analysis of relational databases. Sufficient
normalization of databases aims to decompose existing relational tables in order to
minimize database redundancy while preserving dependencies between attributes. It
also facilitates correct insertion, deletion, and modification of data in the database.
Given an un-normalized relational database, a redesign with no data redundancy and
which is guaranteed to preserve dependencies is not always achievable. Previous studies have shown that it is not always possible to decompose databases into relations
complying with the Boyce Codd Normal Form (which eliminates many of the simple redundancies), such as to guarantee the preservation of functional dependencies.
Due to this fact, the immediately weaker normalization concept that guarantees the
preservation of functional dependencies, the 3rd Normal Form, is generally regarded
as an industry standard for many types of database applications. Normalization does not always reduce the size of a given database and frequently increases the modification and retrieval time of a given transaction. Decrease in performance stems from
the fact that decomposition causes some queries to reconstruct original relations by
joining multiple tables. The complexity of the joins between tables depends on the
types of involved attributes and the integrity constraints imposed on those attributes.
The effect of normalization on the size of the database depends on the data stored
in it. Since there is no guarantee on the effect in terms of size of the database, database
designers need to always predict the impact each normalization step may have on
the overall database. Thus, database designers always need to select good trade-offs between memory consumed by the database, presence of challenging database
maintenance anomalies, and speed at which one can query data from the database.
For the above reason, performance minded database designers tend to de-normalize
relations that are accessed frequently. The table decomposition process sometimes
involves reducing the set of functional dependencies to a minimalistic set which is known as canonical cover.
The main focus of this thesis is to examine the query performance of distinct
versions of the same database where all these versions in the 3rd Normal Form as
they are all derived using the same canonical cover. The databases represent the
same relations with distinct schemas but are populated with the same data. This
allows us to evaluate the impact of normalization approaches on the database space
requirements. In order to evaluate the performance of each version of the database we run the same queries on each of them. Here we define the performance of a database
as the time taken by the database to execute the query.||en_US