Before we go any deeper into our design for today, let's discuss the concept of normalization. Normalization, in its simplest form, describes steps taken during the design of a relational database to reduce or eliminate redundancy when storing data in the tables. More specifically, a normalized data set will help reduce or eliminate problems associated with redundancy, inaccuracy, and inconsistency within the data set.
A database that is not normalized causes a host of problems, the worst of which is inefficiency. Inefficient databases cause database engines to use more processing power for each request, causing the database to respond more slowly. The data stored in such a database is generally also error-prone since the same data needs to be entered or updated in multiple places. Error-prone data may cause queries to return inaccurate results. Inaccurate data must be corrected and with poor normalization, it's difficult to determine which data is accurate. Error-prone data is also bloated, requiring extra storage space for the non-normalized data.
Designers of good relational databases will avoid all of these problems. Consider tblCourses in our database. Specifically, we will focus on the Department Name, Lead Instructor, and Assistant Instructor fields for the first few records:
In this data, there are repeated department names and spelling errors. This table would not be considered normalized since there is redundant data in the table. This redundant data would cause update problems, since changes would have to be made in multiple places. For example, if the Anthropology department changed its name to "Human Studies," we would have to change it in multiple records in this table as well as in tblDepartments.
When normalizing data in Access, keep the following principles in mind:
- Do not use repeated fields or fields that contain multiple values. For example, in the previous data, you see two fields for instructor, Lead Instructor and Assistant Instructor. Instead, create a junction table that stores instructor information for particular course sections.
- Do not include fields that describe data in other tables. In the previous example, we would not include the department name as it describes a department, not an individual course.
- If using a composite key, each field must describe all parts of the key. We will explore this principle later in this workshop. In tblCourses, however, we can see that the Department Name field has nothing to do with a course, rather it is associated with the department. This means it doesn't describe the table's primary key and should be stored elsewhere.
Conceptualizing Database Entities
Keeping those principles in mind, let's clean up the structure of tblCourses. First, let's start thinking of our data in a fundamentally different way. The tblCourses data would be fine if it was to be represented in a spreadsheet, but that's not how a relational database works. A database is a series of interrelated entities. An entity is a representation of a component of the data set. For example, a faculty member would be an entity in our data defined in tblFaculty and its fields should only describe a faculty member. Our database contains course, instructor, and department data, so we will need to represent each of those as a standalone entity or table.
For a more detailed and technical description of normalization, view the Database Normalization article on Wikipedia.
Let's start cleaning up the normalization problems with tblCourses.
Step1. Open tblCourses in Design view.
Step2. To remove the Department Name field from tblCourses,
Click, press: Delete key
Step3. To confirm the deletion,
Click
Normalizing Data
We also do not need the instructor information stored in tblCourses. We will be creating a table to store that data in the next section of the workshop. Let's remove the instructor fields from tblCourses too.
This section has less guidance than the previous sections. If you would like a walk-through, watch the video Normalizing Database Tables: Normalizing Data.
Step1. Remove Instructor1 and Instructor2 from tblCourses.
Step2. Save & close tblCourses.