Define Microsoft Access relationships using integer keys
I received a call from one of my co-workers in the UK the other day, looking for some help with a quick, down-and-dirty database to help him track company assets held by employees. Just something to track the details on all of the computer and peripherals, phones, cars, things like that. He already had it started, and just needed a little help getting the report built and formatted. So, being the nice guy I am, I say 'Sure, zip it up and send it to me, then I'll give you a call back'.
A minute later I had it opened up and was looking at the tables. He actually did a pretty good job, and had the different 'objects' in their own tables. He had a user table, a phone table, a car table, and an IT table. He had, however, fallen into a typical trap that novice database developers often fall into. He had set up a 'name' field in the user table, then had used that name field for the linking fields in the other tables. So what's wrong with that, you ask?
Let's take a quick look at the relationships. You can look at (and define) any pre-configured database relationship by clicking on the Relationship icon in the main toolbar:
This will bring up a form showing the database relationships. In this case, you can see the tables and the relationships that had been defined:
At first glance I noticed a couple of issues here, or at least some things that I would do differently. The first is that each table has a primary key field defined simply as 'ID'. Technically speaking, there really is nothing wrong with that. In practical terms, however, this can make things more difficult for you down the road. As a general rule of thumb, database developers will use a unique name to make it easier to define relationships between tables. In other words, if you see the a field called 'employee_id' in one table, then see a field called 'employee_id' in another table, there's a pretty good bet that these two tables are linked, or have a relationship, based on this field. Having a field called simply 'id' in your tables doesn't help you much, because you will need to rename that field when you attempt to define a relationship. You will see an example of a better way to do this in a minute.
The other problem is the 'Name' field. There are actually two separate problems here. One is that a single field has been defined to hold both the first and last name, which can be a real problem (as described here). It is always easier to combine fields then it is to take them apart, so split your name field into at least 'first_name' and 'last_name' (you may need to add a salutation and suffix field , too, depending on your data). The second problem is that the relationship between the tables has been defined using this field, which is not the best practice.
First of all, it's inefficient to use a text field in database relationships. Any database engine is going to be much more efficient using integer values for the primary-foreign key relationships. Now, admittedly for the several hundred records he will have in this database there is no way he would notice any sort of improvement in speed because he changed from creating a relationship using a text field, instead of setting up a primary-foreign key relationship using indexed numeric keys. He would need to get thousands of records, maybe tens of thousands of records in to the database before he would notice any performance improvement.
But that's not really the point. The point is it's not right. Everyone knows that an important part of being an 'IT person' is that you MUST BE anal retentive. It's a pre-requisite, and undeniable law of nature. Therefore, it is important to do it the right way, even if it's harder. In fact, sometimes, you will do it the right way ONLY because it's harder to do, and you get bonus points if doing it the right way makes it harder for someone else to understand.
OK, I'm joking (kind of). The real point here is that a database configured as you see above is harder to build, and harder to maintain in the long run. Remember, the tables in your database are like the foundation of your house. Everything you build on top of it (in the case of the database this would be the forms, queries, stored procedures, reports, code, etc) are dependent on the stability of this foundation. If you take some extra time to create the foundation everything that you build on it will be easier. In addition, just like your house, if you decide to change the foundation later on you will need to take apart everything built on it, make the change, then re-build the whole thing. Believe me, it's in your best interest to attempt to get the tables in your database built the right way the first time around.
The second issue with using the Name as the relationship key is this: What happens if you need to change it? People get married, a data entry mistake is made, you decide to go with first initial last name as a standard, or any of a hundred reasons could cause you to change the name. If you configured your relationship using referential integrity, but don't have cascade update selected:
you will not be able to change the name without first deleting the relationships. Of course, you can get around that by selecting 'Cascade update related fields', but this is not the strongest way to create your database. If your database crashes or hickups in the middle of updating one of these related fields you will end up with orphaned records. In other words, you could end up with a record in vehicle with no matching record in user. It will be a vehicle without an owner, and will float unclaimed in your database. In addition, your user will no longer have that vehicle, and you're going to wonder where it went. And believe me, these orphaned records can be tough to find, and even tougher to figure out who they belong to.
So here's how I'm going to fix it. I will first need to delete all of the relationships that have been defined. This is done by clicking once on the link between the tables, then either press the 'delete' key on the keyboard or right click and select delete from the popup menu:
I'll do that for all of the links, so that none of the tables are connected. Now I'll open up the user table in design mode, and make the following changes:
- Change the name of the ID field to user_id
- Change the Name field to first_name
- Add another field called last_name
When I'm done it will like this:
Now, I need to change the dependent tables. I'll show the changes using the IT table, but all of the tables will be changed in the exact same manner. These are the changes that I will make:
- Change the name of the ID field to [table name]_id. In this case it will be it_id
- Insert a new field, directly under the primary id field, called user_id. This is the foreign key from the user table, and will hold the id number of the user that this record will be linked to. I will need to make this field a number field, of the type long integer
- Delete the name field, as this field will no longer be used
The example here is an empty database, so no data will be lost by deleting the name field. If you have records in your database you would want to make sure that you are pulling in the id of the matching record before you delete this field. If you have questions on how to do this you may want to check out this series on normalizing data for some tips on how to pull in the matching id value.
The finished table will look like this:
I will now make those same changes to the other two tables, and save all of those changes. I will recreate my relationships (details on how to do that can be found here) using the new id fields that I just created. When I'm finished my relationship form it will look like this:
If you look at this for a minute it starts to make sense. You can tell that I will be creating my user first, then I will pull the id from that user into the matching records in the other tables. This way, no matter what I do to the details of that user, the id will remain constant, and the records will stay aligned. I can add an address or other details, or even change the name and none of the related records will be affected.
Not only will it run faster and be easier to maintain, you will feel the zen-like peacefulness in knowing that you've done it the right way.