ChapterÂ 4.Â Blocking Bad Data

Even the best database designer has spent a sleepless night worrying about the errors that could be lurking in a database. Bad data is a notorious problemâit enters the database, lies dormant for months, and appears only when you discover youâve mailed an invoice to customer âBlank Blankâ or sold a bag of peanuts for â$4.99.

The best way to prevent these types of problems is to stop bad data from making it into your database in the first place. In other words, you need to set up validation rules that reject suspicious values as soon as someone types them in. Once bad data has entered your database, itâs harder to spot than a blueberry in a swimming pool.

This chapter covers the essential set of Access data validation tools:

Duplicates, required fields, and default values are the basics of data integrity.
Input masks format ordinary text into patterns, like postal codes and phone numbers.
Validation rules lay down strict laws for unruly fields.
Lookups limit values to a list of preset choices.

Note

Thereâs one validation technique that this chapter doesnât cover: using data macros. Data macros are specialized routines that spring into action when someone makes a change in your database. Theyâre remarkably powerful, but you canât use them until you learn the basics of macro programming. In the meantime, the validation tools youâll pick up in this chapter are simpler and easier to maintain.

Youâll learn how to build macros in ChapterÂ 15. Youâll learn how to use macros to perform validation with data events in ChapterÂ 16.

Data Integrity Basics

All of Accessâs data validation features work via the Design view you learned about in ChapterÂ 2. To put them in place, you choose a field and then tweak its properties. The only trick is knowing what properties are most useful. Youâve already seen some in ChapterÂ 2, but the following sections fill in a few more details.

Tip

Remember, Access gives you three ways to switch to Design view. Once you right-click the table tab title, you can then choose Design View from the menu, use the HomeâView button on the ribbon, or use the tiny view buttons at the Access windowâs bottom-right corner. And if youâre really impatient, then you donât even need to open your table firstâjust find it in the navigation pane, right-click it there, and then choose Design View.

Preventing Blank Fields

Every record needs a bare minimum of information to make sense. However, without your help, Access canât distinguish between critical information and optional details. For that reason, every field in a new table is optional, except for the primary-key field (which is usually the ID value). Try this out with the Dolls table from ChapterÂ 1; youâll quickly discover that you can add records that have virtually no information in them.

You can easily remedy this problem. Just select the field that you want to make mandatory in Design view, and then set the Required field property to Yes (FigureÂ 4-1).

FigureÂ 4-1.Â The Required field property tells Access not to allow empty values (called nulls in tech-speak).

Access checks the Required field property whenever you add a new record or modify a field in an existing record. However, if your table already contains data, thereâs no guarantee that it follows the rules.

Imagine youâve filled the Dolls table with a few bobbleheads before you decide that every record requires a value for the Character field. You switch to Design view, choose the Character field, and then flip the Required field property to Yes. When you save the table (by switching back to Datasheet view or closing the table), Access gives you the option of verifying the bobblehead records that are already in the table (FigureÂ 4-2). If you choose to perform the test and Access finds the problem, it gives you the option of reversing your changes (FigureÂ 4-3).

Itâs a good idea to test the data in your table to make sure it meets the new requirements you put into place. Otherwise, invalid data could still remain. Donât let the message scare youâunless you have tens of thousands of records, this check doesnât take long.

FigureÂ 4-2.Â Itâs a good idea to test the data in your table to make sure it meets the new requirements you put into place. Otherwise, invalid data could still remain. Donât let the message scare youâunless you have tens of thousands of records, this check doesnât take long.

If Access finds an empty value, then it stops the search and asks you what to do about it. You can keep your changes (even though they conflict with at least one record)âafter all, at least new records wonât suffer from the same problem. Your other option is to reset your field to its more lenient previous self. Either way, you can track down the missing data by performing a sort on the field in question (page 97), which brings empty values to the top.

FigureÂ 4-3.Â If Access finds an empty value, then it stops the search and asks you what to do about it. You can keep your changes (even though they conflict with at least one record)âafter all, at least new records wonât suffer from the same problem. Your other option is to reset your field to its more lenient previous self. Either way, you can track down the missing data by performing a sort on the field in question (page 97), which brings empty values to the top.

Blank values and empty text

Access supports this Required property for every data type. However, with some data types you might want to add extra checks. Thatâs because the Required property prevents only blank fieldsâfields that donât have any information in them at all. However, Access makes a slightly bizarre distinction between blank values and something called empty text.

Word To The Wise: Donât Require Too Much

Youâll need to think very carefully about what set of values you need, at a minimum, to create a record.

For example, a company selling Elvis costumes might not want to accept a new outfit into their Products table unless they have every detail in place. The Required field property is a great help here, because it prevents half-baked products from showing up in the catalog.

On the other hand, the same strictness is out of place in the same companyâs Customers table. The sales staff needs the flexibility to add a new prospect with only partial information. A potential customer may phone and leave only a mailing address (with no billing address, phone number, email information, and so on). Even though you donât have all the information about this customer, youâll still need to place that customer in the Customers table so that he or she can receive the monthly newsletter.

As a general rule, make a field optional if the information for it isnât necessary or might not be available at the time the record is entered.

A blank (null) value indicates that no information was supplied. Empty text indicates that a field value was supplied, but it just happens to be empty. Confused yet? The distinction exists because databases like Access need to recognize when information is missing. A blank value could indicate an oversightâsomeone might just have forgotten to enter the value. On the other hand, empty text indicates a conscious decision to leave that information out.

Tip

To try this out in your datasheet, create a text field that has Required set to Yes. Try inserting a new record and leaving the record blank. (Access stops you cold.) Now, try adding a new record, but place a single space in the field. Hereâs the strange part: Access automatically trims out spaces, and by doing so, it converts your single space to empty text. However, you donât receive an error message because empty text isnât the same as a blank value.

The good news is that if you find this whole distinction confusing, then you can prevent both blank values and empty text. Just set Required to Yes to stop the blank values, and set Allow Zero Length to No to prevent empty text.

Note

A similar distinction exists for numeric data types. Even if you set Required to Yes, you can still supply a number of 0. If you want to prevent that action, then youâll need to use the validation rules described later in this chapter (Validation Rules).

Setting Default Values

So far, the fields in your tables are either filled in explicitly by the person who adds the record or are left blank. But thereâs another optionâyou can supply a default value. Now, if someone inserts a record and leaves the field blank, Access applies the default value instead.

You set a default value using the Default Value field property. For a numeric AddedCost field, you could set this to be the number 0. For a text Country field, you could use the text âU.S.A.â as a default value. (All text values must be wrapped in quotation marks when you use them for a default value.)

Access shows all your default values in the new-row slot at the bottom of the datasheet (FigureÂ 4-4). It also automatically inserts default values into any hidden columns (Hiding Columns). But default value settings donât affect any of your existing recordsâthey keep whatever value they had when you last edited them.

This dating service uses four default values: a default height (5.9), a default city (New York), a default state (also New York), and a default country (U.S.A.). This system makes sense, because most of their new entries have this information. On the other hand, thereâs no point in supplying a default value for the name fields.

FigureÂ 4-4.Â This dating service uses four default values: a default height (5.9), a default city (New York), a default state (also New York), and a default country (U.S.A.). This system makes sense, because most of their new entries have this information. On the other hand, thereâs no point in supplying a default value for the name fields.

Access inserts the default value when you create a new record. (Youâre then free to change that value.) You can also switch a field back to its default value by using the Ctrl+Alt+Space shortcut while youâre editing it.

Tip

One nice feature is that you can use the default value as a starting point for a new record. For example, when you create a new record in the datasheet, you can edit the default value, rather than replacing it with a completely new value.

You can also create more intelligent dynamic default values. Access evaluates dynamic default values whenever you insert a new record, which means that the default value can vary based on other information. Dynamic default values use expressions (specialized database formulas) that can perform calculations or retrieve other details. One useful expression, Date(), grabs the current date thatâs set on your computer. If you use Date() as the Default Value for a date field (as shown in FigureÂ 4-5), then Access automatically inserts the current date whenever you add a new record.

If you use the Date() function as the default value for the DateAcquired field in the bobblehead table, then every time you add a new bobblehead record, Access fills in the current date. You decide whether you want to keep that date or replace it with a different value.

FigureÂ 4-5.Â If you use the Date() function as the default value for the DateAcquired field in the bobblehead table, then every time you add a new bobblehead record, Access fills in the current date. You decide whether you want to keep that date or replace it with a different value.

Preventing Duplicate Values with Indexes

In a properly designed table, every record must be unique. To enforce this restriction, you should choose a primary key (The Primary Key), which is one or more fields that wonât be duplicated.

Hereâs the catch. As you learned in ChapterÂ 2, the safest option is to create an ID field for the primary key. So far, all the tables youâve seen have included this detail. But what if you need to make sure other fields are unique? Imagine you create an Employees table. You follow good database design principles and identify every record with an automatically generated ID number. However, you also want to make sure that no two employees have the same Social Security number (SSN), to prevent possible errorsâlike accidentally entering the same employee twice.

Tip

For a quick refresher about why ID fields are such a good idea, refer to 6. Include an ID Field. In the Employees table, you certainly could choose to make the SSN the primary key, but itâs not the ideal situation when you start linking tables together (ChapterÂ 5), and it causes problems if you need to change the SSN later on (in the case of an error), or if you enter employee information before youâve received the SSN.

You can force a field to require unique values with an index. A database index is analogous to the index in a bookâitâs a list of values (from a field) with a crossreference that points to the corresponding section (the full record). If you index the SocialSecurityNumber field, Access creates a list like this and stores it behind the scenes in your database file.

SocialSecurityNumber	Location of Full Record
001-01-3455	â¦
001-02-0434	â¦
001-02-9558	â¦
002-40-3200	â¦

Using this list, Access can quickly determine whether a new record duplicates an existing SSN. If it does, then Access doesnât let you insert it.

Up To Speed: How Indexes Work

Itâs important that the list of SSNs is sorted. Sorting means the number 001-01-3455 always occurs before 002-40-3200 in the index, regardless of where the record is physically stored in the database. This sorting is important, because it lets Access quickly check for duplicates. If you enter the number 001-02-4300, then Access needs to read only the first part of the list. Once it finds the next âlargerâ SSN (one that falls later in the sort, like 001-02-5010), it knows the remainder of the index doesnât contain a duplicate.

In practice, all databases use many more optimizations to make this process blazingly fast. But thereâs one key principleâwithout an index, Access would need to check the entire table. Tables arenât stored in sorted order, so thereâs no way Access can be sure a given SSN isnât in there unless it checks every record.

So how do you apply an index to a field? The trick is the Indexed field property, which is available for every data type except Attachment and OLE Object. When you add a field, the Indexed property is set to No, which means Access doesnât create an index. To add an index and prevent duplicates, you can change the Indexed property in Design view to Yes [No Duplicates]. The third option, Yes [Duplicates OK], creates an index but lets more than one record have the same value. This option doesnât help you catch repeated records, but you can use it to speed up searches (see the box on Indexes and Performance for more).

Note

As you know from ChapterÂ 2, primary keys also disallow duplicates, using the same technique. When you define a primary key, Access automatically creates an index on that field.

When you close Design view after changing the Indexed field property, Access prompts you to save your changes. At this point, it creates any new indexes it needs. You canât create a no-duplicates index if you already have duplicate information in your table. In this situation, Access gives you an error message when you close the Design window and it attempts to add the index.

Frequently Asked Question: Indexes and Performance

Are indexes a tool for preventing bad data or a technique for boosting performance?

Indexes arenât just for preventing duplicate values. They also shine when you need to boost the speed of common searches. Access can use the index to look up the record it wants, much like you can use the index at the back of this book to find a specific topic.

If you perform a search that scours the Employees table looking for the person with a specific SSN, then Access can use the index. That way, it locates the matching entry much more quickly, and it simply follows the pointer to the full record.

For more information about how indexes can speed up searches, refer to Getting the top records. However, itâs important to realize that indexes enhance performance only for extremely large, complex tables. If youâre storing a few hundred records, each of which has a handful of fields, you really donât need an indexâAccess already performs searches with blinding speed.

Multifield indexes

You can also use indexes to prevent a combination of values from being repeated. Imagine you create a People table to track your friends and their contact information. Youâre likely to have entries with the same first or last name. However, you may want to prevent two records from having the same first and last name. This limitation prevents you from inadvertently adding the same person twice.

Note

This example could cause endless headaches if you honestly do have two friends who share the same first and last names. In that case, youâll need to remove the index before youâre allowed to add the name. You should think carefully about legitimate reasons for duplication before you create any indexes.

To ensure that a combination of fields is unique, you need to create a compound index, which combines the information from more than one field. Hereâs how to do it:

In Design view, choose Table Tools | DesignâShow/HideâIndexes.
The Indexes window appears (FigureÂ 4-6). Using the Indexes window, you can see your current indexes and add new ones.
Choose a name for your index. Type this name into the first blank row in the Index Name column.
The index name has no real importanceâAccess uses it to store the index in the database, but you donât see the index name when you work with the table. Usually, youâll use the name of one or both of the fields youâre indexing (like LastName+FirstName).
FigureÂ 4-6.Â The Indexes window shows all the indexes that are defined for a table. Here, thereâs a single index for the ID field (which Access created automatically) and a compound index thatâs in the process of being created.
Choose the first field in the Field Name column in the same row (like LastName).
It doesnât matter which field name you use first. Either way, the index can prevent duplicate values. However, the order does affect how searches use the index to boost performance. Youâll learn more on Getting the top records.
In the area at the bottom of the window, set the Unique box to Yes.
This creates an index that prevents duplicates (as opposed to one thatâs used only for boosting search speeds).
You can also set the Ignore Nulls box to Yes, if you want Access to allow duplicate blank values. Imagine you want to make the SSN field optional. In this case, you should set Ignore Nulls to Yes. If you set Ignore Nulls to No, then Access lets only one record have a blank SSN field, which probably isnât the behavior you want.
Tip
You can also disallow blank values altogether using the Required property, as described on Data Integrity Basics.
Ignore the Primary box (which identifies the index used for the primary key).
Move down one row. Leave the Index Name column blank (which tells Access itâs still part of the previous index), but choose another field in the Field Name column (like FirstName).
If you want to create a compound index with more than two fields, then just repeat this step until youâve added all the fields you need. FigureÂ 4-7 shows what a finished index looks like.
You can now close the Indexes window.

FigureÂ 4-7.Â Hereâs a compound index that prevents two people from sharing the same first and last names.

Get Access 2010: The Missing Manual now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Access 2010: The Missing Manual by Matthew MacDonald

ChapterÂ 4.Â Blocking Bad Data

Note

Data Integrity Basics

Tip

Preventing Blank Fields

Blank values and empty text

Tip

Note

Setting Default Values

Tip

Preventing Duplicate Values with Indexes

Tip

Note

Multifield indexes

Note

Tip

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly