Conducting business in Japan is pretty easy—once you know the language. The same is true of SQL Server: it's pretty easy to manipulate and access data, once you know the language. In this case, the language is SQL, or Structured Query Language. Originally developed by IBM, SQL has since become a standard across the database industry. Well, kind of. As with America and England, Microsoft's SQL Server and Oracle's Oracle are two relational databases that are divided by a common language. The core parts of the SQL language are pretty consistent between vendors, but each supplier adds a lot of extra features and syntax variations designed by Edgar Codd wannabes.
This section describes those DDL and DML statements that will be most useful in our development of the Library program. You'll be glad to know that SQL isn't too picky about the formatting of the various statements. Upper- and lowercase distinctions are ignored;
SELECT is the same as
select is the same as
SeLeCt. (Traditional SQL code is mostly uppercase. I use uppercase for all keywords, and mixed case for tables, fields, and other custom items. Whatever you choose, consistency is important.) Also, employ whitespace as you see fit. You can put statements on one gigantic line, or put every word on a separate line. The only time whitespace and case matter is in the actual data text strings; whatever you type, that's how it stays.
SQL statements normally end with a semicolon, but some tools do not require you to include the semicolon, and other tools require that you exclude it. When using the SQL Server visual client tools (Management Studio and Management Studio Express), semicolons are optional, but it's a good idea to include them when you are using multiple statements together, one after another. SQL statements used in Visual Basic code never include semicolons.
Later, when you look at a SQL script I wrote, you will see the word
GO from time to time. In SQL Server, this command says, "For all of the other statements that appeared so far, go ahead and process them now."
This may come as a shock to you, but before you can store any data in a table, you have to create that table. SQL has just the tool to do this: the
CREATE TABLE statement. It's one of the many DDL statements. The basic syntax is pretty straightforward:
CREATE TABLE tableName ( fieldName1 dataType options, fieldName2 dataType options, and so on... )
Just fill in the parts and you're ready to populate (data, that is). Table and field names are built from letters and digits; you can include spaces and some other special characters, but it makes for difficult coding later on. Each vendor has its own collection of data types; I'll stick with the SQL Server versions here. The options let you specify things such as whether the field requires data, whether it represents the table's primary key, and other similar constraints. Extensions to the syntax let you set up constraints that apply to the entire table, indexes (which let you sort or search a specific column more quickly), and data storage specifics.
Here's a sample
CREATE TABLE statement that could be used for the table of order line items (refer to Table 4-4):
CREATE TABLE LineItems ( RecordID bigint IDENTITY PRIMARY KEY, OrderID bigint NOT NULL, CustomerID varchar(20) NOT NULL REFERENCES Customers (CustomerID), ProductID varchar(20) NOT NULL, Quantity smallint NOT NULL )
IDENTITY keyword lets SQL Server take charge of filling the
RecordID field with data; it will use a sequential counter to supply a unique
RecordID value with each new record. The
PRIMARY KEY clause identifies the
RecordID field as the unique identifying value for each record in the table. The
smallint data types indicate appropriately sized integer fields, and the
varchar type provides space for text, up to the maximum length specified in the parentheses (20 characters). The
REFERENCES option clause identifies a relationship between this
LineItems table and another table named
Customers; values in the
LineItems.CustomerID field match the key values from the
Customers.CustomerID field. (Note the "dot" syntax to separate table and field names. It shows up everywhere in SQL.) References between tables are also known as foreign references.
If you need to make structure or option changes to a table or its fields after it is created, SQL includes an
ALTER TABLE statement that can change almost everything in the table. Additionally, there is a related
DROP TABLE statement used to get rid of a table and all of its data. You might want to avoid this statement on live production data, as users tend to get a bit irritable when their data suddenly disappears off the surface of the earth.
Table 4-5 summarizes the available data types used in SQL Server.
Table 4-5. SQL Server data types
An 8-byte (64-bit) integer field for values ranging from −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
Fixed-length binary data, up to 8,000 bytes in length. You specify the length through a parameter, as in
Supports three possible values:
Fixed-length standard (
This data type is used within stored procedures, and cannot be used to create a column.
A general date and time field for dates ranging from January 1, 1753 AD to December 31, 9999 AD. Time accuracy for any given value is within 3.33 milliseconds. SQL Server 2008 adds several new date-related data types:
A fixed-precision and scale decimal field. You specify the maximum number of digits to appear on both sides of the decimal point (the precision) and the maximum number of those digits that can appear on the right side of the decimal point (the scale). For instance, a setting of
A floating-point decimal field with variable storage. You can specify the number of bits used to store the value, up to 53. By default, all 53 bits are used, so a setting of
This data type, new with SQL Server 2008, supports querying of hierarchical and tree-shaped data. It is not available in SQL Server 2005.
Don't use these data types, as they will eventually be removed from SQL Server.
A 4-byte (32-bit) integer field for values ranging from −2,147,483,648 to 2,147,483,647.
An 8-byte (64-bit) high-accuracy field for storing currency values, with up to four digits after the decimal point. Stored data values range from −922,337,203,685,477.5808 to 922,337,203,685,477.5807.
This data type is used to record modification events on records. There are restrictions on its use, and it is not guaranteed to be unique within a table.
A general date and time field for dates ranging from January 1, 1900 AD to June 6, 2079 AD. Time accuracy for any given value is within one minute.
A 2-byte (16-bit) integer field for values ranging from −32,768 to 32,767.
A 4-byte (32-bit) high-accuracy field for storing currency values, with up to four digits after the decimal point. Stored data values range from −214,748.3648 to 214,748.3647.
A generic type that stores values from many other type-specific fields.
A special field that temporarily stores the results of a query in a compacted table format. Defining a table field is somewhat complex, and its use naturally carries with it certain restrictions.
A 1-byte (8-bit) unsigned integer field for values ranging from 0 to 255.
A 16-byte globally unique identifier (GUID). The related
Variable-length binary data, up to 8,000 bytes in length. You specify the length through a parameter, as in
Variable-length standard (
Provides storage for typed and untyped XML data documents, up to 2 GB.
Although DDL statements are powerful, they aren't used that much. Once you create your database objects, there's not much call for tinkering. The DML statements are more useful for everyday data surfing.
INSERT statement adds data records to a table. Data is added to a table one record at a time. (A variation of
INSERT lets you insert multiple records, but those records must come from another existing table source.) To use the
INSERT statement, specify the destination table and fields, and then the individual values to put into each field. One data value corresponds to each specified data column name.
INSERT INTO LineItems (OrderID, CustomerID, ProductID, Quantity) VALUES (10002, 'BW3', 'BEV01COF', 1)
Assuming this statement goes with the
CREATE TABLE statement written earlier, this insert action will add a new record to the
LineItems table with five new fields—four specified fields, plus the primary key automatically added to the
RecordID field (since it was marked as
IDENTITY). SQL Server also does a variety of data integrity checks on your behalf. Each data field you add must be of the right data type, but you already expected that. Since we designed the
CustomerID field to be a reference to the
Customer table, the insert will fail if customer BW3 does not already exist in the
Numeric literals can be included in your SQL statements as needed without any additional qualification. String literals are always surrounded by single quotes, as is done for the customer and product IDs in this
INSERT statement. If you need to include single quotes in the literal, enter them twice:
Surround literal date and time values with single quotes:
Such date and time values accept any recognized format, although you should use a format that is not easy for SQL Server to misinterpret.
Many field types support an "unassigned" value, a value that indicates that the field contains no data at all. Such a value is known as the "null" value, and is specified in SQL Server using the
NULL keyword. You cannot assign
NULL to primary key fields, or to any field marked with the
NOT NULL option.
To remove a previously added record, use the
DELETE FROM LineItems WHERE RecordID = 92231
DELETE statement includes a
WHERE clause (the
WHERE RecordID = 92231 part).
WHERE clauses let you indicate one or more records in a table by making comparisons with data fields. Your
WHERE clauses can include
OR keywords to join multiple conditions, and parentheses for grouping.
DELETE FROM LineItems WHERE OrderID = 10001 AND ProductID = 'BRD05RYE'
DELETE statement may delete zero, one, or 1,000 records, so precision in the
WHERE clause is important. To delete all records in the table, exclude the
WHERE clause altogether.
DELETE FROM LineItems
UPDATE statement also uses a
WHERE clause to modify values in existing table records.
UPDATE LineItems SET Quantity = 4 WHERE RecordID = 92231
Assignments are made to fields with the
SET clause; put the field name (
Quantity) on the left side of the equals sign, and the new value on the right (
4). To assign multiple values at once, separate each assignment with a comma. You can also include formulas and calculations.
UPDATE LineItems SET Quantity = Quantity + 1, ProductID = 'BEV02POP' WHERE RecordID = 92231
As with the
DELETE statement, the
UPDATE statement may update zero, one, or many records based on which records match the
The final DML statement, and the one most often used, is
SELECT ProductID, Quantity FROM LineItems WHERE RecordID = 92231
SELECT scans a table (
LineItems), looking for all records matching a given criterion (
RecordID = 92231), and returns a smaller table that contains just the indicated fields (
Quantity) for the matching records. The most basic query returns all rows and columns.
SELECT * FROM LineItems
This query returns all records from the table in no particular order. The asterisk (
*) means "include all fields."
ORDER BY clause returns the results in a specific order.
SELECT * FROM LineItems WHERE Quantity > 5 ORDER BY ProductID, Quantity DESC
This query returns all records that have a
Quantity field value of more than five, and sorts the results first by the
ProductID column (in ascending order) and then by the numeric quantity (in descending order, specified with
Aggregate functions and grouping features let you summarize results from the larger set of data. The following query documents the total ordered quantity for each product in the table:
SELECT ProductID, SUM(Quantity) FROM LineItems GROUP BY ProductID
You can use joins to link together the data from two or more distinct tables. The following query joins the
Customer tables on their matching
CustomerID columns. This
SELECT statement also demonstrates the use of table abbreviations (the "LI" and "CU" prefixes) added through the
AS clauses; they aren't usually necessary, but they can help make a complex query more readable.
SELECT LI.OrderID, CU.CustomerName, LI.ProductID FROM LineItems AS LI INNER JOIN Customer AS CU ON LI.CustomerID = CU.CustomerID ORDER BY LI.OrderID, CU.CustomerName
This table uses an "inner join," one of the five main types of joins, each of which returns different sets of records based on the relationship between the first (left) and second (right) tables in the join:
Returns only those records where there is a match in the linked fields. This type of join uses the
INNER JOIN keywords.
Returns every record from the left table and only those records from the right table where there is a match in the linked fields. If a left table record doesn't have a match, it acts as though all the fields in the right table for that record contain
NULL values. This type of join uses the
LEFT JOIN keywords. One use might be to join the
LineItems tables. You could return a list of the full product name for all available products, plus the total quantity ordered for each one. By putting the
Product table on the left of a left outer join, the query would return all product names, even if that product had never been ordered (and didn't appear in the
Returns all records from the left and right tables, whether they have matches or not. When there is a match, it is reflected in the results. This type of join uses the
FULL JOIN keywords.
Joining focuses on the relationship that two tables have. (This use of "relationship," by the way, is not the basis for the term relational database.) Some tables exist in a "parent-child" relationship; one "parent" record has one or more dependent "child" records in another table. This is often true of orders; a single "order header" has multiple "line items." This type of relationship is known as one-to-many, since one record is tied to many records in the other table. And the relationship is unidirectional; a given child record does not tie to multiple parent records.
A one-to-one relationship ties a single record in one table to a single record in another table. It's pretty straightforward, and is often used to enhance the values found in the original record through a supplementary record in a second table.
In a many-to-many relationship, a single record in one table is associated with multiple records in a second table, and a single record in that second table is also associated with multiple records in the first table. A real-world example would be the relationship between teachers and students in a college setting. One teacher has multiple students in the classroom, but each student also has multiple teachers each semester. Practical implementations of many-to-many relationships actually require three tables: the two related tables, and a "go-between" table that links them together. I will show you a sample of such a table in the upcoming "Project" section of this chapter.
The sample statements I listed here only scratch the surface of the data manipulation possibilities available through SQL. But by now you should have noticed that SQL is remarkably English-like in syntax, much more than even Visual Basic. In fact, the original name for the language—SEQUEL—was an acronym for "Structured English Query Language." As the SQL statements get more complex, they will look less and less like an eighth-grade essay and more like random collections of English words.
The goal here is to introduce you to the basic structure of SQL statements. Most of the statements we will encounter in the Library Project will be no more complex than the samples included here. If you're hungry for more, the "Books Online" component installed with SQL Server (a separate download for the Express Edition) has some pretty good usage documentation. Several good books on the ins and outs of SQL, including vendor-specific dialects, are also available.