Chapter 4. Table Management

Introduction

This chapter covers topics that relate to creating and populating tables:

  • Cloning a table

  • Copying from one table to another

  • Using temporary tables

  • Generating unique table names

  • Determining what storage engine a table uses or converting it from one storage engine to another

Many of the examples in this chapter use a table named mail containing rows that track mail message traffic between users on a set of hosts (see Introduction). To create and load this table, change location into the tables directory of the recipes distribution and run this command:

% mysql cookbook < mail.sql

Cloning a Table

Problem

You want to create a table that has exactly the same structure as an existing table.

Solution

Use CREATE TABLELIKE to clone the table structure. To also copy some or all of the rows from the original table to the new one, use INSERT INTOSELECT.

Discussion

To create a new table that is just like an existing table, use this statement:

CREATE TABLE new_table LIKE original_table;

The structure of the new table is the same as that of the original table, with a few exceptions: CREATE TABLELIKE does not copy foreign key definitions, and it doesn’t copy any DATA DIRECTORY or INDEX DIRECTORY table options that the table might use.

The new table is empty. If you also want the contents to be the same as the original table, copy the rows using an INSERT INTOSELECT statement:

INSERT INTO new_table SELECT * FROM original_table;

To copy only part of the table, add an appropriate WHERE clause that identifies which rows to copy. For example, these statements create a copy of the mail table named mail2, populated only with the rows for mail sent by barb:

CREATE TABLE mail2 LIKE mail;
INSERT INTO mail2 SELECT * FROM mail WHERE srcuser = 'barb';

For more information about INSERTSELECT, see Saving a Query Result in a Table.

Saving a Query Result in a Table

Problem

You want to save the result from a SELECT statement to a table rather than display it.

Solution

If the table exists, retrieve rows into it using INSERT INTOSELECT. If the table does not exist, create it on the fly using CREATE TABLESELECT.

Discussion

The MySQL server normally returns the result of a SELECT statement to the client that executed the statement. For example, when you execute a statement from within the mysql program, the server returns the result to mysql, which in turn displays it on the screen. It’s possible to save the results of a SELECT statement in a table instead, which is useful in several ways:

  • You can easily create a complete or partial copy of a table. If you’re developing an algorithm that modifies a table, it’s safer to work with a copy of a table so that you need not worry about the consequences of mistakes. If the original table is large, creating a partial copy can speed the development process because queries run against it take less time.

  • For a data-loading operation based on information that might be malformed, load new rows into a temporary table, perform some preliminary checks, and correct the rows as necessary. When you’re satisfied that the new rows are okay, copy them from the temporary table to your main table.

  • Some applications maintain a large repository table and a smaller working table into which rows are inserted on a regular basis, copying the working table rows to the repository periodically and clearing the working table.

  • To perform summary operations on a large table more efficiently, avoid running expensive summary operations repeatedly on it. Instead, select summary information once into a second table and use that for further analysis.

This section shows how to retrieve a result set into a table. The table names src_tbl and dst_tbl in the examples refer to the source table from which rows are selected and the destination table into which they are stored, respectively.

If the destination table already exists, use INSERTSELECT to copy the result set into it. For example, if dst_tbl contains an integer column i and a string column s, the following statement copies rows from src_tbl into dst_tbl, assigning column val to i and column name to s:

INSERT INTO dst_tbl (i, s) SELECT val, name FROM src_tbl;

The number of columns to be inserted must match the number of selected columns, with the correspondence between columns based on position rather than name. To copy all columns, you can shorten the statement to this form:

INSERT INTO dst_tbl SELECT * FROM src_tbl;

To copy only certain rows, add a WHERE clause that selects those rows:

INSERT INTO dst_tbl SELECT * FROM src_tbl
WHERE val > 100 AND name LIKE 'A%';

The SELECT statement can produce values from expressions, too. For example, the following statement counts the number of times each name occurs in src_tbl and stores both the counts and the names in dst_tbl:

INSERT INTO dst_tbl (i, s) SELECT COUNT(*), name
FROM src_tbl GROUP BY name;

If the destination table does not exist, create it first with a CREATE TABLE statement, then copy rows into it with INSERTSELECT. Alternatively, use CREATE TABLESELECT to create the destination table directly from the result of the SELECT. For example, to create dst_tbl and copy the entire contents of src_tbl into it, do this:

CREATE TABLE dst_tbl SELECT * FROM src_tbl;

MySQL creates the columns in dst_tbl based on the name, number, and type of the columns in src_tbl. To copy only certain rows, add an appropriate WHERE clause. To create an empty table, use a WHERE clause that selects no rows:

CREATE TABLE dst_tbl SELECT * FROM src_tbl WHERE FALSE;

To copy only some of the columns, name the ones you want in the SELECT part of the statement. For example, if src_tbl contains columns a, b, c, and d, copy just b and d like this:

CREATE TABLE dst_tbl SELECT b, d FROM src_tbl;

To create columns in an order different from that in which they appear in the source table, name them in the desired order. If the source table contains columns a, b, and c that should appear in the destination table in the order c, a, b, do this:

CREATE TABLE dst_tbl SELECT c, a, b FROM src_tbl;

To create columns in the destination table in addition to those selected from the source table, provide appropriate column definitions in the CREATE TABLE part of the statement. The following statement creates id as an AUTO_INCREMENT column in dst_tbl and adds columns a, b, and c from src_tbl:

CREATE TABLE dst_tbl
(
  id INT NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (id)
)
SELECT a, b, c FROM src_tbl;

The resulting table contains four columns in the order id, a, b, c. Defined columns are assigned their default values. This means that id, being an AUTO_INCREMENT column, is assigned successive sequence numbers starting from 1 (see Creating a Sequence Column and Generating Sequence Values).

If you derive a column’s values from an expression, its default name is the expression itself, which can be difficult to work with later. In this case, it’s prudent to give the column a better name by providing an alias (see Naming Query Result Columns). Suppose that src_tbl contains invoice information that lists items in each invoice. The following statement generates a summary that lists each invoice named in the table and the total cost of its items, using an alias for the expression:

CREATE TABLE dst_tbl
SELECT inv_no, SUM(unit_cost*quantity) AS total_cost
FROM src_tbl GROUP BY inv_no;

CREATE TABLESELECT is extremely convenient, but has some limitations that arise from the fact that the information available from a result set is not as extensive as what you can specify in a CREATE TABLE statement. For example, MySQL has no idea whether a result set column should be indexed or what its default value is. If it’s important to include this information in the destination table, use the following techniques:

  • To make the destination table an exact copy of the source table, use the cloning technique described in Cloning a Table.

  • To include indexes in the destination table, specify them explicitly. For example, if src_tbl has a PRIMARY KEY on the id column, and a multiple-column index on state and city, specify them for dst_tbl as well:

    CREATE TABLE dst_tbl (PRIMARY KEY (id), INDEX(state,city))
    SELECT * FROM src_tbl;
  • Column attributes such as AUTO_INCREMENT and a column’s default value are not copied to the destination table. To preserve these attributes, create the table, then use ALTER TABLE to apply the appropriate modifications to the column definition. For example, if src_tbl has an id column that is not only a PRIMARY KEY but also an AUTO_INCREMENT column, copy the table and modify the copy:

    CREATE TABLE dst_tbl (PRIMARY KEY (id)) SELECT * FROM src_tbl;
    ALTER TABLE dst_tbl MODIFY id INT UNSIGNED NOT NULL AUTO_INCREMENT;

Creating Temporary Tables

Problem

You need a table only for a short time, after which you want it to disappear automatically.

Solution

Create a table using the TEMPORARY keyword, and let MySQL take care of removing it.

Discussion

Some operations require a table that exists only temporarily and that should disappear when it’s no longer needed. You can, of course, execute a DROP TABLE statement explicitly to remove a table when you’re done with it. Another option is to use CREATE TEMPORARY TABLE. This statement is like CREATE TABLE but creates a transient table that disappears when your session with the server ends, if you haven’t already removed it yourself. This is extremely useful behavior because MySQL drops the table for you automatically; you need not remember to do it. TEMPORARY can be used with the usual table-creation methods:

  • Create the table from explicit column definitions:

    CREATE TEMPORARY TABLE tbl_name (...column definitions...);
  • Create the table from an existing table:

    CREATE TEMPORARY TABLE new_table LIKE original_table;
  • Create the table on the fly from a result set:

    CREATE TEMPORARY TABLE tbl_name SELECT ... ;

Temporary tables are session-specific, so multiple clients can each create a temporary table having the same name without interfering with each other. This makes it easier to write applications that use transient tables because you need not ensure that the tables have unique names for each client. (For further discussion of table-naming issues, see Generating Unique Table Names.)

A temporary table can have the same name as a permanent table. In this case, the temporary table hides the permanent table for the duration of its existence, which can be useful for making a copy of a table that you can modify without affecting the original by mistake. The DELETE statement in the following example removes rows from a temporary mail table, leaving the original permanent table unaffected:

mysql> CREATE TEMPORARY TABLE mail SELECT * FROM mail;
mysql> SELECT COUNT(*) FROM mail;
+----------+
| COUNT(*) |
+----------+
|       16 |
+----------+
mysql> DELETE FROM mail;
mysql> SELECT COUNT(*) FROM mail;
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
mysql> DROP TEMPORARY TABLE mail;
mysql> SELECT COUNT(*) FROM mail;
+----------+
| COUNT(*) |
+----------+
|       16 |
+----------+

Although temporary tables created with CREATE TEMPORARY TABLE have the benefits just discussed, keep the following caveats in mind:

  • To reuse a temporary table within a given session, you must still drop it explicitly before re-creating it. Attempting to create a second temporary table with the same name results in an error.

  • If you modify a temporary table that hides a permanent table with the same name, be sure to test for errors resulting from dropped connections if you use a programming interface that has reconnect capability enabled. If a client program automatically reconnects after detecting a dropped connection, modifications affect the permanent table after the reconnect, not the temporary table.

  • Some APIs support persistent connections or connection pools. These prevent temporary tables from being dropped as you expect when your script ends because the connection remains open for reuse by other scripts. Your script has no control over when the connection closes. This means it can be prudent to execute the following statement prior to creating a temporary table, just in case it’s still in existence from a previous execution of the script:

    DROP TEMPORARY TABLE IF EXISTS tbl_name

    The TEMPORARY keyword is useful here if the temporary table has already been dropped, to avoid dropping any permanent table that has the same name.

Generating Unique Table Names

Problem

You need to create a table with a name guaranteed not to exist.

Solution

If you create a TEMPORARY table, it doesn’t matter whether a permanent table with that name exists. Otherwise, try to generate a value that is unique to your client program and incorporate it into the table name.

Discussion

MySQL is a multiple-client database server, so if a given script that creates a transient table might be invoked by several clients simultaneously, take care that multiple invocations of the script do not fight over the same table name. If the script creates tables using CREATE TEMPORARY TABLE, there is no problem because different clients can create temporary tables having the same name without clashing.

If you cannot or do not want to use a TEMPORARY table, make sure that each invocation of the script creates a uniquely named table and drops the table when it is no longer needed. To accomplish this, incorporate into the name some value guaranteed to be unique per invocation. A timestamp won’t work if it’s possible for two instances of a script to be invoked within the timestamp resolution. A random number may be better, but random numbers only reduce the possibility of name clashes, not eliminate it. Process ID (PID) values are a better source of unique values. PIDs are reused over time, but never for two processes at the same time, so a given PID is guaranteed to be unique among the set of currently executing processes. Use this fact to create unique table names as follows.

Perl:

my $tbl_name = "tmp_tbl_$$";

Ruby:

tbl_name = "tmp_tbl_" + Process.pid.to_s

PHP:

$tbl_name = "tmp_tbl_" . posix_getpid ();

Python:

import os
tbl_name = "tmp_tbl_%d" % os.getpid()

The PID approach should not be used in contexts such as scripts run within multithreaded web servers in which all threads share the same process ID.

Connection identifiers are another source of unique values. The MySQL server reuses these numbers over time, but no two simultaneous connections to the server have the same ID. To get your connection ID, execute this statement and retrieve the result:

SELECT CONNECTION_ID();

It’s possible to incorporate a connection ID into a table name within SQL by using prepared statements. The following example illustrates this, referring to the table name in the CREATE TABLE statement and a precautionary DROP TABLE statement:

SET @tbl_name = CONCAT('tmp_tbl_', CONNECTION_ID());
SET @stmt = CONCAT('DROP TABLE IF EXISTS ', @tbl_name);
PREPARE stmt FROM @stmt;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SET @stmt = CONCAT('CREATE TABLE ', @tbl_name, ' (i INT)');
PREPARE stmt FROM @stmt;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Why execute the DROP TABLE? Because if you create a table name using an identifier such as a PID or connection ID guaranteed to be unique to a given script invocation, there may still be a chance that the table already exists if an earlier invocation of the script with the same PID created a table with the same name, but crashed before removing the table. On the other hand, any such table cannot still be in use because it will have been created by a process that is no longer running. Under these circumstances, it’s safe to remove the old table if it does exist before creating the new one.

Some MySQL APIs expose the connection ID directly without requiring any statement to be executed. For example, in Perl DBI, use the mysql_thread_id attribute of your database handle:

my $tbl_name = "tmp_tbl_" . $dbh->{mysql_thread_id};

In Ruby DBI, do this:

tbl_name = "tmp_tbl_" + dbh.func(:thread_id).to_s

Checking or Changing a Table Storage Engine

Problem

You want to check which storage engine a table uses so that you can determine what engine capabilities are applicable. Or you need to change a table’s storage engine because you realize that the capabilities of another engine are more suitable for the way you use the table.

Solution

To determine a table’s storage engine, you can use any of several statements. To change the table’s engine, use ALTER TABLE with an ENGINE clause.

Discussion

MySQL supports multiple storage engines, which have differing characteristics. For example, the InnoDB engine supports transactions, whereas MyISAM does not. If you need to know whether a table supports transactions, check which storage engine it uses. If the table’s engine does not support transactions, you can convert the table to use a transaction-capable engine.

To determine the current engine for a table, check INFORMATION_SCHEMA or use the SHOW TABLE STATUS or SHOW CREATE TABLE statement. For the mail table, obtain engine information as follows:

mysql> SELECT ENGINE FROM INFORMATION_SCHEMA.TABLES
    -> WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'mail';
+--------+
| ENGINE |
+--------+
| InnoDB |
+--------+

mysql> SHOW TABLE STATUS LIKE 'mail'\G
*************************** 1. row ***************************
           Name: mail
         Engine: InnoDB
…

mysql> SHOW CREATE TABLE mail\G
*************************** 1. row ***************************
       Table: mail
Create Table: CREATE TABLE `mail` (
... column definitions ...
) ENGINE=InnoDB DEFAULT CHARSET=latin1

To change the storage engine for a table, use ALTER TABLE with an ENGINE specifier. For example, to convert the mail table to use the MyISAM storage engine, use this statement:

ALTER TABLE mail ENGINE = MyISAM;

Be aware that converting a large table to a different storage engine might take a long time and be expensive in terms of CPU and I/O activity.

To determine which storage engines your MySQL server supports, check the output from the SHOW ENGINES statement or query the INFORMATION_SCHEMA ENGINES table.

Copying a Table Using mysqldump

Problem

You want to copy a table or tables, either among the databases managed by a MySQL server, or from one server to another.

Solution

Use the mysqldump program.

Discussion

The mysqldump program makes a backup file that can be reloaded to re-create the original table or tables:

% mysqldump cookbook mail > mail.sql

The output file mail.sql consists of a CREATE TABLE statement to create the mail table and a set of INSERT statements to insert its rows. You can reload the file to re-create the table should the original be lost:

% mysql cookbook < mail.sql

This method also makes it easy to deal with any triggers the table has. By default, mysqldump writes the triggers to the dump file, so reloading the file copies the triggers along with the table with no special handling.

In addition to restoring tables, mysqldump can be used to make copies of them, by reloading the output into a different database. (If the destination database does not exist, create it first.) The following examples show some useful table-copying commands.

Copying tables within a single MySQL server

  • Copy a single table to a different database:

    % mysqldump cookbook mail > mail.sql
    % mysql other_db < mail.sql

    To dump multiple tables, name them all following the database name argument.

  • Copy all tables in a database to a different database:

    % mysqldump cookbook > cookbook.sql
    % mysql other_db < cookbook.sql

    When you name no tables after the database name, mysqldump dumps them all. To also include stored routines and events, add the --routines and --events options to the mysqldump command. (There is also a --triggers option, but it’s unneeded because, as mentioned previously, mysqldump dumps triggers with their associated tables by default.)

  • Copy a table, using a different name for the copy:

    1. Dump the table:

      % mysqldump cookbook mail > mail.sql
    2. Reload the table into a different database that does not contain a table with that name:

      % mysql other_db < mail.sql
    3. Rename the table:

      % mysql other_db
      mysql> RENAME mail TO mail2;

      Or, to move the table into another database at the same time, qualify the new name with the database name:

      % mysql other_db
      mysql> RENAME mail TO cookbook.mail2;

To perform a table-copying operation without an intermediary file, use a pipe to connect the mysqldump and mysql commands:

% mysqldump cookbook mail | mysql other_db
% mysqldump cookbook | mysql other_db

Copying tables between MySQL servers

The preceding commands use mysqldump to copy tables among the databases managed by a single MySQL server. Output from mysqldump can also be used to copy tables from one server to another. Suppose that you want to copy the mail table from the cookbook database on the local host to the other_db database on the host other-host.example.com. One way to do this is to dump the output into a file:

% mysqldump cookbook mail > mail.sql

Then copy mail.sql to other-host.example.com, and run the following command there to load the table into that MySQL server’s other_db database:

% mysql other_db < mail.sql

To accomplish this without an intermediary file, use a pipe to send the output of mysqldump directly over the network to the remote MySQL server. If you can connect to both servers from your local host, use this command:

% mysqldump cookbook mail | mysql -h other-host.example.com other_db

The mysqldump half of the command connects to the local server and writes the dump output to the pipe. The mysql half of the command connects to the remote MySQL server on other-host.example.com. It reads the pipe for input and sends each statement to the other-host.example.com server.

If you cannot connect directly to the remote server using mysql from your local host, send the dump output into a pipe that uses ssh to invoke mysql remotely on other-host.example.com:

% mysqldump cookbook mail | ssh other-host.example.com mysql other_db

ssh connects to other-host.example.com and launches mysql there. It then reads the mysqldump output from the pipe and passes it to the remote mysql process. ssh can be useful to send a dump over the network to a machine that has the MySQL port blocked by a firewall but that permits connections on the SSH port.

Regarding which table or tables to copy, similar principles apply as for local copies. To copy multiple tables over the network, name them all following the database argument of the mysqldump command. To copy an entire database, don’t specify any table names after the database name; mysqldump dumps all its tables.

Get MySQL Cookbook, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.