Chapter 4. Administrating Berkeley DB SQL Databases

Table of Contents

Backing Up Berkeley DB SQL Databases
Offline Backups
Hot Backup
Incremental Backups
About Unix Copy Utilities
Recovering from a Backup
Catastrophic Recovery
Syncing with Oracle Databases
Syncing on Unix Platforms
Syncing on Windows Platforms
Syncing on Windows Mobile Platforms
Data Migration
Migration Using the Shells
Supported Data and Schema
Replicating Berkeley DB SQL Databases
Preparing to use Replication with the Berkeley DB SQL API
Using Replication with the Berkeley DB SQL API

This chapter provides administrative procedures that are unique to the Berkeley DB SQL interface.

Backing Up Berkeley DB SQL Databases

You can use the standard SQLite .dump command to backup the data managed by the BDB SQL interface. You can also use the standard Berkeley DB backup mechanisms on the database.

The BDB SQL interface supports the standard SQLite Online Backup API. However, there is a small difference between the two interfaces. In the BDB SQL interface, the value returned by the sqlite3_backup_remaining method and the number of pages passed to the sqlite3_backup_step method, are estimates of the number of pages to be copied and not exact values. To be certain that the backup process is complete, check if the sqlite3_backup_step method has returned SQLITE_DONE. To learn how to use SQLite Online Backup API, see the official SQLite Documentation Page.

This section describes the mechanisms that can be performed from the command line.

Offline Backups

To create an offline backup:

  1. Commit or abort all on-going transactions.

  2. Pause all database writes.

  3. Force a checkpoint. See the db_checkpoint command line utility.

  4. Copy your database file to the backup location. Note that in order to perform recovery from this backup, do not change the name of the database file.

  5. Copy the last log file to your backup location. Your log files are named log.xxxxxxxxxx, where xxxxxxxxxx is a sequential number. The last log file is the file with the highest number.

    Remember that your log files are placed in the environment directory, which is created on-disk next to your database file. It has the same name as your database file, but adds a -journal extension. For example, if your database is named mydb.db, then your environment directory is named mydb.db-journal

Hot Backup

To create a hot backup, you do not have to stop database operations. Transactions may be on-going and you can be writing to your database at the time of the backup. However, this means that you do not know exactly what the state of your database is at the time of the backup.

You can use the db_hotbackup command line utility to create a hot backup for you. This utility will (optionally) run a checkpoint and then copy all necessary files to a target directory. To do this when you are using the BDB SQL interface:

  1. Create a DB_CONFIG file in your environment directory.

  2. Add a set_data_dir parameter to the DB_CONFIG file. This parameter indicates what directory contains the actual Berkeley DB database managed by this environment. That directory is one level up from you environment, so you want this parameter to be:

     set_data_dir ..
  3. Add a setl_lg_dir parameter to the DB_CONFIG file. This parameter identifies the directory that contains the environment's log files. This parameter should be:

     set_lg_dir .
  4. Run the db_hotbackup command:

     db_hotbackup -h [environment directory] -b [target directory] -D

    The -D option tells the utility to read the DB_CONFIG file before running the backup.

Alternatively, you can manually create a hot backup as follows:

  1. Copy your database file to the backup location. Note that in order to perform recovery from this backup, do not change the name of the database file.

  2. Copy all logs to your backup location.

    Remember that your log files are placed in the environment directory.

Note

It is important to copy your database file and then your logs. In this way, you can complete or roll back any database operations that were only partially completed when you copied the database.

Incremental Backups

Once you have created a full backup (that is, either a offline or hot backup), you can create incremental backups. To do this, simply copy all of your currently existing log files to your backup location.

Incremental backups do not require you to run a checkpoint or to cease database write operations.

When you are working with incremental backups, remember that the greater the number of log files contained in your backup, the longer recovery will take. You should run full backups on some interval, and then do incremental backups on a shorter interval. How frequently you need to run a full backup is determined by the rate at which your database changes and how sensitive your application is to lengthy recoveries (should one be required).

You can also shorten recovery time by running recovery against the backup as you take each incremental backup. Running recovery as you go means that there will be less work for the BDB SQL interface to do if you should ever need to restore your environment from the backup.

About Unix Copy Utilities

If you are copying database files you must copy databases atomically, in multiples of the database page size. In other words, the reads made by the copy program must not be interleaved with writes by other threads of control, and the copy program must read the databases in multiples of the underlying database page size. Generally, this is not a problem because operating systems already make this guarantee and system utilities normally read in power-of-2 sized chunks, which are larger than the largest possible Berkeley DB database page size.

On some platforms (most notably, some releases of Solaris), the copy utility (cp) was implemented using the mmap() system call rather than the read() system call. Because mmap() did not make the same guarantee of read atomicity as did read(), the cp utility could create corrupted copies of the databases.

Also, some platforms have implementations of the tar utility that performs 10KB block reads by default. Even when an output block size is specified, the utility will still not read the underlying database in multiples of the specified block size. Again, the result can be a corrupted backup.

To fix these problems, use the dd utility instead of cp or tar. When you use dd, make sure you specify a block size that is equal to, or an even multiple of, your database page size. Finally, if you plan to use a system utility to copy database files, you may want to use a system call trace utility (for example, ktrace or truss) to make sure you are not using a I/O size that is smaller than your database page size. You can also use these utilities to make sure the system utility is not using a system call other than read().