Backup and Recovery of Related Databases
If you have two or more databases that must be logically consistent, you may need to implement special procedures to ensure the recoverability of these databases.
It is important to consider the recovery goals for the entire set of databases. In the worst case you need to consider how long it will take to recover all of the databases. To avoid excessive recovery with a large number of databases, you need to avoid sharing media at backup time, and you need sufficient hardware to restore the databases in parallel.
Three potential related database scenarios are:
- You experience media failure that affects one or more of the databases, but the transaction log(s) are not damaged. You want to recover to current time.
- One or more transaction logs are destroyed. You need to restore the set of databases to a consistent state at the time of your last log backup.
- You need to restore the entire set of databases to a mutually consistent state at some earlier point in time.
In all three of these cases, you must be using the Full Recovery model for these databases. For more information, see Full Recovery.
The first scenario does not require you to implement any special recovery procedures. To recover the damaged databases, back up the tail of the log, restore the damaged files or the database, and then roll forward using transaction log backups. The undamaged databases require no action.
The other two scenarios require you to use a special procedure to ensure recoverability: marking transactions in the databases.
Marked Transaction Basics
You can mark transactions across related databases and use these marked transactions to recover related databases to the same transaction-consistent point in time. Accomplish this by placing distributed marks across all databases before backing up the log in any database. This will ensure that all log backups have a mark that will appear in all databases. Synchronized backups are not necessary. Instead, placing marks in the transaction log allows synchronization during restore. Use the Full Recovery model to ensure that all the marks will be valid.
Important Related database recovery does not allow recovery to a specific point in time. Recovery of related databases can only be accomplished by recovering to a marked transaction.
An example of related database recovery is a bank that has a database containing checking account data and another database containing savings account data. The two databases are located on different servers, and there are transactions that transfer funds back and forth between checking accounts and savings accounts. When the databases are backed up while fund transfer transactions are active in the system, even if the databases are backed up at the same time, there is a good chance that some transfer transactions will have committed in one database but not the other. Marked transactions can be used to backup and later restore these databases to a point where the outcome of all transactions is the same in both of the restored databases.
For this example, the backup strategy would be:
- Set the recovery model to Full for both databases.
- Back up each database.
Databases can be backed up in series or in parallel.
- Prior to backing up the transaction log, run a marked transaction that spans each database.
- Back up the transaction log on each database.
To restore the backup:
- Restore each database backup.
- Restore each log backup, stopping at the marked transaction.
- Recover each database.
In the event of a media failure, if you want to recover all the databases to a marked transaction, you must determine the most recent marked transaction that is available in all of the transaction logs. This information is stored in the logmarkhistory table, which is in the msdb database, on all of the servers.
When you have determined the marked transaction to which you want to restore:
- Identify the log backups for all related databases containing this mark.
- Create transaction log backups on the undamaged databases as required.
- Resolve hardware problems.
- Restore and recover all related databases to the target mark.
Creating Marked Transactions
The statement BEGIN TRAN new_name WITH MARK can be nested within an already existing transaction. Upon doing so, new_name becomes the mark name for the transaction, despite the name that the transaction may already have been given. Issuing a second, nested BEGIN TRAN...WITH MARK will result in a warning (not error) message:
Server: Msg 3920, Level 16, State 1, Line 2
WITH MARK option only applies to the first BEGIN TRAN WITH MARK.
The option is ignored.
The transaction mark is only placed in the logs of databases that are updated by the marked transaction. In addition, the only databases that will contain the mark are those on the server where the BEGIN TRAN...WITH MARK statement was executed. The following example shows how to put a mark in multiple databases:
BEGIN TRAN T1
UPDATE db1.dbo.table1 set column1 = 2
BEGIN TRAN M2 WITH MARK
UPDATE db2.dbo.table1 set column1 = 2
UPDATE server2.db21.dbo.table1 set column1 = 2
SELECT * from db3.dbo.table1
COMMIT TRAN M2
UPDATE db4.dbo.table1 set column1 = 2
COMMIT TRAN T1
In this example the name of the mark is M2, and it will be placed in the logs of databases db1, db2, and db4. The mark is placed in the logs when the transaction commit log record is generated for the COMMIT TRAN T1 statement. db1 is marked even though the update was executed before the transaction was actually marked. db3 is not marked, despite having been accessed, because no update was made in db3. Also, even though db21 on another server was updated within the transaction, it will not be marked because no BEGIN TRAN...WITH MARK was actually executed by server2.
As indicated in the example, a transaction mark name is not automatically distributed to another server as the transaction spreads to the other server. In order to force the mark's spread to the other servers, a stored procedure must be written which contains a BEGIN TRAN name WITH MARK. That stored procedure must then be executed on the remote server under the scope of the transaction in the originating server. For example, consider a partitioned database that exists on multiple instances of Microsoft® SQL Server™. On each instance is a database named coyote. First, create stored procedure sp_SetMark in every database:
CREATE PROCEDURE sp_SetMark
@name nvarchar (128)
AS
BEGIN TRANSACTION @name WITH MARK
UPDATE coyote.dbo.Marks SET on = 1
COMMIT TRANSACTION
GO
Next, create stored procedure sp_MarkAll containing a transaction that will place a mark in every database. sp_MarkAll can be run from any of the instances:
CREATE PROCEDURE sp_MarkAll
@name nvarchar (128)
AS
BEGIN TRANSACTION
EXEC instance0.coyote.dbo.sp_SetMark @name
EXEC instance1.coyote.dbo.sp_SetMark @name
EXEC instance2.coyote.dbo.sp_SetMark @name
COMMIT TRANSACTION
GO
When a marked transaction is committed, the commit log record for each database in the marked transaction is placed in the log at a point where there are no in-doubt transactions in any of the logs. At this point, it is guaranteed that there are no transactions that appear as committed in one log, but not committed in another log. The following steps accomplish this during the commit of a marked transaction:
Note The commit of a distributed transaction is done in two phases: prepare and commit.
- Prepare phase of a marking transaction will stall all new prepares and commits.
- Only commits of already prepared transactions are allowed to continue.
- Marking transaction then waits for all prepared transactions to drain (with time out).
- Marked transaction is prepared and committed.
- The stall of new prepares and commits is removed.
The stalls generated by marked transactions that span multiple databases can reduce the transaction processing performance of the server.
While rare in practice, it is possible for the commit of a distributed (cross-server) marked transaction to deadlock with other distributed transactions that are committing at the same time. When this happens, the marking transaction will be chosen as the deadlock victim and will be rolled back. When this error occurs, the application can retry the marked transaction. When multiple marked transactions attempt to commit concurrently, there is a higher probability of deadlock. Thus, running concurrent marked transactions is not recommended.
If the database is using log backups, and a log backup chain is active, log marks are traced in the logmarkhistory table:
- In the background after a transaction commits.
- One row per marked database, containing mark name, description, commit LSN, time.
- Time is computed before the commit record is generated.
- All entries for a distributed mark have the same time in a given msdb database.
- All times are before the timestamp in the commit log record.