When RMAN does database backup, restore or recovery, RMAN client (rman command process) itself does not perform the operations. Instead it directs database server sessions to do the tasks. One database session corresponds to one RMAN channel that represents one stream of data to a device, disk or tape. The channel reads data into memory, processes it, and writes it to output device. The work of each channel, disk or SBT tape, is subdivided into three phases: Read, Process and Write.
In this article only disk backup to non-ASM disks is discussed. Some test results of RMAN backups with different input/output I/O buffer settings are shown.
Allocation of Input I/O Buffers
The first phase of RMAN backup is reading data blocks from input files on disks into input I/O buffers. The allocation of input buffers depends on how the files are multiplexed. The rules RMAN uses to allocate input disk buffers are listed below.
|Level of Multiplexing||Input Disk Buffer Size per RMAN Channel|
|Less than or equal to 4||Each channel allocates 16 buffers of size 1 MB so that the total buffer size for all the input files in one channel is 16MB|
|Greater than 4 but less than or equal to 8||Each channel allocates a variable number of buffers of 512 KB so that the total buffer size for all input files is less than 16 MB|
|Greater than 8||Each channel allocates 4 buffers of 128 KB for each file, so that the total buffer size for each input file is 512 KB|
When creating backup sets, RMAN has the ability to read several files simultaneously from disk and then write them to a single backup piece (backup set). For example, RMAN can read from two datafiles simultaneously, and then combine the blocks from these datafiles into a single backup piece. This is called RMAN multiplexing. RMAN image copies are never multiplexed. The number of input files simultaneously read and written to the same backup piece is called Level of Multiplexing, which is the minimum of MAXOPENFILES for each channel and FILESPERSET setting. MAXOPENFILES controls the maximum number of input files that a BACKUP command can have at any given time (the default is 8). FILESPERSET specifies the maximum number of input files to include in each output backup set (default is 64).
Asynchronous Disk I/O
Oracle recommends setting DISK_ASYNC_IO init.ora parameter to TRUE (default) if the database server supports asynchronous I/O to disk. Under asynchronous I/O database server process can begin an I/O operation and then perform other work while waiting for the I/O to complete. RMAN can also begin multiple I/O operations before waiting for the first to complete. When using asynchronous I/O, RMAN allocates I/O buffer memory from the Program Global Area (PGA).
Output I/O Buffers
The main factor affecting the RMAN write phase for disk is the buffer size. For disk backup, by default each channel allocates four output buffers of 1 MB each. Therefore the total memory for each backupset file (output of backup) is 4 MB. This can be changed by setting hidden init.ora parameters as discussed below. When restoring files the rad phase is similar to the write phase of backup process with blocks moving in the opposite direction.
Tuning Backup with Memory Settings
The above mentioned input and output I/O buffer allocations are Oracle defaults. Oracle database has following hidden/unsupported init.ora parameters for RMAN backup/restore memory allocations:
_backup_disk_bufcnt number of buffers used for DISK channels _backup_disk_bufsz size of buffers used for DISK channels _backup_file_bufcnt number of buffers used for file access _backup_file_bufsz size of buffers used for file access _backup_seq_bufcnt number of buffers used for non-DISK channels _backup_seq_bufsz size of buffers used for non-DISK channels
Parameters of _backup_file_xxxx are for allocation of input I/O buffers; parameters of _backup_disk_xxxx are for allocation of output I/O buffers when backing up to disk; parameters of _backup_seg_xxxx are for allocation of output I/O buffers when backing up to tape. All of these parameters can be changed online without requiring database bounce.
Checking and Monitoring Memory Usage of RMAN Process
During RMAN backup or restore, you can monitor the memory usage of each RMAN process by querying V$PROCESS, if PGA is used as in the case of Async I/O. Following is a sample query and its output during a RMAN backup.
SQL> select spid, s.sid, s.program, trunc(pga_used_mem/1024/1024) "USED(MB)", trunc(pga_alloc_mem/1024/1024) "ALLOCATED(MB)" , trunc(pga_max_mem/1024/1024) "MAX(MB)" from v$process p,v$session s where pga_alloc_mem > 1048576 and p.addr = s.paddr order by pga_alloc_mem desc; SPID SID PROGRAM USED(MB) ALLOCATED(MB) MAX(MB) ---------- ------- ------------------------------------------------ -------- ------------- -------- 16713 570 email@example.com (TNS V1-V3) 27 29 29 16719 3970 firstname.lastname@example.org (TNS V1-V3) 27 29 29 16718 3402 email@example.com (TNS V1-V3) 27 29 29 16716 2270 firstname.lastname@example.org (TNS V1-V3) 27 29 29 16714 1137 email@example.com (TNS V1-V3) 27 29 29 16717 2837 firstname.lastname@example.org (TNS V1-V3) 27 29 29 16715 1703 email@example.com (TNS V1-V3) 28 29 29 16720 7 firstname.lastname@example.org (TNS V1-V3) 26 28 28 14802 1700 email@example.com (LGWR) 10 18 18 14826 2833 firstname.lastname@example.org (ARC1) 16 17 17 …
After backup/restore, you can check V$BACKUP_ASYNC_IO (V$BACKUP_SYNC_IO for non-async cases) to find out how RMAN allocated input/output I/O buffers (buffer size and buffer count) for each file (data files and/or backup set files).
View V$BACKUP_ASYNC_IO contains information about how RMAN allocate input and output I/O buffers. This view displays performance information about ongoing and recently completed RMAN backups and restores. For each backup, it contains one row for each input datafile, one row for the aggregate total performance of all datafiles, and one row for the output backup piece. This data is not stored persistently, and is not preserved when the instance is re-started. Column BUFFER_SIZE and BUFFER_COUNT indicate I/O buffer (input or output) allocations for each file.
Test Backups/Restore with Different Memory Settings
To see how the hidden init.ora parameters mentioned above influence RMAN backup performance, a test of RMAN backups using backupset method and with different settings of the init.ora parameters was conducted. The result (backup time) along with details of each backup are listed in the following table. The database PGA was configured at 5 GB. Channel allocation parameter MAXOPENFILES was default at 8 throughout all test runs. The level of multiplexing, which is the minimum of MAXOPENFILES and FILESPERSET, varied from 8 to a smaller value when FILESPERSET changed to below 8. Memory Used per RMAN Channel (MB) was obtained from V$PROCESS using the SQL query shown above when each backup process was run. The database has 82 datafiles with size ranging from 1 GB to 64 GB. The database has block size of 8 KB. But there are several tablespaces whose block size is 16 KB. Each run of RMAN backup allocated eight disk channels. It was noticed RMAN does not mix data files with different block size into the same channel.
|Run||Memory Used per RMAN Channel (MB)||FILESPERSET||Level of Multuiplexing (MB)||_backup_disk_bufsz and _backup_file_bufsz (Byte)||_backup_disk_bufcnt and _backup_file_bufcnt||Backup Time (Minutes)|
|1||28||Default (64)||8||Not Set||Not Set||268|
|5||27||4||4||Not Set||Not Set||185|
|7||27||6||6||Not Set||Not Set||249|
From the results we can see:
- The hidden init.ora parameters for RMAN input/output IO buffer allocations do have performance impact (comparing run #1 – #4).
- With the same total memory for each channel (buffer size X No. buffers), bigger buffer size brings more performance gain (comparing run #2 – #4).
- Without setting the hidden init.ora parameters, the level of multiplexing also has some impact to the backup performance (comparing run #1, #7, and #5). The smaller the level of multiplexing the faster the backup process. It is not clear that is observation is general or specific to this this particular target database. It seems against intuition and hard to explain.
- With the hidden init.ora parameters set, test runs of multiplexing level of 4 and 8 gave about the same performance (comparing run #6 and #2).
Setting hidden init.ora parameter for RMAN I/O buffers was also tested with a real world case of database refresh/clone. A test database is refreshed from the corresponding production database monthly. The database is about 2.7 TB in size. It normally took more than 3 hours to finish the refresh process. With _BACKUP_DISK_BUFSZ /_BACKUP_FILE_BUFSZ set to 4 MB and _BACKUP_DISK_BUFCNT /_BACKUP_FILE_BUFCNT set to 16, the refresh process time was reduced to 115 minutes.
When running a database cloning process with RMAN DUPLICATE command, with the hidden init.ora parameters set to 2 MB for buffer size and 32 for buffer count, respectively, we encountered following error message:
Errors in memory script RMAN-03015: error occurred in stored script Memory Script RMAN-06136: ORACLE error from auxiliary database: ORA-01507: database not mounted ORA-06512: at "SYS.X$DBMS_RCVMAN", line 13466 ORA-06512: at line 1 RMAN-03015: error occurred in stored script Memory Script RMAN-10035: exception raised in RPC: ORA-19583: conversation terminated due to error ORA-19870: error while restoring backup piece /b01/rman_backups/ORKTNP/BACKUPSET_LEVEL_0_ORKTNP_20140902_6bphhhep_1_1 ORA-19863: device block size 2088960 is larger than max allowed: 1048576 ORA-06512: at "SYS.X$DBMS_BACKUP_RESTORE", line 2338 RMAN-10031: RPC Error: ORA-19583 occurred during call to DBMS_BACKUP_RESTORE.RESTOREBACKUPPIECE RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of Duplicate Db command at 09/04/2014 10:12:58 RMAN-05501: aborting duplication of target database ==============================================================
My Oracle Support document 1326100.1 reported this was a bug in 11g and fixed in 126.96.36.199 (the target database was 188.8.131.52). The solution provided in the document is to set _BACKUP_KSFQ_BUFSZ. However in this case, I reduced the size to 1 MB for both _BACKUP_DISK_BUFSZ and _BACKUP_FILE_BUFSZ. The problem was resolved.