Configuring Oracle GoldenGate for the Best Performance with
- More CPUs to enable parallelism
- Fast Disk I/Os to speed up the trail files read/write
- The latest GoldenGate software for the latest performance enhancements.
- Integrated capture and delivery for the replication optimization in the Oracle Database
- Less procession overheads such as data transformations and CDRs
- The minimum required data
- A high-throughput network with the optimized TCP configurations
- Compressed trail files if the network is constrained
The performance requirements of real-time data replications typically refer to how much data can be delivered and how quickly the data can reach the target. How much data can be delivered is measured by the throughput, which is the amount of replicated data over time. How quickly the data can reach the target is represented by the lag time, which is the amount of time taken for a data change to be propagated to the targets. Real-Time data replication frequently asks for low latency and high throughput. You might hear your initial requirement like this:
"Can you synchronize my databases within 1 second latency?"
- What is the source database redo log generation rate?
DBA can use SQL queries to generate reports for this. The answer should be a data throughput in MB/hr or GB/hr. For Oracle GoldenGate, the following range are generally used defined:- <10GB/hr for small volume
- 50-100GB/hr for medium volume
- 100GB-200GB/hr for large volume
- >200GB/hr for extremely large volume.
- How much redo log need to be replicated? ( e.g. 20% or 50% or 100%, etc)
DBA can perform SQL query to find out the information. The result means the redo log data which GoldenGate need to process. - What's the ratio of redo log to trail file?
You can estimate the trial file size to be 50% (in general) of the redo log file. However, the best way is to perform a test to find out the actual ratio for your workload. - What's the expected latency between all the replication steps? For the peak time, how much lag time can be acceptable?
The answer should come from the business users with the lag time for all the major replication steps and the end-to-end lag. What's the average lag time, the peak time lag time and how quickly should Oracle GoldenGate return to average lag time post peak time.
I need to replicate 200GB hr redo log with 100GB/hr trail files in 1 second average latency.
2. Designing Oracle GoldenGate Replications for High Performance
When designing a replication solution, You would take the following steps:
- Prepare the Server to run Oracle GoldenGate
- Prepare the network connection
- Configure the Databases
- Configure the Oracle GoldenGate processes
For Oracle Database replications, please refer to the following Oracle MAA Technical White paper for the details [4]:
- Oracle MAA Technical White Paper, Oracle GoldenGate Performance Best Practices, May, 2017
2.1 Prepare the Server to Run Oracle GoldenGate
The server running the Oracle GoldenGate instance should have enough CPU and storage resources. The disk I/O speed is also crucial because Oracle GoldenGate replication needs a lot of for disk I/Os such as read/write trail files.
After setting up a server, you can perform a baseline test with the typical workload of your applications:
- For extract, you need to stop the database transactions and export the log files for the Oracle GoldenGate capture. You also need to write the trail files to the local disk.
- For replicat, you need to stop the database transactions, turn off indexing or DB checking when GoldenGate applies the transactions.
The result indicates the maximum replication throughput in the given server configuration. If the result can't meet the perforamnce requirement, you then need a better server.
2.2 Prepare the Network
For the network, you need to delivery data for Oracle GoldenGate from pre-generated trail files and find out the throughput and latency. When performing network estimation, you need to know answers to the following questions:
- Where are the source and target database?
The location of the database impacts the network latency. It's important to be aware of the geographic location first. - What're the network bandwidth, latency, and data loss?
You can check the TCP performance to find out how the data is delivered over your network. Because Oracle GoldenGate pump is a single thread process, long network latency will limit the volume of data sent over the network by Oracle GoldenGate. In this case, you need to increase the TCP window size to improve the throughput. However, you have to pay attention to the data loss because the larger TCP window size is, the higher cost is for resending the data.
2.3 Configure the Databases
You also need to avoid unnecessary database index/triggers on the target.
2.4 Configure the Oracle GoldenGate processes
For data replication processes, you need to consider the following configurations:
- Replication Features: you need to optimized in database processing for integrated capture and delivery for Oracle databases.
- Data Filtering: You would only replicat data volume to replicate minimum needed data from the source. If you do have LOB data, considering that will slow down the replication throughput.
3. Tuning the Performance
Tuning the performance should start from to find out the performance bottlenecks. Then, you can modify the configuration of your extracts, pumps and replicats. The following are the common performance problems and the tuning tips:
Resources
- Oracle GoldenGate Performance Best Practices, Oracle MAA Paper, November, 2014
- SQL Server Documentation - Performance (Replication)
- Jinyu Blog, Questions to Ask to Evaluate Oracle GoldenGate Performance, September, 2017
- Oracle MAA Technical White Paper, Oracle GoldenGate Performance Best Practices, May, 2017