- Where is the source and target database?
The location of the database impacts the network latency. It's important to be aware of the geographic location first.
- What is the source database redo log creation rate?
Database Administrators (DBAs) can use SQL queries to generate reports for this. The answer should be data throughput in MB/hr or GB/hr. For Oracle GoldenGate, the following range are generally defined:
- <10GB/hr for small volume,
- 50-100GB/hr for medium volume,
- 100GB-200GB/hr for large volume,
- >200GB/hr for extremely large volume.
- How much redo log to replicate? ( e.g. 20% or 50% or 100%, etc)
DBAs can also can perform SQL query to find out the information. The result means the redo log data which GoldenGate need to process.
- How much data to deliver over the network?
This means the Oracle GoldenGate trail file size. You can perform a test with an extract find out the calculate the redo log size to the trail file size ratio or you can use the use 50% of the redo log file size as the trail file size.
- What's the expected latency between all the replication steps during transaction peak time?
The answer should come from the business users with the lag time for all the major replication steps and/or the end-to-end lag.
- What's the network bandwidth, latency, and data loss?
Your network people should answer this question or you can perform some test (Refer to How to Check the TCP Performance for the details.) This answer impacts the data delivery over the network.
- Because Oracle GoldenGate pump is a single-threaded process, long network latency will limit the volume of data sent over the network by Oracle GoldenGate. In this case, you need to find out if you can increase the TCP window size at operating system level to improve the throughput.
- You also need to pay attention to the data loss because the larger TCP window size is, the higher cost is for resending the data.
- What are types of data in the replication?
Some data like LOB data requires fetching directly from the source database. Replicating these data types has impact on the overall the performance.
Before starting a data replication project, it's critical to evaluating your performance requirements. The evaluation allows you to design a solution with the right architecture, the hardware, and the software to achieve the goals. You would ask the following questions when evaluating the performance:
Created: 3/21/2017, Last Updated: 7/19/2017