This blog explains the differences and new features in globalization support for Oracle GoldenGate 11.2.1, 12.1.2, 12.1.2.1.0 and 12.1.2.1.1. The followings are main takeaways:
- In 11.2.1, extract will use source DB character set to write trail files. No needs to setup NLS_LANG for Oracle DB.
- In 12.1.2, replicat will use target DB character set to setup the session to apply the data. No need to setup the NLS_LANG for Oracle DB. MySQL and Sybase requires setting up the SESSIONCHARSET.
- In 12.1.2.1.0, GoldenGate for Oracle support CHARMAP.
- In 12.1.2.1.1, Oracle GoldenGate for Non-Oracle supports CHARMAP.
Oracle GoldenGate 11.2.1
Oracle GoldenGate 11.2.1 release introduced the automatic character set conversion in Oracle GoldenGate while the prior versions rely on database to perform the conversion. The following new globalization features are added in this release:
1. Always Using Source Database Character Set for Extract
Oracle GoldenGate extract will always use source database character set to write data to trail file.
- For Oracle Database, NLS_LANG is no longer needed for Oracle databases. . However, if users want to control the report file and error log file message encodings, they can still set up the NLS_LANG but it has to be the same as the source database encoding. To check the source database encoding in Oracle database, you can use the following SQL query:
SQL> column parameter format a25 SQL> column value format a15 SQL> select parameter, value from nls_database_parameters 2 where parameter in ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET', 'NLS_LANG UAGE', 'NLS_TERRITORY'); PARAMETER VALUE ------------------------- --------------- NLS_LANGUAGE AMERICAN NLS_TERRITORY AMERICA NLS_CHARACTERSET AL32UTF8 NLS_NCHAR_CHARACTERSET AL16UTF16
- For DB2 z/OS, Unicode tables are written to trail files in Unicode but single byte characters are written to trail file based on user's choice TRAILCHARSETASCII and TRAILCHARSETEBCDIC. For single character set, the tail file will keep the encoding as is. For mixed of ASCII and EBCDIC character set, the extract convert them into a single character set, either ASCII or EBCDIC (default) based on the TRAILCHARSETASCII/TRAILCHARSETEBCDIC options. Any Unicode table will be kept as it (UTF-8) . To void overhead of character sets conversion, you can set up different extract-pump-replicat for table with different encoding. All double bytes CCSIDs are converted to UTF-16 (CCSID 1200). All the rest are converted to UTF-8.
- For DB2 iSeries, all double bytes CCSIDs are converted to UTF-16 (CCSID 1200). All the rest are converted to UTF-8.
2. CHARSETCONVERSION and NOCHARSETCONVERSION (default) Option for Replicat
- For Oracle databases, the session character set is controlled by NLS_LANG. It's is required to setup NLS_LANG environment variable to match the source database character set prior to starting the the replicat. GoldenGate uses Oracle OCI interface to perform the character set conversion.
- For Sybase, MySQL and Teradata session character set is controlled by the Oracle GoldenGate SESSIONCHARSET parameter. For Sybase, if SESSIONCHARSET option is not specified, entry in locales.dat would be honored by OGG session charset. User has to set the correct session character set to be the same as the source database and connect to the target database.
For Oracle GoldenGate 12.1.2 release, the following changes are introduced:
1. Always use target database character set to connect to the target database.
Starting from Oracle GoldenGate 12.1.2, for Oracle Database, GoldenGate replicat no longer needs NLS_LANG setting since it uses target database character set to connect to target database.
2. SESSIONCHARSET Deprecated in Teradata
Starting from 12.1.2, SESSIONCHARSET option is deprecated in Teradata.
For Oracle GoldenGate 12.1.2.1.1 release, both Oracle and Non-Oracle database support the CHARMAP function to allow character set conversions.
References