Preparing for Instance Data Replication
Summarize
Summary of Preparing for Instance Data Replication
This guide helps ServiceNow customers prepare their instances for Instance Data Replication (IDR) to ensure successful and efficient data synchronization between producer and consumer instances. It covers important considerations on geographic placement, table and column selection, data integrity, and configuration best practices to optimize replication performance and avoid common issues.
Show less
Preparing Your Instances
- Geographic proximity: To reduce replication latency, keep producer and consumer instances in the same geographic region and under the same customer. Cross-region replication is possible but may cause delays.
- Data source control: Treat the producer instance as the exclusive source of truth. Avoid concurrent data imports or updates on the consumer instance that can cause duplicate records or mismatches.
Deciding Which Tables and Columns to Replicate
- Tables: Avoid replicating system tables (e.g., those prefixed with sys) to prevent insert/update failures. Avoid continuous replication of CMDB tables due to performance risks; if necessary, replicate on a schedule or with filtered conditions.
- Table hierarchies: Replicate all related parent and child tables to maintain data integrity and preserve table hierarchies.
- Table relationships: Include referenced tables in replication sets to ensure reference fields are populated correctly on the consumer instance.
- Columns: Replicate only necessary columns. Exclude frequently updated system columns to avoid excessive replication activity and potential capacity impacts.
Organizing Replication Sets
- Single table per replication set: Allows granular control, easier error isolation, and parallel replication jobs.
- Grouped tables per replication set: Easier setup and management, but replication issues affect all tables in the group. Limit groups to five or fewer tables for manageability.
Preparing Target Tables on the Consumer Instance
- Ensure target tables are empty or aligned with producer sysid values before replication starts to prevent duplicates and mismatches.
- If pre-existing data with differing sysids cannot be avoided, either clean the consumer tables or use a custom Coalesce field to uniquely identify records.
Reviewing Business Rules, ACLs, and Data Policies
- Business rules: Verify that rules on the consumer instance do not block the
idr.systemuser from inserting, updating, or deleting records. Errors appear in the Replication Payload Error table. - Access Control Lists (ACLs): Confirm
idr.systemhas necessary roles and permissions on target tables to allow replication. - Data policies: Ensure replicated data complies with any data policies on the consumer instance to prevent replication failures.
Practical Outcomes
- Reduced replication latency and improved reliability by geographic and instance preparation.
- Maintained data integrity through careful table, column, and hierarchy selection.
- Flexible replication management via organized replication sets.
- Minimized replication errors by preparing consumer tables and reviewing business logic and security settings.
Before you set up Instance Data Replication (IDR), analyze the tables and columns in the producer and consumer instances to replicate data successfully.
Preparing your instances
Reduce latency by ensuring that the producer and consumer instances are in the same geographic region and belong to the same customer. You can replicate data between instances across different geographic regions. However, replication latency can occur when instances are located in data centers in separate regions.
Deciding which tables to replicate
Before you create a producer replication set, decide which tables to replicate. Avoid replicating system tables (tables with a sys_ prefix). If you replicate system tables (like sys_user, sys_user_group, or sys_user_grmember) that have existing data in the consumer instance, insert and update failures can occur during replication. If you decide to replicate these tables, you may have additional work afterwards to clean these tables.
Avoid continuous replication of CMDB tables. Replicating CMDB data as changes occur can create performance issues or unforeseen consequences with replication due to the number of records involved. If you must replicate CMDB tables, consider scheduling replication or use conditions to constrain the count of replicated records and ensure all required columns are included in the replication set.
For a list of tables to avoid replicating, see Excluded tables in Instance Data Replication.
Analyzing table hierarchies
See Preserving table hierarchy in Instance Data Replication.
Analyzing table relationships
For each table you want to replicate, determine if the table has reference fields that point to other tables. If you are replicating a table with a reference field that points to another table, but you don’t include that table in your replication set, the reference field will be empty in the record on the consumer instance. Replicating table relationships maintains data integrity and ensures that reference fields are populated on the consumer as you'd expect.
Organizing replication sets
As part of the planning phase, determine how you want to organize your replication sets and the tables they contain.
- Create a single replication set for each table. This option requires more setup, configuration, and time management, but allows you to pause replication for a single table without impacting other tables.
- Replication is performed in parallel using multiple jobs.
- Throughput metrics are easier to track with separate replication sets.
- When an error occurs, replication for a single table is affected instead of all multiple tables.
- Create a replication set for a group of related tables. This option is easier to set up and manage, but oversight and potential issues impact all tables in the group.
- If you need find information about replication status, the logical name you use for this replication set might be easier to find and understand compared to using distinct replication sets for each table.
- If you want to pause replication, you can do so for the entire group of tables by pausing replication from one replication set.
- If a replication issue occurs on one table and replication stops, this stops replication for all tables in the replication set.
If you decide to group tables in a single replication set, try to limit it to five or fewer tables. If you have more than five tables to replicate, use multiple replication sets with five or fewer tables in each set.
Deciding which columns to replicate
For each table you want to replicate, decide which columns to include. Avoid including every column in your replication set by default. Instead, decide which columns are necessary, and exclude any sys_ columns or other columns that are updated by scripts automatically.
For example, if you include a column updated by the system automatically on a frequent basis, IDR may replicate data more often than necessary. When the system replicates this data, it can negatively impact the capacity subscription. Periodically check the IDR License and Usage Details dashboard to monitor the message count that is generated from the producer.
Preparing target tables on the consumer instance
By default, IDR uses a record's sys_id field as the lookup value to keep data synchronized between the producer and consumer instance. If the target table contains existing data or records from prior data imports, the sys_id values in the consumer don't match the sys_ids from the producer instance.
Always consider the producer instance as the source of truth. For optimal replication results, follow these guidelines:
- Make the producer instance the exclusive data source for the consumer instance.
- Prior to replication, ensure that the target table in the consumer instance is empty. Ideally, the initial records in the target instance are created from data sent exclusively from the producer via a replication set or a clone.
- Ensure that there are no other ongoing data imports or updates to target tables on the consumer instance after replication starts.
For example, if users from LDAP are imported to the consumer instance using a different data source, the sys_ids for those records don't match the sys_id values in the producer instance. In this scenario, existing user records in the target table on the consumer aren't updated and duplicate records are created.
In situations where this cannot be avoided, clean the tables on the consumer instance prior to replication. You must delete the records in the target table or ensure the sys_id values in producer and consumer are the same.
Alternatively, you can use a custom Coalesce field to identify unique records (instead of the default sys_id column) for replication. Use a custom Coalesce column when records on the consumer instance have a different sys_id for the same records on the producer instance. For more information on using a custom Coalesce, see Custom coalescing.
Reviewing business rules
You can use business rules to trigger workflows after replication, such as sending a notification or validating the replicated data. Review any business rules on the consumer instance that might prevent the idr.system user from inserting, updating, or deleting records in the target table.
If failures related to business rules occur, they appear in the table on the consumer instance. View the Error Message field for details on the script that is causing the failure.
With bidirectional replication, records created on the producer instance are replicated to a consumer instance and vice versa. When the record is inserted on the consumer instance and it triggers a business rule that updates the record, that update is not replicated back to the producer instance.
Reviewing ACLs
Review the ACLs in place on the target table in the consumer instance to ensure that IDR can replicate data successfully. Confirm that the idr.system user has the appropriate roles for the target table.
If failures related to ACLs occur, they appear in the table on the consumer instance. View the Error Message field for details on the ACL that is causing the failure.
Reviewing data policies
Review any data policies in place on the target table in the consumer instance to ensure that IDR can replicate data successfully. Confirm that the data you are replicating satisfies the data policy.
If failures related to data policies occur, they appear in the table on the consumer instance. View the Error Message field for details on the data policy that is causing the failure.