Qlik Replicate supports several databases as source endpoints using System Change Number (SCN), Log Sequence Number (LSN) or both. Here, you can see the list of those endpoints among the information to restart from a Source Change Position.
One of the endpoints that use SCN and LSN is Oracle. This vendor defines SCN and LSN as follows:
- Log Sequence Number (LSN): A number that uniquely identifies a redo record set in a redo log file. When the database fills one online redo log file and switches to a different one, the database automatically assigns the new file a log sequence number.
- System Change Number (SCN): The value of an SCN is the logical point in time. It is incremented every time a user makes a commit.
Qlik Replicate allows restarting a Change Data Capture task (CDC) from a specific SCN or an LSN for those databases that support them.
When recovering from an error, you may restart from an SCN or an LSN. However, restarting a task from an SCN or an LSN should be cautiously used as it may impact data consistency.
So to be on the same wavelength, consistency ensures that any transaction will bring the target database from one valid state to another. It implies that any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination.
However, consistency does not guarantee that the transaction is correct in all possible ways the application programmer might have wanted. It is the responsibility of the developer to create the tasks in a way that ensures correctness.
Programming errors may not result in the violation of any defined business rules or requirements.
To restart a task from an SCN or an LSN, we must do it in the same SCN or LSN we last committed to when we stopped the task. So we are sure there are no changes in the source, and we guarantee consistency. To make it easier, it is better if you stop and restart the task at a moment when the source system has low activity.
If we restart a task from an SCN or an LSN after we committed when we stopped the task, we lose the changes in data that occurred in the source. I.e. we don’t replicate them to the target. We lose consistency and create a data quality issue.
If we restart a task from an SCN or an LSN before we commit when we stop the task, we duplicate the changes in data that occurred in the source. I.e., we copy them twice. Again, we lose consistency and create a data quality issue.
Check the task’s log or the source database to know when you stopped it (SCN1). For example, the line below comes from a Qlik Replicate’s log that uses an Oracle endpoint. The number in red is the SCN in hexadecimal format.
'1042;637630161117035590;20210429133623949719B|00000927.59091f7c.00000001.0001.01.0000:811.3885280.16' [1022701] (oracle_endpoint_apply_util.c:1244)
We convert the SCN from hexadecimal to decimal format. We can do it manually or use an online converter. In our example, Decimal(0000092759091f7c) = 10064602144636.
Now, we can resume our CDC task from SCN. We click on Run [icon name=”arrow-right” prefix=”fas”] Advanced run options [icon name=”arrow-right” prefix=”fas”] Source change position. Here we put the SCN in decimal format.
On a separate note, Oracle provides the function SCN_TO_TIMESTAMP to transform an SCN into a timestamp. You may need to check how to do the transformation for other databases. In the Replication Status table, you could check this timestamp with the field SOURCE_TIMESTAMP_APPLIED.
Regarding LSNs, they don’t always map to timestamps.
In summary, if you believe:
- There won’t be issues with error handling and PK enforcement in any of the tables of the task,
- You know the latest SCN or LSN that Qlik Replicate committed to the target before the task stopped, and
- You are positive there weren’t more changes in any table,
THEN you might start loading from an SCN or an LSN on the task.
Additionally, if you decide to load from an SCN or an LSN, you should do it when the data volume is low, and the ecosystem is quiet.
Note: depending on the situation, you could restart from a timestamp.
Leave a Reply