SSIS Change Data Capture. CDC Package Needs Executing Twice.
I have been doing a lot of work with Change Data Capture in SQL Server 2012 recently and this post is about one of the things I noticed. I spent a good deal of time playing with settings as I initially thought I had done something wrong or through playing had managed to corrupt the metadata around my solution. I then mailed Matt Masson and asked him if he had seen this before. With some relief he told me that yes he had noticed it and had been told it was by design. I personally think it is a bad design if indeed it is by design.
So what’s the problem?
With CDC you can synchronise the source and target through performing a full load (take everything in the source and moving it to the destination). From then on you perform CDC loads using the CDC source in the data flow. What I found was that after performing the initial full load, the CDC load had to be initially executed twice in order to get any changes to come through. The flow looks like this
FULL LOAD (FL) –> Make Changes (C1) –> CDC Load (CDC1) –> Make Changes (C2) –> CDC Load (CDC2)
The changes made in C1 will not come through until you perform CDC2. The changes you make in C2 will also come through at this time.
This surprised me. I did also think that perhaps the log reader had not picked up my changes in C1 and deposited them in the change capture table meaning the CDC source component thought there was nothing to move. After checking the change table however I found the rows from my change.
Have a read of this article to understand a bit more about why we need to execute the CDC package twice