Solved

Advanced Job Options not working for CData Sync

  • 26 May 2024
  • 3 replies
  • 48 views

Userlevel 2
Badge

Hi,

I have set Additional Options to TransactionSize=100 and Batch Size to 50 in the Job’s advanced section. My understanding is that setting a Transaction Size will commit the rows at whatever number is assigned, so in my case when 100 records have been cached, those are committed to the database.

Unfortunately, the job is not saving any records to the Microsoft SQL destination table for one of the source tables. When I turn on verbose logging, I can see in the logs that 10000+ records have been cached, because I can see the data from those records in the log file. 

There are other tables which are successfully being populated. Just this one is not. When it errors out at some 20kish row the job ends and there is no data in my destination table. 

How do I make the job save data it is caching to the table before the error is thrown?

icon

Best answer by Ankit Singh 27 May 2024, 10:07

View original

3 replies

Userlevel 5
Badge +1

Hi @SalD 

There are a few things I’d like to clarify first:

  • TransactionSize controls the number of records that are retained in memory before the records are committed to the destination table. There might be a possibility that your rows did not persist but were just stored in memory. Is there a reason why you wish to reduce the TransactionSize (from the default 50000) to 100?
  • Same way, BatchSize controls the number of records to send to the destination table at one time. Higher batch size generally means better overall performance, although a higher time-out value might be needed. The default value is 1000.
  • If you are working with a table with >20K rows (as per your statement, that’s where it failed), you can increase the BatchSize in that case to a higher value to get the optimal performance. Normally, I’d split the tables with high volumes in a separate job and tweak the job configurations for them individually or override the job settings by setting table-level properties.

Let’s also understand how CData Sync loads the data into SQL Server:

  1. CData Sync creates a temp table in the destination.

  2. Bulk insert data from the source table into temp the table using sqlbulkcopy

  3. Merge from temp the table to the target table on the primarykey

 

If for some reason, the job run fails, Sync rolls back the transactions and deletes this temp table. Hence you don’t see any records in the destination.

Userlevel 2
Badge

Hi Ankit, this explanation seems helpful. I will adjust the settings per your advice and see if it helps.

Hi @SalD 

There are a few things I’d like to clarify first:

  • TransactionSize controls the number of records that are retained in memory before the records are committed to the destination table. There might be a possibility that your rows did not persist but were just stored in memory. Is there a reason why you wish to reduce the TransactionSize (from the default 50000) to 100?
  • Same way, BatchSize controls the number of records to send to the destination table at one time. Higher batch size generally means better overall performance, although a higher time-out value might be needed. The default value is 1000.
  • If you are working with a table with >20K rows (as per your statement, that’s where it failed), you can increase the BatchSize in that case to a higher value to get the optimal performance. Normally, I’d split the tables with high volumes in a separate job and tweak the job configurations for them individually or override the job settings by setting table-level properties.

Let’s also understand how CData Sync loads the data into SQL Server:

  1. CData Sync creates a temp table in the destination.

  2. Bulk insert data from the source table into temp the table using sqlbulkcopy

  3. Merge from temp the table to the target table on the primarykey

 

If for some reason, the job run fails, Sync rolls back the transactions and deletes this temp table. Hence you don’t see any records in the destination.

Hey Ankit,

I’m wanting to understand this process better. Just to clarify, Does increasing the BatchSize only affect performance of getting the data from source to temp table, i.e. the sqlbulkcopy portion? And does increasing the transactionSize only affect performance of the Merge statement? Does the “TransactionSize” correlate to use of the Top clause in the merge statement?

Thanks!

 

Reply