Transformations - Pre-load transformation

Related products: CData Sync

I would like to be able to do two things with Transformations:

  • Be able to execute a transformation at the Start of a job, rather than just at the end.  So a Trigger of “Before job”.
  • Be able to execute a transformation against a Source, not just a Target.

 

I know I could do this by rigging up something with Job Events, but using the Transformation feature is so much cleaner.

Hi Doug,

It appears we missed this one, so apologies for the delay in responding to this request.

To address the points mentioned here:

Be able to execute a transformation at the Start of a job, rather than just at the end.  So a Trigger of “Before job”.

This is definitely something we can take a look at, however, understanding your specific use case beforehand is needed so our developers and product team can investigate if the scenario is possible to support.

Can you please give us more information on what you are trying to achieve; What kind/ type of transformations would you like to run and for what purpose (i.e. data validation, preliminary transformations etc. ?.

An example and a more complete description would be appreciated.

Be able to execute a transformation against a Source, not just a Target.

For this one, I'm afraid this request does not align with the intended usage of the Transformations in Sync, therefore, it's not a capability we can support.

We appreciate your input and feedback.


Thank you for the follow up.

  • Start-of-job trigger on Target - this could be for any number of reasons.  Maybe you need to mass-update some existing rows in the target table before new data gets inserted.  Or maybe you need to call a stored procedure in the target which does some other transformation on some table in the target.  Or even maybe you have a generic logging or status table in the target db, and you want to add a row into that table at the start and end of the job.
     
  • Job trigger on Source - the idea here is that maybe some complex source-side transformation is required before the data is ready for Sync to copy it.  So to orchestrate the process with minimal delays, Sync could call a procedure in the source before running the sync.  This functionality could be limited to connections which are of a Database type, for example.

 


Thanks Doug! I will discuss this with the product team and keep you updated whether we decide to implement these features or if there is any progress.


Hi Doug,

I've had the chance to discuss your requests with the product team, and while the reasons mentioned absolutely make valid points, could you please give us a more detailed description of the scenario/s that have prompted to these requests?

We have some ideas on our plate regarding Sync and I want to make sure that any enhancements we make also align with your requirements.

In particular, we're interested in understanding the specific transformations or operations you envision executing on the source that cannot be achieved on the destination [Job trigger on Source request]; What transformations can be executed on source that cannot be executed on the destination or even as part of the Replicate query?

Any example, use cases or reference you can share would greatly assist us in refining our understanding and exploring potential solutions.


Sure - thank you for the follow-up.

In my use case, I had a data set to bring from SQL Server to Snowflake.  But before that data set in SQL Server was ready to be consumed, a stored procedure needed to be run on the source system itself, to generate that data set.  It would then get populated into some tables in SQL Server, and then Sync could bring it over into Snowflake.

So the current process would be:

  • Schedule the source stored procedure to run in SQL Agent
  • Schedule the CData Sync job to run some point after the stored procedure, adding in some buffer time due to the variability in time it takes for the procedure to run.

If Sync could talk to the Source and not just the Target, then the orchestration becomes quite simple.  Just have a start-of-job task that can run some SQL in the Source (and waits synchronously for the result), and then if no error, runs the rest of the Sync flow.

This is admittedly an edge case -- there’s an acceptable workaround and doesn’t happen that often.  

The other requirement (for a start-of-job trigger on the Target) is much more important.

thank you,
Doug


Thanks Doug! 

I have relayed your use case description to the Product team and will keep you updated on how things go. 


Thanks Doug! 

I have relayed your use case description to the Product team and will keep you updated on how things go. 

I’ve ran into a few scenarios where I need to do a task before a job starts. For my edge case I need to set a flag before and after a job, to tell the application not to reload the data on the target system until the refresh from source and target are done. I also have use cases like Doug has. I would like to see this feature as well.