Skip to main content

Transformations - Pre-load transformation

Related products: CData Sync

I would like to be able to do two things with Transformations:

  • Be able to execute a transformation at the Start of a job, rather than just at the end.  So a Trigger of “Before job”.
  • Be able to execute a transformation against a Source, not just a Target.

 

I know I could do this by rigging up something with Job Events, but using the Transformation feature is so much cleaner.

Hi Doug,

It appears we missed this one, so apologies for the delay in responding to this request.

To address the points mentioned here:

Be able to execute a transformation at the Start of a job, rather than just at the end.  So a Trigger of “Before job”.

This is definitely something we can take a look at, however, understanding your specific use case beforehand is needed so our developers and product team can investigate if the scenario is possible to support.

Can you please give us more information on what you are trying to achieve; What kind/ type of transformations would you like to run and for what purpose (i.e. data validation, preliminary transformations etc. ?.

An example and a more complete description would be appreciated.

Be able to execute a transformation against a Source, not just a Target.

For this one, I'm afraid this request does not align with the intended usage of the Transformations in Sync, therefore, it's not a capability we can support.

We appreciate your input and feedback.


Thank you for the follow up.

  • Start-of-job trigger on Target - this could be for any number of reasons.  Maybe you need to mass-update some existing rows in the target table before new data gets inserted.  Or maybe you need to call a stored procedure in the target which does some other transformation on some table in the target.  Or even maybe you have a generic logging or status table in the target db, and you want to add a row into that table at the start and end of the job.
     
  • Job trigger on Source - the idea here is that maybe some complex source-side transformation is required before the data is ready for Sync to copy it.  So to orchestrate the process with minimal delays, Sync could call a procedure in the source before running the sync.  This functionality could be limited to connections which are of a Database type, for example.

 


Thanks Doug! I will discuss this with the product team and keep you updated whether we decide to implement these features or if there is any progress.


Hi Doug,

I've had the chance to discuss your requests with the product team, and while the reasons mentioned absolutely make valid points, could you please give us a more detailed description of the scenario/s that have prompted to these requests?

We have some ideas on our plate regarding Sync and I want to make sure that any enhancements we make also align with your requirements.

In particular, we're interested in understanding the specific transformations or operations you envision executing on the source that cannot be achieved on the destination [Job trigger on Source request]; What transformations can be executed on source that cannot be executed on the destination or even as part of the Replicate query?

Any example, use cases or reference you can share would greatly assist us in refining our understanding and exploring potential solutions.


Sure - thank you for the follow-up.

In my use case, I had a data set to bring from SQL Server to Snowflake.  But before that data set in SQL Server was ready to be consumed, a stored procedure needed to be run on the source system itself, to generate that data set.  It would then get populated into some tables in SQL Server, and then Sync could bring it over into Snowflake.

So the current process would be:

  • Schedule the source stored procedure to run in SQL Agent
  • Schedule the CData Sync job to run some point after the stored procedure, adding in some buffer time due to the variability in time it takes for the procedure to run.

If Sync could talk to the Source and not just the Target, then the orchestration becomes quite simple.  Just have a start-of-job task that can run some SQL in the Source (and waits synchronously for the result), and then if no error, runs the rest of the Sync flow.

This is admittedly an edge case -- there’s an acceptable workaround and doesn’t happen that often.  

The other requirement (for a start-of-job trigger on the Target) is much more important.

thank you,
Doug


Thanks Doug! 

I have relayed your use case description to the Product team and will keep you updated on how things go. 


Thanks Doug! 

I have relayed your use case description to the Product team and will keep you updated on how things go. 

I’ve ran into a few scenarios where I need to do a task before a job starts. For my edge case I need to set a flag before and after a job, to tell the application not to reload the data on the target system until the refresh from source and target are done. I also have use cases like Doug has. I would like to see this feature as well. 


This would be very useful to us too - we are running in Azure and frequently get database unavailable errors and would like a pre-job to “wake” our database up.  While the pre-job event can be used - it requires us to store the user / password in the event, which we can not do.


Hi @DougN Nouria , @ninken , @uknotfromuk ,

Thank you all for your feedback. 

We have noted this requirement, and it is under consideration by our product team.

With that being said, I would be curious to see the implementations that you’ve tried on your end to achieve your scenarios considering that for some of them, the two steps below, would most probably work: 

  1. Create a Transformation: Set up the transformation you need within CData Sync
  2. Call from the Pre-Job Event: Use the ExecuteJob API call to trigger the transformation from the pre-job event, similar to how it's done in post-job scenarios.

Please refer to the example screenshot below:

Is this a solution that would work for your cases? We’d love to hear your thoughts.


I think the key missing thing is still, a Transformation only runs on a Target.  Several of the use cases above, would require the transformation to be called against any connection, not just a destination connection.

The other thing we’re up against, perhaps, is that, one of CData’s features is that it’s more or less a no-code product.  That’s a selling point.  It’s mostly point-and-click.   Once you start talking about using the Sync API, it’s a different level of complexity.  Add in the, ahem, limited documentation of before-and-after events, and you may not be getting the usage of this feature that you would hope for.

Maybe an enhancement that could address this latter point is, a “pre-post-Event Wizard”,  Something that would give you some choices of things you’d want to do, walk you through some options, fill in the blanks in some dialog boxes, and generate the shell of the event script.  That might make this feature more usable for non-developers.


While this would work for one of my use cases (waking up the target), it would mean “coding” which I am not an expert at.  However, my other use case to run a pre-event on my source still needs double connections - one in the pre-event and one in the task itself.  thanks


I think the key missing thing is still, a Transformation only runs on a Target.  Several of the use cases above, would require the transformation to be called against any connection, not just a destination connection.

The other thing we’re up against, perhaps, is that, one of CData’s features is that it’s more or less a no-code product.  That’s a selling point.  It’s mostly point-and-click.   Once you start talking about using the Sync API, it’s a different level of complexity.  Add in the, ahem, limited documentation of before-and-after events, and you may not be getting the usage of this feature that you would hope for.

Maybe an enhancement that could address this latter point is, a “pre-post-Event Wizard”,  Something that would give you some choices of things you’d want to do, walk you through some options, fill in the blanks in some dialog boxes, and generate the shell of the event script.  That might make this feature more usable for non-developers.


Hi DougN,


The suggestion mentioned in my comment was intended for help on the scenarios where it can be applied which from the descriptions given from uknotfromuk and ninken seemed to be the case for some of them.


This does not take away any of the points for which this feature request is raised or our general approach as a product and team, however, it's part of our efforts, to let you know whether there is a workaround and help you with it.


While I am aware of your experience and expertise with Sync as I have had the pleasure to either work on support tickets or discussions on community, I am trying to take into consideration the ‘larger audience’ and the fact that perhaps for another user of ours, what we mention here could be of help.

 

As always, thanks for the constructive feedback on our product.


While this would work for one of my use cases (waking up the target), it would mean “coding” which I am not an expert at.  However, my other use case to run a pre-event on my source still needs double connections - one in the pre-event and one in the task itself.  thanks

Hi uknotfromuk,

Understandable!

As mentioned in my previous comment, we're continuously working to make our product more accessible and user-friendly. I encourage you to reach out to [email protected] if you need any assistance with the implementation or have any questions.

Our team is here to help and ensure you get the most out of Sync.

Thanks!