workflows are used to determine the process for execution. the main purpose of workflow is to prepare for executing the data flows and to set the state of system, once the data flow execution is completed.
the batch jobs in etl projects are similar to the workflows with the only difference that the job does not have parameters.
various objects can be added to a work flow. they are −
- work flow
- data flow
- scripts
- loops
- conditions
- try or catch blocks
you can also make a work flow call other work flow or a work flow can call itself.
note − in workflow, steps are executed in a left to right sequence.
example of work flow
suppose there is a fact table that you want to update and you have created a data flow with the transformation. now, if you want to move the data from source system, you have to check the last modification for fact table so that you extract only rows that are added after last update.
in order to achieve this, you have to create one script, which determines the last update date and then pass this as input parameter to the data flow.
you also have to check if the data connection to a particular fact table is active or not. if it is not active, you need to setup a catch block, which automatically sends an email to the administrator to notify about this problem.