When you first edit a Transformer stage, it is likely that you will have already defined what data is input to the stage on the input link. You will use the Transformer Editor to define the data that will be output by the stage and how it will be transformed. (You can define input data using the Transformer Editor if required.)

This section explains some of the basic concepts of using a Transformer stage.

Input Link

The input data source is joined to the Transformer stage via the input link.

Output Links

You can have any number of output links from your Transformer stage.
You may want to pass some data straight through the Transformer stage unaltered, but it’s likely that you’ll want to transform data from some input columns before outputting it from the Transformer stage.

You can specify such an operation by entering a transform expression. The source of an output link column is defined in that column’s Derivation cell within the Transformer Editor. You can use the Expression Editor to enter expressions in this cell. You can also simply drag an input column to an output column’s Derivation cell, to pass the data straight through the Transformer stage.

In addition to specifying derivation details for individual output columns, you can also specify constraints that operate on entire output links. A constraint is an expression that specifies criteria that data must meet before it can be passed to the output link. You can also specify a constraint otherwise link, which is an output link that carries all the data not output on other links, that is, columns that have not met the criteria.

Each output link is processed in turn. If the constraint expression evaluates to TRUE for an input row, the data row is output on that link. Conversely, if a constraint expression evaluates to FALSE for an input row, the data row is not output on that link.

Constraint expressions on different links are independent. If you have more than one output link, an input row may result in a data row being output from some, none, or all of the output links.

For example, if you consider the data that comes from a paint shop, it could include information about any number of different colors. If you want to separate the colors into different files, you would set up different constraints. You could output the information about green and blue paint on LinkA, red and yellow paint on LinkB, and black paint on  inkC.

When an input row contains information about yellow paint, the LinkA constraint expression evaluates to FALSE and the row is not output on LinkA. However, the input data does satisfy the constraint criterion for LinkB and the rows are output on LinkB.

If the input data contains information about white paint, this does not satisfy any constraint and the data row is not output on Links A, B or C, but will be output on the otherwise link. The otherwise link is used to route data to a table or file that is a “catch-all” for rows that are not output on any other link. The table or file containing these rows is represented by another stage in the job design.

You can also specify another output link which takes rows that have not be written to any other links because of write failure or expression evaluation failure. This is specified outside the stage by adding a link and converting it to a reject link using the shortcut menu. This link is not shown in the Transformer meta data grid, and derives its meta data from the input link. Its column values are those in the input row that failed to be written. If you have enabled Runtime Column Propagation for an output link

Leave a Reply