The Transformer stage is a processing stage. It appears under the processing category in the tool palette. Transformer stages allow you to create transformations to apply to your data. These transformations can be simple or complex and can be applied to individual columns in your data. Transformations are specified using a set of functions.

Transformer stages can have a single input and any number of outputs. It can also have a reject link that takes any rows which have not been written to any of the outputs links by reason of a write failure or expression evaluation failure.

Transformer Stage

Unlike most of the other stages in a Parallel job, the Transformer stage has its own user interface.

When you edit a Transformer stage, the Transformer Editor appears. An example Transformer stage is shown below. The left pane represents input data and the right pane, output data. In this example, the Transformer stage has a single input and output link and meta data has been defined for both.

Transformer Editor

This section specifies the minimum steps to take to get a Transformer stage functioning. DataStage provides a versatile user interface, and there are many shortcuts to achieving a particular end, this section describes the basic method, you will learn where the shortcuts are when you get familiar with the product.

Steps:

  • In the left pane:
    • Ensure that you have column meta data defined.
  • In the right pane:
    • Ensure that you have column meta data defined for each of the output links. The easiest way to do this is to drag columns across from the input link.
    • Define the derivation for each of your output columns. You can leave this as a straight mapping from an input column, or explicitly define an expression to transform the data before it is output.
    • Optionally specify a constraint for each output link. This is an expression which input rows must satisfy before they are output on a link. Rows that are not output on any of the links can be output on the otherwise link.
    • Optionally specify one or more stage variables. This provides a method of defining expressions which can be reused in your output columns derivations (stage variables are only visible within the stage).

Leave a Reply