LINK COLLECTOR Stage:

The Link Collector stage is an active stage which takes up to 64 inputs and allows you to collect data from this links and route it along a single output link. The stage expects the output link to use the same meta data as the input links.

The Link Collector stage can be used in conjunction with a Link Partitioner stage to enable you to take advantage of a multi-processor system and have data processed in parallel. The Link Partitioner stage partitions data, it is processed in parallel, then the Link Collector stage collects it together again before writing it to a single target. To really understand the benefits you need to know a bit about how DataStage jobs are run as processes, see “DataStage Jobs and Processes”.

In order for this job to compile and run as intended on a multi-processor system you must have inter-process buffering turned on, either at project level using the DataStage Administrator, or at job level from the Job Properties dialog box.

The Properties tab allows you to specify two properties for the Link Collector stage:

  • Collection Algorithm. Use this property to specify the method the stage uses to collect data. Choose from:
    • Round-Robin. This is the default method. Using the round-robin method the stage will read a row from each input link in turn.
    • Sort/Merge. Using the sort/merge method the stage reads multiple sorted inputs and writes one sorted output.
    • Sort Key. This property is only significant where you have chosen a collecting algorithm of Sort/Merge. It defines how each of the partitioned data sets are known to be sorted and how the merged output will be sorted. The key has the following format:

Columnname {sortorder] [,Columnname [sortorder]]…
Columnname specifies one (or more) columns to sort on.
sortorder defines the sort order as follows:

In an NLS environment, the collate convention of the locale may affect the sort order. The default collate convention is set in the DataStage Administrator, but can be set for individual jobs in the Job Properties dialog box.

For example:
FIRSTNAME d, SURNAME D

Specifies that rows are sorted according to FIRSTNAME column and SURNAME column in descending order.

The Link Collector stage can have up to 64 input links. This is where the data to be collected arrives. The Input Name drop-down list on the Inputs page allows you to select which of the 64 links you are looking at.

The Link Collector stage can have a single output link.

 

Leave a Reply