Architecture and Information Tiers

This new architecture does not affect the way DataStage jobs are developed. The DataStage parallel framework remains the same, with a few minimal changes to internal mechanisms that do not impact job designs in any way.

From a job design perspective, the product has interesting new features:

  • New stages, such as Database Connectors, Slowly Changing Dimensions, and Distributed Transaction.
  • Job Parameter Sets
  • Balanced optimization, a capability that automatically or semi automatically rewrites jobs to make use of RDBMS capabilities for transformations. 

Information Server also provides new features for job developers and administrators, such as a more powerfulimport/export facility, a job comparison tool, and an impact analysis tool.

The information tiers work together to provide services, job execution, metadata, and other storage show the below figure

The following list describes the information tiers shown Figure

  • Client:Product module clients that are not Web-based and that are used for development and administration in IBM InfoSphere Information Server. The client tier includes the IBM InfoSphere Information Server console, IBM InfoSphere DataStage and QualityStage clients, and other clients.
  • Engine: Runtime engine that runs jobs and other tasks for product modules that require the engine.
  • Metadata repository:Database that stores the shared metadata, data, and configuration for IBM InfoSphere Information Server and the product modules.
  • Services:Common and product-specific services for IBM InfoSphere Information Server along with IBM WebSphere Application Server (application server)

In this document we focus on aspects related to parallel job development, which is directly related to the Client and Engine layers.

Leave a Reply