Data Stage Transformer Usage Guidelines
Choosing Appropriate Stages
The parallel Transformer stage always generates “C” code which is then compiled to a parallel component. For this reason, it is important to minimize the number of transformers, and to use other stages (Copy, Filter, Switch, etc) when derivations are not needed.
- The Copy stage should be used instead of a Transformer for simple operations including:
- Job Design placeholder between stages (unless the Force option =true, EE will optimize this out at runtime)
- Renaming Columns
- Dropping Columns
- Default Type Conversions
Note that rename, drop (if runtime column propagation is disabled), and default type conversion can also be performed by the output mapping tab of any stage
NEVERuse the “BASIC Transformer” stage in large-volume job flows. Instead, user-defined functions and routines can expand parallel Transformer capabilities.
Consider, if possible, implementing complex derivation expressions using regular patterns by Lookup tables instead of using a Transformer with nested derivations.
For example, the derivation expression:
If A=0,1,2,3 Then B=”X” If A=4,5,6,7 Then B=”C”
Could be implemented with a lookup table containing values for column A and corresponding values of column B.
Optimize the overall job flow design to combine derivations from multiple Transformers into a single Transformer stage when possible.
In v7 and later, the Filter and/or Switch stages can be used to separate rows into multiple output links based on SQL-like link constraint expressions.
In v7 and later, the Modify stage can be used for non-default type conversions, null handling, and character string trimming. See section 7.5 for more information.
Buildops should be used instead of Transformers in the handful of scenarios where complex reusable logic is required, or where existing Transformer-based job flows do not meet performance requirements.
Transformer NULL Handling and Reject Link
When evaluating expressions for output derivations or link constraints, the Transformer will reject (through the reject link indicated by a dashed line) any row that has a NULL value used in the expression. To create a Transformer reject link in DataStage Designer, right-click on an output link and choose “Convert to Reject”.
The Transformer rejects NULL derivation results because the rules for arithmetic and string handling of NULL values are by definition undefined. For this reason, always test for null values before using a column in an expression, for example:
If ISNULL(link.col) Then… Else…
Note that if an incoming column is only used in a pass-through derivation, the Transformer will allow this row to be output. DataStage release 7 enhances this behavior by placing warnings in the log file when discards occur.
Transformer Derivation Evaluation
Output derivations are evaluated BEFORE any type conversions on the assignment. For example, the PadString function uses the length of the source type, not the target. Therefore, it is important to make sure the type conversion is done before a row reaches the Transformer.
For example, TrimLeadingTrailing(string) works only if string is a VarChar field. Thus, the incoming column must be type VarChar before it is evaluated in the Transformer.
Conditionally Aborting Jobs
The Transformer can be used to conditionally abort a job when incoming data matches a specific rule. Create a new output link that will handle rows that match the abort rule. Within the link constraints dialog box, apply the abort rule to this output link, and set the “Abort After Rows” count to the number of rows allowed before the job should be aborted .
Since the Transformer will abort the entire job flow immediately, it is possible that valid rows will not have been flushed from Sequential File (export) buffers, or committed to database tables. It is important to set the Sequential File buffer flush or database commit parameters.
Related Articles:
Browse Categories
Navigation
User login
Readers who viewed this page, also viewed:
- Tracing Siebel server task with SQL Trace and TKPROF in Oracle (55.6%)
- How do I use the DELETE EXACT parameter to delete data from base tables other than the target base table? (55.6%)
- How are addresses for account stored in the vertical Siebel 7 – Industry Application (55.6%)
- How to use MISC SQL in Siebel to set the primary foreign keys in the EIM Step 11? (55.6%)
- How are index hints used in an EIM process (55.6%)
- Siebel EIM Delete Process Parameter Reference (55.6%)
- Reference to Siebel EIM Import Parameters (55.6%)
- How to Troubleshoot Database Errors Caused by Deleting Repositories using Siebel Tools (44.4%)
- Database Performance Considerations - Where indexes can improve performance and where not (44.4%)
- Siebel Database Layout Guidelines (44.4%)
