EIM Tuning - Reference

What is EIM

EIM is a data loading tool that reads data from EIM tables (staging table) and Inserts/Updates/Deletes data in/from Siebel base tables.
 

Why EIM tuning is important

  • Data conversion process is often underestimated and becomes bottleneck reaching go-live date.
  • The slow initial data conversion could result into large amount of delta updates to catch-up.
  • The incremental data loads – if not tuned and tested properly could overrun the nightly batch processing window. Running large batch process with large number of on-line users can hurt on-line user response time.

Myths 

  • The database tools (Sql*Loader, BCD etc) can be used to load base tables.
  • EIM is very slow and we should write our own programs to load Siebel base tables.
  • Since EIM can be used to load data in Siebel database, no tuning efforts are required.
  • No Database tuning and close DBA support is required since EIM is being used to load data.
  • Siebel Server with number of CPUs and large amount of memory is required to run EIM jobs.

Facts 

  • EIM can load millions of rows per hour if tuned properly.
  • EIM satisfies referential integrity requirements for Siebel base tables.
  • The data quality greatly effects EIM performance.
  • EIM is more Database server intensive than it is Application Server intensive.

What can be tuned
 

  • The initial data loading process (IDL)
  • The incremental data loading process (delta)
  • Database tuning
  • Disk storage tuning
  • The EIM process

Tuning Process

  • Ensure that legacy/feeder systems are providing the required data.
  • Ensure data quality – Most common issue is duplicate data.
  • Ensure correct EIM mapping. Engage Siebel ES for EIM mapping review.
  • Tune Pre-EIM processing. i.e. getting data from feeder system, data cleansing, ftp/copy and loading data in EIM tables.
  • If tuned properly – EIM processing represents not more then 25 – 50% of total data conversion time.
  • Don’t forget reconciliation for drop outs.

Critical Resources

  • Business Analyst – To verify EIM mappings.
  • Database Administrator.
  • EIM resource.
  • Testing Team.
  • Programs from legacy/feeder systems to generate input files for EIM.
  • Technical Architect.
  • If the amount of data is too high – Siebel Expert Services.

Tuning EIM = Tuning I/O

  • By reducing I/O
  • By balancing I/O
  • By fast read and write
  • By reducing operation time

Process Design

 

  • Keep number of rows in EIM tables to a reasonable level (i.e. only those rows that are being currently processed).
  • Number of rows in EIM table will depend upon total number of rows that needs to be processed.
  • For large EIM jobs (more then 10 million for single EIM table), consider using EIM shadow tables.
  • Review and thoroughly test pre-processing.

Reducing I/O in pre-processing 

  • Use direct insert operation for pre-processing and to load data in EIM tables.
  • For initial data loads; consider turning off archive logging (Oracle: Archive logs, DB2: turn off table logging).
  • Consider using a different database for pre-processing (Other then loading EIM tables).
  • Never do record by record processing between two databases.

Tuning IFB

 

  • Configuration file is read by EIM to determine following
  • Process to perform
  • Import, Export, Delete, Merge, Update.
  • Interface and Base tables used and affected
  • Other parameters used for EIM processing
  • Default file ‘Default.IFB” contains useful example
  • Default file is available in Siebel_server\admin directory
  • Only Base Tables
  • Only Base Columns
  • Update Rows = False (When the job is insert only)
  • Insert Rows = False (When the job is update only)
  • Separate Insert and Update operations
  • Use SQLPROFILE parameter to list most expensive SQL statements.

Profiling


The SQLPROFILE lists most expensive SQL statement along with run time and summary.  SQLPROFILE is set by specifying output file name in header section of IFB file.

 TO Set SQLPROFILE
 SQLPROFILE = “C:\temp\EIM_SQL.txt”

 SQL Trace
 Set SQL trace level to 8 while running EIM Tasks during the tuning phase.

I/O Balancing

  • EIM will generate tremendous amount of I/O. The EIM throughput is directly related to I/O capabilities of your database server and storage array.
  • Database layout should distribute I/O on multiple disks/spindles to reduce I/O bottleneck.
  • Install EIM tables and EIM table’s indexes on a separate I/O channel from Siebel base tables and Siebel Base table indexes.
  • For initial data load on turn off Archive logs to reduce I/O.
  • For initial data load make sure that Docking Transaction for Mobile users is set to FALSE.

Basic Tuning

  • Find out the most optimum batch size.
  • Generally speaking – for Insert operation start with 10,000 rows per batch. For update, start with 5000 rows per batch and for delete operations, start with 2000 rows per batch.
  • The most optimum batch size will differ from table to table.
  • Try large batch size for smaller tables (tables with fewer columns) and smaller batch size for larger tables (tables with large number of columns).

Run EIM tasks in parallel.

  • Try to load all independent tables in parallel. These are independent tables and should not cause any locking issues.
  • Run multiple streams of EIM against same EIM table and same base tables. The number of streams will depends upon RDBMS, database server and disk storage array.
  • Some databases may run into locking issues with multiple streams on same EIM table depending upon their locking mechanism.

Database Tuning

  • Involving DBA from early stage of data conversion process.
  • Share your EIM know how with DBA.
  • Plan for several iteration of testing and tuning.
  • Have at least one dry run in your project plan.
  • Start thinking about automating data loads from the get go.
  • Prepare a thorough check list for loading sequence and pre-processing required for EIM data loads.
  • For IDL the database needs to be tuned in a different manner then OLTP operations.
  • For IDL – Tune the database for large I/O operations.
  • Create Siebel base tables and indexes with sufficient PCT_FREE/Fill Factors.

General database tuning

  • Involve an experienced database administrator
  • Allocate large memory/buffers to the database
  • Distribute I/O as much as possible
  • Tune OS parameters for Large I/O operations

Tuning Indexes

  • Drop all indexes on target Siebel base table except for P1 and U% indexes.
  • Run sample EIM job with SQL trace 8 and SQLPROFILE.
  • Remember 80 – 20 rule.
  • Check execution plan of most expensive SQL statements.
  • Create indexes to tune most expensive SQL statement.

Oracle

  • Large redo log files
  • For IDL – No Archive logs. Take periodic backups.
  • Create redo log files on fast devices.
  • Rollback segment, Redo logs, Archive redo logs and temp tablespace are going to be very busy during IDL.
  • Avoid frequent redo log switches
  • Focus on top 5 waiting events
  • Set ini_trans to large value (10 – 20) for parallel EIM jobs.
  • Set Free Lists to large value for dictionary managed tablespace.
  • Row level locking – run EIM tasks in parallel. Depending upon amount of data.
  • Use SQL*Loader direct path to load data in EIM tables.
  • Cache frequently used tables and indexes (S_LST_OF_ VAL)

DB2

  • Use Update Statistics parameter in IFB file till you get optimal set of statistics then turn it off.
  • Use larger page size for EIM and large base tables.
  • Ensure proper distribution of tablespace containers.
  • Use separate large buffer pools for EIM and base tables for IDL.

SQL Server

  • Make sure that P1 index (IF_ROW_BATCH_NUM, ROW_ID) on EIM table is a clustered index.
  • Test EIM jobs with and without index hints. This can be done by adding “Index Hints = FALSE” parameter in IFB file.
  • Avoid fragmentation by allocating appropriate fill factors.
  • Periodically de-fragment tables and indexes.
  • Use large batch size for insert operations.
  • For initial EIM data loads, benchmark your run time with degree of parallelism set to 1.
  • If possible – sort your data on base table user keys before loading data in EIM tables.
  • Benchmark EIM run time with Index Hints and without index hints. Index hints can be turned off by using “Index Hints = False” in IFB file.
  • Create separate file groups for TEMPDB, Data and Log and put them on separate stripe/controller.
  • Put P1 index on separate stripe/controller.
  • Parallel data load for EIM tables using bcp.
Tags