Informatica – PowerCenter Performance Tuning:
To tune session performance, first identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until you are satisfied with the session performance. You can use the test load option to run sessions when you tune session performance.
If you tune all the bottlenecks, you can further optimize session performance by increasing the number of pipeline partitions in the session.
Because determining the best way to improve performance can be complex, change one variable at a time, and time the session both before and after the change.
To improve session performance:
1. Optimize the target. Enables the Integration Service to write to the targets efficiently.
2. Optimize the source. Enables the Integration Service to read source data efficiently.
3. Optimize the mapping. Enables the Integration Service to transform and move data efficiently.
4. Optimize the transformation. Enables the Integration Service to process transformations in a mapping efficiently.
5. Optimize the session. Enables the Integration Service to run the session more quickly.
6. Optimize the grid deployments. Enables the Integration Service to run on a grid with optimal performance.
7. Optimize the PowerCenter components. Enables the Integration Service and Repository Service to function optimally.
8. Optimize the system. Enables PowerCenter service processes to run more quickly. Performance bottlenecks can occur in the source and target databases, the mapping, the session, and the system. The strategy is to identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until you are satisfied with the performance.
Look for performance bottlenecks in the following order:
1. Target 2. Source 3. Mapping 4. Session 5. System
Use the following methods to identify performance bottlenecks:
Run test sessions. You can configure a test session to read from a flat file source or to write to a flat file target to identify source and target bottlenecks. Analyze performance details. Analyze performance details, such as performance counters, to determine where session performance decreases. Analyze thread statistics. Analyze thread statistics to determine the optimal number of partition points. Monitor system performance. You can use system monitoring tools to view the percentage of CPU use, I/O waits, and paging to identify system bottlenecks. You can also use the Workflow Monitor to view system resource usage. Using Thread Statistics such as following thread statistics:
Run time. Amount of time the thread runs. Idle time. Amount of time the thread is idle. It includes the time the thread waits for other thread processing within the application. Idle time includes the time the thread is blocked by the Integration Service, but not the time the thread is blocked by the operating system. Busy time. Percentage of the run time the thread is by according to the following formula:
(run time – idle time) / run time X 100
You can ignore high busy percentages when the total run time is short, such as under 60 seconds. This does not necessarily indicate a bottleneck. Thread work time. The percentage of time the Integration Service takes to process each transformation in a thread. The session log shows the following information for the transformation thread work time:
Thread work time breakdown:
<transformation name>: <number> percent
<transformation name>: <number> percent
<transformation name>: <number> percent
If a transformation takes a small amount of time, the session log does not include it.
1. Target Bottlenecks
2. Identifying Target Bottlenecks
3. Eliminating Target Bottlenecks
Mapping: You can create a read test mapping to identify source bottlenecks. A read test mapping isolates the read query by removing the transformation in the mapping.
To create a read test mapping, complete the following steps:
1. Make a copy of the original mapping.
2. In the copied mapping, keep only the sources, source qualifiers, and any custom joins or queries.
3. Remove all transformations.
4. Connect the source qualifiers to a file target.
Run a session against the read test mapping. If the session performance is similar to the original session, you have a source bottleneck.
Using a Database Query:
To identify source bottlenecks, execute the read query directly against the source database. Execute the query against the source database with a query tool such as isql. On Windows, you can load the result of the query in a file. On UNIX, you can load the result of the query in /dev/null. Measure the query execution time and the time it takes for the query to return the first row.
Eliminating Source Bottlenecks: Complete the following tasks to eliminate source bottlenecks:
Set the number of bytes the Integration Service reads per line if the Integration Service reads from a flat file source. Have the database administrator optimize database performance by optimizing the query. Increase the database network packet size. Configure index and key constraints. If there is a long delay between the two time measurements in a database query, you can use an optimizer hint. Mapping Bottlenecks: If you determine that you do not have a source or target bottleneck, you may have a mapping bottleneck.
Identifying Mapping Bottlenecks:
Read the thread statistics and work time statistics in the session log.
Analyze performance counters. High errorrows and rowsinlookupcache counters indicate a mapping bottleneck. Add a Filter transformation before each target definition.
Identifying Session Bottlenecks: To identify a session bottleneck, analyze the performance details. Performance details display information about each transformation, such as the number of input rows, output rows, and error rows.
Identifying System Bottlenecks: You can view system resource usage in the Workflow Monitor. You can use system tools to monitor Windows and UNIX systems.
Using the Workflow Monitor to Identify System Bottlenecks: You can view the Integration Service properties in the Workflow Monitor to see CPU, memory, and swap usage of the system when you are running task processes on the Integration Service. Use the following Integration Service properties to identify performance issues:
A. CPU%. The percentage of CPU usage includes other external tasks running on the system.
B. Memory usage. The percentage of memory usage includes other external tasks running on the system. If the memory usage is close to 95%, check if the tasks running on the system are using the amount indicated in the Workflow Monitor or if there is a memory leak. To troubleshoot, use system tools to check the memory usage before and after running the session and then compare the results to the memory usage while running the session.
C. Swap usage. Swap usage is a result of paging due to possible memory leaks or a high number of concurrent tasks.
Identifying System Bottlenecks on Windows:
You can view the Performance and Processes tab in the Task Manager for system information. The Performance tab in the Task Manager provides an overview of CPU usage and total memory used. Use the Performance Monitor to view more detailed information.
Use the Windows Performance Monitor to create a chart that provides the following information:
1. Percent processor time. If you have more than one CPU, monitor each CPU for percent processor time.
2. Pages/second. If pages/second is greater than five, you may have excessive memory pressure (thrashing).
3. Physical disks percent time. The percent of time that the physical disk is busy performing read or write requests.
4. Physical disks queue length. The number of users waiting for access to the same disk device.
5. Server total bytes per second. The server has sent to and received from the network.
Identifying System Bottlenecks on UNIX
Use the following tools to identify system bottlenecks on UNIX:
1. top. View overall system performance. This tool displays CPU usage, memory usage, and swap usage for the system and for individual processes running on the system.
2. iostat. Monitor the loading operation for every disk attached to the database server. Iostat displays the percentage of time that the disk is physically active. If you use disk arrays, use utilities provided with the disk arrays instead of iostat.
3. vmstat. Monitor disk swapping actions. Swapping should not occur during the session.
4. sar. View detailed system activity reports of CPU, memory, and disk usage. You can use this tool to monitor CPU loading. It provides percent usage on user, system, idle time, and waiting time. You can also use this tool to monitor disk swapping actions. eliminate system bottlenecks:
* If the CPU usage is more than 80%, check the number of concurrent running tasks. Consider changing the load or using a grid to distribute tasks to different nodes. If you cannot reduce the load, consider adding more processors.
* If swapping occurs, increase the physical memory or reduce the number of memory-intensive applications on the disk.
* If you have excessive memory pressure (thrashing), consider adding more physical memory.
* If the percent of time is high, tune the cache for PowerCenter to use in-memory cache instead of writing to disk. If you tune the cache, requests are still in queue, and the disk busy percentage is at least 50%, add another disk device or upgrade to a faster disk device. You can also use a separate disk for each partition in the session.
* If physical disk queue length is greater than two, consider adding another disk device or upgrading the disk device. You also can use separate disks for the reader, writer, and transformation threads.
* Consider improving network bandwidth. * When you tune UNIX systems, tune the server for a major database system.
* If the percent time spent waiting on I/O (%wio) is high, consider using other under-utilized disks. For example, if the source data, target data, lookup, rank, and aggregate cache files are all on the same disk, consider putting them on different disks.
You are done.
+ or – comments will be welcome here.
MCTIP & MCTS