18
Tuning CPU Resources

This chapter describes how to solve CPU resource problems.

This chapter contains the following sections:

Understanding CPU Problems

To address CPU problems, first establish appropriate expectations for the amount of CPU resources your system should be using. Then, determine whether sufficient CPU resources are available, and recognize when your system is consuming too many resources. Begin by determining the amount of CPU resources the Oracle instance utilizes with your system in the following three cases:

System is idle (when little Oracle and non-Oracle activity exists)
System at average workloads
System at peak workloads

You can capture various workload snapshots using the UTLBSTAT/UTLESTAT utility, found in the ORACLE_HOME/rdbms/admin/ directory on UNIX and in the ORACLE_HOME/rdbms81/admin directory on NT. Operating system tools, such as vmstat, sar, and iostat on UNIX and Performance Monitor on NT, should be run during the same time interval as UTLBSTAT/UTLESTAT to provide a complimentary view of the overall statistics.

Note:
Release 8.1.6 also contains a new package called STATSPACK that improves on the UTLBSTAT/UTLESTAT process. For more information, see "Supported Scripts".

Workload is an important factor when evaluating your system's level of CPU utilization. During peak workload hours, 90% CPU utilization with 10% idle and waiting time may be acceptable. Even 30% utilization at a time of low workload may be understandable. However, if your system shows high utilization at normal workload, then there is no room for a peak workload. For example, Figure 18-1 illustrates workload over time for an application having peak periods at 10:00 AM and 2:00 PM.

Figure 18-1 Average Workload and Peak Workload

This example application has 100 users working 8 hours a day, for a total of 800 hours per day. Each user entering one transaction every 5 minutes translates into 9,600 transactions daily. Over an 8-hour period, the system must support 1,200 transactions per hour, which is an average of 20 transactions per minute. If the demand rate were constant, then you could build a system to meet this average workload.

However, usage patterns are not constant--and in this context, 20 transactions per minute can be understood as merely a minimum requirement. If the peak rate you need to achieve is 120 transactions per minute, then you must configure a system that can support this peak workload.

For this example, assume that at peak workload, Oracle uses 90% of the CPU resource. For a period of average workload, then, Oracle uses no more than about 15% of the available CPU resource, as illustrated in the following equation:

20 tpm/120 tpm * 90% = 15%

Where tpm is transactions per minute.

If the system requires 50% of the CPU resource to achieve 20 tpm, then a problem exists: the system cannot achieve 120 transactions per minute using 90% of the CPU. However, if you tuned this system so that it achieves 20 tpm using only 15% of the CPU, then, assuming linear scalability, the system might achieve 120 transactions per minute using 90% of the CPU resources.

As users are added to an application, the workload can rise to what had previously been peak levels. No further CPU capacity is then available for the new peak rate, which is actually higher than the previous.

CPU capacity issues can be addressed with the following:

Tuning; that is, detecting and solving CPU problems from excessive:
- System CPU Utilization
- Oracle CPU Utilization
Increasing hardware capacity, including changing the system architecture.

See Also:
For more information about improving your system architecture, see Chapter 2, "Performance Tuning Methods".
Reducing the impact of peak load use patterns by prioritizing CPU resource allocation. Oracle's Database Resource Manager does this by allocating and managing CPU resources among database users and applications.

See Also:
For more information about Oracle's Database Resource Manager, see Oracle8i Concepts and Oracle8i Administrator's Guide.

Detecting and Solving CPU Problems

If you suspect a problem with CPU usage, check two areas:

System CPU Utilization

Oracle statistics report CPU use by Oracle sessions only, whereas every process running on your system affects the available CPU resources. Therefore, tuning non-Oracle factors can also improve Oracle performance.

Use operating system monitoring tools to determine what processes are running on the system as a whole. If the system is too heavily loaded, check the memory, I/O, and process management areas described later in this section.

Tools such as sar -u on many UNIX-based systems let you examine the level of CPU utilization on your entire system. CPU utilization in UNIX is described in statistics that show user time, system time, idle time, and time waiting for I/O. A CPU problem exists if idle time and time waiting for I/O are both close to zero (less than 5%) at a normal or low workload.

On NT, use Performance Monitor to examine CPU utilization. Performance Manager provides statistics on processor time, user time, privileged time, interrupt time, and DPC time. (NT Performance Monitor is not the same as Performance Manager, which is an Oracle Enterprise Manager tool.)

Note:
This section describes how to check system CPU utilization on most UNIX-based and NT systems. For other platforms, see your operating system documentation.

Memory Management

Check the following memory management areas:

Paging and Swapping

Use tools such as sar or vmstat on UNIX or Performance Monitor on NT to investigate the cause of paging and swapping.

Oversize Page Tables

On UNIX, if the processing space becomes too large, then it may result in the page tables becoming too large. This is not an issue on NT.

I/O Management

Check the following I/O management issues:

Thrashing

Ensure that your workload fits into memory, so the machine is not thrashing (swapping and paging processes in and out of memory). The operating system allocates fixed portions of time during which CPU resources are available to your process. If the process wastes a large portion of each time period checking to be sure that it can run and ensuring that all necessary components are in the machine, then the process may be using only 50% of the time allotted to actually perform work.

Client/Server Round Trips

The latency of sending a message may result in CPU overload. An application often generates messages that need to be sent through the network over and over again, resulting in significant overhead before the message is actually sent. To alleviate this problem, batch the messages and perform the overhead only once, or reduce the amount of work. For example, you can use array inserts, array fetches, and so on.

See Also:
For more details on tuning I/O, see Chapter 20, "Tuning I/O".

Process Management

Check the following process management issues:

Scheduling and Switching

The operating system may spend excessive time scheduling and switching processes. Examine the way in which you are using the operating system, because you could be using too many processes. On NT systems, do not overload your server with too many non-Oracle processes.

Context Switching

Due to operating system specific characteristics, your system could be spending a lot of time in context switches. Context switching can be expensive, especially with a large SGA. Context switching is not an issue on NT, which has only one process per instance. All threads share the same page table.

Programmers often create single-purpose processes, exit the process, and create a new one. Doing this re-creates and destroys the process each time. Such logic uses excessive amounts of CPU, especially with applications that have large SGAs. This is because you need to build the page tables each time. The problem is aggravated when you pin or lock shared memory, because you have to access every page.

For example, if you have a 1 gigabyte SGA, then you may have page table entries for every 4K, and a page table entry may be 8 bytes. You could end up with (1G/4K) * 8B entries. This becomes expensive, because you need to continually make sure that the page table is loaded.

Parallel execution and the multi-threaded server become areas of concern if MINSERVICE has been set too low (set to 10, for example, when you need 20). For an application that is performing small lookups, this may not be wise. In this situation, it becomes inefficient for both the application and the system.

Oracle CPU Utilization

This section explains how to examine the processes running in Oracle. Three dynamic performance views provide information on Oracle processes:

V$SYSSTAT shows Oracle CPU usage for all sessions. The statistic "CPU used by this session" shows the aggregate CPU used by all sessions.
V$SESSTAT shows Oracle CPU usage per session. You can use this view to determine which particular session is using the most CPU.
V$RSRC_CONSUMER_GROUP shows CPU utilization statistics on a per consumer group basis, if you are running the Oracle Database Resource Manager.

For example, if you have 8 CPUs, then for any given minute in real time, you have 8 minutes of CPU time available. On NT and UNIX, this can be either user time or time in system mode (privileged mode on NT). If your process is not running, then it is waiting. Thus, CPU time utilized by all systems may be greater than one minute per interval.

At any given moment, you know how much time Oracle has used on the system. So, if 8 minutes are available and Oracle uses 4 minutes of that time, then you know that 50% of all CPU time is used by Oracle. If your process is not consuming that time, then some other process is. You then need to identify the processes that are using CPU time. If you can, determine why the processes use so much CPU time and attempt to tune them. Possible areas to research include, but are not limited to, the following:

Reparsing SQL Statements

When Oracle executes a SQL statement, it parses it to determine whether the syntax and its contents are correct. This process can consume significant overhead. Once parsed, Oracle does not parse the statement again unless the parsing information is aged from the memory cache and is no longer available. Ineffective memory sharing among SQL statements can result in reparsing. Use the following procedure to determine whether reparsing is occurring:

Get the parse time CPU and CPU figures used by this session from the "Statistics" section of the estat report or from V$SYSTATS. For example:
```
SELECT * FROM V$SYSSTAT
WHERE NAME IN('parse time cpu', 'parse time elapsed', 'parse count (hard)');
```
Now you can detect the general response time on parsing. The more your application is parsing, the more contention exists, and the more time your system spends waiting. If parse time CPU represents a large percentage of the CPU time, then time is being spent parsing instead of executing statements. If this is the case, then it is likely that the application is using literal SQL and not sharing it, or the shared pool is poorly configured.

Query V$SQLAREA to find frequently reparsed statements. For example:

SELECT SQL_TEXT, PARSE_CALLS, EXECUTIONS 
FROM V$SQLAREA
ORDER BY PARSE_CALLS;

Tune the statements with the higher numbers of parse calls.

If the parse time CPU is only a small percentage of the total CPU used, then you should determine where the CPU resources are going. There are several things you can do to help with this.

Find statements with large numbers of buffer gets, because these are typically heavy on CPU.

The following statement finds SQL statements which frequently access database buffers. Such statements are probably looking at many rows of data.

SELECT ADDRESS, HASH_VALUE, BUFFER_GETS, EXECUTIONS, 
 BUFFER GETS/EXECUTIONS "GETS/EXEC", SQL_TEXT
FROM V$SQLAREA 
WHERE BUFFER_GETS > 50000

AND EXECUTIONS > 0

ORDER BY 3;

This example shows which SQL statements have the most buffer_gets and use the most CPU. The statements of interest are those with a large number of gets per execution, especially if execution is high. It is very beneficial to have an understanding of the application components to know which statements are expected to be expensive.

Note:
The 50000 cut-off value is an arbitrary starting point, and it should be increased or decreased gradually until the top 10 to 20 statements are listed. This statement does not highlight CPU-intensive PL/SQL blocks.

After candidate statements have been isolated, the full statement text can be obtained using the following query, substituting relevant values for ADDRESS and HASH_VALUE pairs. For example:
```
SELECT SQL_TEXT 
FROM V$SQLTEXT
WHERE ADDRESS='&ADDRESS_WANTED'
```
```
AND HASH_VALUe=&HASH_VALUE
```
```
ORDER BY piece;
```
The statement can then be explained (using EXPLAIN PLAN) or isolated for further testing to see how CPU-intensive it really is. If the statement uses bind variables and if your data is highly skewed, then the statement may only be CPU-intensive for certain bind values.

Find which sessions are responsible for most CPU usage. The following statement helps locate sessions which have used the most CPU:

SELECT v.SID, SUBSTR(s.NAME,1,30) "Statistic", v.VALUE
FROM V$STATNAME s, V$SESSTAT v
WHERE s.NAME = 'CPU used by this session'

AND v.STATISTIC# = s.STATISTIC#
AND v.VALUE > 0

ORDER BY 3;

Note:
CPU time is cumulative; therefore, a session that has been connected for several days may appear to be heavier on CPU than one that has only been connected for a short period of time. Thus, it is better to write a script to sample the difference in the statistic between two known points in time, letting you see how much CPU was used by each session in a known time frame.

After any CPU-intensive sessions have been identified, the V$SESSION view can be used to get more information. At this stage, it is generally best to revert to user session tracing (SQL_TRACE) to determine where the CPU is being used.

Trace typical user sessions using the SQL_TRACE option to see how CPU is apportioned amongst the main application statements.

After these statements have been identified, you have the following three options for tuning them:
- Rewrite the application so that statements do not continually reparse.
- Reduce parsing by using the initialization parameter SESSION_CACHED_CURSORS.
- If the parse count is small, the execute count is small, and the SQL statements are very similar except for the WHERE clause, then you may find that hard coded values are being used instead of bind variables. Use bind variables to reduce parsing.

Read Consistency

Your system may spend excessive time rolling back changes to blocks in order to maintain a consistent view. Consider the following scenarios:

If there are many small transactions and an active long-running query is running in the background on the same table where the inserts are happening, then the query may need to roll back many changes.
If the number of rollback segments is too small, then your system could also be spending a lot of time rolling back the transaction table. Your query may have started long ago; because the number of rollback segments and transaction tables is very small, your system frequently needs to reuse transaction slots.

Note:
The average wait time should be close to zero. (V$SYSSTAT also shows the average wait time per parse.)
See Also:
For information on approaches to SQL statement tuning, see Chapter 9, "Optimizing SQL Statements".

A solution is to make more rollback segments, or to increase the commit rate. For example, if you batch ten transactions and commit them once, then you reduce the number of transactions by a factor of ten.
If your system must scan too many buffers in the foreground to find a free buffer, then it wastes CPU resources. To alleviate this problem, tune the DBWn process(es) to write more frequently.

You can also increase the size of the buffer cache to enable the database writer process(es) to keep up. To find the average number of buffers the system scans at the end of the least recently used list (LRU) to find a free buffer, use the following formula:

On average, you would expect to see 1 or 2 buffers scanned. If more than this number are being scanned, then increase the size of the buffer cache or tune the DBWn process(es).

Use the following formula to find the number of buffers that were dirty at the end of the LRU:

If many dirty buffers exist, then possibly the DBWn process(es) cannot keep up. Again, increase the buffer cache size or tune the DBWn process.

Note:
Query the V$SYSSTAT view to find the values of "free buffers inspected" and "dirty buffers inspected".

Scalability Limitations Within the Application

In most of this CPU tuning discussion, we assume you can achieve linear scalability, but this is never actually the case. How flat or nonlinear the scalability is indicates how far away from optimal performance your system is. Problems in your application might be adversely affecting scalability. Examples of this include too many indexes, right-hand index problems, too much data in the blocks, or not properly partitioning the data. These types of contention problems waste CPU cycles and prevent the application from attaining linear scalability.

Wait Detection

Whenever an Oracle process waits for something, it records it as a wait using one of a set of predefined wait events. (See V$EVENT_NAME for a list of all wait events). Some of these events can be considered idle events; i.e., the process is waiting for work. Other events indicate time spent waiting for a resource or action to complete. By comparing the relative time spent waiting on each wait event and the "CPU used by this session" (from above), you can see where the Oracle instance is spending most of its time. To get an indication of where time is spent, follow these steps:

Review either the V$SYSTATS view or the wait events section of the UTLBSTAT/UTLESTAT report.
Ignore any idle wait events. Common idle wait events include:
- Client message
- SQL*Net message from client
- SQL*Net more data from client
- RDBMS IPC message
- Pipe get
- Null event
- PMON timer
- SMON timer
- Parallel query dequeue
Ignore any wait events that represent a very small percentage of the total time waited.
Add the remaining wait event times, and calculate each one as a percentage of total time waited.
Compare the total time waited with the CPU used by this session figure.
Find the event with the largest wait event time. This may be the first item you want to tune.

Latch Contention

Latch contention is a symptom of CPU problems; it is not usually a cause. To resolve it, you must locate the latch contention within your application, identify its cause, and determine which part of your application is poorly written.

In some cases, the spin count may be set too high. It's also possible that one process may be holding a latch that another process is attempting to secure. The process attempting to secure the latch may be endlessly spinning. After a while, this process may go to sleep and later resume processing and repeat its ineffectual spinning. To resolve this:

Check the Oracle latch statistics. The "latch free" event in V$SYSTEM_EVENT shows how long processes have been waiting for latches. If there is no latch contention, then this statistic does not appear. If there is a lot of contention, then it may be better for a process to go to sleep at once when it cannot obtain a latch, rather than use CPU time by spinning.
Look for the ratio of CPUs to processes. If there are large numbers of both, then many processes can run. But, if a single process is holding a latch on a system with ten CPUs, then reschedule that process so it is not running. But, ten other processes may run ineffectively trying to secure the same latch. This situation wastes, in parallel, some CPU resource.

Check V$LATCH_MISSES, which indicates where in the Oracle code most contention occurs.

Note:
Tuning the SPIN_COUNT actually tunes the symptom and not the real problem. Furthermore, setting SPIN_COUNT may actually increase CPU waits. If a post-wait driver is available on the system, then evaluate its implementation.

Solving CPU Problems by Changing System Architectures

If you have maximized the CPU power on your system and have exhausted all means of tuning your system's CPU use, then consider redesigning your system on another architecture. Moving to a different architecture might improve CPU use. This section describes architectures you could consider using. This section contains the following possibilities:

Single Tier to Two-Tier
Multi-Tier: Using Smaller Client Machines
Two-Tier to Three-Tier
Three-Tier

Oracle Parallel Server

Note:
If you are running a multi-tier system, then check all levels for CPU utilization. For example, on a three-tier system, your server might be mostly idle while your second tier is completely busy. To resolve this, tune the second tier rather than the server or the third tier. In a multi-tier system, it is usually not the server that has a performance problem. It is usually the clients and the middle tier.

Single Tier to Two-Tier

Consider whether changing from several clients with one server, all running on a single machine (single tier), to a two-tier client/server configuration would relieve CPU problems.

Figure 18-2 Single Tier to Two-Tier

Multi-Tier: Using Smaller Client Machines

Consider whether using smaller clients improves CPU usage rather than using multiple clients on larger machines. This strategy may be helpful with either two-tier or three-tier configurations.

Figure 18-3 Multi-Tier Using Smaller Clients

Two-Tier to Three-Tier

If your system runs with multiple layers, then consider whether moving from a two-tier to three-tier configuration and introducing an application server or a transaction processing monitor might be a good solution.

Figure 18-4 Two-Tier to Three-Tier

Three-Tier

Consider using one or more application servers or multiple transaction processing monitors.

Figure 18-5 Three-Tier with Multiple Application Servers

Oracle Parallel Server

Consider whether incorporating Oracle Parallel Server would solve your CPU problems.

18Tuning CPU Resources

Understanding CPU Problems

Figure 18-1 Average Workload and Peak Workload

Detecting and Solving CPU Problems

System CPU Utilization

Memory Management

Paging and Swapping

Oversize Page Tables

I/O Management

Thrashing

Client/Server Round Trips

Process Management

Scheduling and Switching

Context Switching

Oracle CPU Utilization

Reparsing SQL Statements

Read Consistency

Scalability Limitations Within the Application

Wait Detection

Latch Contention

Solving CPU Problems by Changing System Architectures

Single Tier to Two-Tier

Figure 18-2 Single Tier to Two-Tier

Multi-Tier: Using Smaller Client Machines

Figure 18-3 Multi-Tier Using Smaller Clients

Two-Tier to Three-Tier

Figure 18-4 Two-Tier to Three-Tier

Three-Tier

Figure 18-5 Three-Tier with Multiple Application Servers

Oracle Parallel Server

Figure 18-6 Oracle Parallel Server

18
Tuning CPU Resources