Express Diagnosis of Oracle E-Business Suite Release 12.2.n Upgrade Performance Issues

The latest version of this document can be found in My Oracle Support Document "Express Diagnosis of Oracle E-Business Suite Release 12.2.n Upgrade Performance Issues (Document 1583752.1)"

In This Document

Executive Overview

This document describes the diagnostic strategies and methods that can be used during an Oracle E-Business Suite Release 12.2 upgrade to minimize downtime and expedite resolution of any issues.

The content applies to upgrades from 11.5.10, R12.0.n or R12.1.n to R12.2.n.

The same principles can also be applied to other Oracle E-Business Suite upgrades.

For more detailed information see the My Oracle Support document “Best Practices for Minimizing Oracle E-Business Suite Release 12.2.n Upgrade Downtime (Document 1581549.1)”

Clarification : In all cases it is only necessary to produce diagnostics during test runs (and not during the final production run).

1. Introduction

When analyzing Release 12.2.n Upgrade performance issues, the goal is to:

Prevent wasted test iterations. Aim to provide solutions that solve the issue first time.
Maximize the number of performance issues investigated on each iteration.
Maximize the number of performance issues resolved.

Upgrade jobs cannot be tested in isolation. They can only be tested on the next iteration. If a fix does not work, it is a potential wasted test iteration.

To do this the following are needed:

Actual statistics: So it is possible to see exactly which execution plan steps are inefficient, rather than those that might be inefficient. The likely impact of a performance fix can also be estimated. There will then be a higher probability of providing a fix that can solve the performance issue first time. Also, it will be possible to identify marginal fixes (i.e. fixes that reduce elapsed times by 10-50%, for example by having more specific index access). These fixes often reduce contention between workers.
Diagnostics that are quick and easy to run, so that diagnostics can be obtained faster and on more jobs/SQL.
Diagnostics that have very little impact on the performance of the Release 12.2 upgrade: if they can be run during the upgrade then the results are obtained sooner and the issue resolved more quickly.

2. R12.2, Online Patching and Diagnostics

The new diagnostics available for ADOP are logs. These give timestamps for particular steps in the upgrade, or list errors that have occurred. However, they do not identify the SQLs or underlying events that caused the performance issue, so it is essential to use AWR reports, AD Job Timing Reports, Display Cursor Reports and SQLT Extracts etc.

Online Patching (ADOP) and the underlying Edition-Based Redefinition are only used after "Online Patching Enablement" for the "R12.AD.C.Delta.n", "R12.TXK.C.Delta.n", "12.2.n RUP" and subsequent patches.

Note that although the Online Patching method is used for the latter stages in the upgrade to R12.2.n, these are run during downtime (the application is not running).

Since R12.AD.C.Delta.5 the 12.2.n RUP patches can be applied using "downtime" mode.

So the performance advice and challenges for an upgrade to R12.2.0 are largely the same as for an upgrade to R12.1.n. The same method is used to apply many of the patches (AutoPatch).

However, the Online Patching Enablement patch, Online Patching and the Release Update Packs for 12.2.n are quite different to the main R12.2.0 upgrade, so they may introduce new performance challenges, particularly with internal SQL and ADOP SQL (i.e. SQL run from or on AD_ZD objects).

It is important to stress that although Online Patching (ADOP) provides new logs to track performance, the key diagnostics are still the same ones used for previous Oracle E-Business upgrades (e.g. Display Cursor reports, SQL Monitor reports, AWR reports, AD Job Timing Reports, SQLHC reports, SQLT Extracts etc.)

3. Before Running the Oracle E-Business Suite Upgrade

3.1 Set Statistics_Level = ALL (or _rowsource_execution_statistics = TRUE)

This can be set using the command:

SQL>alter system set statistics_level='ALL'

This is the simplest way to see actual row source statistics (including elapsed time, physical reads, buffer gets etc) for each execution plan line (on SQLT and Display Cursor report). The alternative of SQL Trace and TKPROF requires editing standard code.

Note that the internal views v$sql_plan_statistics and v$sql_plan_statistics_all will not contain any actual row source statistics for execution plan lines if statistics_level = TYPICAL, even if timed_statistics = TRUE.

When the STATISTICS_LEVEL parameter is set to ALL, additional statistics are added to the set of statistics collected with the TYPICAL setting. The additional statistics are timed OS statistics and plan execution statistics, which include row source statistics.

Using this strategy will typically speed up the resolution of issues significantly and may also allow the correct solution to be identified first time.

Alternatively, the same actual execution statistics can be collected by setting the initialization parameter _rowsource_execution_statistics=TRUE (with statistics_level = 'TYPICAL'). This gives a lower overhead than statistics_level=ALL.

Some technical architects and DBAs at customers (or implementing partners) can be resistant to setting statistics_level = ALL (or _rowsource_execution_statistics = TRUE), believing that this can slow down performance significantly.

Two points are relevant here:

Although setting statistics_level = ALL / _rowsource_execution_statistics = TRUE will have some performance impact, it is likely to be small and not significant. The Release 12.2 upgrade is made up of batch processes, and so the statistics workload is a much lower proportion of the total.
Even if the performance impact is significant, the goal is to reduce the elapsed times for the latter dry runs and go live (when it will be feasible to revert statistics_level / _rowsource_execution_statistics to their previous values). So suffering an increase in elapsed time during an early stage of testing is not an issue.

So there may be a small impact on elapsed time and the work that needs to be done initially, but it will help to subsequently reduce the elapsed time and amount of re-work that needs to be done.

Note that setting statistics_level to ALL while AWR is enabled could significantly increase the number of rows inserted to the WRH$_LATCH_CHILDREN table. So monitor the SYSAUX tablespace to ensure that it does not run out of space.

3.2 AWR Snapshot Interval and Retention Period

Automatic Workload Repository (AWR) should be enabled with a snapshot of 30 minutes (the default is 60 minutes). For short upgrades, a shorter snapshot may be more suitable.

The AWR retention period should be long enough to cover the duration of the upgrade run and a significant period afterwards (to gather diagnostics and analyze). The suggestion is N+7 days, where N is the estimated upgrade time, but a longer period will provide more time to gather subsequent diagnostics and statistics.

4. During the Oracle E-Business Suite Upgrade

4.1 Obtain Top SQL in Cursor Cache or AWR

This could be internal or application SQL. These scripts should be run regularly during the upgrade, particularly when there are long-running jobs.

If SQL is still in memory (cursor cache) the following can be used to identify long running SQLs that may not have been written to the AWR yet (at last snapshot):

SELECT * FROM 
   (SELECT 
      ss.sql_id, 
      ROUND(SUM(ss.elapsed_time/1000000),0) elapsed_time_secs, 
      ROUND(SUM(ss.cpu_time/1000000),0) cpu_time_secs, 
      SUM(ss.disk_reads) disk_reads, 
      SUM(ss.direct_writes) direct_writes, 
      SUM(ss.buffer_gets) buffer_gets, 
      SUM(ss.px_servers_executions) px_server_execs, 
      SUM(ss.rows_processed) rows_processed, 
      SUM(ss.executions) executions, 
      SUM(ss.application_wait_time) apwait_secs, 
      SUM(ss.sharable_mem) sharable_mem, 
      SUM(ss.total_sharable_mem) total_sharable_mem
   FROM v$sqlstats ss 
   GROUP BY ss.sql_id 
   ORDER BY 2 DESC) 
WHERE ROWNUM <= 100;

The following SQL script will report the longest running SQLs between two AWR snapshots.

SELECT * FROM 
   (SELECT 
      dhs.sql_id, 
      ROUND(SUM(dhs.elapsed_time_delta/1000000),0) elapsed_time_secs, 
      ROUND(SUM(dhs.cpu_time_delta/1000000),0) cpu_time_secs, 
      SUM(dhs.disk_reads_delta) disk_reads, 
      SUM(dhs.buffer_gets_delta) buffer_gets, 
      SUM(dhs.px_servers_execs_delta) px_server_execs, 
      SUM(dhs.rows_processed_delta) rows_processed, 
      SUM(dhs.executions_delta) executions, 
      ROUND(SUM(dhs.iowait_delta/1000000),0) iowait_secs, 
      ROUND(SUM(dhs.clwait_delta/1000000),0) clwait_secs, 
      ROUND(SUM(dhs.ccwait_delta/1000000),0) ccwait_secs, 
      ROUND(SUM(dhs.apwait_delta/1000000),0) apwait_secs 
   FROM dba_hist_sqlstat dhs ,
        v$database d 
   WHERE dhs.dbid = d.dbid 
   AND snap_id > <begin snap> 
   AND snap_id <= <end snap> 
   GROUP BY dhs.sql_id 
   ORDER BY 2 DESC) 
WHERE ROWNUM <= 100;

Where <begin snap> and <end snap> are the start and end snapshot IDs.

The output of this statement will look similar to the following:

SQL_ID        ELAPSED_TIME_SECS CPU_TIME_SECS   DISK_READS BUFFER_GETS …. 
------------- ----------------- --------------- ---------- ----------- ….
xxxxxxxxxxxxx            367440           42999   34838244  3795838289 …. 
xxxxxxxxxxxxx            264369          170788     441127   562033013 …. 
xxxxxxxxxxxxx             70370            6448    3599284   469639133 …. 
xxxxxxxxxxxxx             68298           38896    7125573  1327384554 …. 
xxxxxxxxxxxxx             63600           27402   20043712   587615960 ….

The elapsed time is the maximum elapsed time for all workers of a job.

Enterprise Manager can also be used to identify expensive SQL as it occurs.

4.2 Obtain Display Cursor Report for Long-Running SQL

For long-running SQL reported by the above script, run a display cursor report (with ALL +ALLSTATS option).

This displays the actual execution plan of any cursor loaded in the cursor cache. At the basic level it shows the runtime execution plan. However, the format ALL also includes extra information such as pruning, parallel execution, predicate, projection, alias and remote SQL information.

This should be run (as soon as possible) whilst the jobs are running or very shortly afterwards. If running of the Display Cursor Report is delayed, the cursor may have been flushed from memory or invalidated and no actual statistics will be available and the report will display no data.

The +ALLSTATS option (which includes IOSTATS and MEMSTATS) will include actual statistics for each execution plan step. These include:

Elapsed time
Physical reads
Buffer gets
Memory used (in PGA) for memory intensive operations (such as hash-joins, sorts, bitmap operators etc).

However, this additional information is only provided if statistics_level=ALL / _rowsource_execution_statistics = TRUE

Note that SQLT with XTRACT will also report actual row source statistics in the same circumstances. However, Display Cursor provides a simpler view of the information. It can also be run during the upgrade, while the long running SQL is in progress, without much of an overhead.

The report can be produced by running the following SQL script:

SET pages 0
SET lines 300
SET LONG 10000
SET LONGCHUNKSIZE 10000
SPOOL <report_name>.txt
SELECT * FROM TABLE(dbms_xplan.display_cursor('<sql_id>', NULL, 'ALL +ALLSTATS'));
SPOOL OFF;

For more information see the "Display Cursor" section in My Oracle Support document "Oracle E-Business Suite Performance Guide (Document 1672174.1)"

If the SQL is no longer in memory, but is in the AWR, use the Display AWR report (DBMS_XPLAN.DISPLAY_AWR) instead.

SET pages 0 
SET lines 300 
SET LONG 10000 
SET LONGCHUNKSIZE 10000 
SPOOL<report_name>.txt
SELECT * FROM TABLE(dbms_xplan.display_awr('<sql_id>', NULL, NULL, 'ALL')); 
SPOOL OFF;

However, this does not report on actuals: it does not have a +ALLSTATS option, and there are no actual statistics for execution plan steps stored in AWR.

Note that SQLT with XTRACT method will not report on actual statistics in this case either.

Note that the display cursor and AWR reports only show the sql_text (first 1000 characters) and not the full_text.

So, if necessary, run the following SQL script to obtain the full SQL text.

SET pages 0
SET lines 300
SET LONG 10000
SET LONGCHUNKSIZE 10000
SPOOL <report_name>.txt 
SELECT sql_id, sql_text, sql_fulltext FROM v$SQL 
WHERE sql_id = '<sql_id>';
SPOOL OFF;

4.2.1 Automate execution of Display Cursor reports for top SQLs

The following script can be used (on linux) to automatically produce Display Cursor reports for the top SQL (by elapsed time) during each upgrade patch. It also produces Display AWR reports, which only provide the execution plan and no actual statistics, but which may be useful as a fall back if statistics_level or _rowsource_execution_statistics have not been set, or a particular SQL was no longer in cursor cache.

Although this automated process will produce many reports that are never looked at. It will mean that if a long running SQL (as part of a long running job) is identified then the Display Cursor report will already be available for analysis. It will not be necessary to wait until the next test run to obtain one.

There are 4 parameters that should be passed to the script:

Apps password
Number of iterations. This should be large enough so that the shell script keeps on producing Display Cursor reports throughout the upgrade patch. It is recommended that this is much larger than needed, the shell script can always be killed when the patch has completed.
Iteration gap in minutes. This should be short enough so that all long running sqls of interest are reported, but not cause much overhead. 10 minutes is recommended.
Number of Top SQLs to monitor (by elapsed time). This should be large enough so that all long running sqls of interest are reported, but not cause much overhead. 20 is recommended.

So, if the apps password is represented by <password>, the number of iterations is 72, the gap is 10 minutes and the number of top SQLs is 20, the script will be called as follows:

xxauto_dc.sh <password> 72 10 20

The shell script loops for the specified number of iterations.

In each iteration of the loop it :

Calls the SQL script xxautodc.sql, passing the number of top SQLs parameter. This builds a temporary SQL script dispcurbatch.sql to spool and run the Display Cursor reports for the top SQL and then executes that script.
Removes the temporary SQL script dispcurbatch.sql (after it has been run).
Sleeps for the specified number of minutes (before starting the next iteration of the loop).

There are 3 sets of reports :

Disp_Curs_<sql_id>_<YYMMDDHH24MI>.txt. This is the Display Cursor report for a top SQL in the AWR
Disp_AWR_<sql_id>_<YYMMDDHH24MI>.txt. This is the Display AWR report for a top SQL in the AWR
CC_Disp_Curs_<sql_id>_<YYMMDDHH24MI>.txt. This is the Display Cursor report for a top SQL in the Cursor Cache

Where

<sql_id> is the SQL ID for the particular SQL statement.
<YYMMDDHH24MI> indicates the approximate time that the report was produced (i.e. from which iteration) in format YYMMDDHH24MI

Ensure that there is plenty of space on the file system (volume) available to accommodate the output files.

# Script xxauto_dc.sh

# Get arguments
# 1. apps password
# 2. number of iterations
# 3. iteration gap - minutes
# 4. No of Top SQLs 

export appspass="$1"
export numit="$2"
export gap="$3"
export topsql="$4"

iteration=1

until [ ${iteration} -gt ${numit} ]
do
   echo "Iteration ${iteration}"

   sqlplus apps/${appspass} @xxautodc.sql $topsql

   # remove batch file (if it exists) 

   rm -f dispcurbatch.sql

   iteration=`expr $iteration + 1` 

   # Now sleep for requested number of minutes 

   if [ ${iteration} -gt ${numit} ] 
   then
      echo "Do not sleep"
   else
      sleep ${gap}m
   fi

done

The SQL script xxautodc.sql:

Reads the "Number of Top SQLs" parameter.
Runs SQL on the AWR and cursor cache to spool output to temporary SQL script file dispcurbatch.sql. The SQL script dispcurbatch.sql will contain commands to spool and run Display Cursor reports for the top SQLs (in AWR and cursor cache).
Executes the SQL script dispcurbatch.sql.

-- Script xxautodc.sql
-- example script for building script to run Display Cursor for top sql ids
-- It is absolutely essential that either statistics_level = ALL or _rowsource_execution_statistics = TRUE
-- otherwise the Display Cursor report will not contain the actual rowsource statistics (for each plan line)
-- that are essential in quickly identifying if and why a SQL has an inefficient execution plan.

WHENEVER SQLERROR CONTINUE
WHENEVER OSERROR CONTINUE

VARIABLE l_topsql NUMBER

-- pick up l_topsql and set bind var
DEFINE topsql = '&&1'

exec :l_topsql := '&topsql'

set echo off heading off feedback off verify off

set pages 0 termout off
set linesize 100

spool dispcurbatch.sql

SELECT 'WHENEVER SQLERROR CONTINUE' FROM DUAL;
SELECT 'WHENEVER OSERROR CONTINUE' FROM DUAL;
SELECT 'SET pages 0' FROM DUAL;
SELECT 'SET lines 300' FROM DUAL;
SELECT 'SET LONG 10000' FROM DUAL;
SELECT 'SET LONGCHUNKSIZE 10000' FROM DUAL;

SELECT
   '                                                                                                 ',
   'SPOOL '
   ||'Disp_Curs_'
   ||sql_id
   ||'_'
   ||TO_CHAR(SYSDATE,'YYMMDDHH24MI')
   ||'.txt                                                         ',
   'SELECT * FROM TABLE(dbms_xplan.display_cursor('''
   ||sql_id
   ||''', NULL, ''ALL +ALLSTATS''));    ',
   'SPOOL OFF',
   '                                                                                                 ',
   'SPOOL '
   ||'Disp_AWR_'
   ||sql_id
   ||'_'
   ||TO_CHAR(SYSDATE,'YYMMDDHH24MI') ||'.txt                                                          ',
   'SELECT * FROM TABLE(dbms_xplan.display_awr('''
   ||sql_id
   ||''', format => ''ALL''));                 ',
   'SPOOL OFF'
FROM
   (SELECT  /*+ leading(d,i,dhss,dhst) */
       dhss.sql_id,
       ROUND(SUM(dhss.elapsed_time_delta/1000000),0) elapsed_time_secs -- needs sum because of different plan (hash)s for same SQL
    FROM v$database d,
         v$instance i,
         dba_hist_sqlstat dhss,
         dba_hist_sqltext dhst
    WHERE dhss.dbid = d.dbid
    AND dhss.instance_number = i.instance_number
    AND dhst.dbid = dhss.dbid
    AND dhst.sql_id = dhss.sql_id
    AND dhst.command_type != 47 -- exclude anonymous blocks
    AND dhss.snap_id = (SELECT /*+ push_subq */ MAX(dhsn.snap_id)
                        FROM dba_hist_snapshot dhsn,
                             v$database d,
                             v$instance i
                        WHERE dhsn.dbid = d.dbid
                        AND dhsn.instance_number = i.instance_number) -- Just report on latest completed snapshot
    GROUP BY dhss.sql_id
    ORDER BY 2 DESC)
WHERE rownum <= :l_topsql -- limit to the top SQLs (ordered by elapsed time).
; 

-- Also get top SQLs from cursor cache - to fill any gaps

SELECT
   '                                                                                                 ',
   'SPOOL '
   ||'CC_Disp_Curs_'
   ||sql_id
   ||'_'
   ||TO_CHAR(SYSDATE,'YYMMDDHH24MI')
   ||'.txt                                                      ',
   'SELECT * FROM TABLE(dbms_xplan.display_cursor('''
   ||sql_id
   ||''', NULL, ''ALL +ALLSTATS''));    ',
   'SPOOL OFF'
FROM
   (SELECT
       ss.sql_id,
       ROUND(SUM(ss.delta_elapsed_time/1000000),0) elapsed_time_secs
    FROM v$sqlstats ss,
         v$sql s
    WHERE s.sql_id = ss.sql_id
    AND s.child_address = ss.last_active_child_address
    AND s.command_type != 47 -- exclude anonymous blocks
    GROUP BY ss.sql_id
    ORDER BY 2 DESC)
WHERE rownum <= :l_topsql -- limit to the top SQLs (ordered by elapsed time).
; 

spool off

set termout on echo on verify on heading on feedback on

set termout off

-- Now run the script that has been built above
@@dispcurbatch

set termout on

exit

4.3 Obtain SQL Monitor Report for SQL Using Parallel Query/DML

Obtain SQL Monitor Reports for SQL that uses Parallel Query or DML

The main advantage of this is that it gives a good view of how parallel SQL/DML performs across stages of the plan and parallel child (secondary) processes.

It can also give a good idea of the actual executions and row counts for each execution plan line even if "statistics_level" initialization parameter is not set to ALL ( or "_rowsource_execution_statistics" is not set to TRUE) at the time the SQL is executed.

It can be run during the upgrade, while the long running SQL is in progress (or shortly afterwards), without much of an overhead.

It can be produced by running the following SQL script:

set trimspool on
set trim on
set pages 0
set long 10000000
set longchunksize 10000000
set linesize 200
set termout off
spool sql_monitor_for_<sql_id>.htm
variable my_rept CLOB;
BEGIN 
   :my_rept := dbms_sqltune.report_sql_monitor(sql_id => '<sql_id>', report_level => 'ALL', type => 'HTML');
END;
/

print :my_rept

spool off;

set termout on

For more information see the "SQL Monitor Report" section in My Oracle Support document "Oracle E-Business Suite Performance Guide (Document 1672174.1)"

4.4 Identify when SQL ran

The following SQL will show when a particular piece of SQL ran (i.e. between which snapshots). This is useful in matching SQLs to jobs.

SELECT  
   dhs.sql_id, 
   dsn.snap_id, 
   dsn.begin_interval_time, 
   dsn.end_interval_time, 
   ROUND(SUM(dhs.elapsed_time_delta/1000000),0) elapsed_time_secs 
FROM dba_hist_sqlstat dhs ,
     v$database d ,
     dba_hist_snapshot dsn 
WHERE dhs.dbid = d.dbid 
AND dsn.snap_id = dhs.snap_id 
AND dsn.dbid = dhs.dbid 
AND dsn.instance_number = dhs.instance_number 
AND dhs.sql_id = '<sql_id>' 
AND dsn.snap_id > <begin_snap> 
AND dsn.snap_id <= <end_snap>  
GROUP BY dhs.sql_id, dsn.snap_id, dsn.begin_interval_time, dsn.end_interval_time 
ORDER BY dsn.snap_id;

Where <begin snap> and <end snap> are the start and end snapshot IDs.

The output of this statement will look similar to the following:

SQL_ID        SNAP_ID BEGIN_INTERVAL_TIME     END_INTERVAL_TIME       ELAPSED_TIME_SECS 
------------- ------- ----------------------- ----------------------- ----------------- 
xxxxxxxxxxxxx xxxx    04-JAN-13 23.00.25.5560 05-JAN-13 00.00.21.1620             23123 
xxxxxxxxxxxxx xxxx    05-JAN-13 00.00.21.1620 05-JAN-13 01.00.38.2680             37145

4.5 Match Long-Running SQL to Jobs

The following SQL will give jobs running at any point between two time intervals, with the longest running jobs first. This is useful in matching SQLs to jobs.

Where:

<start_time> = start of period to report in format YYYYMMDDHH24MISS
<end_time> = end of period to report in format YYYYMMDDHH24MISS

Note that the job must have completed for it to be reported by this SQL script.

SELECT 
   phase, 
   phase_name, 
   product, 
   job_name, 
   max_elapsed_time, 
   min_start_time, 
   max_end_time, 
   workers 
FROM 
   (SELECT 
       phase,
       phase_name, 
       product, 
       job_name, 
       MAX(elapsed_time) elapsed_time_unconv, 
       LPAD(FLOOR(MAX(elapsed_time)*24), 4)||':'|| 
          LPAD(FLOOR((MAX(elapsed_time)*24-floor(MAX(elapsed_time)*24))*60), 2, '0')||':'|| 
          LPAD(MOD(ROUND(MAX(elapsed_time)*86400), 60), 2, '0') max_elapsed_time,
       INITCAP(TO_CHAR(MIN(start_time),'MON DD HH24:MI', 'NLS_DATE_LANGUAGE = american')) min_start_time,
       INITCAP(TO_CHAR(MAX(end_time),'MON DD HH24:MI', 'NLS_DATE_LANGUAGE = american')) max_end_time, 
       count(worker_id) workers 
    FROM ad_task_timing
    WHERE session_id = <session_id> 
    AND ( start_time BETWEEN TO_DATE('<start_time>','YYYYMMDDHH24MISS') AND TO_DATE('<end_time>','YYYYMMDDHH24MISS') 
        OR 
          NVL(end_time, start_time+elapsed_time) BETWEEN TO_DATE('<start_time>','YYYYMMDDHH24MISS') AND TO_DATE('<end_time>','YYYYMMDDHH24MISS') )
    GROUP BY phase, phase_name, product, job_name)
ORDER BY elapsed_time_unconv DESC;

The output of this statement will look similar to the following:

      Phase                            max elapsed 
phase name   product job_name          time        min_start_time max_end_time workers 
----- ------ ------- ----------------- ----------- -------------- ------------ ------- 
255   upg+80 zx      zxaptrxmigupd.sql     2:43:47 Mar 13 04:22   Mar 13 07:06      64 
255   upg+80 ap      apxlaupg.sql          1:38:57 Mar 13 04:03   Mar 13 05:42       1

To find upgrade AD jobs that are in progress, use adctrl option 1 (Show worker status).

When they started can be determined by looking at the patch log file. e.g.

$ cat u_merged.log|grep -A2 cstpostimportaad.sql 
Assigned: file cstpostimportaad.sql on worker 48 for product bom username BOM. Time is: Fri Mar 22 2013 22:48:54

4.6 Report on CBO Statistics for All Oracle E-Business Suite Tables

Report on the CBO statistics for Oracle E-Business Suite tables during the upgrade, before adsstats.sql is run (in 12.2.0 CUP).

The script adsstats.sql will populate the statistics correctly before the end of the upgrade. The fact that tables may have incorrect statistics during the upgrade will not be visible. So it may not be possible to see that a table had null, zero or inaccurate CBO statistics, and that this is the reason for an expensive execution plan.

The following script will report on the CBO statistics for all Oracle E-Business Suite tables:

SELECT 
   owner, 
   table_name, 
   num_rows, 
   TO_CHAR(last_analyzed,'DD-MON-YYYY HH24:MI:SS') last_analyzed 
FROM all_tables 
WHERE owner IN (SELECT upper(oracle_username) sname 
                FROM fnd_oracle_userid 
                WHERE oracle_id BETWEEN 900 
                AND 999 
                AND read_only_flag = 'U' 
                UNION ALL 
                SELECT DISTINCT upper(oracle_username) sname 
                FROM fnd_oracle_userid a,
                     fnd_product_installations b 
                WHERE a.oracle_id = b.oracle_id )
ORDER BY owner, table_name;

The output of this statement will look similar to the following:

OWNER                  TABLE_NAME                     NUM_ROWS LAST_ANALYZED                                 
---------------------- ---------------------------- ---------- ------------------------ 
ABM                    ABM_ACC_MAP_SUM_REP                   0 06-JAN-2013 08:46:33 
ABM                    ABM_ACT_ACC_RU_DAT                    0 06-JAN-2013 08:46:35 
ABM                    ABM_ACT_STA_RU_DAT                    0 06-JAN-2013 08:46:36 
ABM                    ABM_ACT_TAGS                          0 06-JAN-2013 08:46:37 
ABM                    ABM_API_TEMPLATES                    38 06-JAN-2013 08:44:53 
ABM                    ABM_API_TEMPLATES_TL                722 06-JAN-2013 08:41:16 
ABM                    ABM_API_TEMPLATE_ATTRIBUTES         248 06-JAN-2013 08:44:34

5. After the Oracle E-Business Suite Upgrade

These are diagnostics to be obtained directly after each of the main patches/RUPs have completed.

This will be directly after each of 12.2.0 CUP, Online Patching Enablement, R12.AD.C.Delta.n, R12.TXK.C.Delta.n and 12.2.n RUP.

5.1 AD Job Timing Report

This reports:

Number of successful, failed, deferred, re-started or skipped jobs.
The top 100 time consuming jobs.
The failed, deferred, re-started and skipped jobs.
The timing of each upgrade phase, with the total number of jobs, and the number deferred, re-started and skipped.

However, it only reports the Top 100 Time Consuming Jobs, and for AD Parallel jobs it considers each worker to be a different job. This means it may only report a handful of jobs. An alternative is to query the AD_TASK_TIMING table for long running jobs directly.

When ADOP, AutoPatch or AD Administration is run, it automatically generates an AD Job Timing report (adt<session_id>.lst). The contents of this report can be accessed from Oracle Application Manager, or reports can be obtained for completed upgrade sessions from the APPL_TOP/admin/<SID>/out directory. The report is called adt<session_id>.lst.

The AD Job Timing Report can also be run for AD Administration jobs from the command line.

$ cd $APPL_TOP/admin/<SID>/out

$ sqlplus <APPS username>/<APPS password> @$AD_TOP/admin/sql/adtimrpt.sql \ <session id> <output file>

Where <session_id> is the session of the timing statistics required, and <output file> is the name of the file where the statistics will be written.

$AD_TOP/admin/sql/adtimdet.sql can also be run in a similar way. This gives details on all jobs ordered by phase or elapsed time. This is useful for finding out how long any job took to run, and also where the "Top 100 Time Consuming Jobs" is dominated by multiple workers of a few jobs.

Note that the SQL scripts may be in $AD_TOP/sql (not admin/sql).

See "Oracle E-Business Suite Maintenance Guide Release 12.2" for more information.

5.2 Identify Long-Running Upgrade Jobs

Note that the "Top 100 Time Consuming Jobs" section of the standard adtimrpt.sql report lists all workers for AD Parallel jobs separately. So the top 100 can be dominated by a handful of jobs.

The following SQL can be used to list all jobs in order of maximum elapsed time (descending), but reporting all workers of an AD Parallel job in one line. It only reports completed jobs.

Note that <session_id> is the ID for the upgrade "session" and not a user session.

Be aware of jobs that are called multiple times in the same phase (e.g. akload.class, LoadMap.class, XDOLoader.class).

SELECT 
   phase, 
   phase_name, 
   product, 
   job_name, 
   max_elapsed_time, 
   min_start_time, 
   max_end_time, 
   workers 
FROM 
   (SELECT 
       phase, 
       phase_name, 
       product, 
       job_name, 
       MAX(elapsed_time) elapsed_time_unconv, 
       LPAD(FLOOR(MAX(elapsed_time)*24), 4)||':'|| 
          LPAD(FLOOR((MAX(elapsed_time)*24-floor(MAX(elapsed_time)*24))*60), 2, '0')||':'|| 
          LPAD(MOD(ROUND(MAX(elapsed_time)*86400), 60), 2, '0') max_elapsed_time, 
       INITCAP(TO_CHAR(MIN(start_time),'MON DD HH24:MI', 'NLS_DATE_LANGUAGE = american')) min_start_time, 
       INITCAP(TO_CHAR(MAX(end_time),'MON DD HH24:MI', 'NLS_DATE_LANGUAGE = american')) max_end_time, 
       count(worker_id) workers 
    FROM ad_task_timing 
    WHERE session_id = <session_id> 
    GROUP BY phase, phase_name, product, job_name)
ORDER BY elapsed_time_unconv DESC;

The output of this statement will look similar to the following:

Phs                                        MaxElpsd   Min           Max             Num 
Num   Phase        Prod  Job               Time       Start Time    End Time       Wrkr 
----- ------------ ----- ----------------- ---------- ------------- ------------- ----- 
255   upg+80       zx    zxaptrxmigupd.sql    2:43:47 Mar 13 04:22  Mar 13 07:06     64 
351   last+63      ad    adsstats.sql         2:09:42 Mar 13 11:14  Mar 13 13:23      1 
255   upg+80       ap    apxlaupg.sql         1:38:57 Mar 13 04:03  Mar 13 05:42      1 
251   upg+76       ap    apilnupg.sql         0:55:45 Mar 13 01:24  Mar 13 02:19     64 
241   upg+57       ont   ontup251.sql         0:25:07 Mar 13 00:11  Mar 13 00:36      1 
….

5.3 Obtain File versions for Long-Running Jobs

When an SR or Bug is raised, Oracle Support and Development will ask for the version of the job (file) that has the issue, or the version of code (file) used by the job.

It may be necessary to check what is done in a particular job, or the exact metadata/object definition in a file. When doing this, be sure to identify the version of the files used in the upgrade.

For example, for an R12.2.n upgrade, the version present in the file system or database after all the upgrade steps – including R12.2.n RUP and post upgrade steps – may be different from the one used during the R12.2.0 upgrade.

To find out the version of a file used during the R12.2.0 upgrade, check the unified driver used for that part of the upgrade: the same principle applies to the 12.2.n RUP. For example:

$ cat u_merged.drv|grep –A5 cstpostimportaad.sql

Note : Do not rely on the version of the file or object in the file system or database, as it could be a later version applied by a subsequent patch.

5.4 Obtain AWR Reports

Obtain AWR reports for:

The whole period that the upgrade is running.
For the duration of long-running jobs (i.e. between the snapshots taken just before the job starts and just after it finishes).
Each individual snapshot.

Initially, the following can be identified:

Long running SQL (SQL Statistics/SQL Ordered By)
Contention and bottlenecks (Top 5 Timed Foreground Events/Foreground Wait Events)
CPU utilization

However, many other useful statistics can be obtained from AWR reports, depending on the situation e.g. :

Advisory Statistics on SGA/PGA sizing etc.
Segment Statistics to identify the objects with the highest physical/logical reads or waits (Row Lock, ITL, Buffer Busy).
Wait Event Histogram to identify the distribution of wait times and how serious contention is.

If there are high levels of a particular wait(s), first check to see if it only occurs with particular long-running SQL and jobs, before assuming that it is a system configuration or resource issue.

awrrpt.sql will typically be used to generate the AWR reports. Always choose HTML report type. On an Oracle RAC instance, awrrpti.sql will usually suffice, as the upgrade will be run on one Oracle RAC node only.

AWR reports can be automated. This is useful if producing a large number of AWR reports, particularly for successive snapshots. See the "Automating AWR Reports" section in My Oracle Support document "Performance Diagnosis with Automatic Workload Repository (Document 1674086.1)".

Note that some fixed objects and dictionary objects (particularly WRH$_LATCH_CHILDREN) will have grown significantly during the upgrade. This is especially the case if statistics_level = ALL, or there is a high retention period or a short snapshot interval.

So fixed object and dictionary statistics may need to be gathered before running AWRs.

See My Oracle Support document "Performance Diagnosis with Automatic Workload Repository (Document 1674086.1)" for more information.

5.5 SQLHC or SQLT

Run SQLHC or SQLT for the long running SQLs.

There are occasions when SQLHC should be used rather than SQLT:

For example a customer may object to installing SQLT on their production environment or they may be concerned about the performance impact on a production system.
It may be necessary to obtain more information on a SQL during an upgrade patch/pack.
Or the customer may not have installed SQLT yet.
SQLT is more complicated and may require more explanation, so it may be quicker and simpler to ask for SQLHC.

There are situations where SQLT is preferred over SQLHC:

The SQL ID (and runtime execution plan) may not be in memory or AWR. So SQLHC will not provide any output (and SQLT with XPLAIN or XECUTE method is required).
More information is required.

5.6 SQLHC

When long running SQLs with possible sub-optimal execution plans have been identified then use SQLHC (sqlhc.sql).

This provides html reports, which contain most of the data required to identify the reasons why the CBO chose a sub-optimal execution plan. e.g. Health Check (Observations), CBO Statistics, Metadata for tables and indexes, Execution plans, Runtime Statistics, System/Instance (Initialization) Parameters, SQL Profiles/Baselines associated with the SQL and SQL Activity Report.

It saves a huge amount of time by providing all information in one action (and in one zip file sqlhc_YYYYMMDD_HHMISS_SQLID.zip).

It does not require any installation on the environment (other than the script file itself) and has a negligible performance impact. If necessary it can be run whilst the upgrade is in progress.

It is particularly useful if engineers/analysts do not have access to the instance.

Note that SQLHC will only provide actual row source statistics if the "statistics_level" initialization parameter is set to ALL ( or "_rowsource_execution_statistics" is set to TRUE) at the time the SQL is executed.

See the following My Oracle Support documents for more information:

Oracle E-Business Suite Performance Guide (Document 1672174.1), section "SQLHC"
SQL Tuning Health-Check Script (SQLHC) (Document 1366133.1)

5.7 SQLT with XTRACT method

This uses a significant amount of system resources, so should not be run during the upgrade. Instead, Display Cursor report and SQLHC can be used to obtain much of the information that is included in the SQLT output.

The SQLT provides execution plans, CBO statistics, database parameters, CBO parameters, performance statistics, and metadata (schema object definitions and so on) associated with the SQL.

Depending on the SQLT parameters it can also contain supporting information such as AWR reports, ASH Reports, ADDM Reports, CBO (10053) trace, SQL Monitor report, EBS Initialization Parameters healthcheck report (bde_chk_cbo), Test Case Builder and SQL Tuning Advisor.

It is particularly useful for passing information to analysts/engineers, if access to the instance is not possible. Even if access is possible, it gives all the information in one place (one zip file).

The SQLT should be provided on the same environment where the performance issue was observed, and should be run as soon after the relevant program/process as possible.

Be aware of any actions that may alter the data that SQLT is capturing (that is, actions that take place after the observed performance issue, but before SQLT is run). For example, statistics being gathered, removed, locked, imported, or data being deleted from temporary or interface tables.

To run SQL with XTRACT the sql_id or hash_value will be needed.

In Oracle Database 11g, the TKPROF will give the sql id and hash value.

For Oracle 11g. the raw trace gives the sql_id (sqlid=). In 10g and before, it gives the hash value (hv=).

Other methods of obtaining the sql_id include using AWR, Oracle Enterprise Manager, and directly querying V$SQL.

The XTRACT method is needed because:

It obtains the runtime execution plan from memory or AWR
It is RAC aware.
If the SQL was executed when "statistics_level" was set to ALL (or "_rowsource_execution_statistics" is set to TRUE) then it will contain actual row source statistics (I/O, buffer gets, elapsed time) on the execution plan (provided the cursor is still in memory (cursor cache).

Note that SQLT runs AWR and ASH reports. Some dictionary objects (particularly WRH$_LATCH_CHILDREN, especially if statistics_level is set to ALL) will have grown significantly during the upgrade. So, it may be necessary to gather fixed object and dictionary statistics before running SQLT.

SQLT can take quite a while to run.

See the following My Oracle Support documents for more information:

Oracle E-Business Suite Performance Guide (Document 1672174.1), section "SQLT"
All About the SQLT Diagnostic Tool (Document 215187.1)

5.7.1 Reducing workload / time taken producing SQLT

To reduce the workload, it is recommended that the following are run (from SQL*Plus) before running sqltxtract.sql:

To disable Test Case Builder TCB and/or SQL Tuning Advisor

EXEC sqltxplain.sqlt$a.set_param('test_case_builder', 'N');

EXEC sqltxplain.sqlt$a.set_param('sta_time_limit_secs', '30');

To disable the automatic export of a test case repository

EXEC sqltxplain.sqlt$a.set_param('export_repository', 'N');

If SQLT still takes a long time, and the schema objects used by the SQL contain a large number of sub-partitions, the granularity of the data collected can be reduced as follows:

EXEC sqltxplain.sqlt$a.set_param('c_gran_segm', 'PARTITION');

EXEC sqltxplain.sqlt$a.set_param('c_gran_cols', 'PARTITION');

EXEC sqltxplain.sqlt$a.set_param('c_gran_hgrm', 'PARTITION');

Note that these commands can all be run as APPS. They do not need to be run as user SQLTXPLAIN.

These values are stored in a table called SQLTXPLAIN.SQLI$_PARAMETER. Once they are set, they do not need to be re-set for each execution of SQLT. The current values can be checked by querying this table.

To reduce the time further the counting of rows on tables can be disabled, by running the following. However, information on the actual number of rows in each table will be lost.

 EXEC sqltxplain.sqlt$a.set_param('count_star_threshold', '0');

6. Online Patching Diagnostics

6.1 Online Patching Enablement - Specific Diagnostics

The Online Patching Enablement patch is applied using AutoPatch (adpatch). In addition to the general diagnostics above the output from the following script will be useful during Online Patching Enablement:

$ sqlplus apps @$AD_TOP/sql/ADZDSHOWDDLS.sql

6.2 Online Patching (ADOP) Logs and Diagnostics

All the ADOP logs are located on the non-editioned file system (fs_ne) in the <INSTALL BASE>/fs_ne/EBSapps/log/adop directory e.g.

/u01/PROD/fs_ne/EBSapps/log/adop

Each cycle of ADOP creates a subdirectory corresponding to the patch session ID, e.g.

/u01/PROD/fs_ne/EBSapps/log/adop/n

Where n is the session ID.

It is easiest and quickest to produce a zip of the entire directory.

The main files of interest are the ADOP logs (e.g. adop_YYYYMMDD_HHMISS.log).

But the adzdshowlog.out, adworker*.log, u*.log, u*.lgi, admrgpch*.log files are all useful and under the same path.

When running ADOP the on screen terminal output will mention which ADOP session ID is in use. e.g.

/u01/PROD/fs_ne/EBSapps/log/adop/9/apply_20121011_024437

The session ID directory will contain a trace file for each phase (e.g. adop_20130316_091340.log) and a corresponding log directory for each phase containing other logs (e.g. apply_20130316_091340).

The timestamp of the trace file and the corresponding log directory will match.

6.3 Non ADOP Logs

The same log directory for each phase (e.g. apply_20130316_091340) also contains some AD Utility and worker logs.

These include adrelink.log, adlibin.log, adlibout.log, adworknnn.log. The most useful are the adworknnn.log files that show the jobs run on each AD Parallel worker along with timestamps.

6.4 Online Patching Log Analyzer Utility

This is delivered in R12.AD.C.Delta.n (since R12.AD.C.Delta.4).

This utility analyzes adop log directories for errors and warnings, and displays messages to help the user quickly identify any problems that may have occurred during an adop run. It thereby offers an alternative to reviewing log files manually.

The Log Analyzer utility can be run without options:

To scan all log directories of the latest adop session for errors:

$ adopscanlog

The utility can also be run with various options. Examples include:

To scan log directories relating to the latest run of adop in the latest session:
```
$ adopscanlog -latest=yes  
```
To scan log directories relating to the latest run of the specified phase, in the latest session:
```
$ adopscanlog -latest=yes -phase=<phase_name>  
```
To scan all log directories of a given session (represented by a session_id) for errors:
```
$ adopscanlog -session_id=<number>  
```
To see a complete list of supported parameters:
```
$ adopscanlog -help
```

6.5 adzdshowlog.out

This reports the contents of the AD_ZD_LOGS table. This contains messages on the progress of online patching with timestamps. The contents of this table will be truncated every time cleanup/prepare phase is run.

This can also be obtained for previous phases by running the following script:

$ sqlplus apps @$AD_TOP/sql/ADZDSHOWLOG.sql

Or running the SQL

SELECT * FROM ad_zd_logs ORDER BY log_sequence desc;

6.6 Check the current status of the adop cycle

To check the status of the current ADOP cycle, source the run filesystem environment file and run command

adop -status

Usage:

adop -status generates a summary report
adop -status <sessionID> generates a summary report for that session ID
adop -status -detail generates a detailed report

6.7 SQL to determine status of ADOP phases

The following SQL statement will show the status for each adop phase along with its corresponding session id.

SELECT adop_session_id, prepare_status, apply_status, finalize_status, 
       cutover_status, cleanup_status, abort_status, status, abandon_flag, node_name 
FROM ad_adop_sessions 
ORDER BY adop_session_id;

This is effectively a history of online patching in an environment.

The following statuses apply to all phases:

Y : the phase is done
N : the phase has not been completed
X : the phase is not applicable
R : the phase is running (in progress)
F : the phase has failed
P : (is applicable only to APPLY phase) at least one patch is already applied for the session id
C : the status of this ADOP session has completed

Note: Numerical statuses are only relevant for the cutover phase. These status values are updated when a step has completed, and are as follows:

0 : cutover/force_shutdown has started
1 : "force_shutdown" step has successfully executed
3 : "db_cutover" step has successfully executed
4 : "fs_cutover" step has successfully executed
6 : "force_startup" step has successfully executed

Cutover statuses

cutover_status='Y' 'COMPLETED'
cutover_status not in ('N','Y','X') and status='F' 'FAILED'
cutover_status='0' 'CUTOVER STARTED'
cutover_status='1' 'SERVICES SHUTDOWN COMPLETED'
cutover_status='3' 'DB CUTOVER COMPLETED'
cutover_status='D' 'FLIP SNAPSHOTS COMPLETED'
cutover_status='4' 'FS CUTOVER COMPLETED'
cutover_status='5' 'ADMIN STARTUP COMPLETED'
cutover_status='6' 'SERVICES STARTUP COMPLETED'
cutover_status='N' 'NOT STARTED'
cutover_status='X' 'NOT APPLICABLE'

7. Fixed Object and Dictionary Statistics

There can sometimes be issues with Fixed Object and Dictionary Statistics in Online Patching Enablement, R12.2.n RUPs (online patching) or when producing AWR reports or SQLT.

A fixed object (X$ tables) resides in memory only, and typically records the information about the instance or memory structures. The v$ dynamic performance views are defined on top of X$ tables e.g. V$SQL and V$SQL_PLAN.

Data dictionary tables (e.g. SYS.USER$, SYS.TS$, SYS.SEG$, SYS.OBJ$, SYS.TAB$, SYS.FILE) are stored on data files like normal application tables.

Some fixed objects and dictionary objects will have grown significantly during the upgrade.

There are also changes to fixed objects due to Online Patching Enablement (and the underlying Edition-Based Redefinition). As a result internal SQL in Online Patching Enablement, R12.2.n RUPs and other online patches can sometimes be long running. Gathering fixed object statistics can help in these circumstances. Particularly on editioning objects.

There may be additional specific circumstances during the upgrade where fixed object or dictionary statistics need to be gathered (such as before importing schema statistics or running SQLT or AWR reports when AWR has grown significantly).

If there is internal SQL (on V$ views or on SYS/SYSTEM objects) appearing high in AWR and TKPROF reports, it is likely that dictionary and fixed object statistics need to be gathered.

Note that the FND_STATS API does not gather statistics for dictionary or fixed objects. The DBMS_STATS APIs need to be used.

The commands are:

		execute DBMS_STATS.GATHER_FIXED_OBJECTS_STATS(no_invalidate=>FALSE);

		execute DBMS_STATS.GATHER_DICTIONARY_STATS( estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, options => 'GATHER AUTO', no_invalidate=>FALSE)

Usually the "no_invalidate=>FALSE" argument will not be needed. However, the procedures DBMS_STATS.set_database_prefs, set_global_pefs, set_schema_prefs or set_table_prefs could have been used to set the default value for NO_INVALIDATE to TRUE.

If there are only a handful of internal SQLs with inefficient execution plans and only a few objects then specific objects could be targeted rather than gathering all dictionary or fixed object statistics.

e.g.

exec dbms_stats.gather_table_stats(ownname=>'SYS', tabname=>'OBJ$', no_invalidate=>false);

exec dbms_stats.gather_table_stats(ownname=>'SYS', tabname=>'X$KQFP', no_invalidate=>false);

See the following My Oracle Support documents for more information:

Best Practices for Gathering Statistics with Oracle E-Business Suite (Document 1586374.1)
Fixed Objects Statistics (GATHER_FIXED_OBJECTS_STATS) Considerations (Document 798257.1)

OracleAppsDBAKK1

Tuesday, June 8, 2021