如何使用 Oracle 19C 的 EBS 自动配置中的慢查询
我们最近完成了一些 eBS R12.2 数据库升级安装,并注意到作为 AutoConfig 的一部分,运行特定 SQL 存在问题。
查询 sql_id 是 5da8tjzfafw6j
,它看起来像以下这样(格式化):
select
uo.object_name,
aa.object_name procedure_name,
aa.ARGUMENT_NAME
from
user_objects uo,
user_arguments aa
where
uo.object_name = 'FND_GSM_UTIL' and
aa.object_id = uo.object_id
and aa.object_name = 'UPLOAD_CONTEXT_FILE' and
object_type = 'PACKAGE'
order by 1,2,3
该查询返回FND_GSM_UTIL.UPLOAD_CONTEXT_FILE
过程的参数。
这显然是一个次优代码:AutoConfig 应该只运行它需要运行的过程,并在缺少过程的情况下处理异常。
查询执行时间长达 20 分钟,这比数据库升级前的正常 AutoConfig 运行时间要长得多。
SQL Monitor 报告如下所示:
Global Stats
=================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
=================================================================
| 999 | 24 | 976 | 1 | 402K | 95653 | 1GB |
=================================================================
Parallel Execution Details (DOP=2 , Servers Allocated=4)
============================================================================================================================
| Name | Type | Server# | Elapsed | Cpu | IO | Buffer | Read | Read | Wait Events |
| | | | Time(s) | Time(s) | Waits(s) | Gets | Reqs | Bytes | (sample #) |
============================================================================================================================
| PX Coordinator | QC | | 0.15 | 0.06 | 0.09 | 24 | 8 | 65536 | |
| p000 | Set 1 | 1 | 0.06 | 0.06 | | | | . | |
| p001 | Set 1 | 2 | 0.06 | 0.06 | | | | . | |
| p002 | Set 2 | 1 | 0.06 | 0.06 | | | | . | |
| p003 | Set 2 | 2 | 999 | 23 | 976 | 402K | 95645 | 1GB | log buffer space (1) |
| | | | | | | | | | db file parallel read (534) |
| | | | | | | | | | db file scattered read (2) |
| | | | | | | | | | db file sequential read (408) |
============================================================================================================================
============================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Mem | Activity
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (Max) | (%)
============================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 1 | +993 | 5 | 5 | | | . |
| 1 | PX COORDINATOR | | | | 1 | +993 | 5 | 5 | | | . |
| 2 | PX SEND QC (ORDER) | :TQ10003 | 2 | 22 | 1 | +993 | 2 | 5 | | | . |
| 3 | SORT ORDER BY | | 2 | 22 | 1 | +993 | 2 | 5 | | | 4096 |
| 4 | PX RECEIVE | | 2 | 21 | 1 | +993 | 2 | 5 | | | . |
| 5 | PX SEND RANGE | :TQ10002 | 2 | 21 | 1 | +993 | 2 | 5 | | | . |
| 6 | HASH JOIN BUFFERED | | 2 | 21 | 992 | +2 | 2 | 5 | | | 2MB |
| 7 | BUFFER SORT | | | | 992 | +2 | 2 | 2 | | | 4096 |
| 8 | PX RECEIVE | | 2 | 18 | 992 | +2 | 2 | 2 | | | . |
| 9 | PX SEND HYBRID HASH | :TQ10000 | 2 | 18 | 1 | +993 | 1 | 2 | | | . |
| 10 | STATISTICS COLLECTOR | | | | 1 | +993 | 1 | 1 | | | . |
| 11 | VIEW | USER_OBJECTS | 2 | 18 | 1 | +993 | 1 | 1 | | | . |
...
| 44 | PX RECEIVE | | 2 | 3 | | | 2 | | | | . |
| 45 | PX SEND HYBRID HASH | :TQ10001 | 2 | 3 | 1 | +993 | 2 | 0 | | | . |
| 46 | PX PARTITION LIST ALL | | 2 | 3 | 1 | +993 | 2 | 5 | | | . |
| 47 | EXTENDED DATA LINK FULL | INT$DBA_ARGUMENTS | 2 | 3 | 992 | +2 | 2 | 5 | | | . |
===========================================================================================================================================================================
为了可读性,我特意从计划中删除了几行。
我们可以看到查询是并行运行的,并行度等于 2。查询分配了四个从服务器来完成 user_objects 和 user_arguments 之间的哈希连接,大部分时间都花在访问 INT$DBA_ARGUMENTS 上。
这个 INT$ 扩展数据链接视图是我以前从未见过的。据我了解,这个视图就像一个 GV$ 视图,除了它对所有容器运行一些东西。
检查 PX 从站是否正在运行的 SQL:
SQL> select p.spid, p.pname, sql_id from v$session s, v$process p where s.status='ACTIVE' and s.username='APPS' and s.paddr=p.addr;
SPID PNAME SQL_ID
------------------------ ----- -------------
47661 P000 5da8tjzfafw6j
47663 P001 5da8tjzfafw6j
47673 P002 5da8tjzfafw6j
47678 P003 5da8tjzfafw6j
56976 5da8tjzfafw6j
但是如果我们通过 oradebug 去检查 P003 的运行 SQL,它看起来会有所不同:
SQL> oradebug setospid 47678
Oracle pid: 81, Unix process pid: 47678, image: oracle@dbhost (P003)
SQL> oradebug dump errorstack 3
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_p003_47678.trc
这是 P003 运行的查询 sql_id=ctvujtx8491t7
(格式化):
SELECT /*+ NO_STATEMENT_QUEUING RESULT_CACHE (SYSOBJ=TRUE) OPT_PARAM('_ENABLE_VIEW_PDB', 'FALSE') */
"OWNER",
"OBJECT_NAME",
"OBJECT_ID",
"ARGUMENT_NAME"
FROM
NO_CROSS_CONTAINER("SYS"."INT$DBA_ARGUMENTS") "INT$DBA_ARGUMENTS"
WHERE
"INT$DBA_ARGUMENTS"."OBJECT_NAME"='UPLOAD_CONTEXT_FILE' AND
"INT$DBA_ARGUMENTS"."OWNER"=q'"APPS"'
查询由扩展数据链接视图生成,并在 PDB 中运行。
事实上,这个查询运行时间很长,是导致 INT$DBA_ARGUMENTS 视图访问缓慢的原因。
这是我在启用运行时统计信息收集的情况下手动运行 sql_id=ctvujtx8491t7 的计划:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 |00:19:52.76 | 392K| 33448 | | | |
| 1 | RESULT CACHE | 6z4f6jmj63bxh4w12xbkc9yzyb | 1 | | 5 |00:19:52.76 | 392K| 33448 | 65536 | 1024 | |
|* 2 | FILTER | | 1 | | 5 |00:19:52.76 | 392K| 33448 | | | |
| 3 | NESTED LOOPS | | 1 | 285 | 5 |00:19:52.76 | 392K| 33448 | | | |
| 4 | NESTED LOOPS | | 1 | 28518 | 6488K|00:06:57.84 | 170K| 11027 | | | |
|* 5 | HASH JOIN | | 1 | 194 | 85018 |00:00:00.91 | 3284 | 0 | 9307K| 2811K| 9598K (0)|
| 6 | NESTED LOOPS | | 1 | 194 | 85018 |00:00:00.45 | 3279 | 0 | | | |
| 7 | TABLE ACCESS BY INDEX ROWID | USER$ | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | | | |
|* 8 | INDEX UNIQUE SCAN | I_USER1 | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | |
| 9 | TABLE ACCESS BY INDEX ROWID BATCHED| OBJ$ | 1 | 194 | 85018 |00:00:00.30 | 3276 | 0 | | | |
|* 10 | INDEX RANGE SCAN | I_OBJ5 | 1 | 194 | 85018 |00:00:00.12 | 1016 | 0 | | | |
| 11 | INDEX FULL SCAN | I_USER2 | 1 | 577 | 577 |00:00:00.01 | 5 | 0 | | | |
|* 12 | INDEX RANGE SCAN | I_ARGUMENT2 | 85018 | 147 | 6488K|00:06:43.03 | 167K| 11027 | | | |
|* 13 | TABLE ACCESS BY INDEX ROWID | ARGUMENT$ | 6488K| 1 | 5 |00:12:34.33 | 222K| 22421 | | | |
|* 14 | TABLE ACCESS BY INDEX ROWID BATCHED | USER_EDITIONING$ | 1 | 1 | 1 |00:00:00.01 | 4 | 0 | | | |
|* 15 | INDEX RANGE SCAN | I_USER_EDITIONING | 1 | 12 | 7 |00:00:00.01 | 2 | 0 | | | |
|* 16 | TABLE ACCESS BY INDEX ROWID BATCHED | USER_EDITIONING$ | 1 | 1 | 1 |00:00:00.01 | 4 | 0 | | | |
|* 17 | INDEX RANGE SCAN | I_USER_EDITIONING | 1 | 12 | 7 |00:00:00.01 | 2 | 0 | | | |
| 18 | NESTED LOOPS SEMI | | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 19 | INDEX RANGE SCAN | I_OBJ4 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 20 | INDEX RANGE SCAN | I_USER2 | 0 | 192 | 0 |00:00:00.01 | 0 | 0 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter((BITAND("O"."FLAGS",1048576)=1048576 OR BITAND("U"."SPARE1",16)=0 OR ("O"."TYPE#"<>88 AND IS NULL) OR ( IS NOT NULL AND (("U"."TYPE#"=2 AND
"U"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id'))) OR ("U"."TYPE#"<>2 AND SYS_CONTEXT('userenv','current_edition_name')='ORA$BASE') OR IS
NOT NULL))))
5 - access("O"."OWNER#"="U"."USER#")
8 - access("U"."NAME"='APPS')
10 - access("O"."SPARE3"="U"."USER#")
filter((INTERNAL_FUNCTION("O"."TYPE#") OR ("O"."TYPE#"=13 AND "O"."SUBNAME" IS NULL)))
12 - access("O"."OBJ#"="A"."OBJ#")
13 - filter(NVL("A"."PROCEDURE
大部分时间都花在通过 I_ARGUMENT2 索引访问 ARGUMENT$。它运行缓慢,因为索引访问返回 650 万行,其中大部分在表访问后被丢弃。这意味着,APPS 用户的所有过程参数都被一一检查,并且只有 UPLOAD_CONTEXT_FILE 的行在过滤后被进一步发送到执行堆栈。
如果 ARGUMENT$ 数据/索引没有被缓存,这样的 SQL 就不能快速运行,这会导致读取多达 392K 块。缓存数据后,查询会在几秒钟内运行——在我的测试中至少需要 3 秒。
顺便说一句,这样的问题一直在发生:一些次优的查询运行时间很大程度上取决于缓存了多少数据,并且可以在几秒钟或几分钟内“偶尔”运行。
有没有办法修复 sql_id=5da8tjzfafw6j?就在这里。从 Oracle 19.10 开始,有一个新参数 CONTAINER_DATA 可以设置为 CURRENT_DICTIONARY。这样的更改使原始查询立即运行,只有 47 个得到相同的结果:
select /*+ opt_param('CONTAINER_DATA' 'CURRENT_DICTIONARY') */
uo.object_name,
aa.object_name procedure_name,
aa.ARGUMENT_NAME
from
user_objects uo,
user_arguments aa
where
uo.object_name = 'FND_GSM_UTIL' and
aa.object_id = uo.object_id
and aa.object_name = 'UPLOAD_CONTEXT_FILE' and
object_type = 'PACKAGE'
order by 1,2,3
Plan hash value: 1041619941
---------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 |00:00:00.01 | 47 | | | |
| 1 | SORT ORDER BY | | 1 | 1 | 5 |00:00:00.01 | 47 | 2048 | 2048 | 2048 (0)|
|* 2 | FILTER | | 1 | | 5 |00:00:00.01 | 47 | | | |
...
我想将此提示添加为 SQL 补丁来修复此 SQL ,并避免对其他查询的潜在影响。但是,SQL Patch 不起作用:它继续使用扩展的数据链接视图。这让我认为 CONTAINER_DATA 只能在会话/实例级别更改。由于不确定eBS是否支持,将参数实例设置为宽有点危险。我有一个关于在 eBS 中使用 CONTAINER_DATA
的 SR 开放,Oracle 支持 - 还没有反馈。
概括
- 如果您已将 eBS DB 升级到 19c,您可能会注意到由于 SQL 欠佳而导致 AutoConfig 前置步骤挂起。
- 如果您的硬件足够快,可以在几秒钟内完成 10 万次随机 IO,您可能不会注意到任何问题。
- 如果大部分数据已经缓存,您可能也不会注意到 SQL。
- 有一个参数
CONTAINER_DATA
可能可以针对 eBS 19.10+ 数据库实例进行更改,以完全避免遇到此问题,但 Oracle 尚未正式支持(可能)。