Objective: Assess the impact of HASH_JOIN on SWAP and TEMP Tablespace
In our previous blog, we noticed how a HASH_JOIN on 2 large tables (choice of optimizer) impacted the Memory, SWAP and Temp tablespaces.
In this blog, we are going to see possible ways to work around the impact and at the same time deliver the business demand (which is a successful query execution).
To know more about the test setup and instrumentations used for this blog, please look @ part1 blog.
Option or Fix 1:
Reduce the rowset to be processed by a filter clause on the join column.
Original Query:
select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;
Fix1 - query:
select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
and t1.id <= 1000000
;
Output:
If we reduce the result set; it worked fine.
20:27:17 SQL> select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
and t1.id <= 1000000
;20:33:35 2 20:33:35 3 20:33:35 4 20:33:35 5 20:33:35 6 20:33:35 7
1000000 rows selected.
Elapsed: 00:00:17.39
Statistics
----------------------------------------------------------
79 recursive calls
150 db block gets
464961 consistent gets
413366 physical reads
0 redo size
92278464 bytes sent via SQL*Net to client
733780 bytes received via SQL*Net from client
66668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000000 rows processed
20:33:53 SQL>
SQL> SELECT dbms_sqltune.Report_sql_monitor(SQL_ID=>'4j5fnm8zg0y3y', TYPE=>'text',report_level=>'ALL')
2 from dual;
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id and t1.id <= 1000000
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 4j5fnm8zg0y3y
SQL Execution ID : 16777217
Execution Started : 05/14/2023 20:33:35
First Refresh Time : 05/14/2023 20:33:35
Last Refresh Time : 05/14/2023 20:33:53
Duration : 18s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 66668
Global Stats
================================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes | Reqs | Bytes |
================================================================================
| 13 | 4.91 | 8.28 | 66668 | 465K | 3461 | 3GB | 271 | 66MB |
================================================================================
SQL Plan Monitoring Details (Plan Hash Value=913581377)
====================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (Max) | (%) | (# samples) |
====================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 17 | +2 | 1 | 1M | | | | | . | . | | |
| 1 | HASH JOIN | | 1M | 117K | 17 | +2 | 1 | 1M | 271 | 66MB | 271 | 66MB | 104MB | 75MB | 20.00 | Cpu (1) |
| | | | | | | | | | | | | | | | | direct path read temp (1) |
| 2 | TABLE ACCESS FULL | JT1 | 1M | 55248 | 5 | +2 | 1 | 1M | 1595 | 2GB | | | . | . | 40.00 | direct path read (4) |
| 3 | TABLE ACCESS FULL | JT2 | 1M | 55248 | 12 | +6 | 1 | 1M | 1595 | 2GB | | | . | . | 40.00 | direct path read (4) |
====================================================================================================================================================================================
SQL>
Result:
The query finishes fine with same plan. /oradata didnt grew (no growth in temp tablespace). There isnt much activity in vmstat or iostat as well.
Query optimized, now we just need to rotate it over next 1M rows in batches to get the job done.
Charts:
Option 2:
Let us try set hash_area_size to 400M (PGA-500M).
alter session set workarea_size_policy='MANUAL';
alter session set hash_area_size=419430400;
Query:
select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;
Output:
20:55:04 SQL> set autotrace trace statistics
20:55:10 SQL> select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;20:58:12 2 20:58:12 3 20:58:12 4 20:58:12 5 20:58:12 6
ERROR:
ORA-27072: File I/O error
Linux-x86_64 Error: 28: No space left on device
Additional information: 4
Additional information: 263687
Additional information: 4294967295
ORA-01114: IO error writing block to file 203 (block # 263559)
...
ORA-01114: IO error writing block to file 203 (block # 262535)
ORA-27072: File I/O error
Linux-x86_64 Error: 28: No space left on device
Additional information: 4
Additional information: 262535
Additional information: 4294967295
ORA-01114: IO error writing block to file 203 (
1508775 rows selected.
Elapsed: 00:01:47.79
Statistics
----------------------------------------------------------
2123 recursive calls
10318 db block gets
399048 consistent gets
299532 physical reads
0 redo size
142075417 bytes sent via SQL*Net to client
1106845 bytes received via SQL*Net from client
100586 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1508775 rows processed
21:00:01 SQL>
21:03:49 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +cost +bytes'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6hz2uy35hb6jy, child number 0
-------------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Plan hash value: 913581377
---------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 226K(100)| 1508K|00:01:36.17 | 409K| 299K| 266K| | | |
|* 1 | HASH JOIN | | 1 | 25M| 2593M| 1573M| 226K (1)| 1508K|00:01:36.17 | 409K| 299K| 266K| 2478M| 41M| 366M (1)|
| 2 | TABLE ACCESS FULL| JT1 | 1 | 25M| 1287M| | 55303 (1)| 25M|00:00:03.42 | 202K| 202K| 0 | | | |
| 3 | TABLE ACCESS FULL| JT2 | 1 | 25M| 1287M| | 55303 (1)| 12M|00:00:02.30 | 196K| 97049 | 0 | | | |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."ID"="T2"."ID")
20 rows selected.
Elapsed: 00:00:00.16
21:04:00 SQL>
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Error: ORA-1114
------------------------------
ORA-01114: IO error writing block to file 203 (block # 262407)
ORA-27072: File I/O error
Linux-x86_64 Error: 28: No space left on device
Additional information: 4
Additional information: 262407
Additional information: 4294967295
Global Information
------------------------------
Status : DONE (ERROR)
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 6hz2uy35hb6jy
SQL Execution ID : 16777218
Execution Started : 05/14/2023 20:58:13
First Refresh Time : 05/14/2023 20:58:13
Last Refresh Time : 05/14/2023 21:00:00
Duration : 107s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 100587
Global Stats
==========================================================================================================
| Elapsed | Cpu | IO | Concurrency | Other | Fetch | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes | Reqs | Bytes |
==========================================================================================================
| 98 | 28 | 55 | 0.04 | 15 | 101K | 409K | 28675 | 3GB | 16282 | 2GB |
==========================================================================================================
SQL Plan Monitoring Details (Plan Hash Value=913581377)
===========================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (Max) | (%) | (# samples) |
===========================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 106 | +2 | 1 | 2M | | | | | . | . | 70.83 | Cpu (13) |
| | | | | | | | | | | | | | | | | enq: CF - contention (1) |
| | | | | | | | | | | | | | | | | control file parallel write (18) |
| | | | | | | | | | | | | | | | | db file single write (3) |
| | | | | | | | | | | | | | | | | local write wait (33) |
| 1 | HASH JOIN | | 25M | 227K | 108 | +0 | 1 | 2M | | | 2116 | 2GB | 367MB | 2GB | 23.96 | Cpu (14) |
| | | | | | | | | | | | | | | | | direct path write temp (9) |
| 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 61 | +2 | 1 | 25M | 1595 | 2GB | | | . | . | 4.17 | Cpu (2) |
| | | | | | | | | | | | | | | | | direct path read (2) |
| 3 | TABLE ACCESS FULL | JT2 | 25M | 55303 | 45 | +63 | 1 | 12M | 769 | 758MB | | | . | . | 1.04 | Cpu (1) |
===========================================================================================================================================================================================
So this option doesnt work as well. Query ran from 20:58:00 to 21:00:01
Observation:
Query did try fetch the records unlike original case where it failed even without fetching any records, but it failed due to insufficient temp space. Ok, anyway I am not allocating more space to /oradata (though I have 40GB more room). I would like to try fixing the plan.No SI or SO (paging activity) still.
So oracle is doing a good job in keeping up with memory demand.
[oracle@vcentos79-oracle-sa1 ~]$ free -k
total used free shared buff/cache available
Mem: 2914064 1126480 21928 1492496 1765656 256444
Swap: 3145724 11008 3134716
[oracle@vcentos79-oracle-sa1 ~]$
The query needs 2.5GB memory for 0 pass (in memory join). I do have swap of 3GB. So I can give it a try. Note, this isnt a solution. I am going to try test if 2.5G is good enough for this query. So I will upsize the hash_area_size to 3000GB (so part of it will come from memory and rest will come from swap).
alter session set hash_area_size = 2147483647; -- that is 1.999GB, which is the max we can set for hash_area_size, any value more than this will fail with ORA-02017 error
21:14:45 SQL> sho parameter hash_area_size
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hash_area_size integer 2147483647
21:14:52 SQL>
Ok, let us give it a try with 2GB, rest to go to anyway temp datafile.
Preset /oradata fs:
SQL> alter tablespace TEMP shrink space keep 100M;
SQL>
[oracle@vcentos79-oracle-sa1 ORA19C]$ df -h /oradata
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/appdata--vg-oradata--lv 10G 8.2G 1.9G 82% /oradata
[oracle@vcentos79-oracle-sa1 ORA19C]$
set autotrace trace statistics
select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;
Output:
21:14:52 SQL> set autotrace trace statistics
21:20:45 SQL> select /*+ monitor */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;21:21:00 2 21:21:00 3 21:21:00 4 21:21:00 5 21:21:00 6
25000000 rows selected.
Elapsed: 00:07:38.45
Statistics
----------------------------------------------------------
411 recursive calls
2375 db block gets
1851624 consistent gets
456625 physical reads
0 redo size
2355076217 bytes sent via SQL*Net to client
18333759 bytes received via SQL*Net from client
1666699 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
25000000 rows processed
21:28:39 SQL> set autotrace off
21:30:49 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +cost +bytes'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6hz2uy35hb6jy, child number 1
-------------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Plan hash value: 913581377
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 110K(100)| 25M|00:04:29.11 | 1853K| 456K| 51660 | | | | |
|* 1 | HASH JOIN | | 1 | 25M| 2593M| 110K (1)| 25M|00:04:29.11 | 1853K| 456K| 51660 | 2478M| 41M| 2060M (1)| 410M|
| 2 | TABLE ACCESS FULL| JT1 | 1 | 25M| 1287M| 55303 (1)| 25M|00:01:28.02 | 202K| 202K| 0 | | | | |
| 3 | TABLE ACCESS FULL| JT2 | 1 | 25M| 1287M| 55303 (1)| 25M|00:00:13.47 | 1649K| 202K| 0 | | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."ID"="T2"."ID")
20 rows selected.
Elapsed: 00:00:00.69
21:30:56 SQL>
live execution sample 1:
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 6hz2uy35hb6jy
SQL Execution ID : 16777219
Execution Started : 05/14/2023 21:21:00
First Refresh Time : 05/14/2023 21:21:00
Last Refresh Time : 05/14/2023 21:23:57
Duration : 176s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Global Stats
=================================================================================================
| Elapsed | Cpu | IO | Concurrency | Other | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Gets | Reqs | Bytes | Reqs | Bytes |
=================================================================================================
| 179 | 16 | 78 | 0.19 | 84 | 191K | 1611 | 1GB | 161 | 104MB |
=================================================================================================
SQL Plan Monitoring Details (Plan Hash Value=913581377)
=====================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | | | (%) | (# samples) | |
=====================================================================================================================================================================================================
| -> 0 | SELECT STATEMENT | | | | 180 | +2 | 1 | 0 | | | | | . | . | 27.19 | Cpu (1) | |
| | | | | | | | | | | | | | | | | control file parallel write (1) | |
| | | | | | | | | | | | | | | | | Disk file operations I/O (1) | |
| | | | | | | | | | | | | | | | | local write wait (28) | |
| -> 1 | HASH JOIN | | 25M | 111K | 180 | +2 | 1 | 0 | | | 116 | 114MB | 1GB | 118MB | 31.58 | Cpu (34) | |
| | | | | | | | | | | | | | | | | direct path write temp (2) | |
| -> 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 181 | +1 | 1 | 24M | 1515 | 1GB | | | . | . | 41.23 | Cpu (33) | 95% |
| | | | | | | | | | | | | | | | | resmgr:cpu quantum (2) | |
| | | | | | | | | | | | | | | | | direct path read (12) | |
| 3 | TABLE ACCESS FULL | JT2 | 25M | 55303 | | | | | | | | | . | . | | | |
=====================================================================================================================================================================================================
SQL> SQL>
The query is using paging disk now!!!
I started seeing my sessions terminated as expected!!!!
Look sessions were getting killed by OS!!!
SQL> SQL> /
SELECT dbms_sqltune.Report_sql_monitor(SQL_ID=>'6hz2uy35hb6jy', TYPE=>'text',report_level=>'ALL')
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 32348
Session ID: 103 Serial number: 11093
SQL>
So this is the effect of using swap!! OS will start randomly killing sessions!
Live sample 2:
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 6hz2uy35hb6jy
SQL Execution ID : 16777219
Execution Started : 05/14/2023 21:21:00
First Refresh Time : 05/14/2023 21:21:00
Last Refresh Time : 05/14/2023 21:27:25
Duration : 384s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 1152782
Global Stats
=========================================================================================================
| Elapsed | Cpu | IO | Concurrency | Other | Fetch | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes | Reqs | Bytes |
=========================================================================================================
| 269 | 38 | 93 | 0.53 | 137 | 1M | 2M | 6339 | 3GB | 2242 | 388MB |
=========================================================================================================
SQL Plan Monitoring Details (Plan Hash Value=913581377)
=====================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | | | (%) | (# samples) | |
=====================================================================================================================================================================================================
| -> 0 | SELECT STATEMENT | | | | 393 | +2 | 1 | 18M | | | | | . | . | 35.59 | Cpu (19) | |
| | | | | | | | | | | | | | | | | SQL*Net message to client (1) | |
| | | | | | | | | | | | | | | | | control file parallel write (5) | |
| | | | | | | | | | | | | | | | | Disk file operations I/O (2) | |
| | | | | | | | | | | | | | | | | local write wait (36) | |
| -> 1 | HASH JOIN | | 25M | 111K | 393 | +2 | 1 | 18M | | | 368 | 362MB | 2GB | 369MB | 34.46 | Cpu (53) | |
| | | | | | | | | | | | | | | | | acknowledge over PGA limit (2) | |
| | | | | | | | | | | | | | | | | direct path write temp (6) | |
| 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 218 | +1 | 1 | 25M | 1595 | 2GB | | | . | . | 27.12 | Cpu (34) | 100% |
| | | | | | | | | | | | | | | | | resmgr:cpu quantum (2) | |
| | | | | | | | | | | | | | | | | direct path read (12) | |
| -> 3 | TABLE ACCESS FULL | JT2 | 25M | 55303 | 177 | +218 | 1 | 20M | 1276 | 1GB | | | . | . | 2.82 | Cpu (1) | 80% |
| | | | | | | | | | | | | | | | | resmgr:cpu quantum (3) | |
| | | | | | | | | | | | | | | | | direct path read (1) | |
=====================================================================================================================================================================================================
Live sample 3 (query finished):
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 6hz2uy35hb6jy
SQL Execution ID : 16777219
Execution Started : 05/14/2023 21:21:00
First Refresh Time : 05/14/2023 21:21:00
Last Refresh Time : 05/14/2023 21:28:39
Duration : 459s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 1666668
Global Stats
=========================================================================================================
| Elapsed | Cpu | IO | Concurrency | Other | Fetch | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes | Reqs | Bytes |
=========================================================================================================
| 292 | 45 | 101 | 0.55 | 145 | 2M | 2M | 7643 | 4GB | 2587 | 435MB |
=========================================================================================================
SQL Plan Monitoring Details (Plan Hash Value=913581377)
==========================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (Max) | (%) | (# samples) |
==========================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 454 | +2 | 1 | 25M | | | | | . | . | | |
| 1 | HASH JOIN | | 25M | 111K | 457 | +2 | 1 | 25M | 410 | 404MB | 410 | 404MB | 2GB | 410MB | | |
| 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 218 | +1 | 1 | 25M | 1595 | 2GB | | | . | . | | |
| 3 | TABLE ACCESS FULL | JT2 | 25M | 55303 | 215 | +218 | 1 | 25M | 1595 | 2GB | | | . | . | | |
==========================================================================================================================================================================
SQL>
Observations:
1. /oradata fs free space reduced from 1.9G to 1.6GB (300MB additional demand + 100MB of what was already left in TEMP tablespace!)
2. OS randomly killing sessions, as soon as paging started - which is quite risky.
3. The Memory allocation started by freeing up FS cache this oracle process and then once all memory is consumed, it started paging.
Query ran from 21:21 to 21:29
Charts:
Option 3:
Let us see if forcing Nested Loop can help now.
temp tablespace situation:
SQL> select con_id,tablespace_name,sum(bytes)/1024/1024,sum(maxbytes)/1024/1024 from cdb_temp_files group by con_id,tablespace_name;
CON_ID TABLESPACE_NAME SUM(BYTES)/1024/1024 SUM(MAXBYTES)/1024/1024
---------- ------------------------------ -------------------- -----------------------
1 TEMP 31.9921875 32767.9844
3 TEMP 411 32767.9844 <<<<<<<<< our concern
SQL>
[oracle@vcentos79-oracle-sa1 ORA19C]$ df -h /oradata
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/appdata--vg-oradata--lv 10G 8.5G 1.6G 85% /oradata <<< 1.6GB free
[oracle@vcentos79-oracle-sa1 ORA19C]$
22:06:39 SQL> sho parameter workarea
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
workarea_size_policy string AUTO
22:06:45 SQL> sho parameter hash_area
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hash_area_size integer 131072
22:06:50 SQL>
OS Monitors ON (vmstat and iostat)
NL with t2 on t1 or t1 on t2 doesnt make any difference, since both of this tables are of same size.
set autotrace trace statistics
select /*+ monitor use_nl(t2 t1) */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;
SELECT dbms_sqltune.Report_sql_monitor(SQL_ID=>'742f55pj83rmf', TYPE=>'text',report_level=>'ALL') FROM dual;
live sample 1:
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 742f55pj83rmf
SQL Execution ID : 16777216
Execution Started : 05/14/2023 22:07:52
First Refresh Time : 05/14/2023 22:07:52
Last Refresh Time : 05/14/2023 22:09:49
Duration : 118s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 6
Global Stats
================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
================================================================
| 134 | 109 | 25 | 6 | 17M | 130K | 126GB |
================================================================
SQL Plan Monitoring Details (Plan Hash Value=3927851833)
===================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (%) |(# samples) |
===================================================================================================================================================
| -> 0 | SELECT STATEMENT | | | | 118 | +0 | 1 | 82 | | | | |
| -> 1 | NESTED LOOPS | | 25M | 1T | 118 | +0 | 1 | 82 | | | | |
| -> 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 118 | +0 | 1 | 81 | 2 | 224KB | | |
| -> 3 | TABLE ACCESS FULL | JT2 | 1 | 55343 | 119 | +0 | 82 | 82 | 130K | 126GB | 100.00 | Cpu (91) |
| | | | | | | | | | | | | resmgr:cpu quantum (4) |
| | | | | | | | | | | | | direct path read (23) |
===================================================================================================================================================
...
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 742f55pj83rmf
SQL Execution ID : 16777216
Execution Started : 05/14/2023 22:07:52
First Refresh Time : 05/14/2023 22:07:52
Last Refresh Time : 05/14/2023 22:11:51
Duration : 240s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 13
Global Stats
================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
================================================================
| 277 | 228 | 49 | 13 | 37M | 290K | 281GB |
================================================================
SQL Plan Monitoring Details (Plan Hash Value=3927851833)
==============================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (%) |(# samples) | |
==============================================================================================================================================================
| -> 0 | SELECT STATEMENT | | | | 240 | +0 | 1 | 182 | | | | | |
| -> 1 | NESTED LOOPS | | 25M | 1T | 240 | +0 | 1 | 182 | | | | | |
| -> 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 240 | +0 | 1 | 181 | 2 | 224KB | | | 0% |
| -> 3 | TABLE ACCESS FULL | JT2 | 1 | 55343 | 241 | +0 | 182 | 182 | 290K | 281GB | 100.00 | Cpu (187) | |
| | | | | | | | | | | | | resmgr:cpu quantum (4) | |
| | | | | | | | | | | | | direct path read (49) | |
==============================================================================================================================================================
The fact, this Nested Loop isnt going to finish anytime soon. Look it has read 281GB already and it has feteched only 181 records until now. we need 25M records fetched!
I dont see disk activity neither swap. I guess it cached the tables in memory. # of buffer gets is increasing with each sample.
All the waits are on CPU. For each 60 rows, its doing a buffer read of 20M!!!
Anyway this plan isnt going to break the system, let it run for sometime
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 742f55pj83rmf
SQL Execution ID : 16777216
Execution Started : 05/14/2023 22:07:52
First Refresh Time : 05/14/2023 22:07:52
Last Refresh Time : 05/14/2023 22:16:32
Duration : 521s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 23
Global Stats
================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
================================================================
| 548 | 422 | 126 | 23 | 68M | 535K | 518GB |
================================================================
SQL Plan Monitoring Details (Plan Hash Value=3927851833)
===============================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (%) | (# samples) | |
===============================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 521 | +0 | 1 | 336 | | | | | |
| 1 | NESTED LOOPS | | 25M | 1T | 521 | +0 | 1 | 336 | | | | | |
| -> 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 523 | +0 | 1 | 335 | 2 | 224KB | | | 0% |
| -> 3 | TABLE ACCESS FULL | JT2 | 1 | 55343 | 523 | +0 | 336 | 336 | 535K | 518GB | 100.00 | Cpu (356) | 100% |
| | | | | | | | | | | | | resmgr:cpu quantum (27) | |
| | | | | | | | | | | | | direct path read (137) | |
===============================================================================================================================================================
It did a bunch of physical IO now. Let us leave it for sometime to see if it makes 1% atleast?
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Global Information
------------------------------
Status : EXECUTING
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 742f55pj83rmf
SQL Execution ID : 16777216
Execution Started : 05/14/2023 22:07:52
First Refresh Time : 05/14/2023 22:07:52
Last Refresh Time : 05/14/2023 22:51:10
Duration : 2598s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 139
Global Stats
==============================================================================
| Elapsed | Cpu | IO | Concurrency | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
==============================================================================
| 3003 | 2441 | 562 | 0.10 | 139 | 421M | 3M | 3TB |
==============================================================================
SQL Plan Monitoring Details (Plan Hash Value=3927851833)
===============================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (%) | (# samples) | |
===============================================================================================================================================================
| -> 0 | SELECT STATEMENT | | | | 2599 | +0 | 1 | 2078 | | | | | |
| -> 1 | NESTED LOOPS | | 25M | 1T | 2599 | +0 | 1 | 2078 | | | | | |
| -> 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 2599 | +0 | 1 | 2077 | 3 | 344KB | | | 0% |
| -> 3 | TABLE ACCESS FULL | JT2 | 1 | 55343 | 2599 | +0 | 2078 | 2078 | 3M | 3TB | 100.00 | Cpu (1986) | 100% |
| | | | | | | | | | | | | resmgr:cpu quantum (35) | |
| | | | | | | | | | | | | direct path read (574) | |
===============================================================================================================================================================
close to 1hr past, not even 1% done.
22:07:08 SQL> select /*+ monitor use_nl(t2 t1) */ t1.*
,t2.*
from jt1 t1
,jt2 t2
where t1.id=t2.id
;22:07:16 2 22:07:16 3 22:07:16 4 22:07:16 5 22:07:16 6
^CERROR:
ORA-01013: user requested cancel of current operation
2100 rows selected.
Elapsed: 00:44:03.95
Statistics
----------------------------------------------------------
6 recursive calls
0 db block gets
428240139 consistent gets
428227443 physical reads
376 redo size
190714 bytes sent via SQL*Net to client
1987 bytes received via SQL*Net from client
141 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2100 rows processed
22:51:56 SQL>
22:55:54 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +cost +bytes'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 742f55pj83rmf, child number 0
-------------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2
where t1.id=t2.id
Plan hash value: 3927851833
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 1383G(100)| 2115 |00:44:01.33 | 428M| 428M|
| 1 | NESTED LOOPS | | 1 | 25M| 2593M| 1383G (1)| 2115 |00:44:01.33 | 428M| 428M|
| 2 | TABLE ACCESS FULL| JT1 | 1 | 25M| 1287M| 55303 (1)| 2115 |00:00:00.03 | 158 | 43 |
|* 3 | TABLE ACCESS FULL| JT2 | 2115 | 1 | 54 | 55343 (1)| 2115 |00:44:01.30 | 428M| 428M|
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("T1"."ID"="T2"."ID")
21 rows selected.
Elapsed: 00:00:00.76
22:56:03 SQL>
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor use_nl(t2 t1) */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id
Error: ORA-1013
------------------------------
ORA-01013: user requested cancel of current operation
Global Information
------------------------------
Status : DONE (ERROR)
Instance ID : 1
Session : SYS (1:9741)
SQL ID : 742f55pj83rmf
SQL Execution ID : 16777216
Execution Started : 05/14/2023 22:07:52
First Refresh Time : 05/14/2023 22:07:52
Last Refresh Time : 05/14/2023 22:51:56
Duration : 2644s
Module/Action : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-
Service : opdb1
Program : sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)
Fetch Calls : 142
Global Stats
==============================================================================
| Elapsed | Cpu | IO | Concurrency | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
==============================================================================
| 3058 | 2487 | 572 | 0.10 | 142 | 428M | 3M | 3TB |
==============================================================================
SQL Plan Monitoring Details (Plan Hash Value=3927851833)
==========================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (%) | (# samples) |
==========================================================================================================================================
| 0 | SELECT STATEMENT | | | | 2645 | +0 | 1 | 2115 | | | | |
| 1 | NESTED LOOPS | | 25M | 1T | 2645 | +0 | 1 | 2115 | | | | |
| 2 | TABLE ACCESS FULL | JT1 | 25M | 55303 | 2645 | +0 | 1 | 2114 | 3 | 344KB | | |
| 3 | TABLE ACCESS FULL | JT2 | 1 | 55343 | 2645 | +0 | 2115 | 2115 | 3M | 3TB | | |
==========================================================================================================================================
SQL>
>>>>>> I cancelled this operation, since it will cause havoc to my Memory & SSD with IOs, life will dimish for them!
Anyway this is ruled out as not a solution.
So we have 2 options..
1. Increase the physical memory and /oradata FS to accomadate the HASH_JOIN demand
2. Do the Join in batches.
This closes the blog.
I know this is a lengthy blog. But hope you enjoyed reading it.
Thanks
No comments:
Post a Comment