Monday, May 15, 2023

Oracle: HASH_JOIN - MEMORY, SWAP and TEMP Tablespace an impact analysis - Part1

Objective: Assess the impact of HASH_JOIN on SWAP and TEMP Tablespace

Find out how much disk IO and swapping is created when the Oracle's *_area_size is suboptimal.

For this test, we will have 2 test tables identical by defination and rowdata (atleast the join column). Both the tables were loaded with 25M records. When they are joined, they yield 25M records as well.

Current Setting:

CPU:2
Memory:3GB
SGA:1504M
PGA:500M
hash_area_size:131072
sort_area_size:65536
bitmap_area_size:1048576
workarea_size_policy: AUTO (PGA in effect)
Container deployment: yes
CON_ID: 3 (OPDB1)

Instrumentations used:

1. vmstat

2. iostat

3. ASH

4. dbms_sqltune

5. dbms_xplan

Test table creation:

1) Test table creation step

create table jt1

as
with generator as (
select rownum id
from dual
connect by
rownum <= 5000
)
select rownum id
,trunc((rownum - 1) / 1000) clustered
,mod(rownum - 1,1000) scattered
,trunc(dbms_random.value(0,1000)) randomised
,substr(dbms_random.string('U',12),1,12) random_string
,lpad(rownum,10) vc_small
,rpad('x',10,'x') vc_padding
from
generator g1,
generator g2
;

create table jt2
as
with generator as (
select rownum id
from dual
connect by
rownum <= 5000
)
select rownum id
,trunc((rownum - 1) / 1000) clustered
,mod(rownum - 1,1000) scattered
,trunc(dbms_random.value(0,1000)) randomised
,substr(dbms_random.string('U',12),1,12) random_string
,lpad(rownum,10) vc_small
,rpad('x',10,'x') vc_padding
from
generator g1,
generator g2
;

No indexes.

>>> Tables were created.

2) Dont change any of the db settings, run a query which joins both

alter session set statistics_level=all;

statistics_level string ALL

set autotrace trace statistics

select /*+ monitor */ t1.*

,t2.*

from jt1 t1

,jt2 t2

where t1.id=t2.id

;

Actual output:

19:04:06 SQL> select /*+ monitor */ t1.*

,t2.*

from jt1 t1

,jt2 t2

where t1.id=t2.id

;19:04:22 2 19:04:22 3 19:04:22 4 19:04:22 5 19:04:22 6

from jt1 t1

ERROR at line 3:

ORA-27072: File I/O error

Linux-x86_64 Error: 28: No space left on device

Additional information: 4

Additional information: 267015

Additional information: 4294967295

ORA-01114: IO error writing block to file 203 (block # 266949)

...

ORA-27072: File I/O error

Linux-x86_64 Error: 28: No space left on device

Additional information: 4

Additional information: 265894

Additional information: 4294967295

ORA-01114: IO error writing block to file 203 (

Elapsed: 00:01:44.30

19:06:08 SQL>

>>> the query failed because TEMP tablespace grew and consumed all /oradata fs. Query ran between 19:04:06 and 19:06:08.

df -h /oradata

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/appdata--vg-oradata--lv 10G 10G 108K 100% /oradata

from same session - the last query execution stats:

19:16:05 SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +cost +bytes'));

PLAN_TABLE_OUTPUT

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 6hz2uy35hb6jy, child number 0

-------------------------------------

select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id

Plan hash value: 913581377

------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | | | | 266K(100)| 0 |00:00:00.01 | 0 | 0 | | | |

|* 1 | HASH JOIN | | 1 | 25M| 2593M| 1573M| 266K (1)| 0 |00:00:00.01 | 0 | 0 | 2478M| 41M| 56M (1)|

| 2 | TABLE ACCESS FULL| JT1 | 1 | 25M| 1287M| | 55303 (1)| 25M|00:00:02.80 | 202K| 202K| | | |

| 3 | TABLE ACCESS FULL| JT2 | 1 | 25M| 1287M| | 55303 (1)| 6804K|00:00:01.65 | 54333 | 54465 | | | |

------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("T1"."ID"="T2"."ID")

20 rows selected.

Elapsed: 00:00:00.25

19:18:40 SQL>

Before the query:

Filesystem                           Size  Used Avail Use% Mounted on

/dev/mapper/appdata--vg-oradata--lv   10G  8.1G  2.0G  81% /oradata

After query failure:

/dev/mapper/appdata--vg-oradata--lv   10G   10G  156K 100% /oradata

ran in parallel these commands:

vmstat 2 1000 -t >/tmp/vms14may23.out

iostat -x /dev/sdc1 2 1000 -t >/tmp/ios14may23.out

/dev/sdc1 is the /oradata mountpoint's disk.

Let us pick live sql monitoring:

SET trimspool ON

SET TRIM ON

SET pages 0

SET linesize 32767

SET LONG 1000000

SET longchunksize 1000000

set feedback off

set echo off

set verify off

SELECT dbms_sqltune.Report_sql_monitor(SQL_ID=>'6hz2uy35hb6jy', TYPE=>'text',report_level=>'ALL')

FROM dual;

Output:

SQL> SELECT dbms_sqltune.Report_sql_monitor(SQL_ID=>'6hz2uy35hb6jy', TYPE=>'text',report_level=>'ALL')

FROM dual;  2

SQL Monitoring Report

SQL Text

------------------------------

select /*+ monitor */ t1.* ,t2.* from jt1 t1 ,jt2 t2 where t1.id=t2.id

Error: ORA-1114

------------------------------

ORA-01114: IO error writing block to file 203 (block # 266468)

ORA-27072: File I/O error

Linux-x86_64 Error: 28: No space left on device

Additional information: 4

Additional information: 266468

Additional information: 4294967295

Global Information

------------------------------

 Status              :  DONE (ERROR)

 Instance ID         :  1

 Session             :  SYS (1:9741)

 SQL ID              :  6hz2uy35hb6jy

 SQL Execution ID    :  16777216

 Execution Started   :  05/14/2023 19:04:22

 First Refresh Time  :  05/14/2023 19:04:22

 Last Refresh Time   :  05/14/2023 19:06:07

 Duration            :  105s

 Module/Action       :  sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)/-

 Service             :  opdb1

 Program             :  sqlplus@vcentos79-oracle-sa1 (TNS V1-V3)

 Fetch Calls         :  1

Global Stats

==========================================================================================================

| Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Fetch | Buffer | Read  | Read  | Write | Write |

| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs  | Bytes | Reqs  | Bytes |

==========================================================================================================

|     104 |      20 |       69 |        0.00 |       15 |     1 |   267K | 29290 |   2GB | 23043 |   2GB | 	<< look here 60% iowaits and 2GB temp writes. 

==========================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=913581377)

==========================================================================================================================================================================

| Id |      Operation       | Name |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Write | Write |  Mem  | Temp  | Activity | Activity Detail |

|    |                      |      | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes | Reqs  | Bytes | (Max) | (Max) |     (%)    |   (# samples)   |

==========================================================================================================================================================================

|  0 | SELECT STATEMENT     |      |         |       |       104 |     +2 |     1 |        0 |      |       |       |       |       . |     . |          |                 |

|  1 |   HASH JOIN          |      |     25M |  267K |       104 |     +2 |     1 |        0 |      |       |  8398 |   2GB |  56MB |     2GB |          |                 |

|  2 |    TABLE ACCESS FULL | JT1  |     25M | 55303 |        79 |     +2 |     1 |      25M | 1595 |   2GB |       |       |       . |     . |          |                 |

|  3 |    TABLE ACCESS FULL | JT2  |     25M | 55303 |        24 |    +82 |     1 |       7M |  435 | 426MB |       |       |       . |     . |          |                 |

==========================================================================================================================================================================

SQL>

This how memory & swap looks - now after failure:

[root@vcentos79-oracle-sa1 ~]# free -k

              total        used        free      shared  buff/cache   available

Mem:        2914064      730944       22228     1346740     2160892      797796

Swap:       3145724       11008     3134716

[root@vcentos79-oracle-sa1 ~]#

SQL> exec dbms_workload_repository.create_snapshot; << secured a snap after failure.
SQL>

Let us look @ how the memory usage and disk io looked:

Look at the chart we draw from the data collected by vmstat and iostat.

There is no swapping activity, which means the Oracle contained this queries demand to stay within PGA_AGGREGATE_TARGET which is 500MB (actual usage was 56MB).

The spilled over operation did go to tempfile pushing the temp file usage to 2GB and ended up consuming all of the /oradata FS free space left.

SQL> 1

  1* select con_id,name,open_mode from v$pdbs

SQL> /

    CON_ID NAME                 OPEN_MODE

---------- -------------------- ----------

         2 PDB$SEED             READ ONLY

         3 OPDB1                READ WRITE			<<< conid=3

         4 INVENTORYPDB         MOUNTED

SQL>

SQL> 1

  1* select con_id,tablespace_name,sum(bytes)/1024/1024 from cdb_temp_files group by con_id,tablespace_name order by 1,2

SQL> /

    CON_ID TABLESPACE SUM(BYTES)/1024/1024

---------- ---------- --------------------

         1 TEMP                         32

         3 TEMP                       2133		<<< 2GB in use.

SQL>

Query ran between 19:04:06 and 19:06:08

A Linux shell script to get relevant info from the vmstat and iostat file:

while read line
do
if [[ "${line}" == "05/14/2023"* ]];
then
tmstmp=${line};
fi
if [[ "${line}" == "sdc1"* ]];
then
deviceperformance=$(echo ${line}|awk '{print $6","$7","$13","$14}');
textout=${tmstmp},${deviceperformance};
echo ${textout};
fi
done </tmp/ios14may23.out


while read line
do
if ! [[ "${line}" == "procs"* || "${line}" == *"free"* ]];
then 
tmstmp=$(echo $line|awk '{print $19}');
restofthefields=$(echo $line|awk '{print $7","$8","$9","$10","$16}');
echo ${tmstmp},${restofthefields};
fi
done </tmp/vms14may23.out

If we look below from the OS stats we collected, we could see the db generated a large amount of IO to /oradata FS during query execution.

We also get the ASH reports dumped...

col event for a40
set lines 1000
set colsep |
set time on
set timing on
alter session set nls_date_format='DD/MM/YYYY HH24:MI:SS';
select cast(sample_Time as date) sampltm
,SQL_ID
,SQL_OPCODE
,SQL_OPNAME
,SQL_PLAN_LINE_ID
,SQL_PLAN_OPERATION
,SQL_PLAN_OPTIONS
,event
,SESSION_STATE
,TIME_WAITED
,DELTA_READ_IO_BYTES
,DELTA_WRITE_IO_BYTES
,PGA_ALLOCATED
,TEMP_SPACE_ALLOCATED
from dba_hist_active_sess_history
where session_id=1
and con_id=3
and cast(sample_Time as date) between '14/05/2023 19:00:00' and '14/05/2023 19:07:00'
order by 1;

ASH confirms the total PGA and Temp tablespace we consumed as we saw from the dbms_xplan and dbms_sqltune outputs.

So we defined the issue. Now let us look at the solution in next blogs.

Thanks

Let us learn Oracle Database Administration skills

Monday, May 15, 2023

Oracle: HASH_JOIN - MEMORY, SWAP and TEMP Tablespace an impact analysis - Part1

Objective: Assess the impact of HASH_JOIN on SWAP and TEMP Tablespace

No comments:

Post a Comment

MSSQL: Backup/Restore logically a large table