11gR2 introduced the DBMS_PARALLEL_EXECUTE package.
The basic idea behind it is you divide some large task into smaller chunks and then execute all the chunks in parallel.
The API and the examples in the Oracle Documentation might lead one to believe the package can only be used to split large SQL statements into parallel chunks.
You can however use the same package to execute a pl/sql block multiple times simultaneously.
With proper use of the chunk ids you can even execute multiple different procedures in parallel as single task.
I’ll illustrate both below.
First, create a table to hold some results:
CREATE TABLE parallel_exec_test_table ( test_name VARCHAR2(100), chunk_name VARCHAR2(100), chunk_value INTEGER, chunk_start TIMESTAMP, chunk_end TIMESTAMP );
Next, I’ll create a procedure that does some dummy work, pausing for a second and updating the test table multiple times.
CREATE OR REPLACE PROCEDURE parallel_exec_test1_chunk( p_test_name IN VARCHAR2, p_chunk_name IN VARCHAR2, p_chunk_value IN INTEGER ) IS BEGIN IF p_chunk_value = 4 THEN RAISE VALUE_ERROR; END IF; INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, p_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); FOR i IN 1 .. p_chunk_value LOOP UPDATE parallel_exec_test_table SET chunk_value = i WHERE test_name = p_test_name AND chunk_name = p_chunk_name; END LOOP; UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = p_chunk_name; END;
You might note the intentional RAISE for an input of 4, that’s to illustrate the parallel execution exception handling.
Next, create a procedure to call the test chunk multiple times in parallel.
Important to note your pl/sql block must have bind variables called :start_id and :end_id.
If you don’t have both, or name them something different you’ll get an ORA-01006 exception.
In this example, I generate a list of chunks with ids 1-10. Each chunk starts and ends with a single id, so 10 chunks.
I then pass these ids to the chunk procedure and execute 5 of them at a time. Since each invocation of the chunk procedure will log when it starts this should create 2 clusters of five parallel executions.
CREATE OR REPLACE PROCEDURE parallel_exec_test_proc IS v_task_name VARCHAR2(30) := 'parallel_test1'; v_plsql_block VARCHAR2(32767); BEGIN DELETE parallel_exec_test_table WHERE test_name = v_task_name; DBMS_PARALLEL_EXECUTE.create_task(task_name => v_task_name); DBMS_PARALLEL_EXECUTE.create_chunks_by_sql( task_name => v_task_name, sql_stmt => 'SELECT level start_id, level end_id FROM dual connect by level <=10', by_rowid => FALSE ); v_plsql_block := q'[ begin parallel_exec_test1_chunk('parallel_test1',to_char(to_date(:start_id,'j'),'jsp'),:end_id); end; ]'; DBMS_PARALLEL_EXECUTE.run_task( task_name => v_task_name, sql_stmt => v_plsql_block, language_flag => DBMS_SQL.native, parallel_level => 5 ); DBMS_OUTPUT.put_line( TO_CHAR(SYSTIMESTAMP, 'yyyy-mm-dd hh24:mi:ss.ff') || ' ' || DBMS_PARALLEL_EXECUTE.task_status(v_task_name) ); END; /
Testing produces the following results…
SQL> set timing on SQL> EXEC parallel_exec_test_proc PL/SQL procedure successfully completed. Elapsed: 00:00:06.03 SQL> SELECT * FROM parallel_exec_test_table WHERE test_name = 'parallel_test1' ORDER BY chunk_start; TEST_NAME CHUNK_NAME CHUNK_VALUE CHUNK_START CHUNK_END -------------------- ---------- ----------- -------------------------- ------------------------------ parallel_test1 three 3 2013-10-17 00:44:27.494957 2013-10-17 00:44:29.496812 parallel_test1 five 5 2013-10-17 00:44:27.495694 2013-10-17 00:44:29.499113 parallel_test1 six 6 2013-10-17 00:44:27.496019 2013-10-17 00:44:29.499743 parallel_test1 seven 7 2013-10-17 00:44:27.496139 2013-10-17 00:44:29.498747 parallel_test1 eight 8 2013-10-17 00:44:27.510579 2013-10-17 00:44:29.514314 parallel_test1 nine 9 2013-10-17 00:44:29.498327 2013-10-17 00:44:31.500831 parallel_test1 ten 10 2013-10-17 00:44:29.500020 2013-10-17 00:44:31.503208 parallel_test1 one 1 2013-10-17 00:44:29.500746 2013-10-17 00:44:31.502249 parallel_test1 two 2 2013-10-17 00:44:29.501337 2013-10-17 00:44:31.502495 9 rows selected. Elapsed: 00:00:00.04
As you can see, 5 runs at 44:27 and 4 runs at 44:29. You might also note that even though the chunks were defined in order 1-10, the actual execution order is effectively random.
Also note, 10 chunks were submitted but only 9 records were written. This is because for id 4 raised an exception.
Checking the chunk status in the dictionary we can see the error
SQL> SELECT start_id,end_id,status,start_ts,error_message FROM dba_parallel_execute_chunks WHERE task_name = 'parallel_test1'; START_ID END_ID STATUS START_TS ERROR_MESSAGE ---------- ---------- -------------------- --------------------------------------------------------------------------- -------------------------------------------------- 3 3 PROCESSED 2013-10-17 00:44:27.494342 5 5 PROCESSED 2013-10-17 00:44:27.494354 6 6 PROCESSED 2013-10-17 00:44:27.495262 7 7 PROCESSED 2013-10-17 00:44:27.495951 8 8 PROCESSED 2013-10-17 00:44:27.510169 9 9 PROCESSED 2013-10-17 00:44:29.497890 10 10 PROCESSED 2013-10-17 00:44:29.499605 1 1 PROCESSED 2013-10-17 00:44:29.500361 2 2 PROCESSED 2013-10-17 00:44:29.500928 4 4 PROCESSED_WITH_ERROR 2013-10-17 00:44:27.494370 ORA-06502: PL/SQL: numeric or value error 10 rows selected.
Another, potentially more useful option is executing different procedures simultaneously.
To demonstrate this I’ll create 5 new procedures.
CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk1(p_test_name IN VARCHAR2) IS c_chunk_name CONSTANT VARCHAR2(6) := 'chunk1'; BEGIN INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = c_chunk_name; END; CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk2(p_test_name IN VARCHAR2) IS c_chunk_name CONSTANT VARCHAR2(6) := 'chunk2'; BEGIN INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = c_chunk_name; END; CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk3(p_test_name IN VARCHAR2) IS c_chunk_name CONSTANT VARCHAR2(6) := 'chunk3'; BEGIN INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = c_chunk_name; END; CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk4(p_test_name IN VARCHAR2) IS c_chunk_name CONSTANT VARCHAR2(6) := 'chunk4'; BEGIN INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = c_chunk_name; END; CREATE OR REPLACE PROCEDURE parallel_exec_test2_chunk5(p_test_name IN VARCHAR2) IS c_chunk_name CONSTANT VARCHAR2(6) := 'chunk5'; BEGIN INSERT INTO parallel_exec_test_table(test_name, chunk_name, chunk_start) VALUES (p_test_name, c_chunk_name, SYSTIMESTAMP); DBMS_LOCK.sleep(2); UPDATE parallel_exec_test_table SET chunk_end = SYSTIMESTAMP WHERE test_name = p_test_name AND chunk_name = c_chunk_name; END;
Just as in the first test, I’ll create a procedure that will execute them simultaneously. The trick here will be that the API requires chunk ids not procedure names. This is easily handled though with a simple CASE statement to call the procedure I want for each id.
CREATE OR REPLACE PROCEDURE parallel_exec_test2_proc IS v_task_name VARCHAR2(30) := 'parallel_test2'; v_plsql_block VARCHAR2(32767); BEGIN DELETE parallel_exec_test_table WHERE test_name = v_task_name; DBMS_PARALLEL_EXECUTE.create_task(task_name => v_task_name); DBMS_PARALLEL_EXECUTE.create_chunks_by_sql( task_name => v_task_name, sql_stmt => 'SELECT level start_id, level end_id FROM dual connect by level <=5', by_rowid => FALSE ); v_plsql_block := q'[ declare v_dummy integer := :end_id; begin case :start_id when 1 then parallel_exec_test2_chunk1('parallel_test2'); when 2 then parallel_exec_test2_chunk2('parallel_test2'); when 3 then parallel_exec_test2_chunk3('parallel_test2'); when 4 then parallel_exec_test2_chunk4('parallel_test2'); when 5 then parallel_exec_test2_chunk5('parallel_test2'); end case; end; ]'; DBMS_PARALLEL_EXECUTE.run_task( task_name => v_task_name, sql_stmt => v_plsql_block, language_flag => DBMS_SQL.native, parallel_level => 5 ); DBMS_OUTPUT.put_line( TO_CHAR(SYSTIMESTAMP, 'yyyy-mm-dd hh24:mi:ss.ff') || ' ' || DBMS_PARALLEL_EXECUTE.task_status(v_task_name) ); END; /
And testing produces these results…
SQL> EXEC parallel_exec_test2_proc PL/SQL procedure successfully completed. Elapsed: 00:00:06.02 SQL> SELECT * 2 FROM parallel_exec_test_table 3 WHERE test_name = 'parallel_test2' 4 ORDER BY chunk_start; TEST_NAME CHUNK_NAME CHUNK_VALUE CHUNK_START CHUNK_END -------------------- ---------- ----------- -------------------------- ---------------------------- parallel_test2 chunk3 2013-10-17 01:11:18.236996 2013-10-17 01:11:20.238041 parallel_test2 chunk4 2013-10-17 01:11:18.242705 2013-10-17 01:11:20.244119 parallel_test2 chunk5 2013-10-17 01:11:18.251068 2013-10-17 01:11:20.252110 parallel_test2 chunk1 2013-10-17 01:11:18.259176 2013-10-17 01:11:20.260017 parallel_test2 chunk2 2013-10-17 01:11:18.269944 2013-10-17 01:11:20.271156
If I was working with 11gR1 or older I’d use dbms_scheduler or dbms_job (for really old versions like 9i or lower.) Using dbms_parallel_execute is certainly easier. One important thing to note is each chunk is committed if it completes successfully. So, this is not a good technique if you need to maintain an atomic transaction across all the chunks. Still, it’s another tool for the toolbox and one I appreciate.
I hope it helps.
Questions and comments, as always, are welcome.
Hi Sean,
Your blog about the topic (Parallel PL/SQL with DBMS_PARALLEL_EXECUTE) was what i was looking for to implement. It was very helpful example
Calling a procedure multiple times with different parameters (Parallel processing)
Topic : Parallel PL/SQL with DBMS_PARALLEL_EXECUTE
URL : https://seanstuber.wordpress.com/2013/10/17/parallel-plsql-with-dbms_parallel_execute/
I am just using the same code what was written by you and compiled it.
I am able to compile the code but when executing the procedure i am getting the below error :
Command user for execution : EXEC parallel_exec_test_proc
Error Message :
[Error] Execution (2: 1): ORA-29497: duplicate task name
ORA-06512: at “SYS.DBMS_PARALLEL_EXECUTE_INTERNAL”, line 170
ORA-00001: unique constraint (SYS.PK_DBMS_PARALLEL_EXECUTE_1) violated
ORA-06512: at “SYS.DBMS_PARALLEL_EXECUTE”, line 86
ORA-06512: at “NDS.PARALLEL_EXEC_TEST_PROC”, line 8
ORA-06512: at line 1
Your help regarding this issue will be much appreciated..
The test procedures create and execute the tasks but do not remove them.
I did that intentionally for the tests to leave objects around to investigate after execution.
You can see yours with
SELECT task_name, status FROM user_parallel_execute_tasks;
Then to clean them up, assuming you ran both tests cases…
BEGIN
DBMS_PARALLEL_EXECUTE.drop_task(‘parallel_test1’);
DBMS_PARALLEL_EXECUTE.drop_task(‘parallel_test2’);
END;
It would be ok to add these to the test procedures yourself so they do their own cleanup, but then you lose the status tracking.
Alternately, change the code to create a unique task name each time and then you can compare multiple runs over time and do clean up later when you no longer need them.
This is a very good and well explained topic that I’ll use soon in my company. The use case that I have is a procedure to generate csv files based on a sql query. This procedure is working fine with small amount of data but when exporting more than 1M-2M is getting very slow. So I’ll try to use a parallel processing and export the files in chunks. Thanks for doing this useful tutorial even today in 2019.
I’m glad you found it useful.
1 to 2 million rows doesn’t seem like it should be that much.
You might want to try using something like sqlcl csv option, or some version of the c unloader originally written by Tom Kyte
The original (I think) can be found here:
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:459020243348)