Dear Patrick,
Last year I did a presentation on table functions at KScope. One of the questions I got was: ‘If pipelined table functions provide their results faster, why would you want to use a normal table function?’ I couldn’t come up with the answer then, maybe you can help?
Erik van Roon
Dear Erik,
Let’s start with explaining a bit what table functions are. Table Functions are functions that return a collection of data and which can be called in a normal SQL statement by using the TABLE() operator. Let’s create a simple function. Please note this is a demonstration only, not something you would normally solve using table functions (or PL/SQL as a whole).
First step is to create a record type in the database:
[PATRICK]SQL>CREATE TYPE emp_rt AS OBJECT ( empno NUMBER(4) , ename VARCHAR2(10) , mgr NUMBER(4) ) /
Then you need to create a table type in the database:
[PATRICK]SQL>CREATE TYPE emp_tt AS TABLE OF emp_rt /
Then it is time for the simple function. The DBMS_LOCK.SLEEP call is in there to show the difference between Table Functions and Pipelined Table Functions.
[PATRICK]SQL>CREATE OR REPLACE FUNCTION tf_emp RETURN emp_tt AS l_returnvalue emp_tt; BEGIN SELECT emp_rt(e.empno, e.ename, e.mgr) BULK COLLECT INTO l_returnvalue FROM emp e ORDER BY e.deptno; FOR indx IN l_returnvalue.first .. l_returnvalue.last LOOP l_returnvalue(indx).ename := INITCAP(l_returnvalue(indx).ename); dbms_lock.sleep(.25); -- for demo purposes only END LOOP; RETURN l_returnvalue; END; /
Now you can call the function in the FROM clause of your SQL statement as if it were a relational table:
[PATRICK]SQL>SELECT * FROM TABLE(tf_emp) /
Notice that the result is displayed after all records have been processed, i.e. after 3.5 seconds (due to the DBMS_LOCK.SLEEP statement).
Now let’s create a PIPELINED table function, which produces the same result but in a different manner:
[PATRICK]SQL>CREATE OR REPLACE FUNCTION ptf_emp RETURN emp_tt PIPELINED AS l_emps emp_tt; l_returnvalue emp_rt; BEGIN SELECT emp_rt(e.empno, e.ename, e.mgr) BULK COLLECT INTO l_emps FROM emp e ORDER BY e.deptno; FOR indx IN l_emps.first .. l_emps.last LOOP l_returnvalue := emp_rt(empno => l_emps(indx).empno ,ename => INITCAP(l_emps(indx).ename) ,mgr => l_emps(indx).mgr); PIPE ROW (l_returnvalue); dbms_lock.sleep(.25); -- for demo purposes only END LOOP; RETURN; END; /
If you set the arraysize of your SQL*Plus session (or your command window in PL/SQL Developer) you can see how the results are being returned as they are produced, i.e. 0.25 seconds apart.
[PATRICK]SQL>SET ARRAYSIZE 1 [PATRICK]SQL>SELECT * FROM TABLE(ptf_emp) /
Now you can see the difference between a Table Function and a Pipelined Table Function. Pipelined Table Functions are best used when you are executing different stages of transformation of your data, for example reading from an OLTP system and writing to a DataWareHouse system. If you PARALLEL_ENABLE your functions AND your source table can be read parallel then you could really see some performance benefits.
But all of this doesn’t explain why you should NOT use pipelining in a table function. The ONLY reason I can think of is when you want to be able to call the function from plain PL/SQL. PL/SQL does one call to a function and expects one result from it. Not a result spread out over many ‘callback’s.
If you create a function like this:
[PATRICK]SQL>CREATE OR REPLACE PROCEDURE call_tf AS l_emps emp_tt; BEGIN l_emps := tf_emp; FOR indx IN l_emps.first .. l_emps.last LOOP dbms_output.put_line(l_emps(indx).empno || ' ' || l_emps(indx).ename || ' ' || l_emps(indx).mgr); END LOOP; END; /
And call this function, then everything works ok. It takes about 3.5 seconds for the function to complete, due to the DBMS_LOCK.SLEEP call in the tf_emp function.
If you call the pipelined table function you are stopped at compile time. The call to a pipelined table function is not allowed.
[PATRICK]SQL>CREATE OR REPLACE PROCEDURE call_ptf AS l_emps emp_tt; BEGIN l_emps := ptf_emp; FOR indx IN l_emps.first .. l_emps.last LOOP dbms_output.put_line(l_emps(indx).empno || ' ' || l_emps(indx).ename || ' ' || l_emps(indx).mgr); END LOOP; END; / Warning: Procedure created with compilation errors. [PATRICK]SQL>sho err Errors for PROCEDURE CALL_PTF: LINE/COL ERROR -------- --------------------------------------------------------------------- 1/10 PLS-00653: aggregate/table functions are not allowed in PL/SQL scope
You can of course wrap the call to the pipelined table function in a cursor like this:
[PATRICK]SQL>CREATE OR REPLACE PROCEDURE call_ptf2 AS CURSOR c_emps IS SELECT emp_rt(t.empno, t.ename, t.mgr) FROM TABLE(ptf_emp) t; l_emps emp_tt; BEGIN OPEN c_emps; FETCH c_emps BULK COLLECT INTO l_emps; FOR indx IN l_emps.first .. l_emps.last LOOP dbms_output.put_line(l_emps(indx).empno || ' ' || l_emps(indx).ename || ' ' || l_emps(indx).mgr); END LOOP; END; / Procedure created.
But when you call this function you will see that it takes about 3.5 seconds to fetch all the records, effectively using the pipelined table function as a normal table function. This might be your escape to use pipelined table functions in a SQL only environment and still use the same function in a PL/SQL environment.
‘But you said pipelined table functions are best used then executing different stages of transformation. That includes multiple PL/SQL functions.’ you might wonder. That is correct. A pipelined table function may call another pipelined table function. It can use for instance a collection as its input like this
[PATRICK]SQL>CREATE OR REPLACE FUNCTION ptf_emp3 (emps_in IN emp_tt) RETURN emp_tt PIPELINED AS l_emp emp_rt; BEGIN FOR indx IN emps_in.first .. emps_in.last LOOP l_emp := emps_in(indx); l_emp.ename := UPPER(l_emp.ename); PIPE ROW (l_emp); dbms_lock.sleep(.25); -- for demo purposes only END LOOP; RETURN; END; /
When you call this function using this:
[PATRICK]SQL>SET ARRAYSIZE 1 [PATRICK]SQL>SELECT * FROM TABLE(ptf_emp3(CAST(MULTISET (SELECT * FROM TABLE(ptf_emp)) AS emp_tt))) /
You will notice the output will not start displaying until after about 3.5 seconds which tells me the call to ptf_emp must be completed before pft_emp3 can start doing its work.
If you change the parameter to accept a cursor (and of course change the processing as well) like this:
[PATRICK]SQL>CREATE OR REPLACE FUNCTION ptf_emp4 (cursor_in IN SYS_REFCURSOR) RETURN emp_tt PIPELINED AS l_emp emp_rt; BEGIN LOOP FETCH cursor_in INTO l_emp; EXIT WHEN cursor_in%NOTFOUND; l_emp.ename := upper(l_emp.ename); PIPE ROW(l_emp); dbms_lock.sleep(.25); -- for demo purposes only END LOOP; RETURN; END; /
And call this function using this statement:
[PATRICK]SQL>SET arraysize 1 [PATRICK]SQL>SELECT * FROM TABLE(ptf_emp4(CURSOR (SELECT emp_rt(empno ,ename ,mgr) FROM TABLE(ptf_emp)))) /
You will notice the output starting to display after about half a second. The total time needed for this function to complete is roughly the same as the previous, but this function starts processing sooner, which might be exactly what you need.
I hope this sheds a bit of light on the subject. I think bottom line is to always use pipelined table functions instead of normal table functions, except when the function is only meant for PL/SQL or client consumption. You can always work around implementation restrictions if you need to use the same codebase in both SQL and PL/SQL.
Happy Oracle’ing,
Patrick Barel
If you have any comments on this subject or you have a question you want answered, please send an email to patrick[at]bar-solutions[dot]com. If I know the answer, or can find it for you, maybe I can help. |
This question has been published in OTech Magazine of Spring 2015.