Pipelined Table Functions

I have had trouble with a certain view I have to create at the customer site I am currently working at. The view involves 3 SQL queries combined by using a UNION ALL, since the separate queries are mutually exclusive. Using the UNION ALL makes it a bit faster since Oracle doesn’t have to do all the work of sorting and checking for duplicate rows.

Retrieving data from this view resulted in memory errors since there is too much data for the server to come up with and to return to me. After checking all kinds of initialization parameters I decided to try and build a couple of pipelined table functions that would retrieve the same data for me, but I assumed it wouldn’t put such a big pressure on my memory usage.

A pipelined table function will come up with a single row for the result set and pipe it out of the function when it’s available. In my opinion there would only be need for enough memory for this single row instead of enough memory for all the rows of the entire result set.

Since the function gets called from SQL the SQL engine needs to be aware of the result it produces. So I need to create types that define a row of the result. And since the function will return more than a single row, I also need to define a NESTED TABLE, a collection of these rows. Using two types I can mimic the appearance of a table.

Now all I have to do is create functions that will give me the result I am interested in.

My ‘normal’ view looks like this:

create or replace view v_emps as

select e.empno, e.ename, e.job, e.mgr, e.hiredate, e.sal, e.comm

   from emp e

  where e.deptno = 10

union all

select e.empno, e.ename, e.job, e.mgr, e.hiredate, e.sal, e.comm

  from emp e

where e.deptno = 20

union all

select e.empno, e.ename, e.job, e.mgr, e.hiredate, e.sal, e.comm

  from emp e

where e.deptno = 30

union all

select e.empno, e.ename, e.job, e.mgr, e.hiredate, e.sal, e.comm

  from emp e

where e.deptno = 40

/

In the following example I create a new view based on the table functions for the EMP table. The SQL approach is better, easier to read and faster, but it shows what should be done if you want to move over to table functions.

The way such a function look is kind of strange at first sight.

Function f1 return mynestedtabletype pipelined

Is

  <declaration>

  l_record myrecordtype;

begin

  loop

    <do stuff to define l_record>

    pipe row(l_record); — send the row out

  end loop;

  return; — return nothing but control

end f1;

First I create a type for the records I return from my function:

CREATE TYPE emp_t AS OBJECT

(

  empno    NUMBER(4),

  ename    VARCHAR2(10),

  job      VARCHAR2(9),

  mgr      NUMBER(4),

  hiredate DATE,

  sal      NUMBER(7, 2),

  comm     NUMBER(7, 2)

)

/

Then I create a nested table of the record type I just created. The functions will return (or say they will return) a variable of this type:

CREATE TYPE emps_tt AS TABLE OF emp_t

/

Next is the package which will contain the table functions. I create 4 functions to retrieve all the datasets I need.

CREATE OR REPLACE PACKAGE emp_tf IS 
  — Author  : Patrick Barel

  — Purpose : Table Functions for EMP

  — Public function and procedure declarations

  FUNCTION get_accounting RETURN emps_tt PIPELINED;

  FUNCTION get_research RETURN emps_tt PIPELINED;

  FUNCTION get_sales RETURN emps_tt PIPELINED;

  FUNCTION get_operations RETURN emps_tt PIPELINED;

END emp_tf;

/

CREATE OR REPLACE PACKAGE BODY emp_tf IS

  — Function and procedure implementations

  FUNCTION get_accounting RETURN emps_tt

    PIPELINED IS

    CURSOR c_emps IS

      SELECT e.empno, e.ename, e.job, e.mgr

            ,e.hiredate, e.sal, e.comm

        FROM emp e

       WHERE e.deptno = 10;

    r_emps        c_emps%ROWTYPE;

    l_returnvalue emp_t;

  BEGIN

    OPEN c_emps;

    LOOP

      FETCH c_emps INTO r_emps;

      l_returnvalue := emp_t(r_emps.empno, r_emps.ename, r_emps.job

                            ,r_emps.mgr, r_emps.hiredate

                            ,r_emps.sal, r_emps.comm);

      EXIT WHEN c_emps%NOTFOUND;

      PIPE ROW(l_returnvalue);

    END LOOP;

    CLOSE c_emps;

    RETURN;

  END get_accounting;

  —

  FUNCTION get_research RETURN emps_tt

    PIPELINED IS

    CURSOR c_emps IS

      SELECT e.empno, e.ename, e.job, e.mgr

            ,e.hiredate, e.sal, e.comm

        FROM emp e

       WHERE e.deptno = 20;

    r_emps        c_emps%ROWTYPE;

    l_returnvalue emp_t;

  BEGIN

    OPEN c_emps;

    LOOP

      FETCH c_emps

        INTO r_emps;

      l_returnvalue := emp_t(r_emps.empno, r_emps.ename, r_emps.job

                            ,r_emps.mgr, r_emps.hiredate

                            ,r_emps.sal, r_emps.comm);

      EXIT WHEN c_emps%NOTFOUND;

      PIPE ROW(l_returnvalue);

    END LOOP;

    CLOSE c_emps;

    RETURN;

  END get_research;

  —

  FUNCTION get_sales RETURN emps_tt

    PIPELINED IS

    CURSOR c_emps IS

      SELECT e.empno, e.ename, e.job, e.mgr

            ,e.hiredate, e.sal, e.comm

        FROM emp e

       WHERE e.deptno = 30;

    r_emps        c_emps%ROWTYPE;

    l_returnvalue emp_t;

  BEGIN

    OPEN c_emps;

    LOOP

      FETCH c_emps

        INTO r_emps;

      l_returnvalue := emp_t(r_emps.empno, r_emps.ename, r_emps.job

                            ,r_emps.mgr, r_emps.hiredate

                            ,r_emps.sal, r_emps.comm);

      EXIT WHEN c_emps%NOTFOUND;

      PIPE ROW(l_returnvalue);

    END LOOP;

    CLOSE c_emps;

    RETURN;

  END get_sales;

  —

  FUNCTION get_operations RETURN emps_tt

    PIPELINED IS

    CURSOR c_emps IS

      SELECT e.empno, e.ename, e.job, e.mgr

            ,e.hiredate, e.sal, e.comm

        FROM emp e

       WHERE e.deptno = 40;

    r_emps        c_emps%ROWTYPE;

    l_returnvalue emp_t;

  BEGIN

    OPEN c_emps;

    LOOP

      FETCH c_emps

        INTO r_emps;

      l_returnvalue := emp_t(r_emps.empno, r_emps.ename, r_emps.job

                            ,r_emps.mgr, r_emps.hiredate

                            ,r_emps.sal, r_emps.comm);

      EXIT WHEN c_emps%NOTFOUND;

      PIPE ROW(l_returnvalue);

    END LOOP;

    CLOSE c_emps;

    RETURN;

  END get_operations;

  —

BEGIN

  NULL;

END emp_tf;

/

In the implementation of the functions I could of course use BULK OPERATIONS to speed up the processing instead of the row-by-row (or slow-by-slow) approach I used now, but using bulk operations use more memory (the retrieved data is loaded into memory) and that is exactly what I am trying to avoid. Using BULK COLLECT with a limit clause might be an option, but that is something to explore later on.

The new view looks like this:

create or replace view v_emps_tf as

select * from table(emp_tf.get_accounting)

union all

select * from table(emp_tf.get_research)

union all

select * from table(emp_tf.get_sales)

union all

select * from table(emp_tf.get_operations)

/

Pretty much the same as the original one, but now based on my functions.

Let’s take a look at the ‘strange’ things here. When you look at the function specification it says it will return a nested table type, but in fact it will return nothing. Every result gets piped out of the function (PIPE ROW) when it is done.

Another thing is the from part of the queries. We tell the SQL engine to treat the results of the function as if it were a table. For this we use the TABLE() operator.

Doing some checks (select * from v_emp minus select * from v_emp_tf) tells me the result of both views is the same, since there are no rows returned. When I turn timing on I notice that the ‘normal’ SQL approach is faster than the table function approach. But the issue wasn’t speed, it was memory usage.

Leave a Reply

Your email address will not be published. Required fields are marked *