Pipelined Functions: Better Than DBMS_OUTPUT

In Portability on 2009-11-11 at 15:36

Every now and then I need some PL/SQL that prints something to the terminal. The traditional solution for this is DBMS_OUTPUT.PUT_LINE. If you used it before, you probably know that there are some obstacles:

Don’t forget to set serveroutput on
Don’t forget to set an appropriate buffer size (but there is still a absolute maximum)
Don’t wonder about missing blanks
No way to flush the output

In case you don’t know yet, the first three can be taken care of by issuing the following in SQL*Plus:

set serveroutput on size 1000000 format wrapped;

However, the absolute buffer size limit remains. The years have passed, and the limitations were accepted. Until I noticed that there is an alternative.

With release 9iR2, Oracle delivered a package to format the PLAN_TABLE‘s content: DBMS_XPLAN. A very nifty tool which finally rendered the SQLs to format the explain plan useless. But explain plan is not today’s topic; lets just focus in how the formatted plan reaches our screen:

select * from table(DBMS_XPLAN.DISPLAY);

The easiest way to use the package is shown above. It’s obvious that the data is actually fetched by regular select statement. You will also notice this by SQL*Plus artifacts like the feedback line (8 rows selected):

SQL> explain plan for select * from dual;

Explained.

SQL>  select * from table(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------
Plan hash value: 272002086

---------------------------------------------------------------------
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time  |
---------------------------------------------------------------------
|  0 | SELECT STATEMENT  |      |    1 |     2 |     2   (0)| 00:01 |
|  1 |  TABLE ACCESS FULL| DUAL |    1 |     2 |     2   (0)| 00:01 |
---------------------------------------------------------------------

8 rows selected.

SQL>

The table you see isn’t actually a table (as the database knows it), it’s just formatted output. When using DBMS_XPLAN, I always noted to myself to investigate this. Some years later, the time has come.

Pipelined Table Functions

Oracle’s (pipelined) table functions are PL/SQL functions that return data as (more or less) regular tables that can be accessed with regular SQL. The pipelined feature makes it possible to fetch the first records while the function is still running.

It seems that the primary target of this feature are custom analytic functions and data re-arranging like pivot. However, I use it to get rid of DBMS_OUTPUT.

Hello World

The following is the most simplistic example how to use a pipelined function to return arbitrary output:

create or replace type piped_output_table as table of varchar2(4000);
/

create or replace
 function hello_world return piped_output_table pipelined IS
 text varchar2(4000);
begin
 text := 'Hello world';
 pipe row(text);
 RETURN;
END;
/

The hello_world function must be wrapped in a table() expression to access it like a regular table in SQL:

SQL>  select * from table(hello_world);
COLUMN_VALUE
---------------------------------------------------------------------
Hello world

1 row selected.

SQL>

We have already mastered everything important to replace DBMS_OUTPUT:

Define a table type to be used as return type of the function
Add the pipelined keyword
The PIPE ROW() statement to actually produce a row.
The RETURN statement without an argument.

However, there are two more issues to consider:

You might want to have a different column name (instead of COLUMN_VALUE)
You might need to produce output from nested functions

Custom Column Names

The first issue is rather simple; to name the returned column(s) you must create a object type and base the table type upon this object type:

create or replace type piped_output is object (output varchar2(4000));
/

create or replace type piped_output_table is table of piped_output;
/

create or replace function
   hello_world return piped_output_table pipelined IS
   rec piped_output := piped_output(NULL);
begin
   rec.output := 'Hello world';
   pipe row(rec);
   RETURN;
END;
/

The column header can have an arbitrary column name. Since it is a column name, all respective limitations apply (e.g., no spaces).

Nested Functions

The final requirement to replace DBMS_OUTPUT is produce output from nested function calls. With DBMS_OUTPUT this is rather straight:

CREATE OR REPLACE PACKAGE dbms_output_nested IS
   PROCEDURE hello_world;
END;
/

CREATE OR REPLACE PACKAGE body dbms_output_nested IS
   PROCEDURE nested_procedure IS
   BEGIN
      DBMS_OUTPUT.PUT_LINE('  hello world from nested function') ;
   END nested_procedure;

   PROCEDURE hello_world IS
   BEGIN
      DBMS_OUTPUT.ENABLE();
      DBMS_OUTPUT.PUT_LINE('hello world') ;
      nested_procedure;
   END hello_world;
END dbms_output_nested;
/

Unfortunately, I am not aware of an easy way to accomplish this with pipelined table functions. It seems that there is no way to directly pipe one functions output into the output pipe of another function. So far, the only solution I found is to explicitly copy the records over:

CREATE OR REPLACE PACKAGE piped_nested IS
   TYPE piped_output IS RECORD (output VARCHAR2(4000));
   TYPE piped_output_table IS TABLE OF piped_output;

   FUNCTION hello_world
     RETURN piped_nested.piped_output_table PIPELINED;
   FUNCTION nested_procedure
     RETURN piped_nested.piped_output_table PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY piped_nested IS
   FUNCTION nested_procedure
     RETURN piped_nested.piped_output_table PIPELINED IS
      rec piped_nested.piped_output;
   BEGIN
      rec.output := '   hello world from nested function';
      PIPE ROW(rec);
      RETURN;
   END;

   FUNCTION hello_world
     RETURN piped_nested.piped_output_table PIPELINED IS
      rec piped_nested.piped_output;
   BEGIN
      rec.output := 'Hello World';
      PIPE ROW(rec);

      FOR r IN (select * from table(nested_procedure)) LOOP
         rec.output := r.output;
         PIPE ROW(rec);
      END LOOP;

      RETURN;
   END;
END piped_nested;
/

You see that there are two major drawbacks:

The nested_procedure is public
The copy bloats the hello_world function

As mentioned above, I don’t have a good solution for this (let me know if you find something).

Mixing DBMS_OUTPUT with Pipelined Functions

If you have legacy PL/SQL code that uses DBMS_OUTPUT and would like to migrate to pipelined functions, you might consider the following wrapper:

CREATE OR REPLACE PACKAGE dbms_output_table IS
   TYPE piped_output IS RECORD (dbms_output VARCHAR2(4000));
   TYPE piped_output_table IS TABLE OF piped_output;
   FUNCTION display RETURN piped_output_table PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY dbms_output_table IS
   FUNCTION display RETURN piped_output_table PIPELINED IS
      rec piped_output;
      p_status NUMBER;
   BEGIN
      DBMS_OUTPUT.GET_LINE(rec.dbms_output, p_status);
      WHILE p_status = 0 LOOP
         PIPE ROW(rec);
         DBMS_OUTPUT.GET_LINE(rec.dbms_output, p_status);
      END LOOP;
      RETURN;
   END;
END;
/

With this function you can retrieve the current content of the DBMS_OUTPUT buffer via a pipelined table function. Please note that the producer of the DBMS_OUTPUT must enable the DBMS_OUTPUT facility and SQL*Plus must be set serverout off (otherwise SQL*Plus would retrieve it):

SQL>  exec dbms_output_nested.hello_world;
hello world
hello world from nested function

PL/SQL procedure successfully completed.

SQL> set serverout off
SQL>  exec dbms_output_nested.hello_world;

PL/SQL procedure successfully completed.

SQL> select * from table(dbms_output_table.display);

DBMS_OUTPUT
---------------------------------------------------------------------
hello world
 hello world from nested function

SQL>

The Best of Both Worlds?

This—and similar methods—can be used so that nested functions still use the DBMS_OUTPUT interface, while the front end function returns its output as pipelined table. A mixed approach allows you to hide the nested function—no need to have them in the public package declaration—but reapplies some of the DBMS_OUTPUT limitations; the maximum buffer size (although at smaller level), no flush.

CREATE OR REPLACE PACKAGE mixed_package IS
   TYPE piped_output IS RECORD (output VARCHAR2(4000));
   TYPE piped_output_table IS TABLE OF piped_output;
   FUNCTION hello_world RETURN piped_output_table PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY mixed_package IS
   PROCEDURE prepare_dbms_output IS
   BEGIN
      DBMS_OUTPUT.DISABLE; -- removes any stale data
      DBMS_OUTPUT.ENABLE(1000000);
   END;

   PROCEDURE nested_function IS
   BEGIN
      DBMS_OUTPUT.PUT_LINE('   Hello world from nested function');
   END;

   FUNCTION hello_world RETURN piped_output_table PIPELINED IS
      rec piped_output;
      p_status NUMBER;
   BEGIN
      prepare_dbms_output;

      rec.output := 'Hello World';
      PIPE ROW(rec);

      nested_function;

      DBMS_OUTPUT.GET_LINE(rec.output, p_status);
      WHILE p_status = 0 LOOP
         PIPE ROW(rec);
         DBMS_OUTPUT.GET_LINE(rec.output, p_status);
      END LOOP;

      RETURN;
   END;
END;
/

You can also use other methods to return row data from nested functions and pass it on to the PIPE ROW statement in the topmost stack frame. No matter in which way pass rows from nested functions, you must be aware that they are passed as a bulk when the nested function finishes. That means, the are kind of flushed once for each nested function call. If all you data is returned from a nested function, you might effectively disable the pipelined feature.

Final Comparison and Critique

The user interface of pipelined table functions is much better than the well established DBMS_OUTPUT for several reasons:

No separate API needed to access the returned data
Most programming languages don’t have a convenient way to access the serveroutput. Very often the only way to fetch the buffer is to manually call DBMS_OUTPUT.GET_LINES. On a quick view, only the perl DBI/DBD::Oracle framework has addressed the problem (a little bit):
But wait, your scripts are anyways intended for SQL*Plus usage only? Besides the fact that you never know in which context your scripts are used, there are also other user interfaces than SQL*Plus to access an Oracle database. There are many IDE that support database access. Most of those have support for DBMS_OUTPUT in the meanwhile but it’s different in each implementation. You can avoid a great deal of questions where to find the output of your script in tool XYZ by using table functions.
No size limitation
But wait, your script isn’t returning much data? Yes, you usually know how much data your script will return. With a call to DBMS_OUTPUT.ENABLE you can make the buffer big enough. But if you are wrong? You are lucky if the user knows how to enlarge the buffer. You are even more happy if the hard limit of the buffer size is not reached (1,000,000 bytes). I know of a case where the user was required to change a script to use UTL_FILE because the output was a few megabytes (luckily they had access to the database back-end, where the file was written).
Pipelined
Self-explaining? No major usability benefit, I guess. However, it is a benefit to see that something is happening while the function is running. It is definitely a benefit not to buffer everything in memory until the end—that is probably the reason why there is a hard limit on the buffer size.

I guess the article gave you insight to the major drawback of the pipelined table function method as well:

Producing output from nested function calls is a pain in the a**
It is possible, but inconvenient—if not delusive because it exploits internal functions in the public interface. The use of collections in the nested function makes it possible to maintain a clean public interface. However, copying the data over and over again is inconvenient, if not problematic, because it might reintroduce the size issue (when using DBMS_OUTPUT the already mentioned limit of 1,000,000 bytes per nested function call apply).
DML in the table function cannot participate in the caller’s transaction
UPDATE 2010-03-11: Gary commented that
```
select * from table(hello_world)
```
works only if the function does not perform any DML. Gary is of course right; any attempt to do so results in ORA-14551. The only possibility to perform DML in the function is to use an autonomous transaction.

My personal Best Current Practice is to use other methods (collections or even DBMS_OUTPUT) in the private functions, but use a pipelined function interface for the user. This is sometimes very inconvenient to code but the improved user experience is worth that effort.

Of course I would have some wishes how the situation could be improved. The most powerful way would be to allow something like this:

PIPE ROW(nested_pipelined_function);

Where nested_pipelined_function doesn’t need to be part of the public package interface. Each PIPE ROW() in the nested function should directly output as if was written in the calling function. On the other hand, it would also be great to have something to pipe a complete collection at once:

PIPE ROWS(collection);

I am well aware that the usage of pipelined table functions as demonstrated in this article might not reflect the intended usage of that feature, thus its fitness for that purpose is limited.

It’s not fatal to open your mind to new approaches.

Markus Winand's Blog

Software Quality is Quality of Life