Tuesday, September 25, 2007

PL/SQL: Taking it to the next level.

Disclaimer: The PL/SQL Examples presented here are theory and will not compile. However there are plenty of resources from oracle on this subject.

Ever wonder how you could query a pl/sql table? Well there is a way to do this now, since 9i. You must define a function to be “pipelined.” This is one of the PL/SQL’s more powerful abilities, ironically, it’s rarely used, it’s relatively unknown in the PL/SQL Development community. If you talked to your top PL/SQL guru and said, "Hey man these pipelined functions could be really awesome for reporting." The response from your guru would be this, "How dare you feel my head with such vile and lonesome propaganda!"

Essentially, for report writers this opens up some doors that we haven’t ever thought about or wish we had when writing some really challenging reports. This really comes in handy when your working with tough data models. With pipelined functions there is no need for staging tables, long concatenated lexical’s or multiple functions.

While a function will only return one value, there is an exception to this rule which is a pipelined function which can returns n-values. The pseudo syntax for defining a function of that nature is below.


function get_valid_employee_table (p_dept in varchar2) RETURN emp_tbl_type PIPELINED
is
l_emp_rec emp_pkg.emp_rec;
begin
loop through some pl/sql table that you pre-populated in your before report trigger or the calling program
l_emp_rec := some_pl_sql_table;
if ..
--do something
pipe row(l_emp_rec);
elsif ..
--do something
pipe row(l_emp_rec);
else
--do something
pipe row(l_emp_rec);
end if;
end loop;
return;
end get_valid_employee_table;


Sample Call:

select *
from TABLE(emp_pkg.get_valid_employee_table(‘finance’) managers
where managers.employee_id in (select employee_id from employees)

The inside skinny: When, Where and Why

When:
1. You want to avoid a huge concatenated lexical variable in your where clause that will cause performance issues.

2. There is a need for staging table. Typically, with staging tables your already be storing the data in memory so why make the report transaction based. Note: Staging tables in reports are caused by a bad data model or unruly requirements….it can happen :-)

Where:
1. Code this in your pl/sql package.
2. Use it either for reporting (see when) or an ETL Process.

Why:
1. There is no need to create half a dozen functions, you can create a record and call one function and cast it as a table. Boom, you saved yourself some redundancy in your query and in PL/SQL
2. Writes to disk are eliminated, performance plus
3. No need to manage transactions in a report (ie commit/rollbacks into staging tables). Less complex report…
4. Complex calculations can be preformed in SQL natively
a. ad-hoc counts
b. group by
c. joins/outer joins to sql tables or even other parallel enable functions.
5. Technical this can be viewed as polymorphism. Since its code that is logical being represented as a table or view (yes can you put in a view) it can be queried n- possible ways with your limitation only being that of SQL and or PL/SQL. Think about it...


Send me a shout! Is there a mistake or would you like to comment about this article?

3 comments:

Gareth said...

Hi Ike,
I'm interested to know how the pipelined functions work with large datasets. Generally I've used global temp (staging) tables in the past when the data volumes are large ... I'm talking say 30million large rows in base tables. Any bemusements on this?
Gareth

Ike Wiggins said...

Gareth,

Pipelined functions are totally awesome. There so under-utilized it's disgusting. 99% of developers don't even know the true potential of this functionality.

Pipelined functions are designed to work with large-data sets. They should in thoery save you one read and one write. With your application of 30 million writes and reads the performance advantage of saving the write and the read is very noticeable (particularly if your using oracle loader). I can't say if global temp tables store that information in memory or if it's on disk....

If the function is parellel enabled and it's partioning the rows by any (note parellelism is typically not setup on 11i databases and you will not get any benefit from parellel options) the performance gains are significant. This especially true in data conversions, data warehouse apps, or etl in general.

I don't think there is real need to use global temp tables anymore, since this is also just a table in memory....I can't specifically say that it's better than pl/sql table, I think there logical the same when pipelined, I can't say if there physically different.

Did that answer your question?

Ike Wiggins

Gareth said...

Hi Ike,
Thanks for the response. Will try next time I hit a staging table scenario.
Gareth