Generating Rows Automatically in Oracle
Quite often there is a need to generate rows in SQL. Whether it be to generate a row for every date between X and Y (the most common I personally find), or simply just to generate some dummy data for testing purposes. Or even just as an academic exercise it is good practice to explore the options for achieving something which may seem very simple. Traditionally I have seen people do something like this:
SELECT 'stuff' FROM user_objects WHERE rownum < :n;
Which of course is perfectly valid. If you can guarantee that user_objects contains enough rows for your :n limit, or if you prefer to use all_ or dba_ then the user is able to select from them. If not, you tend to find cartesian joins and such going on, or sometimes pl/sql loops (which, depending on how they are being used, can be a big performance hit).
So, are there any other alternative (future-proof, more reliable, faster…) techniques we can use? Of course there are! As with anything there is more than one way to solve a particular problem, but here are a few of the most common techniques.
Connect By
My personal favourite, simply for its ease of use and implementation, plus speed of execution (fast-dual). This relies on exploiting the connect by functionality by not supplying any prior clause. You’re effectively connecting every row to itself. It’s worth mentioning here that officially this technique isn’t documented, and is simply taking advantage of a side-effect. However it is so commonly used I can’t see it ever being deprecated or removed (then again, this is Oracle we’re talking about!).
SELECT 'stuff' FROM dual CONNECT BY LEVEL < :n;
Where I tend to personally use this is to generate calendars. Let’s say I want a row for every day in the current year.
SELECT TRUNC(SYSDATE,'YYYY')+LEVEL - 1 FROM dual CONNECT BY LEVEL <= ADD_MONTHS(TRUNC(SYSDATE,'YYYY'),12)-TRUNC(SYSDATE,'YYYY')
The addition and subtraction of 1 is because Level starts at 1.
Recursive Query
This technique is more valid than the one described above and so will possibly be preferred by the purists. It is however slightly more complex. It utilizes a recursive query with an exit strategy when your desired number of rows is reached.
WITH i(n) AS ( SELECT 1 n FROM Dual UNION ALL SELECT n+1 n FROM i WHERE n < :n) SELECT n FROM i
Model Clause
Possibly a contender for the most obscure of the bunch, simply because the Model clause in SQL is not very well understood by most. I have to admit I’m not an expert in it by any means! It is however extremely powerful and a simple statement can produce our row generator.
SELECT 'stuff' FROM dual MODEL RETURN UPDATED ROWS dimension by (0 as i) measures (0 j) rules iterate(100) (j[iteration_number]=iteration_number+1);
Pipelined Function
Another technique that is often overlooked is the pipelined function (and again a contender for the most obscure). It carried some overhead, however it is very reusable and flexible. We first have to define an additional type object to support it, however once done, the results can be surprisingly pleasing and very quick to develop with!
create type gen_num is table of number; create function n_rows(n in number) return gen_num pipelined is begin for i in 1 .. n loop pipe row (i); end loop; return; end; / select * from table(n_rows(:n));
Summary
Each of the techniques above has its own pro’s and con’s and there is no right way. I am yet to compare real-world performance between the different methods, however one must be cautious that simply generating rows is only half the problem. The key is how we can then join to those rows within another query, and how that query as a whole performs.
Please feel free to post any comments or questions.