From: Tonkuma on
On Nov 24, 10:04 am, Tonkuma <tonk...(a)fiberbit.net> wrote:
>
> I tried a way without nested table expression.
> But, it looks the spec. of current DB2 SQL that OLAP functions can't
> nest each other,
> then I used a nested table expression.
I'm sorry!

This worked(tested on DB2 9.7 for Windows.):
------------------------------ Commands Entered
------------------------------
WITH
/**************************************************
***** Start of sample data *****
**************************************************/
sample_data(COL1, COL2, COL3) AS (
VALUES
( 1, 2, '11/16/2009')
,( 1, 2, '11/17/2009')
,( 2, 300, '11/18/2009')
,( 1, 300, '11/19/2009')
,( 1, 2, '11/20/2009')
)
/**************************************************
***** End of sample data *****
**************************************************/
SELECT col1, col2, col3
, COUNT( CASE
WHEN col1 <>
LAG(col1) OVER(ORDER BY col3, col1, col2)
THEN 'Y'
/* ELSE NULL */
END)
OVER(ORDER BY col3, col1, col2
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) + 1 AS group_id
FROM sample_data s
ORDER BY
col3, col1, col2
;
------------------------------------------------------------------------------

COL1 COL2 COL3 GROUP_ID
----------- ----------- ---------- -----------
1 2 11/16/2009 1
1 2 11/17/2009 1
2 300 11/18/2009 2
1 300 11/19/2009 3
1 2 11/20/2009 3

5 record(s) selected.

From: Tonkuma on
This is shorter than previous.
------------------------------ Commands Entered
------------------------------
WITH
/**************************************************
***** Start of sample data *****
**************************************************/
sample_data(COL1, COL2, COL3) AS (
VALUES
( 1, 2, '11/16/2009')
,( 1, 2, '11/17/2009')
,( 2, 300, '11/18/2009')
,( 1, 300, '11/19/2009')
,( 1, 2, '11/20/2009')
)
/**************************************************
***** End of sample data *****
**************************************************/
SELECT col1, col2, col3
, COUNT( NULLIF( col1
, LAG(col1)
OVER(ORDER BY col3, col1, col2)
)
)
OVER(ORDER BY col3, col1, col2
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS group_id
FROM sample_data
ORDER BY
col3, col1, col2
;
------------------------------------------------------------------------------

COL1 COL2 COL3 GROUP_ID
----------- ----------- ---------- -----------
1 2 11/16/2009 1
1 2 11/17/2009 1
2 300 11/18/2009 2
1 300 11/19/2009 3
1 2 11/20/2009 3

5 record(s) selected.

From: Tonkuma on
This is a default, then you can omit the clause.
/*
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
*/

Like this:
------------------------------ Commands Entered
------------------------------
WITH
/**************************************************
***** Start of sample data *****
**************************************************/
sample_data(COL1, COL2, COL3) AS (
VALUES
( 1, 2, '11/16/2009')
,( 1, 2, '11/17/2009')
,( 2, 300, '11/18/2009')
,( 1, 300, '11/19/2009')
,( 1, 2, '11/20/2009')
)
/**************************************************
***** End of sample data *****
**************************************************/
SELECT col1, col2, col3
, COUNT( NULLIF( col1
, LAG(col1)
OVER(ORDER BY col3, col1, col2)
)
)
OVER(ORDER BY col3, col1, col2)
AS group_id
FROM sample_data
ORDER BY
col3, col1, col2
;
------------------------------------------------------------------------------

COL1 COL2 COL3 GROUP_ID
----------- ----------- ---------- -----------
1 2 11/16/2009 1
1 2 11/17/2009 1
2 300 11/18/2009 2
1 300 11/19/2009 3
1 2 11/20/2009 3

5 record(s) selected.


From: Ian on
esmith2112 wrote:
>> Evan,
>>
>> What are the grouping columns? Also, are you looking for a trigger-
>> based solution, or just a query that would merely reflect the current
>> data?
>>
>> --Jeff
>
> The data from the first listing above is the output from another
> complicated business logic query it's been sorted by all three
> columns. The second listing (the 23 is actually a typo--should have
> been 2) is what is what my target results would ideally be. The only
> thing I have to go on is the value from COL1.
>
> Humanly I can see that rows one and two share the same value. They
> belong in the first group. The value row three is different from the
> previous two rows. Therefore it would be next logical grouping. Rows
> three and four share the same value and represent a change from row
> three. However, they do share the same value as rows one and two, but
> should NOT be grouped with those rows.
>
> I know I could write procedural code to iterate through the result
> set, summarizing when I see the row change. I was just hoping that
> there might be a construct in SQL using the OLAP functions that might
> be able to take the pain and need for external programming away from
> the equation.

It is *possible* to do this in SQL, but it may not be very efficient.
Here's a hint to get you started:

select
col1
,col2
,col3
,min(col1)
over (order by col3,col1,col2
rows between 1 preceding and 1 preceding) as prev_col1
from
your_table

The last expression will give you the value of col1 on the previous
row. You can use the to compare col1 with the previous row's value
of col1 to see if it's changing.

However, it would be far more efficient to do it in the application
that will be consuming the data. Alternatively, you could write an
external UDF that could do this more efficiently than SQL.

From: esmith2112 on
Thanks, Tonkuma (anata-wa ichi-ban SQL-no sensei desu!) and Ian. I've
learned multiple new concepts. Very useful!

We're in one of those binds, where it's less invasive to change the
code in a stored procedure and push that, rather than to have to file
the paperwork to open the application code and get permission to take
an outage to promote it. That's why doing it in the database was
preferable for the short term.


Regards,
Evan