From: Michel Esber on
Hello,

Db2 UDB v9.5 Fp3.

I need to return the set of rows that have the highest timestamp. This
is very easy:

drop table DATA;

create table DATA (sample timestamp, field1 integer);

insert into DATA values ('2010-01-18-14.00.00.000000', 1),
('2010-01-18-14.00.00.000000',2), ('2010-01-18-14.00.00.000000',3),
('2010-01-18-14.01.00.000000',4), ('2010-01-18-14.01.00.000000',5),
('2010-01-18-14.02.00.000000',6), ('2010-01-18-14.02.00.000000',7);

Easy query:

select * from DATA D,
(select MAX(SAMPLE) MAX_SAMPLE from DATA) as M
where D.SAMPLE = M.MAX_SAMPLE;

My source table has hundreds of Millions of rows and the access plan
shows two index scans.
Is there any magic trick to achieve the same result with only one
index scan?

Thanks in advance, -Michel
From: jefftyzzer on
On Jan 18, 9:59 am, Michel Esber <smes...(a)gmail.com> wrote:
> Hello,
>
> Db2 UDB v9.5 Fp3.
>
> I need to return the set of rows that have the highest timestamp. This
> is very easy:
>
> drop table DATA;
>
> create table DATA (sample timestamp, field1 integer);
>
> insert into DATA values ('2010-01-18-14.00.00.000000', 1),
> ('2010-01-18-14.00.00.000000',2), ('2010-01-18-14.00.00.000000',3),
> ('2010-01-18-14.01.00.000000',4), ('2010-01-18-14.01.00.000000',5),
> ('2010-01-18-14.02.00.000000',6), ('2010-01-18-14.02.00.000000',7);
>
> Easy query:
>
> select * from DATA D,
>     (select MAX(SAMPLE) MAX_SAMPLE from DATA) as M
> where D.SAMPLE = M.MAX_SAMPLE;
>
> My source table has hundreds of Millions of rows and the access plan
> shows two index scans.
> Is there any magic trick to achieve the same result with only one
> index scan?
>
> Thanks in advance, -Michel

SELECT * FROM DATA ORDER BY SAMPLE DESC FETCH FIRST 1 ROW ONLY;

Ideally your index is on (or leads with) the SAMPLE column and is
DESCending or is set to ALLOW REVERSE SCANS

--Jeff
From: Lennart on
On 18 Jan, 18:59, Michel Esber <smes...(a)gmail.com> wrote:
> Hello,
>
> Db2 UDB v9.5 Fp3.
>
> I need to return the set of rows that have the highest timestamp. This
> is very easy:
>
> drop table DATA;
>
> create table DATA (sample timestamp, field1 integer);
>
> insert into DATA values ('2010-01-18-14.00.00.000000', 1),
> ('2010-01-18-14.00.00.000000',2), ('2010-01-18-14.00.00.000000',3),
> ('2010-01-18-14.01.00.000000',4), ('2010-01-18-14.01.00.000000',5),
> ('2010-01-18-14.02.00.000000',6), ('2010-01-18-14.02.00.000000',7);
>
> Easy query:
>
> select * from DATA D,
>     (select MAX(SAMPLE) MAX_SAMPLE from DATA) as M
> where D.SAMPLE = M.MAX_SAMPLE;
>
> My source table has hundreds of Millions of rows and the access plan
> shows two index scans.
> Is there any magic trick to achieve the same result with only one
> index scan?
>
> Thanks in advance, -Michel

Two suggestions:

1)
select sample, field1 from (
select sample, field1, row_number() over (order by sample desc) as
rn
from data
) where rn = 1;

2)
select * from data
order by sample desc
fetch first 1 rows only;

From: jefftyzzer on
On Jan 18, 9:59 am, Michel Esber <smes...(a)gmail.com> wrote:
> Hello,
>
> Db2 UDB v9.5 Fp3.
>
> I need to return the set of rows that have the highest timestamp. This
> is very easy:
>
> drop table DATA;
>
> create table DATA (sample timestamp, field1 integer);
>
> insert into DATA values ('2010-01-18-14.00.00.000000', 1),
> ('2010-01-18-14.00.00.000000',2), ('2010-01-18-14.00.00.000000',3),
> ('2010-01-18-14.01.00.000000',4), ('2010-01-18-14.01.00.000000',5),
> ('2010-01-18-14.02.00.000000',6), ('2010-01-18-14.02.00.000000',7);
>
> Easy query:
>
> select * from DATA D,
>     (select MAX(SAMPLE) MAX_SAMPLE from DATA) as M
> where D.SAMPLE = M.MAX_SAMPLE;
>
> My source table has hundreds of Millions of rows and the access plan
> shows two index scans.
> Is there any magic trick to achieve the same result with only one
> index scan?
>
> Thanks in advance, -Michel

This is untested, but if you at least know that your result set will
never be more than n rows (say 100), perhaps you could do this:

SELECT * FROM DATA ORDER BY SAMPLE DESC FETCH FIRST 100 ROWS ONLY;

Ideally your index is on (or leads with) the SAMPLE column and is
DESCending or is set to ALLOW REVERSE SCANS

--Jeff
From: jefftyzzer on
On Jan 18, 9:59 am, Michel Esber <smes...(a)gmail.com> wrote:
> Hello,
>
> Db2 UDB v9.5 Fp3.
>
> I need to return the set of rows that have the highest timestamp. This
> is very easy:
>
> drop table DATA;
>
> create table DATA (sample timestamp, field1 integer);
>
> insert into DATA values ('2010-01-18-14.00.00.000000', 1),
> ('2010-01-18-14.00.00.000000',2), ('2010-01-18-14.00.00.000000',3),
> ('2010-01-18-14.01.00.000000',4), ('2010-01-18-14.01.00.000000',5),
> ('2010-01-18-14.02.00.000000',6), ('2010-01-18-14.02.00.000000',7);
>
> Easy query:
>
> select * from DATA D,
>     (select MAX(SAMPLE) MAX_SAMPLE from DATA) as M
> where D.SAMPLE = M.MAX_SAMPLE;
>
> My source table has hundreds of Millions of rows and the access plan
> shows two index scans.
> Is there any magic trick to achieve the same result with only one
> index scan?
>
> Thanks in advance, -Michel

I'm assuming a FETCH FIRST 1 ROW-type query won't work because you
said you need to return the SET of rows. I've not tested this query,
but perhaps it might give you what you need:

SELECT
A,
B,
C,
SAMPLE,
MAX(SAMPLE) OVER() M_SAMPLE
FROM
DATA
WHERE
SAMPLE = M_SAMPLE;

--Jeff