From: Uri Dimant on
Haha, my mistake Bob....





"Bob Barrows" <reb01501(a)NOyahoo.SPAMcom> wrote in message
news:OvHll0tuKHA.4752(a)TK2MSFTNGP04.phx.gbl...
> The decimal point after the 2 in the divisor which forces the result to
> retain the fraction.
>
> Uri Dimant wrote:
>> Bob
>>> Change
>>> (MAX(year) + MIN(year)) * COUNT(*) / 2 g
>>> to
>>> (MAX(year) + MIN(year)) * COUNT(*) / 2. g
>>
>> Where is a difference in above?-))))
>>
>>
>>
>>
>>
>>
>> "Bob Barrows" <reb01501(a)NOyahoo.SPAMcom> wrote in message
>> news:en7DNrtuKHA.5316(a)TK2MSFTNGP05.phx.gbl...
>>> Change
>>> (MAX(year) + MIN(year)) * COUNT(*) / 2 g
>>> to
>>> (MAX(year) + MIN(year)) * COUNT(*) / 2. g
>>>
>>> and it will return the correct data.
>>>
>>> To make it return distinct staff_ids, just modify the select clause:
>>>
>>> WITH gauss AS (
>>> SELECT
>>> staff_id,
>>> SUM(year) s,
>>> (MAX(year) + MIN(year)) * COUNT(*) / 2. g
>>> FROM test GROUP BY staff_id
>>> )
>>> SELECT distinct t.staff_id FROM test t
>>> INNER JOIN gauss g
>>> ON g.staff_id = t.staff_id AND g.s <> g.g
>>>
>>>
>>>
>>>
>>> M.K wrote:
>>>> Hi Stefan,
>>>>
>>>> Thanks for your help. insert few more rows
>>>>
>>>> insert into TEST select 8,2003
>>>> insert into TEST select 8,2004
>>>> insert into TEST select 8,2006
>>>>
>>>>
>>>> Now there is no record for staff_id 8 for year 2005. I also want
>>>> this ID to added in my list.
>>>> It is not returning row to show that staff_id 8 is having a missing
>>>> year(i.e. 2005).
>>>> I don't want to get missing years but list of ID's with having any
>>>> year missing.
>>>> The ID with only one row (data for only one year) is not required to
>>>> be returned.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> "Stefan Hoffmann" <ste5an(a)ste5an.de> wrote in message
>>>> news:ur4sA#suKHA.800(a)TK2MSFTNGP04.phx.gbl...
>>>>> hi Uri,
>>>>>
>>>>> On 03.03.2010 10:25, Uri Dimant wrote:
>>>>>> SELECT [Staff_id],[Year] FROM
>>>>>> (
>>>>>> SELECT *, MAX([Staff_id])OVER ()mx
>>>>>> ,COUNT (*) OVER (PARTITION BY [Staff_id]) AS cn
>>>>>> FROM TEST[Staff_id]
>>>>>> )AS D WHERE cn<mx
>>>>> This does not work when adding more test data:
>>>>>
>>>>> insert into TEST select 1,2004
>>>>> insert into TEST select 10,2004
>>>>> insert into TEST select 11,2004
>>>>> insert into TEST select 12,2004
>>>>>
>>>>> The Gauss algorithm should work:
>>>>>
>>>>> WITH gauss AS (
>>>>> SELECT
>>>>> staff_id,
>>>>> SUM(year) s,
>>>>> (MAX(year) + MIN(year)) * COUNT(*) / 2 g
>>>>> FROM test GROUP BY staff_id
>>>>> )
>>>>> SELECT t.*
>>>>> FROM test t
>>>>> INNER JOIN gauss g
>>>>> ON g.staff_id = t.staff_id AND g.s <> g.g
>>>>>
>>>>>
>>>>> mfG
>>>>> --> stefan <--
>>>
>>> --
>>> HTH,
>>> Bob Barrows
>
> --
> HTH,
> Bob Barrows
>
>


From: Uri Dimant on
Nice one Stefan
I wonder which one would be faster on huge table? :-))))




"Stefan Hoffmann" <ste5an(a)ste5an.de> wrote in message
news:eVTHN0tuKHA.2436(a)TK2MSFTNGP04.phx.gbl...
> hi Bob,
>
> On 03.03.2010 15:07, Bob Barrows wrote:
>> Change
>> (MAX(year) + MIN(year)) * COUNT(*) / 2 g
>> to
>> (MAX(year) + MIN(year)) * COUNT(*) / 2. g
>>
>> and it will return the correct data.
> Of course, an implicit conversion to int happens.
>
>> To make it return distinct staff_ids, just modify the select clause:
> Or drop the CTE:
>
> SELECT staff_id
> FROM test
> GROUP BY staff_id
> HAVING (MAX(year) + MIN(year)) * COUNT(*) / 2.0 <> SUM(year);
>
>
> mfG
> --> stefan <--


From: Stefan Hoffmann on
hi Uri,

On 03.03.2010 15:33, Uri Dimant wrote:
> Nice one Stefan
> I wonder which one would be faster on huge table? :-))))
My statement uses one table scan as yours needs two, but on the other
side if the actual query plan is right, my HAVING clause requires a sort
which requires ~20 times more I/O.


mfG
--> stefan <--
From: Plamen Ratchev on
Here is another solution:

SELECT staff_id
FROM test
GROUP BY staff_id
HAVING MAX(year) - MIN(year) <> COUNT(DISTINCT year) - 1;

--
Plamen Ratchev
http://www.SQLStudio.com
From: Uri Dimant on
Stefan
SET STATISTICS IO on shows that my query is more expencive in terms of
logical reads, yes you are right the SORT operations is pretty costly in
that case





"Stefan Hoffmann" <ste5an(a)ste5an.de> wrote in message
news:uJV1LUuuKHA.4220(a)TK2MSFTNGP05.phx.gbl...
> hi Uri,
>
> On 03.03.2010 15:33, Uri Dimant wrote:
>> Nice one Stefan
>> I wonder which one would be faster on huge table? :-))))
> My statement uses one table scan as yours needs two, but on the other side
> if the actual query plan is right, my HAVING clause requires a sort which
> requires ~20 times more I/O.
>
>
> mfG
> --> stefan <--