Distinct issues [Programming SQL]

Prev: need help with query
Next: Sum and aggregations

From: tshad on 20 Apr 2010 17:59

We have procedures that have many DISTINCTs in them.

It appears that the original programmer just decided to put DISTINCT on
everything.

I can't just take them off since there may be a good reason for it to be one
or more the queries.

Is there an easy way to check if they are needed or not?

Thanks,

Tom

From: Plamen Ratchev on 20 Apr 2010 18:12

Run the queries without DISTINCT and see if you get the same result set (no duplicate rows). In general using DISTINCT
indicates some problems with data (or design) resulting in duplication.

--
Plamen Ratchev
http://www.SQLStudio.com

From: tshad on 20 Apr 2010 20:08

"Plamen Ratchev" <Plamen(a)SQLStudio.com> wrote in message
news:yL-dnW2whepXu1PWnZ2dnUVZ_qKmnZ2d(a)speakeasy.net...
> Run the queries without DISTINCT and see if you get the same result set
> (no duplicate rows). In general using DISTINCT indicates some problems
> with data (or design) resulting in duplication.
>
I agree.

But the problem is that one SP has about 7 queries with DISTINCT in it and
running the query with one set of parameters may have unique rows and
another may have one duplicate row. But no way to knowing whether this is
the case.

Normally, I would think you would add the DISTINCT on a query because you
ran into a query that gave you duplicate results.

But in this case, it looks like the programmer just will nilly added it to
all the queries. And this is done in a bunch of programs.

Thanks,

Tom

> --
> Plamen Ratchev
> http://www.SQLStudio.com

From: Plamen Ratchev on 20 Apr 2010 21:22

You can analyze the queries and the underlaying tables. In most cases based on joins, knowledge of the data,
primary/unique constraints, and predicates you can tell if the query can produce duplicate rows in the result set.

--
Plamen Ratchev
http://www.SQLStudio.com

From: --CELKO-- on 20 Apr 2010 23:10

>> It appears that the original programmer just decided to put DISTINCT on everything. <<

That is often a sign that the original programmer was a code generator
and not a human being. This was the safest way to do things when
people used to use IDENTITY as a key instead of the actual key for
tables in the early days.

You are probably screwed. Each query will have to be examined by hand
for keys in the results.

But quick trick is to look for a SELECT DISTINCT with a GROUP BY that
has the same columns so you can drop the DISTINCT. Generated code was
that awful.

| Next | Last
Pages: 1 2
Prev: need help with query
Next: Sum and aggregations