what is a phantom read ? [Java Programming]

Prev: consuming the web service
Next: Need help designing some JUnit tests

From: gk on 20 May 2010 08:50

Here is a Phantom read example I read:

/* Query 1 */

SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

return 2 records.

/* Query 2 */
INSERT INTO users VALUES ( 3, 'Bob', 27 );
COMMIT;

/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

return 3 records.

see It seems to me normal . I don't understand why they are called
'Phantom read' . Look , all transactions are happening in different
time , so we are getting the latest data always. Initially there
were 2 records , Later on , 1 record inserted ..so when we run Query 1
again, we get the updated data i.e 3 records.

So,what is wrong here ? what to be worried here ? why its called
phantom read ?

From: Jeff Higgins on 20 May 2010 09:14

On 5/20/2010 8:50 AM, gk wrote:
> Here is a Phantom read example I read:
>
> /* Query 1 */
>
> SELECT * FROM users
> WHERE age BETWEEN 10 AND 30;
>
> return 2 records.
>
>
>
> /* Query 2 */
> INSERT INTO users VALUES ( 3, 'Bob', 27 );
> COMMIT;
>
>
>
> /* Query 1 */
> SELECT * FROM users
> WHERE age BETWEEN 10 AND 30;
>
>
> return 3 records.
>
>
>
> see It seems to me normal . I don't understand why they are called
> 'Phantom read' . Look , all transactions are happening in different
> time , so we are getting the latest data always.

What has time got to do with it?

Initially there
> were 2 records , Later on , 1 record inserted ..so when we run Query 1
> again, we get the updated data i.e 3 records.
>
> So,what is wrong here ? what to be worried here ? why its called
> phantom read ?

Did you read the entire article?
<http://en.wikipedia.org/wiki/Isolation_(database_systems)>

From: gk on 20 May 2010 09:35

On May 20, 6:14 pm, Jeff Higgins <oohigg...(a)yahoo.com> wrote:
> On 5/20/2010 8:50 AM, gk wrote:
>
>
>
>
>
> > Here is a Phantom read example I read:
>
> > /* Query 1 */
>
> > SELECT * FROM users
> > WHERE age BETWEEN 10 AND 30;
>
> > return 2 records.
>
> > /* Query 2 */
> > INSERT INTO users VALUES ( 3, 'Bob', 27 );
> > COMMIT;
>
> > /* Query 1 */
> > SELECT * FROM users
> > WHERE age BETWEEN 10 AND 30;
>
> > return 3 records.
>
> > see It seems to me normal . I don't understand why they are called
> > 'Phantom read' . Look , all transactions are happening in different
> > time , so we are getting the latest data always.
>
> What has time got to do with it?
>
> Initially there
>
> > were 2 records , Later on , 1 record inserted ..so when we run Query 1
> > again, we get the updated data i.e 3 records.
>
> > So,what is wrong here ? what to be worried here ? why its called
> > phantom read ?
>
> Did you read the entire article?
> <http://en.wikipedia.org/wiki/Isolation_(database_systems)>- Hide quoted text -
>
> - Show quoted text -

it seems to me perfectly normal . Do you see any trouble in this
scenario ? I don't understand where is the trouble yet . why the
trouble will come up ?

From: Lew on 20 May 2010 09:59

gk wrote:
>>> Here is a Phantom read example I read:
>>
>>> /* Query 1 */
>>
>>> SELECT * FROM users
>>> WHERE age BETWEEN 10 AND 30;
>>
>>> return 2 records.
>>
>>> /* Query 2 */
>>> INSERT INTO users VALUES ( 3, 'Bob', 27 );
>>> COMMIT;
>>
>>> /* Query 1 */
>>> SELECT * FROM users
>>> WHERE age BETWEEN 10 AND 30;
>>
>>> return 3 records.
>>
>>> see It seems to me normal . I don't understand why they are called
>>> 'Phantom read' . Look , all transactions are happening in different
>>> time , so we are getting the latest data always.

Jeff Higgins wrote:
>> What has time got to do with it?

Pay attention here, gk!

gk wrote:
>> Initially there
>>
>>> were 2 records , Later on , 1 record inserted ..so when we run Query 1
>>> again, we get the updated data i.e 3 records.
>>
>>> So,what is wrong here ? what to be worried here ? why its called
>>> phantom read ?

Jeff Higgins wrote:
>> Did you read the entire article?
>> <http://en.wikipedia.org/wiki/Isolation_(database_systems)>

The key word here is "entire". You might otherwise miss
"Note that *transaction 1 executed the same query twice*. [emph. orig.] If the
highest level of isolation were maintained, the same set of rows should be
returned both times, and indeed that is what is mandated to occur in a
database operating at the SQL SERIALIZABLE isolation level. However, at the
lesser isolation levels, a different set of rows may be returned the second time."

The fact that the same query returns different results at lesser levels means
that one or both results are "phantoms" - not the real answer.

> it seems to me perfectly normal . Do you see any trouble in this
> scenario ? I don't understand where is the trouble yet . why the
> trouble will come up ?

The part you're missing is that the two queries occur *inside the same
transaction*. That's *inside the same transaction*. It's the fact that it's
the *same* transaction getting different results that makes it a "problem".
If the isolation level is low, then the transaction is not isolated (get it?)
from the effects of the other transaction. Were the two queries in different
transactions the isolation level would be irrelevant, but they're in the same
transaction.

This is not to say you always need repeatable-read isolation, but when you do,
phantom reads are a "problem".

Why might you need repeatable read? Well, if you're building intermediate
results, say bringing in a set of rows to process, you could get bizarre
results if that set changes while the transaction progresses. It's sort of
like a 'ConcurrentModificationException' in the collections classes. You
can't build a house on shifting sands.

If you could, you wouldn't bother putting the multiple queries in the *same
transaction*.

--
Lew

From: Lew on 20 May 2010 10:03

Lew wrote:
> This is not to say you always need repeatable-read isolation, but when
> you do, phantom reads are a "problem".

s/repeatable[- ]read/serializable/g

--
Lew

| Next | Last
Pages: 1 2 3 4 5
Prev: consuming the web service
Next: Need help designing some JUnit tests