How to do...well...anything...in DB2 SQL [DB2]

Prev: SQL0901N Error.
Next: Convert DECIMAL to DATE

From: Ian Boyd on 9 Mar 2006 13:57

> Which tool are you using to run your SQL?
> I guess I'm missing the overall picture, did you post it somewhere in
> this thread?

i'm using a variety of tools in a variety of environments. None of them
involve a command line and runnnig saved files containing SQL statements.
Most involve writing a query in an editor of some sort and running it. Or
then taking that query and embedding it into code. And we prefer our queries
to be nicely formatted and (ideally) commented.

It is safe to say that our tools are Windows desktop machines, Windows
applications, Windows services, Websphere Java Servlets, editors that let us
type in and run SQL.

From: Ian Boyd on 9 Mar 2006 14:19

> I think Ian might have written this before discovering (in a separate
> note somewhere in this thread) that something on the client side is
> stripping line breaks causing problems with line comments.

It is important to note that it is either inside IBM's ODBC driver or IBM's
OLEDB provider. When we use the same tools to send queries to Microsoft's
SQL Server using either Microsoft's ODBC driver or Microsoft's OLEDB
provider, carriage returns and are put onto the ethernet intact. So it is
something to do with DB2. Maybe not the server side, but it's still IBM's
solution. And so, with IBM's solution i generally cannot use comments.

It really is irrelavant if DB2 is choking on it because it received mangled
SQL, IBMs drivers did it. They are DB2 drivers. Hence DB2 did it. If the
roles were reversed, it would be an SQL Server problem.

>> > 8. You cannot give DB2 some arbitrary SQL to run. You are only
>> > allowed to give DB2 one "statement" at a time. If you try to give
>> > it more than one statement, it will choke. Examples of statements
>> > include CREATE TABLE, CREATE TRIGGER, CREATE PROCEDURE, DECLARE
>> > CURSOR, CALL (which executes a stored procedure).
>>
>> Not true. You have to separate the statements with the statement
>> terminator.
>
> Actually, Ian is correct in a limited sense here. While it is certainly
> true that one can write a script containing several statements
> (separated by whatever means one wishes) and give this to the CLP for
> execution, this still doesn't mean you can execute multiple
> statements...

Yes. The tools can do me a favor and chop up a bunch of SQL into chunks that
DB2 can manage. It would be much nicer if DB2 could manage all the code at
once. And as we've already seen in other database products, there is no
technical limitation to doing so - just a design one.

So change the design. People can still be free to only send one statement to
DB2 at a time. But now DB2 can also be powerful and intelligent enough to
handle more than one statement.

>Quoting from Microsoft's own documentation of the SQLExecDirect call in
>the ODBC API:
> [snip]
> No mention of multiple statements there.

From the Microsoft Books Online:
<quote>
Batches
A batch is a group of one or more Transact-SQL statements sent at one time
from an application to Microsoft? SQL Server? for execution. SQL Server
compiles the statements of a batch into a single executable unit, called an
execution plan. The statements in the execution plan are then executed one
at a time.
....
Assume there are 10 statements in a batch. If the fifth statement has a
syntax error, none of the statements in the batch are executed. If the batch
is compiled, and the second statement then fails while executing, the
results of the first statement are not affected because it has already
executed.
</quote>

Here is a phrase that i really take issue with:
> If one can do this with MS SQL
> Server, it's certainly non-standard behaviour, and shouldn't be
> relied upon to be implemented by other databases.

SQL Server is easier to *use* because of it. DB2 is free to keep their
product harder to use.
Pretend it didn't work in SQL Server. Pretend that this wasn't the way it
works. Why shouldn't IBM innovate and include such a feature? Why not look
for reasons to DO new features, rather than looking for reasons NOT to do
new features.

An overarching theme i'm trying to express is usability. Make things more
usable, rather than less. Just because that's they way it is doesn't mean
that's the way it has to be. There are things that SQL Server does that are
completely self-consisent with their own little world of rules - but they're
just stupid. Just make it work! We all know what you were trying to do. Why
don't you do it? Is it a technical limitation? Fix it! Is it new idea that
you can't stick into your model of how the product is supposed to work?
Maybe the model is broken, maybe you need to extend it!

And it's many many things. It's not just defaults on columns. It's not just
implicitly casting. It's not just being able to select a scalar. It's a lot
of things.

Another very very very common example that people ask over and over and over
and OVER in the Microsoft newsgroups:

"How come i can't do:

SET MyVariable = CALL MyStoredProcedure

when my procedure only returns one row and one column?"

We all know what the person was trying to do. We all KNOW it. So, now before
going into any arguments about why what he is doing is wrong, you have to
stop and choose: Are we going to try to helpful, or are we going to be a
hinderance?

Do you want to make that syntax work, but you can't for technical reasons?
Fix them!
Are you going to refuse to allow that syntax on moral grounds (i.e. Celko)?
You can do that. But perhaps another database vendor will incorporate that
functionality into their system, and theirs will become easier to use.

i realize DB2 is trying to follow the standards, and standards are a good
thing. But please don't argue with your users telling them they need to do
things the hard way because it's "standard" or because "we've decided how
you should do things." Innovate!

From: Dave Hughes on 9 Mar 2006 14:48

Ian Boyd wrote:

> > > - IBM Command Editor (db2ce.bat) with it's "Statement termination
> > > character" changed to =
> >
> > Hmmm, using = as a statement terminator is probably a bad idea
> > (given the ambiguity).
>
> Sorry, sorry, sorry. That didn't translate through the news server
> properly. The character i changed it to was the "Vulgar Fraction
> One Half" U+00BD
> Alt+0189 on the keyboard if you're using Windows
> "=" <-- It shows up as 1/2 in Outlook Express's editor :)

Argh! I was scratching my head wondering how anyone could be so crazy
as to use equals as a statement terminator, but from experience I
should have known better (being in the UK, I'm used to mail and news
servers translating the British pound sign into =3A or just =)!

> > Hmmm ... you shouldn't need those CASTs around the NULLs, not in an
> > INSERT statement anyway. Let me just try it:
>
> You do - sometimes.
>
> Try something of the form:
> INSERT INTO TestTable (Name, Address, Phone)
> SELECT fname, addr1, NULL FROM Customers

Damn, I'd forgotten about that particular construction.

[snip]
> Now in this case, DB2 actually can infer the type - it just isn't
> looking far enough ahead. So rather than risk it sometimes working
> and sometimes not, i will just adhere to the rule that you should
> always do it. It's just safer that way.

Fair point ... probably a better strategy for someone new to DB2 and
just wishing to "get on with it".

> > In this case it's because a NULL on its own has no datatype, and
> > each column in a query result must have a datatype. Ergo, the NULL
> > must be cast to some datatype in this particular case.
>
> No reason DB2 can't just pick a type. If the extent of my statement
> was: SELECT NULL AS SomeNullValue FROM sysibm.sysdummy1
>
> Then just go ahead and make it an integer. Nobody's gonna care.

Pick an arbitrary data type... Certainly a viable option, but I'm
beginning to wonder...

> > Incidentally, this often causes confusion with the set operators
> > (UNION, INTERSECT and EXCEPT). For example (again using the TEST
> > table from above):
> >
> >db2 => SELECT A, B, C FROM TEST
> >db2 (cont.) => UNION
> >db2 (cont.) => SELECT NULL, B, C FROM TEST;
> > SQL0206N "NULL" is not valid in the context where it is used.
> > SQLSTATE=42703
>
> > One could argue that, in this case DB2 ought to be able to figure
> > out that the NULL in the query on the right-hand side of the UNION
> > should be implicitly cast to an INTEGER as that is the datatype of
> > the first column in the query on the left-hand side of the UNION.
>
> Yes, one would :)

Yes, there's that idea again: "infer the type".

<tangent>

The more I look at SQL (ignoring the hybrid stuff like stored
procedures, triggers, and dynamic compound statements), the more I'm
convinced it's just a set-based pure functional programming language in
disguise (the disguise being a truly bizarre syntax that wraps a whole
bunch of operations up in a single expression called SELECT).

The type inferencing mentioned above is exactly what certain pure
functional languages like ML do. Which makes me wonder, in the case of
picking an arbitrary datatype (as mentioned above), whether it wouldn't
be a good idea to steal an idea from functional programming (e.g.
permit a column in an output set with "arbitrary" type). Yeah, I know
.... crazy talk ...

</tangent>

Now returning to our scheduled program...

[snip another type inference example]
>
> > Hmm, I'd say there's very little in DB2 that's non-sensical, and
> > generally I do find there is a philosophy behind DB2's way of doing
> > things. But as I mentioned in another post, I'm beginning to
> > understand just how alien it must seem when "switching
> > philosophies" so to speak.
>
> It's not so much switching that is a problem, or maybe it is. Maybe
> it is the design standard itself that is weird. But there are things
> that "work there", but "don't work there". And often-times the answer
> as to why it behaves that way is: "that's the standard" "by
> design"
>
> But as a human using the system there are things that just shouldn't
> be that way. If the standard says it, maybe the standard needs to be
> revisited. If it's a technical limitation, then it needs to be
> overcome. If what i want makes no logical sense, then there will be a
> logical reason why. But if it turns out that "i want to do this, i
> can't think of any reason why your product can't shouldn't do it."
> "Well, we don't, and we have no plans to do it that way." "Then
> can you point to me to a competitors product that will do this?"
>
> On some level, IBM is writing software to be used by developers. It
> might not be a bad idea to make their jobs easier, rather than harder.

Again, fair point.

>
> > The VALUES expression, to me, is a good example of "DB2's
> > philosophy". A number of other databases use
> >
> > SELECT <value>, <value>, ...
> >
> > as a way of generating a row on an adhoc basis. However, it's
> > unclear with this syntax how one could generate an adhoc table.
>
> Yes, i agree. VALUES in an expression that has no equivalent in SQL
> Server. And IBM has added value to their product with this
> innovation. And other RDBMs would do well to steal the idea :)

Actually, I'm not entirely sure VALUES is a DB2 "innovation" ... it
could just be standard SQL that DB2's implemented and which other
databases have ignored. Anyone know for sure?

> > As someone
> > else mentioned, in SQL "everything is a table" (a single row is
> > just a special case of a table, and a single value is another such
> > special case).
>
> Not everything is a table. USER is a special register. There are
> plenty of "special registers". i guess i would need to ask, since i
> am speaking without knowing... Is the following valid:
>
> ALTER TABLE MyTable ALTER COLUMN Tag SET WITH DEFAULT asdfasdf ;
> how about
> ALTER TABLE SUPERDUDE.AUDIT_LOG ALTER COLUMN TAG SET WITH DEFAULT
> USER ; how about
> ALTER TABLE SUPERDUDE.AUDIT_LOG ALTER COLUMN TAG SET WITH DEFAULT
> (select username from employees fetch first 1 rows only) ;

Granted, I should qualify that over-generalizing statement: "In SQL
everything is a table ... except when it's not" :-)

In general, SQL could be said to have three "meta-types": scalars,
tuples, and sets of tuples (well, strictly speaking, bags of tuples as
one can have a table without a unique key, but we'll stick with "set"
as per convention). However, a scalar is still a special case of a
one-element tuple, and a tuple a special case of a one-element set.

Which is not to say that one can use a set of tuples anywhere a scalar
is required, but you can still see how a scalar is a "special case" of
a set of tuples.

Anyway, to the above statements. The first two are fine (with the
exception that the constant asdfasdf ought to be quoted). Column
defaults must be scalar, and must either evaluate to a constant or use
a special register as their only variable component, and both
expressions meet these criteria. (I recall from a prior post that not
all special registers, i.e. CURRENT CLIENT_APPLNAME etc., can currently
be used in this manner, however that turned out to be because of a lack
of implementation rather than any fundamental technical or theoretical
restriction).

The third statement, however, is problematic. It's not going to work
because it doesn't evaluate to a constant or special register. That's
the primary reason it won't work, but there's another aspect to it that
begs comment:

FETCH FIRST n ROWS is an oft abused modifier. As I understand it
(hopefully, someone more knowledgeable will correct me if I'm wrong
about this), FETCH FIRST n ROWS is meant to be used to grab a quick
sample of data from a potentially long running query.

For example, if you've put together a query that might take a *long*
time to run, and you'd just like to check a sample of the output, you
add FETCH FIRST n ROWS. I don't believe it's meant to be used in the
sense of limiting a query to a single row, or the top n results, or
whatever. This is simply because a table (theoretically at least) has
no intrinsic order and therefore without an ORDER BY clause in the
query one cannot guarantee the result is deterministic. Even with an
ORDER BY clause, this still wouldn't be the "right" to perform these
tasks.

The right way of limiting a query result to one row would be to ensure
that the WHERE clause specifies enough conditions to guarantee a unique
match in the table (i.e. by limiting on all columns of a unique key).

Likewise, the right way of grabbing the top n results would be to use
the RANK() or DENSE_RANK() functions. Unfortunately, the documentation
for the OLAP functions (of which RANK and DENSE_RANK are two) seems to
have been moved somewhere obscure in the Info Center (they're now all
buried in the "Reference / SQL / Language elements / Expressions"
subject). Just search for DENSE_RANK and you should find them easily
enough.

[snip]
> It's difficult to be frustrated and pleasent at the same time;
> especially when i also turn around and vent off to colleagues here so
> easily :)

I recommend several therapeutic sessions of fragging friends and
colleagues in Quake III (honestly, why this isn't considered mandatory
for the mental well-being of office workers is beyond me :-)

Dave.

--

From: Dave Hughes on 9 Mar 2006 14:56

Ian Boyd wrote:

> > And the query changes per application,
> > so no problem.
>
> Not when a query is the query was written in one application and used
> by another.
>
> > If the query is in a stored PROCEDURE, and outputs a
> > result set, it is best to use the VALUES inside a SELECT statement
> > as Knut pointed out.
>
> How do you give names to the fields returned from a SELECT of a
> VALUES table?

I'm pretty sure I've provided the syntax for this in a couple of other
comments, but maybe I didn't indicate it explicitly:

SELECT COLA, COLB, COLC
FROM (
VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9)
) AS TEMP(COLA, COLB, COLC)

The "AS" bit is mandatory and provides a name for the table produced by
VALUES. The field aliases in brackets after AS TEMP are optional, but
allow you to assign names to the columns in the VALUES table, which can
then be referenced in the enclosing SELECT statement as above. Though I
could just as easily have done:

SELECT *
FROM (
VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9)
) AS TEMP(COLA, COLB, COLC)

HTH,

Dave.

--

From: Knut Stolze on 9 Mar 2006 15:17

Dave Hughes wrote:

> The third statement, however, is problematic. It's not going to work
> because it doesn't evaluate to a constant or special register. That's
> the primary reason it won't work,

Actually, standardized SQL allows check constraints to contain sub-selects.
None of the products I know of actually implemented this, however. The
issue is probably (a) starting table scans on an insert is usually not such
a good idea, and (b) if it is really necessary, one could use triggers. So
it boils down to question we always have with new functionality: Who's
gonna pay for it and which feature should be moved back instead?

--
Knut Stolze
DB2 Information Integration Development
IBM Germany

First | Prev | Next | Last
Pages: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Prev: SQL0901N Error.
Next: Convert DECIMAL to DATE