Database development [Java Programming]

From: Tom Anderson on 26 Apr 2010 15:41

On Mon, 26 Apr 2010, Martin Gregorie wrote:

> On Mon, 26 Apr 2010 13:37:46 +0200, Zlatko Duric wrote:
>
>> Well, my objects are documents - those are "nodes". But so are the
>> "folders" holding the documents. And so are their parents. And I want,
>> for example, all the docs that have the keyword FOO and their parent is
>> "Reports".
>
> Thats what additional indexes are for. For example, to support the
> following example you'd want indexes on doc.keyword and folder.name,
> since they'll probably be used often for data selection and sorting.
> We'll assume that the DB designer was sensible and put indexes on both
> table's prime keys. Your example requirement could be satisfied with a
> single query, which would involve a prime key join and use the additional
> indexes to select the rows to be included in the dataset. Something like
> this:
>
> SELECT required fields
> from folder f, document d
> where f.key = d.folderkey
> and d.keyword = 'FOO'
> and f.name = 'Reports';
>
> Any decent RDBMS should be able to optimise that type of query and would
> return just the required result set.

I suspect documents may have more than one keyword, in which case your
query might look like:

SELECT required fields
from folder f, document d
where f.key = d.folderkey
and d.keyword LIKE '%FOO%' -- if keywords are packed space-separated into one column (bad idea)
and f.name = 'Reports';

Or:

SELECT required fields
from folder f, document d, keyword k
where f.key = d.folderkey and d.key = k.documentkey
and k.word = 'FOO' -- if keyword is a relation (documentkey, word)
and f.name = 'Reports';

Or even:

SELECT required fields
from folder f, document d, document_keywords dk, keyword k
where f.key = d.folderkey and d.key = dk.documentkey and dk.keywordkey = k.key
and k.word = 'FOO' -- if keywords are first-class and there is a join table (fully normalised but overcomplicated)
and f.name = 'Reports';

>> That's messy - iterate through all the folders to find the stuff I need.
>
> Why would you need to do that? A correctly written query will only
> return the data you need, present it in the order you want and not slow
> database updates down by requiring additional reference tables which
> must be maintained.

I think the OP is saying he would have to navigate in that way if he was
using ORM rather than SQL. That still isn't true, though, because in JPA
you have JP-QL, which is more or less isomorphic to SQL, and in fact
simpler, because the mapping of relationships to properties makes joins
easier to express:

select doc from Document doc
where
'FOO' in doc.keywords
and doc.folder.name = 'Reports'

(i think)

tom

--
Everybody with a heart votes love

From: Lew on 26 Apr 2010 15:48

Tom Anderson wrote:
> I think the OP is saying he would have to navigate in that way if he was
> using ORM rather than SQL. That still isn't true, though, because in JPA
> you have JP-QL, which is more or less isomorphic to SQL, and in fact
> simpler, because the mapping of relationships to properties makes joins
> easier to express:
>
> select doc from Document doc
> where
> 'FOO' in doc.keywords
> and doc.folder.name = 'Reports'
>
> (i [sic] think)
>

What you wrote looks correct, but I am fairly certain it requires
'keywords' to be expressed as a collection (probably a 'Set'), which
is best done if the keywords are in their own table, which they should
be anyway. I don't think it works with the space-separated list of
keywords as in your first of three examples.

Your second example, as you hint, is probably optimal and the third
overkill.

--
Lew

From: Robert Klemme on 26 Apr 2010 16:14

On 04/26/2010 02:14 AM, Arne Vajhøj wrote:

> I don't think the big benefits of ORM (Hibernate or JPA or one of the
> alternatives) are in the writing of the code. It still requires
> somebody that knows both the ORM framework and the database well
> to write really efficient code.
>
> The big benefits are for reading the code. Everyone can read the
> the code using ORM and immediately understand what it does without
> looking at tons of code that uses JDBC and SQL. It is maintenance
> friendly.

So you are saying that I need skilled people to write the initial code
and can give maintenance to less skilled people because the ORM using
code is easy to read? I am not sure that is a good strategy. Over time
software tends to decay because more and more bug fixes are applied and
features added. If only the people knew internals of ORM that wrote the
initial code I see a good chance that maintainers wreck havoc on the
performance and potentially the whole application if they change /
extend the easy readable code without knowing the tool they are using.
Even a change as seemingly simple as that of a field type from "int" to
"String" might have dramatic consequences. And just think of the woes
of schema migration: if you have an installed base you urgently need
someone who understands the DB underneath and the ORM tool to come up
with a feasible migration strategy that.

Btw, did I mention that I believe database independence is a myth? :-)

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

From: Tom Anderson on 26 Apr 2010 16:26

On Mon, 26 Apr 2010, Lew wrote:

> Tom Anderson wrote:
>
>> I think the OP is saying he would have to navigate in that way if he
>> was using ORM rather than SQL. That still isn't true, though, because
>> in JPA you have JP-QL, which is more or less isomorphic to SQL, and in
>> fact simpler, because the mapping of relationships to properties makes
>> joins easier to express:
>>
>> select doc from Document doc
>> where
>> � 'FOO' in doc.keywords
>> � and doc.folder.name = 'Reports'
>>
>> (i [sic] think)
>
> What you wrote looks correct, but I am fairly certain it requires
> 'keywords' to be expressed as a collection (probably a 'Set'), which is
> best done if the keywords are in their own table, which they should be
> anyway. I don't think it works with the space-separated list of
> keywords as in your first of three examples.

I believe you're absolutely right - sorry, i should have been clearer on
what context that query required.

> Your second example, as you hint, is probably optimal and the third
> overkill.

Probably. The advantage of the fully normal design is that queries like:

"find me all the documents with keywords with 'brew' in them"

are fast, because doing that in any schema involves a LIKE over the
keyword text, which more or less means a table scan, and the keyword table
in that design is smaller than in the second one (proportional to the
number of distinct keywords, not the sum of the number of keywords over
all documents), and much smaller than in the first one, where it's the
whole table.

But then, if you have full-text indexing, you can search the packed string
without doing a table scan. Although that's not necessarily fast enough:

http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html

But then, you can full-text index the more normal schemas too:

http://www.opensymphony.com/compass/content/about.html

Er, so, yeah, anyway, glad to have entirely eliminated the OP's confusion,
i'm sure.

tom

--
Mass motoring effects an absolute triumph of bourgeois ideology on the
level of daily life. It gives and supports in everyone the illusion
that each individual can seek his or her own benefit at the expense of
everyone else. -- Andre Gorz

From: Robert Klemme on 26 Apr 2010 16:31

On 04/26/2010 12:48 PM, Lew wrote:
>
> junw2000(a)gmail.com says...
>>> When I work on database development projects, I use JDBC and SQL. Many
>>> people use hibernate/spring. Can somebody explain the pros and cons of
>>> using JDBC and SQL vs using hibernate/spring on database
>>> developments?
>
> Pitch wrote:
>> I always believed that ORM systems are forcing you to write your own
>> business-rules layer apart from the persistence layer. That way
>> database access is kept simple and easy mantainable.
>
> ORM doesn't force business rules into a separate layer and raw JDBC
> calls don't force them into the same layer as persistence.

Exactly. I believe there is a fundamental dilemma that I haven't seen a
satisfying solution to: with a relational database and an object
oriented (or not) application which implements the business logic you
automatically have a distribution of business rules between several tiers.

If you manage to place all business consistency rules into the database
(which is often impossible because either of limitations of the DB or
complexity of the model) you leave very little for the application layer
(mostly presentation) so you might wonder why not directly implement all
the business logic in PL/SQL or T-SQL (just to name two well known
brands). Advantage is that you cannot break the model if you need to do
changes in the DB (this can happen for migration, repair or other one
off tasks).

If you place all the rules in the application consequently you would
have to even get rid of foreign keys. Downside is of course that you
now have zero consistency enforcement for the data model in the DB (e.g.
during all those tasks mentioned above) and you are only using 10% of a
potentially expensive installation (in the case of a commercial RDBMS).

In reality I have often seen a mix between the two approaches: some
consistency checking (FK, PK, CHECK constraints, triggers) is done in
the database and the "rest" of the business logic lives in the
application tier. This may actually be the worst approach: it's not
only that you don't have a single place where the business model is
consistently defined and enforced - that is merely a violation of some
form of purity rule (which _does_ have its advantages). But this might
also make people feel safe when they change the database while breaking
business rules that live elsewhere...

We probably should get rid of persistence altogether - maybe Alzheimer
DB or WORN is the future. :-)

>> Also, this multi-tier architecture allows for easier load-balancing,
>> architecture changes, integration with other systems, development..

Pitch, you can have multi tier with JDBC and ORM - you can even have
multi tier without persistence altogether.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: Null pointer issues
Next: Urgent