JPA slow on remote database queryes [Java Programming]

Prev: Let the Experts to solve all your queries
Next: environment variables

From: Lew on 17 Apr 2010 15:34

carmelo wrote:
> Thank you guys for your answers.
>
> < how slow is slow? Roughly how many seconds for what size result
> sets?
> 59 seconds for 439 records!

I load a 3,000-entry List from JPA over a relatively slow network in one
project virtually instantly. It's not JPA that is at fault.

I don't bother with pagination when I'm using JPA. It ends up defeating the
internal caching and available external caching capabilities of the JPA
provider. It also messes up your architecture.

When you use JPA, you need to avoid thinking of your persistent information as
"data". The whole point of JPA is to give an *object-oriented* view of
persistent entities and their relationships. The fact that you are tangled in
questions of "performance" (and trying to blame JPA for it) and "batches"
indicates that you are not working from an object-oriented view of your
architecture. That is the real mistake.

You get the best performance out of JPA when you use it in its natural mode.
Then it's quite fast, at least as much so as normal database access will let
it be. Done properly, it'll never be "JPA" that is to blame for performance
issues.

You have given us absolutely no information about the data or object models
you use, or how you try to integrate them. Then you expect miracles from
Usenet to solve your problem, while you go haring off after chimeras like "JPA
hurts performance". It doesn't work that way.

Well, I guess it does, now that you are reading this post.

JPA has built-in support for certain natural kinds of "batches", that is,
one-to-many, many-to-one and many-to-many relationships of entities or sets of
entities to each other. It fills these batches quite efficiently
appropriately to the programmer's choice of 'Fetch.EAGER' or 'Fetch.LAZY'.
The former uses JOIN syntax (assuming proper integrity constraints in the
database) and the latter conditional queries.

With only 439 entities ("records" is a data-oriented term) you don't need to
page. That's just silly.

If you keep entity size reasonable, let's say about a KiB apiece, you can hold
a thousand per MiB. In batch sizes over about ten thousand it starts to
become useful to think about paging.

Proper data models on the database side are an absolute must. Screw up the
data model and too bad for you.

JPA works best for simpler data models, ones that have clean representations
of entities. It can handle beautifully any relationships expressible in
collection terms, and since relational databases are set oriented by
definition, that is anything that a relational database expresses.

As relationship complexity grows, particularly with many-to-many links, the
@JoinX() annotations become mandatory. You have to get pretty specific in
your mappings, telling JPA not only the fields through which you link but the
table and column names involved.

The Java EE 5 tutorial on java.sun.com goes through some of this, and googling
around will find you more examples. I find the documentation for each of
EclipseLink, Hibernate and Apache OpenJPA all to be very useful, regardless of
which one you actually happen to use.

--
Lew

From: carmelo on 19 Apr 2010 03:21

Thank you Lew for your exaustive answer, but I've got this performance
problem and I need to solve it. I hope you can help me finding a
reasonable solution. I'm happy that you can load thousands of records
with JPA, so I hope I can load them so fast too.. However, locally, in
my local area network, I can load thousands of records with JPA too..
the performance problem begins when I try to load data from a remote
host database.

To be clear, here is my JPA query and my entity code:

//query
String query = "SELECT a FROM Articoli a";
Query q = entityManager.createQuery(query);
java.util.Collection data = q.getResultList();
for (Object entity : data) {
entityManager.refresh(entity);
}
list.clear();
list.addAll(data);

//entity, Articoli.java
@Entity
@Table(name = "articoli")
@NamedQueries({
..here are NamedQueries for each field..
})
public class Articoli implements Serializable {
@Transient
private PropertyChangeSupport changeSupport = new
PropertyChangeSupport(this);
private static final long serialVersionUID = 1L;
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Basic(optional = false)
@Column(name = "CodiceArticolo")
private Integer codiceArticolo;
@Basic(optional = false)
@Column(name = "Descrizione")
private String descrizione;
@Column(name = "H")
private BigDecimal h;
@Column(name = "L")
private BigDecimal l;
@Column(name = "Diametro")
private BigDecimal diametro;
@Column(name = "Spessore")
private BigDecimal spessore;
@Basic(optional = false)
@Column(name = "prezzo")
private BigDecimal prezzo;
@Column(name = "Lungh")
private BigDecimal lungh;
@Column(name = "Peso")
private BigDecimal peso;
@Column(name = "Lung_scatola")
private BigDecimal lungscatola;
@Column(name = "Larg_scatola")
private BigDecimal largscatola;
@Column(name = "H_scatola")
private BigDecimal hscatola;
@Column(name = "Peso_scatola")
private BigDecimal pesoscatola;
@Column(name = "Spessore_max")
private BigDecimal spessoremax;
@Column(name = "Lampada")
private String lampada;
@Column(name = "Watt")
private BigDecimal watt;
@Column(name = "Lumen")
private BigDecimal lumen;
@Column(name = "Temp_colore")
private BigDecimal tempcolore;
@Column(name = "Trattamento")
private String trattamento;
@Column(name = "T_50_min")
private BigDecimal t50min;
@Column(name = "T_50_max")
private BigDecimal t50max;
@Column(name = "Id")
private BigDecimal id;
@Column(name = "Od")
private BigDecimal od;
@Version
protected int version;
@JoinColumn(name = "CodiceSezione", referencedColumnName =
"codice")
@ManyToOne
private Sezione codiceSezione;
@JoinColumn(name = "CodiceCategoria1", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria1;
@JoinColumn(name = "CodiceCategoria2", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria2;
@JoinColumn(name = "CodiceCategoria3", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria3;
@JoinColumn(name = "CodiceCategoria4", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria4;
@OneToMany(cascade = CascadeType.ALL, mappedBy =
"codicearticolomagaz")
private Collection<Magazzinodettaglio>
magazzinodettaglioCollection;
@OneToMany(cascade = CascadeType.ALL, mappedBy =
"codicearticolofornitore")
private Collection<Articolifornitori> articolifornitoriCollection;

public Articoli() {
}

public Articoli(Integer codiceArticolo) {
this.codiceArticolo = codiceArticolo;
}

public Articoli(Integer codiceArticolo, String descrizione,
BigDecimal prezzo) {
this.codiceArticolo = codiceArticolo;
this.descrizione = descrizione;
this.prezzo = prezzo;
}

..here are getter/setter methods for each field..

public Collection<Magazzinodettaglio>
getMagazzinodettaglioCollection() {
return magazzinodettaglioCollection;
}

public void
setMagazzinodettaglioCollection(Collection<Magazzinodettaglio>
magazzinodettaglioCollection) {
this.magazzinodettaglioCollection =
magazzinodettaglioCollection;
}

public Collection<Articolifornitori>
getArticolifornitoriCollection() {
return articolifornitoriCollection;
}

public void
setArticolifornitoriCollection(Collection<Articolifornitori>
articolifornitoriCollection) {
this.articolifornitoriCollection =
articolifornitoriCollection;
}

@Override
public int hashCode() {
int hash = 0;
hash += (codiceArticolo != null ? codiceArticolo.hashCode() :
0);
return hash;
}

@Override
public boolean equals(Object object) {
if (!(object instanceof Articoli)) {
return false;
}
Articoli other = (Articoli) object;
if ((this.codiceArticolo == null && other.codiceArticolo !=
null) || (this.codiceArticolo != null && !
this.codiceArticolo.equals(other.codiceArticolo))) {
return false;
}
return true;
}

@Override
public String toString() {
return "desktopapplication1.Articoli[codiceArticolo=" +
codiceArticolo + "]";
}

public void addPropertyChangeListener(PropertyChangeListener
listener) {
changeSupport.addPropertyChangeListener(listener);
}

public void removePropertyChangeListener(PropertyChangeListener
listener) {
changeSupport.removePropertyChangeListener(listener);
}

}

If it's not enough, please let me know, and I'll prepare a sample
application.

I hope you can help me to achive a better performance.

Thank you very much in advance for your help guys!

From: Alessio Stalla on 19 Apr 2010 04:17

On Apr 19, 9:21 am, carmelo <csa...(a)tiscali.it> wrote:
> To be clear, here is my JPA query and my entity code:
>
> //query
> String query = "SELECT a FROM Articoli a";
> Query q = entityManager.createQuery(query);
> java.util.Collection data = q.getResultList();
> for (Object entity : data) {
> entityManager.refresh(entity);}

Are you aware of the fact that refresh() will reload the entity from
the database, therefore issuing a query? And that many-to-one
relations are eager by default, so for each entity you'll be fetching
all the related objects each time?

hth,
Alessio

From: carmelo on 19 Apr 2010 05:05

Thank you Alessio for your reply.

> Are you aware of the fact that refresh() will reload the entity from the database, therefore issuing a query?

You're right! refresh() issues a new query. I removed it, but the
loading time is now about 49 seconds, still too high. Even if it's
better that 59 seconds :-)

> many-to-one relations are eager by default, so for each entity you'll be fetching all the related objects each time?

How can I do? There are n:1 relations between Sezione, Categoria and
Articoli. In other words, Articoli.sezione can have multiple values,
which values are taken from Sezione. And the same is for
Articoli.categoria1, Articoli.categoria2, Articoli.categoria3,
Articoli.categoria, which values are taken from Categoria.
However I tried to remove these relations, and the resulting loading
time was 46 secods, therefore I saved only 3 seconds.

I hope you can help me achiving better results. Thanks guys

From: Martin Gregorie on 19 Apr 2010 10:16

On Mon, 19 Apr 2010 00:21:47 -0700, carmelo wrote:

> Thank you Lew for your exaustive answer, but I've got this performance
> problem and I need to solve it. I hope you can help me finding a
> reasonable solution. I'm happy that you can load thousands of records
> with JPA, so I hope I can load them so fast too.. However, locally, in
> my local area network, I can load thousands of records with JPA too..
> the performance problem begins when I try to load data from a remote
> host database.
>
I think your best bet is to do some monitoring to see exactly what is
going on at the network.

Start by instrumenting your code so you can see how big the retrieved
dataset has to be to cause problems.

Then install a copy of Wireshark (available for most OS) and use it to
inspect the network traffic when your query is running. Look at smaller
queries as well as ones big enough to cause the problem to pick up any
significant differences, e.g. does the nature of the traffic change as
the retrieved data expands from a few display lines to more than a
screenful? Does scrolling to the next page cause network activity?

Turning on JDBC and/or DBMS tracing can help too.

You're probably going to have to wade through great heaps of captured
data, especially if the problem only occurs with large data sets, but the
GUI used by later copies of Wireshark can help there.

I've seen awfully dumb tricks played by both software and database
designers that have only come to light when large data volumes and/or
slower data links were used, e.g.:

- really stupid auto-generated SQL
(PowerBuilder - fortunately it would also accept manually written SQL)

- dumb developer tricks
(Microsoft Foundation Classes - MFC queries managed to scan the whole
table when getting the column names for 'SELECT * FROM...' queries)

- poor DB schema design
( missing indexes aren't apparent with typical programmer small test
data sets but boy do they bite once row counts are in the ten
thousands)

--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |

First | Prev |
Pages: 1 2 3
Prev: Let the Experts to solve all your queries
Next: environment variables