Sunday, June 21, 2015

JDBC fetch size - commonly forgotten setting

JDBC fetch size parameter can significantly reduce the result set fetching time. It can be set on any Statement or PreparedStatement object. Many frameworks like Spring or Hibernate give you very convenient API to do this.

Although this parameter is widely available, I discovered that a lot of projects I have visited did not configure the fetch size and thus was using its default value. Yet, the default value can result in a poor performance on some jdbc drivers.

I want to show you how much you can improve your fetching performance if you adjust the fetch size to your statements. The scale of improvement depends on the driver you use.

Oracle jdbc driver

Assume we have table with 1 million rows and for any reason we have to fetch all records into JVM. How fast you can get all data? What fetch size will Oracle use if you don’t set it explicitly?

jdbc fetch size benchmark, fetch size from 10 to 2000
Figure 1 - fetching 1M rows with different fetchSize values (local oracle database)
Figure 1 shows fetching times for different fetch size values for Oracle database. In this example database and java application are located on the same machine. I will show later how it looks for a remote database.

Setting Fetch Size with standard JDBC calls

This is how you can set fetch size for given PreparedStatement using JDBC API:


Lets see what happens if fetchSize property is set to 10. When rs.next() is called for first time, the oracle driver fetches first 10 records from database and store them in a memory buffer. So, for next 9 calls to rs.next() records are retrieved from this buffer. After the buffer is fully read, subsequent rs.next() will force driver to fetch a new bunch of rows (10) into the buffer.

So if we want to read 10k rows with fetch size set to 10, the driver will make 1000 round trips to the database using the underlying connection. If we set the fetchSize to 500 the driver will perform only 20 round trips to our database.

Look at Figure 1. Setting fetchSize to 100 gives you a 6 times shorter fetching time then with setting fetchSize to 10. Now, you should know that the default fetchSize for the oracle driver is 10...

Two important comments:

  • fetchSize can be set on each Statement or PreparedStatement or even on ResultSet. By default, ResultSet uses fetchSize of Statement from which is born. The default value for Statement or PreparedStatement is jdbc driver specific
  • fetchSize is only a hint for the driver – the Oracle driver respects this setting, while other drivers may ignore it and fetch all the records at once, for instance.

Setting Fetch Size with Spring JdbcTemplate

When using Spring jdbc support you can do this in 2 ways:

Ad hoc JdbcTemplate instance


Shared JdbcTemplate instance


When implementing a DAO that extends JdbcDaoSupport every call to getJdbcTemplate() returns the same shared JdbcTemplate instance. You can mix this with ad-hoc instances. For example, override initTemplateConfig() to set the default for this DAO but use ad-hoc JdbcTemplate for selected queries.

Setting Fetch Size with Hibernate


With Hibernate you can specify jdbc fetchSize in two ways. First, you can set this as global setting. Second, you can overwrite this global setting for every individual query you want.

Global setting

Hibernate gives you property called hibernate.jdbc.fetch_size which you can set on your SessionFactory object. If configured, Hibernate will set this value on each PreparedStatement it creates.

With Spring + Hibernate you can use LocalSessionFactoryBean:

With Spring + JPA you can use LocalContainerEntityManagerFactoryBean:

Or you can set this property in any other way (hibernate.cfg.xml, etc.).

If you set hibernate.jdbc.fetch_size property and turn Hibernate logs to debug:

you should be able to see a log message telling you that Hibernate will use your setting:

Fetch size per query

Another method is to set fetchSize for particular query only. With Hibernate you can create HQL or SQL query in the following way:

With JPA you can use similar API:


Summary

Figure 2 shows you the same experiment for a remote oracle database.

jdbc fetch size benchmark, remote oracle database
Figure 2 - fetching 1M rows (remote oracle database)

In this experiment (Oracle database) when using fetchSize 500 I was able to fetch 1M rows 12 times faster than with default fetchSize of 10.

You can ask what it means:

  • does a larger fetch size imply better performance? 
  • if not – what is optimal value?

I will try to answer this for Oracle JDBC driver in my next post. For now I can only say that a larger fetch size:

  • usually gives you faster fetching
  • requires more memory for buffers
  • with some configurations consumes surprisingly much memory (oracle driver)

In my projects I use the following settings:

  • fetch size 50200 - as global setting
  • fetch size 2002000 - for some specific queries

Final notes

Experiment conditions:

  • local and remote Oracle 11g database
  • jdbc driver used: 11.1.0.7.0
  • only 1 fetching thread running at a time

Experiment was also performed for oracle drivers 10.2.0.4.0 and 11.2.0.3.0 with similar results.

Feel free to share your own experiences with fetchSize parameter on Oracle or any other database.

And final word worth remembering – if you don't set fetchSize explicitly, jdbc driver will use its default value which is 10 for oracle drivers.


6 comments:

  1. If my result set is smaller than the fetch size, would it negatively affect performance?

    ReplyDelete
    Replies
    1. Hi Tim, assuming your jvm has enough memory, then there is no negative effect on performance.

      For statement with fetchSize equal to N, Oracle driver maintains in-memory buffer for N potential result rows.

      If your query returns for example 10% of N, then then rest of buffer is simply not used for this very result set.

      Sometimes there is only 1 global setting of fetchSize. In such configuration there is considerable memory waste for all SQL statements that always return small result sets. For example all counting statements (select count... from...) requires only 1 bucket in buffer.

      Therefore, if your jvm has not so much memory on heap, setting large fetchSize for small queries leads to allocating unncessary memory and more frequent garbage collections. And this of course results in lower performance.

      Delete
  2. Would setting fetchSize at query level override the fetchSize settings on the application server's connection pool?

    ReplyDelete
    Replies
    1. Yes, setting fetchSize at query level (Statement or PreparedStatement) will override the fetchSize value set on Connection.

      Delete
  3. Thank you for you post.
    I just realized while loop for resultset is so slow, then I started googling about fetchSize. Finally, I got this useful information. Thank you so much.

    ReplyDelete