JDBC fetch size - commonly forgotten setting

JDBC fetch size parameter can significantly reduce the result set fetching time. It can be set on any Statement or PreparedStatement object. Many frameworks like Spring or Hibernate give you very convenient API to do this.

Although this parameter is widely available, I discovered that a lot of projects I have visited did not configure the fetch size and thus was using its default value. Yet, the default value can result in a poor performance on some jdbc drivers.

I want to show you how much you can improve your fetching performance if you adjust the fetch size to your statements. The scale of improvement depends on the driver you use.

Oracle jdbc driver

Assume we have table with 1 million rows and for any reason we have to fetch all records into JVM. How fast you can get all data? What fetch size will Oracle use if you don’t set it explicitly?

jdbc fetch size benchmark, fetch size from 10 to 2000
Figure 1 - fetching 1M rows with different fetchSize values (local oracle database)
Figure 1 shows fetching times for different fetch size values for Oracle database. In this example database and java application are located on the same machine. I will show later how it looks for a remote database.

Setting Fetch Size with standard JDBC calls

This is how you can set fetch size for given PreparedStatement using JDBC API:


Lets see what happens if fetchSize property is set to 10. When rs.next() is called for first time, the oracle driver fetches first 10 records from database and store them in a memory buffer. So, for next 9 calls to rs.next() records are retrieved from this buffer. After the buffer is fully read, subsequent rs.next() will force driver to fetch a new bunch of rows (10) into the buffer.

So if we want to read 10k rows with fetch size set to 10, the driver will make 1000 round trips to the database using the underlying connection. If we set the fetchSize to 500 the driver will perform only 20 round trips to our database.

Look at Figure 1. Setting fetchSize to 100 gives you a 6 times shorter fetching time then with setting fetchSize to 10. Now, you should know that the default fetchSize for the oracle driver is 10...

Two important comments:

  • fetchSize can be set on each Statement or PreparedStatement or even on ResultSet. By default, ResultSet uses fetchSize of Statement from which is born. The default value for Statement or PreparedStatement is jdbc driver specific
  • fetchSize is only a hint for the driver – the Oracle driver respects this setting, while other drivers may ignore it and fetch all the records at once, for instance.

Setting Fetch Size with Spring JdbcTemplate

When using Spring jdbc support you can do this in 2 ways:

Ad hoc JdbcTemplate instance


Shared JdbcTemplate instance


When implementing a DAO that extends JdbcDaoSupport every call to getJdbcTemplate() returns the same shared JdbcTemplate instance. You can mix this with ad-hoc instances. For example, override initTemplateConfig() to set the default for this DAO but use ad-hoc JdbcTemplate for selected queries.

Setting Fetch Size with Hibernate


With Hibernate you can specify jdbc fetchSize in two ways. First, you can set this as global setting. Second, you can overwrite this global setting for every individual query you want.

Global setting

Hibernate gives you property called hibernate.jdbc.fetch_size which you can set on your SessionFactory object. If configured, Hibernate will set this value on each PreparedStatement it creates.

With Spring + Hibernate you can use LocalSessionFactoryBean:

With Spring + JPA you can use LocalContainerEntityManagerFactoryBean:

Or you can set this property in any other way (hibernate.cfg.xml, etc.).

If you set hibernate.jdbc.fetch_size property and turn Hibernate logs to debug:

you should be able to see a log message telling you that Hibernate will use your setting:

Fetch size per query

Another method is to set fetchSize for particular query only. With Hibernate you can create HQL or SQL query in the following way:

With JPA you can use similar API:


Summary

Figure 2 shows you the same experiment for a remote oracle database.

jdbc fetch size benchmark, remote oracle database
Figure 2 - fetching 1M rows (remote oracle database)

In this experiment (Oracle database) when using fetchSize 500 I was able to fetch 1M rows 12 times faster than with default fetchSize of 10.

You can ask what it means:

  • does a larger fetch size imply better performance? 
  • if not – what is optimal value?

I will try to answer this for Oracle JDBC driver in my next post. For now I can only say that a larger fetch size:

  • usually gives you faster fetching
  • requires more memory for buffers
  • with some configurations consumes surprisingly much memory (oracle driver)

In my projects I use the following settings:

  • fetch size 50200 - as global setting
  • fetch size 2002000 - for some specific queries

Final notes

Experiment conditions:

  • local and remote Oracle 11g database
  • jdbc driver used: 11.1.0.7.0
  • only 1 fetching thread running at a time

Experiment was also performed for oracle drivers 10.2.0.4.0 and 11.2.0.3.0 with similar results.

Feel free to share your own experiences with fetchSize parameter on Oracle or any other database.

And final word worth remembering – if you don't set fetchSize explicitly, jdbc driver will use its default value which is 10 for oracle drivers.


Comments

  1. If my result set is smaller than the fetch size, would it negatively affect performance?

    ReplyDelete
    Replies
    1. Hi Tim, assuming your jvm has enough memory, then there is no negative effect on performance.

      For statement with fetchSize equal to N, Oracle driver maintains in-memory buffer for N potential result rows.

      If your query returns for example 10% of N, then then rest of buffer is simply not used for this very result set.

      Sometimes there is only 1 global setting of fetchSize. In such configuration there is considerable memory waste for all SQL statements that always return small result sets. For example all counting statements (select count... from...) requires only 1 bucket in buffer.

      Therefore, if your jvm has not so much memory on heap, setting large fetchSize for small queries leads to allocating unncessary memory and more frequent garbage collections. And this of course results in lower performance.

      Delete
  2. Would setting fetchSize at query level override the fetchSize settings on the application server's connection pool?

    ReplyDelete
    Replies
    1. Yes, setting fetchSize at query level (Statement or PreparedStatement) will override the fetchSize value set on Connection.

      Delete
  3. Thank you for you post.
    I just realized while loop for resultset is so slow, then I started googling about fetchSize. Finally, I got this useful information. Thank you so much.

    ReplyDelete
  4. Hi Yunhu,
    I'm glad this post was helpful.

    ReplyDelete
  5. Hello Herhor,
    Thanks for your post. Super helpful.

    In my tests I found that using fetchsize=1 when I knew the result would be only 1, was actually worse than using fetchsize=50.

    ReplyDelete
  6. Thank you for sharing such a nice article. This article will surely enhance reader’s knowledge. Keep posting!

    ReplyDelete
    Replies
    1. Good to know :)
      I'm planning more posts about java/database issues.

      Delete
  7. Thanks a lot. I worked a lot on a project and it was taking a long time for generating response. The vendor was going to ditch my product. But your post came as a saviour.

    ReplyDelete
  8. Thanks for this amazing post. Very informative! I've been struggling with this and wondering if you've ever tried setting the Fetch size as JVM argument?

    ReplyDelete
  9. Thanks for this useful article.
    Just want to ask about performance if we use both:
    ...
    stmt.setFetchSize(200);
    ...
    rs.setFetchSize(200);

    ReplyDelete
  10. Hey, there is no performance degradation. Setting fetchsize on ResultSet simply overwrites fetchsize that was set on Statement object.

    ReplyDelete
    Replies
    1. Hi, SetfetchSize() is not working for MS-SQL-Server.
      I have used both mssql-jdbc and sqljdbc4 driver.
      Please help

      Delete
  11. Thanks, your article is very good !
    Just in case, do you know any way to tune a similar Oracle parameter 'defaultLobPrefetchSize' with JPA ?
    I asked a question on stackoverflow but I counldn't get any answer https://stackoverflow.com/questions/51872943/oracle-lob-prefetch-with-jpa-and-c3p0.

    ReplyDelete
  12. Hi, SetfetchSize() is not working for MS-SQL-Server.
    I have used both mssql-jdbc and sqljdbc4 driver.
    Please help

    ReplyDelete
  13. As stated lets say that fetch size is 1000 ,when result set has 1000 rows in memory buffer,rs.next() will fetch from memory buffer.After 1000 rows ,it will make a DB trip and fetch next 1000 rows. My question is will the memory of previously fetched 1000 rows in ResultSet get released by JVM ?

    ReplyDelete
  14. Hi Darshan, you are right. The next 1000 rows are written to the same memory buffer and overwrite the old 1000 rows. So, yes, we can say that memory occupied by previously fetched rows is freed.

    ReplyDelete
  15. I've heard mySQL has a single cursor during fetch and hence does not support fetch_size: it's either all rows or one-by-one. In that case, does setting hibernate fetch property have any effect on MySQL DB?

    ReplyDelete
  16. Herhor you are an absolute champion
    I've been working integrating Ingres with the cloud and adapting your learnings from this post into an Ingres data pipeline changed total runtime from 6 hours to 16 mins
    (1 table went from 3hours 9 mins to 4 mins)
    so thank you!

    ReplyDelete
  17. Hi, I'm really happy to hear that. Sometimes it turns out there is a simple setting that can speed up our application.

    ReplyDelete
  18. I really acknowledge your comprehensive effort regarding the content. Hoping to continue this effort in future. Thanks
    mywakehealth


    ReplyDelete
  19. Hi Herhor, In our scenario, we have a Postres DB, from which we retrieve 1 million records through a complex query. Do you think setting setFetchSize(50000) is good idea, to avoid too many DB roundtrip? Simply what would be the maximum recommended number for setFetchSize?

    ReplyDelete
    Replies
    1. Hi, PostgreSQL presents a different approach: by default it fetches all records at once. So it is done with one round trip to DB. However is has memory impact as all your 1M records are collected in ResultSet object in the same thime. By using setFetchSize(100k) you swtich 10 DB round trips, but also you save the heap usage, as ResultSet buffer is 10 times smaller.
      You need to experiment with this setting to discover which is most performant. I think fetch size = 50k should be quite good compromise.

      Delete
  20. I'm setting fetchsize to some reasonable value and the same query was taking time as say 10 - 15 mins for some internal purpose so that streaming is good in case.
    My doubt is what happens if I set max_execution_time to 5 mins. Whether the streaming was closed and the query will be killed ?

    ReplyDelete
  21. Hi, what database are you using? And at what level you are setting max_execution_time?

    ReplyDelete
  22. Hi, Which is the difference between fetch size and max rows? What would you suggest if we need to decrease RAM memory usage?

    ReplyDelete
  23. Hi,
    in JDBC specification there are 2 completely different methods:
    Statement.setMaxRows(int) limits the numer of rows returned by ResultSet. This method will not change the way the database will perform query. Just some rows will be dropped if they exceed the limit.
    On the other hand Statement.setFetchSize(int) sets the number of rows that will be held in in-memory buffer at the same time, however ResultSet will return all query rows eventually.

    Summarizing - lowering fetchSize will decrease memory consumption and will not change the result set (all records will be returned); setting max rows doesn't change memory needs, but it will affect the result set (possibly less records will be returned).

    ReplyDelete
  24. Does fetch size property support in DB2? If yes what would be the ideal fetch size. I am going to 10 million records

    ReplyDelete
    Replies
    1. I didn't use DB2, but from what I can see, it supports this parameter. So you should be able to take advantage of it. Check the performance with fetch size equal to 1000 or 5000 and compare to the default.

      Delete
  25. Hi herhor, thanks for sharing this information about fetchsize. Its fabulous.
    here has a question, will it query again table in each loop?

    ReplyDelete
  26. same case for sqlserver will it work?

    ReplyDelete
  27. Hi, I haven't experimented with SQLServer jdbc drivers, but it seems to me this is very similar to oracle db, so setting larger fetch size should reduce total fetch time for larger result sets.

    ReplyDelete
  28. Thank you so much.Do you ever try spring-batch chunk? what's the difference between them?

    ReplyDelete

Post a Comment