JDBC fetch size - commonly forgotten setting
JDBC fetch size parameter can significantly reduce the result set fetching time. It can be set on any Statement or PreparedStatement object. Many frameworks like Spring or Hibernate give you very convenient API to do this.
Although this parameter is widely available, I discovered that a lot of projects I have visited did not configure the fetch size and thus was using its default value. Yet, the default value can result in a poor performance on some jdbc drivers.
I want to show you how much you can improve your fetching performance if you adjust the fetch size to your statements. The scale of improvement depends on the driver you use.
Figure 1 shows fetching times for different fetch size values for Oracle database. In this example database and java application are located on the same machine. I will show later how it looks for a remote database.
Lets see what happens if fetchSize property is set to 10. When rs.next() is called for first time, the oracle driver fetches first 10 records from database and store them in a memory buffer. So, for next 9 calls to rs.next() records are retrieved from this buffer. After the buffer is fully read, subsequent rs.next() will force driver to fetch a new bunch of rows (10) into the buffer.
So if we want to read 10k rows with fetch size set to 10, the driver will make 1000 round trips to the database using the underlying connection. If we set the fetchSize to 500 the driver will perform only 20 round trips to our database.
Look at Figure 1. Setting fetchSize to 100 gives you a 6 times shorter fetching time then with setting fetchSize to 10. Now, you should know that the default fetchSize for the oracle driver is 10...
Two important comments:
With Hibernate you can specify jdbc fetchSize in two ways. First, you can set this as global setting. Second, you can overwrite this global setting for every individual query you want.
With Spring + Hibernate you can use LocalSessionFactoryBean:
In this experiment (Oracle database) when using fetchSize 500 I was able to fetch 1M rows 12 times faster than with default fetchSize of 10.
You can ask what it means:
I will try to answer this for Oracle JDBC driver in my next post. For now I can only say that a larger fetch size:
In my projects I use the following settings:
Although this parameter is widely available, I discovered that a lot of projects I have visited did not configure the fetch size and thus was using its default value. Yet, the default value can result in a poor performance on some jdbc drivers.
I want to show you how much you can improve your fetching performance if you adjust the fetch size to your statements. The scale of improvement depends on the driver you use.
Oracle jdbc driver
Assume we have table with 1 million rows and for any reason we have to fetch all records into JVM. How fast you can get all data? What fetch size will Oracle use if you don’t set it explicitly?Figure 1 - fetching 1M rows with different fetchSize values (local oracle database) |
Setting Fetch Size with standard JDBC calls
This is how you can set fetch size for given PreparedStatement using JDBC API:Lets see what happens if fetchSize property is set to 10. When rs.next() is called for first time, the oracle driver fetches first 10 records from database and store them in a memory buffer. So, for next 9 calls to rs.next() records are retrieved from this buffer. After the buffer is fully read, subsequent rs.next() will force driver to fetch a new bunch of rows (10) into the buffer.
So if we want to read 10k rows with fetch size set to 10, the driver will make 1000 round trips to the database using the underlying connection. If we set the fetchSize to 500 the driver will perform only 20 round trips to our database.
Look at Figure 1. Setting fetchSize to 100 gives you a 6 times shorter fetching time then with setting fetchSize to 10. Now, you should know that the default fetchSize for the oracle driver is 10...
Two important comments:
- fetchSize can be set on each Statement or PreparedStatement or even on ResultSet. By default, ResultSet uses fetchSize of Statement from which is born. The default value for Statement or PreparedStatement is jdbc driver specific
- fetchSize is only a hint for the driver – the Oracle driver respects this setting, while other drivers may ignore it and fetch all the records at once, for instance.
Setting Fetch Size with Spring JdbcTemplate
When using Spring jdbc support you can do this in 2 ways:
When implementing a DAO that extends JdbcDaoSupport every call to getJdbcTemplate() returns the same shared JdbcTemplate instance. You can mix this with ad-hoc instances. For example, override initTemplateConfig() to set the default for this DAO but use ad-hoc JdbcTemplate for selected queries.
Ad hoc JdbcTemplate instance
Shared JdbcTemplate instance
When implementing a DAO that extends JdbcDaoSupport every call to getJdbcTemplate() returns the same shared JdbcTemplate instance. You can mix this with ad-hoc instances. For example, override initTemplateConfig() to set the default for this DAO but use ad-hoc JdbcTemplate for selected queries.
Setting Fetch Size with Hibernate
With Hibernate you can specify jdbc fetchSize in two ways. First, you can set this as global setting. Second, you can overwrite this global setting for every individual query you want.
Global setting
Hibernate gives you property called hibernate.jdbc.fetch_size which you can set on your SessionFactory object. If configured, Hibernate will set this value on each PreparedStatement it creates.With Spring + Hibernate you can use LocalSessionFactoryBean:
With Spring + JPA you can use
LocalContainerEntityManagerFactoryBean:
Or you can set this property in any other way (hibernate.cfg.xml, etc.).
If you set hibernate.jdbc.fetch_size property and turn Hibernate logs to debug:
you should be able to see a log message telling you that
Hibernate will use your setting:
Fetch size per query
Another
method is to set fetchSize for particular query only. With Hibernate you can create HQL or SQL query in the following way:
With JPA you can use similar API:
With JPA you can use similar API:
Summary
Figure 2 shows you the same experiment for a remote oracle database.Figure 2 - fetching 1M rows (remote oracle database) |
In this experiment (Oracle database) when using fetchSize 500 I was able to fetch 1M rows 12 times faster than with default fetchSize of 10.
You can ask what it means:
- does a larger fetch size imply better performance?
- if not – what is optimal value?
I will try to answer this for Oracle JDBC driver in my next post. For now I can only say that a larger fetch size:
- usually gives you faster fetching
- requires more memory for buffers
- with some configurations consumes surprisingly much memory (oracle driver)
In my projects I use the following settings:
- fetch size 50 – 200 - as global setting
- fetch size 200 – 2000 - for some specific queries
Final notes
Experiment conditions:- local and remote Oracle 11g database
- jdbc driver used: 11.1.0.7.0
- only 1 fetching thread running at a time
Experiment was also performed for oracle drivers 10.2.0.4.0 and 11.2.0.3.0 with similar results.
Feel free to share your own experiences with fetchSize parameter on Oracle or any other database.
And final word worth remembering – if you don't set fetchSize explicitly, jdbc driver will use its default value which is 10 for oracle drivers.
If my result set is smaller than the fetch size, would it negatively affect performance?
ReplyDeleteHi Tim, assuming your jvm has enough memory, then there is no negative effect on performance.
DeleteFor statement with fetchSize equal to N, Oracle driver maintains in-memory buffer for N potential result rows.
If your query returns for example 10% of N, then then rest of buffer is simply not used for this very result set.
Sometimes there is only 1 global setting of fetchSize. In such configuration there is considerable memory waste for all SQL statements that always return small result sets. For example all counting statements (select count... from...) requires only 1 bucket in buffer.
Therefore, if your jvm has not so much memory on heap, setting large fetchSize for small queries leads to allocating unncessary memory and more frequent garbage collections. And this of course results in lower performance.
Would setting fetchSize at query level override the fetchSize settings on the application server's connection pool?
ReplyDeleteYes, setting fetchSize at query level (Statement or PreparedStatement) will override the fetchSize value set on Connection.
Deleteyiihaaa
ReplyDeleteThank you for you post.
ReplyDeleteI just realized while loop for resultset is so slow, then I started googling about fetchSize. Finally, I got this useful information. Thank you so much.
Hi Yunhu,
ReplyDeleteI'm glad this post was helpful.
Hello Herhor,
ReplyDeleteThanks for your post. Super helpful.
In my tests I found that using fetchsize=1 when I knew the result would be only 1, was actually worse than using fetchsize=50.
Thank you for sharing such a nice article. This article will surely enhance reader’s knowledge. Keep posting!
ReplyDeleteGood to know :)
DeleteI'm planning more posts about java/database issues.
Thanks a lot. I worked a lot on a project and it was taking a long time for generating response. The vendor was going to ditch my product. But your post came as a saviour.
ReplyDeleteI'm glad I could help.
DeleteThanks for this amazing post. Very informative! I've been struggling with this and wondering if you've ever tried setting the Fetch size as JVM argument?
ReplyDeleteThanks for this useful article.
ReplyDeleteJust want to ask about performance if we use both:
...
stmt.setFetchSize(200);
...
rs.setFetchSize(200);
Hey, there is no performance degradation. Setting fetchsize on ResultSet simply overwrites fetchsize that was set on Statement object.
ReplyDeleteHi, SetfetchSize() is not working for MS-SQL-Server.
DeleteI have used both mssql-jdbc and sqljdbc4 driver.
Please help
Thanks, your article is very good !
ReplyDeleteJust in case, do you know any way to tune a similar Oracle parameter 'defaultLobPrefetchSize' with JPA ?
I asked a question on stackoverflow but I counldn't get any answer https://stackoverflow.com/questions/51872943/oracle-lob-prefetch-with-jpa-and-c3p0.
Hi, SetfetchSize() is not working for MS-SQL-Server.
ReplyDeleteI have used both mssql-jdbc and sqljdbc4 driver.
Please help
As stated lets say that fetch size is 1000 ,when result set has 1000 rows in memory buffer,rs.next() will fetch from memory buffer.After 1000 rows ,it will make a DB trip and fetch next 1000 rows. My question is will the memory of previously fetched 1000 rows in ResultSet get released by JVM ?
ReplyDeleteHi Darshan, you are right. The next 1000 rows are written to the same memory buffer and overwrite the old 1000 rows. So, yes, we can say that memory occupied by previously fetched rows is freed.
ReplyDeleteI've heard mySQL has a single cursor during fetch and hence does not support fetch_size: it's either all rows or one-by-one. In that case, does setting hibernate fetch property have any effect on MySQL DB?
ReplyDeleteHerhor you are an absolute champion
ReplyDeleteI've been working integrating Ingres with the cloud and adapting your learnings from this post into an Ingres data pipeline changed total runtime from 6 hours to 16 mins
(1 table went from 3hours 9 mins to 4 mins)
so thank you!
Hi, I'm really happy to hear that. Sometimes it turns out there is a simple setting that can speed up our application.
ReplyDeleteI really acknowledge your comprehensive effort regarding the content. Hoping to continue this effort in future. Thanks
ReplyDeletemywakehealth
Hi Herhor, In our scenario, we have a Postres DB, from which we retrieve 1 million records through a complex query. Do you think setting setFetchSize(50000) is good idea, to avoid too many DB roundtrip? Simply what would be the maximum recommended number for setFetchSize?
ReplyDeleteHi, PostgreSQL presents a different approach: by default it fetches all records at once. So it is done with one round trip to DB. However is has memory impact as all your 1M records are collected in ResultSet object in the same thime. By using setFetchSize(100k) you swtich 10 DB round trips, but also you save the heap usage, as ResultSet buffer is 10 times smaller.
DeleteYou need to experiment with this setting to discover which is most performant. I think fetch size = 50k should be quite good compromise.
I'm setting fetchsize to some reasonable value and the same query was taking time as say 10 - 15 mins for some internal purpose so that streaming is good in case.
ReplyDeleteMy doubt is what happens if I set max_execution_time to 5 mins. Whether the streaming was closed and the query will be killed ?
Hi, what database are you using? And at what level you are setting max_execution_time?
ReplyDeleteHi, Which is the difference between fetch size and max rows? What would you suggest if we need to decrease RAM memory usage?
ReplyDeleteHi,
ReplyDeletein JDBC specification there are 2 completely different methods:
Statement.setMaxRows(int) limits the numer of rows returned by ResultSet. This method will not change the way the database will perform query. Just some rows will be dropped if they exceed the limit.
On the other hand Statement.setFetchSize(int) sets the number of rows that will be held in in-memory buffer at the same time, however ResultSet will return all query rows eventually.
Summarizing - lowering fetchSize will decrease memory consumption and will not change the result set (all records will be returned); setting max rows doesn't change memory needs, but it will affect the result set (possibly less records will be returned).
Does fetch size property support in DB2? If yes what would be the ideal fetch size. I am going to 10 million records
ReplyDeleteI didn't use DB2, but from what I can see, it supports this parameter. So you should be able to take advantage of it. Check the performance with fetch size equal to 1000 or 5000 and compare to the default.
DeleteHi herhor, thanks for sharing this information about fetchsize. Its fabulous.
ReplyDeletehere has a question, will it query again table in each loop?
No, the query is executed only once.
Deletesame case for sqlserver will it work?
ReplyDeleteHi, I haven't experimented with SQLServer jdbc drivers, but it seems to me this is very similar to oracle db, so setting larger fetch size should reduce total fetch time for larger result sets.
ReplyDeleteThank you so much.Do you ever try spring-batch chunk? what's the difference between them?
ReplyDelete