Hibernate Performance Tips(Hibernate 性能技巧)
Some people I know are looking at porting their (very large) J2EE application from using a homegrown OR framework on top of Entity Beans with CMP (without CMR) to Hibernate, and they asked me for some tips. I‘m not claiming to be a Hibernate expert, but I‘ve used it on an enterprise product, added features, submitted patches, etc. so I know a bit. Since I was writing this up anyway, I figured I‘d blog it.
- Use the latest. I would suggest jumping on Hibernate 3 even though it's in beta now as it has a lot of very useful new features.
- Make all associations lazy (Hibernate 3 makes this the default) and make it a concious choice to eagerly join or fetch data for specific use-cases.
- Define your Session management strategy early. Will it be one session per request, session per request with detached objects, session per "application transaction" across multiple requests, etc. Not sure how this would play in a client-server app. This is more from a web application perspective.
- Define the flush strategy early : let Hibernate auto-flush vs. defining your own synchronization points where you flush to the database.
- Define your caching strategy early: will you cache transactional data? If so go to www.tangosol.com and talk to them about buying their product, Coherence. If you are only caching lookup / setup data where some small time-lag between changes being available to all cluster members is acceptable, then you can get away with an opensource implementation like OSCache. The transactional cache is very much preferred, as it will allow you to use the query cache, which is where a lot of the power of the cache lays. I've talked to people who use non-transactional caches for the query cache, but the Hibernate docs and Gavin say that it's not safe. It will probably work 99.9% of the time, but.... The query cache could be useful for caching queries for non-transactional data even without a transactional cache.
- Cache your entities, and make sure to apply the cache settings to their association mappings as well (set, map, bag, list).
- You can only join from an entity to one of its one-to-many or many-to-many associations, so choose the right one. Set appropriate values for the batch size of the other collections (the number of related records to load at once) which you want to be eagerly loaded.
- Examine your object graphs to make sure that references to entities won't be held anywhere preventing garbage collection. This is especially true for entities with lots of associations and which change frequently, as you can get lots of objects filling up memory.
- Use optimistic concurrency with a version column if possible. Timestamps work too, but not as cleanly. Avoid making Hibernate examine all values to determine if records have changed.
- Understand the lifecycle of identifier creation. If you are using sequences or other database-generated identifiers, your identifiers won't be set until the new entity has been saved to the database. If you use the identifier as part of your equals() and hashCode() implementation, this means that if you add the entity to a Set before you save it, you won't be able to find it again after it's saved. I've gone to UUIDs assigned in the constructor of the entity instances for this reason, and it makes life much easier. I've heard that some people, faced with this, have made their hashCode() implementation always return the same number for every instance of an entity. While this technically fulfills the contract of hashCode() it's hardly optimal.
- Set your IDE up with a reference to the Hibernate source code of the distribution you are using so you can trace into it. Not only will it help you understand what's causing the behavior you're seeing, but it will help you understand how Hibernate works.
- Understand the 2nd level cache (and understand that the Session is your 1st level cache). Understand how it caches your entity data (as a Map of identifier to Array of field values) and collections (it just saves the identifiers and re-constitutes the objects from the entity cache).
- Go buy Hibernate in Action. It covers Hibernate 2.1 but the ideas are the same, there are just some new features in 3.0.
- Speaking of new features, look at Filters. I haven't had a chance to use them yet, but the mere fact that they make the effective date problem so easy makes them a huge win. They can also let you find the number of records in a collection, etc. without having to load the whole thing, very nice.
- Baseline your first working setup with memory and CPU profilers and also with a SQL profiler like IronTrack SQL, then re-run your test cases with the profilers for every configuration tweak to see what the effects are.