potential memory leak when using RAMDirectory ,CloseableThreadLocal and a thread pool .
Lucene的文档缓存数据会绑定线程,随着线程退出而清除。
On Thu, Jan 3, 2013 at 12:16 PM, Alon Muchnick <alon [at] datonics> wrote:
> hi Mike ,
>
> at the peak there are 500 live threads going through Lucune (not all of
> them at the same time , tomcat thread pool uses round robin ) ,regarding
> the Directory impl we are using RAMDirectory.
> the object that takes most of the heap is the "hardRefs" WeakHashMap class
> member in the CloseableThreadLocal class . the size of the map is 500 ,
> with one entry for each thread that went through Lucune.
Hmm ... I'm curious what's actually using up all the RAM here.
A buffered Directory impl (eg NIOFSDirectory) would have IndexInput
clones that have 1 KB buffers, but RAMDirectory doesn't do that.
> when in run the test with only one thread the initial RAM did not grow much
> beyond the initial 30MB .
OK that's good. So somehow each thread ties up ~ 1 MB RAM.
> when looking at the code i can see that the references to the hardRefs map
> will be cleared when :
>
> 1.a thread which searched Lucune is no longer alive , so its corresponding
> record in the map will be cleared either by the GC or by the purge() method
> .
>
> 2.the close() method is called which "should only be called when all
> threads are done using the instance" .
Right.
> what happens to threads that go through Lucune and stay alive for a very
> long times ?
> will they always be a reference key in the hardRefs map for them ?
Yes ... we never prune live threads.
> if so does the value for the corresponding record in the map will be
> overwritten each time this thread makes a new search ? or will it be some
> how accumulated ? :
Well, we set the value once when we first see the thread, and then use
that value from then on.
> value ---- org.apache.lucene.index.TermInfosReader$ThreadResources --->
>
> termInfoCache |org.apache.lucene.util.cache.SimpleLRUCache
> termEnum |org.apache.lucene.index.SegmentTermEnum
>
> i have reduced the size of the thread pool , and changed to a more
> aggressive thread closing policy ,threads are now reduced when they are not
> needed (before that the thread pool would reach its max size and stayed
> there),
> this should kick in the the purge method and reclaim memory from the closed
> threads .
>
> ill give the system a few hours to run and update on the results .
OK thanks.
Really anything more then 2 * number-of-cores threads coming through
Lucene is overkill ...