<< 分布式搜索ElasticSearch构建集群与简单搜索实例应用 - 苏若年 - 博客园 | 首页 | ElasticSearch: Java API | Javalobby >>

Lucene 4.4 以后近实时NRT检索

Lucene4.4之后,NRTManager 及NRTManagerReopenThread 已经都没有了,如果做近实时搜索的话,就要这么做,

初始化:

   Directory directory = new RAMDirectory();
   IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(ver));
   IndexWriter indexWriter = new IndexWriter(directory, iwc);
   TrackingIndexWriter trackWriter = new TrackingIndexWriter(indexWriter);
   searcherManager = new SearcherManager(indexWriter, true, new SearcherFactory());
   
   ControlledRealTimeReopenThread<IndexSearcher> CRTReopenThread = 
     new ControlledRealTimeReopenThread<IndexSearcher>(trackWriter, searcherManager, 5.0, 0.025) ;

   CRTReopenThread.setDaemon(true);
   CRTReopenThread.setName("后台刷新服务");
   CRTReopenThread.start();

添加文档:

trackWriter.addDocument(doc);

进行搜索:

IndexSearcher searcher = searcherManager.acquire();

......

searcherManager.release(searcher);

 
 
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
 
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TrackingIndexWriter;
import org.apache.lucene.search.ControlledRealTimeReopenThread;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ReferenceManager;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.SearcherManager;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.util.Version;
 
 
public class LuceneIndex {
private final static Logger log = LogManager.getLogger(LuceneIndex.class);
    private final IndexWriter _indexWriter;
    private final TrackingIndexWriter _trackingIndexWriter;
    private final ReferenceManager<IndexSearcher> _indexSearcherReferenceManager;
    private final ControlledRealTimeReopenThread<IndexSearcher> _indexSearcherReopenThread;
 
 
    private long _reopenToken;      // index update/delete methods returned token
 
////////// CONSTRUCTOR & FINALIZE
    /**
     * Constructor based on an instance of the type responsible of the lucene index persistence
     */
    
    public LuceneIndex(final Directory luceneDirectory,
                       final Analyzer analyzer) {
        try {
            // [1]: Create the indexWriter
            _indexWriter = new IndexWriter(luceneDirectory,
                                           new IndexWriterConfig(Version.LATEST,
                                                                 analyzer));
 
            // [2a]: Create the TrackingIndexWriter to track changes to the delegated previously created IndexWriter 
            _trackingIndexWriter = new TrackingIndexWriter(_indexWriter);
 
            // [2b]: Create an IndexSearcher ReferenceManager to safelly share IndexSearcher instances across
            //       multiple threads
            _indexSearcherReferenceManager = new SearcherManager(_indexWriter,
                                                                 true,
                                                                 null);
 
            // [3]: Create the ControlledRealTimeReopenThread that reopens the index periodically having into 
            //      account the changes made to the index and tracked by the TrackingIndexWriter instance
            //      The index is refreshed every 60sc when nobody is waiting 
            //      and every 100 millis whenever is someone waiting (see search method)
            //      (see http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/NRTManagerReopenThread.html)
            _indexSearcherReopenThread = new ControlledRealTimeReopenThread<IndexSearcher>(_trackingIndexWriter,
                                                                                           _indexSearcherReferenceManager,
                                                                                           60.00,   // when there is nobody waiting
                                                                                           0.1);    // when there is someone waiting
            _indexSearcherReopenThread.start(); // start the refresher thread
        } catch (IOException ioEx) {
            throw new IllegalStateException("Lucene index could not be created: " + ioEx.getMessage());
        }
    }
    @Override
    protected void finalize() throws Throwable {
        this.close();
        super.finalize();
    }
    /**
     * Closes every index
     */
    public void close() {
        try {
            // stop the index reader re-open thread
            _indexSearcherReopenThread.interrupt();
            _indexSearcherReopenThread.close();
 
            // Close the indexWriter, commiting everithing that's pending
            _indexWriter.commit();
            _indexWriter.close();
 
        } catch(IOException ioEx) {
            log.error("Error while closing lucene index: {}",
                                                             ioEx);
        }
    }
////////// INDEX
    /**
     * Index a Lucene document
     * @param doc the document to be indexed
     */
    public void index(final Document doc) { 
        try {
            _reopenToken = _trackingIndexWriter.addDocument(doc);
            log.debug("document indexed in lucene");
        } catch(IOException ioEx) {
            log.error("Error while in Lucene index operation: {}",
                                                                  ioEx);
        } finally {
            try {
                _indexWriter.commit();
            } catch (IOException ioEx) {
                log.error("Error while commiting changes to Lucene index: {}",
                                                                              ioEx);
            }
        }
    }
    /**
     * Updates the index info for a lucene document
     * @param doc the document to be indexed
     */
    public void reIndex(final Term recordIdTerm,
                        final Document doc) {   
        try {
            _reopenToken = _trackingIndexWriter.updateDocument(recordIdTerm, 
                                                               doc);
            log.debug("{} document re-indexed in lucene");
        } catch(IOException ioEx) {
            log.error("Error in lucene re-indexing operation: {}",
                                                                  ioEx);
        } finally {
            try {
                _indexWriter.commit();
            } catch (IOException ioEx) {
                log.error("Error while commiting changes to Lucene index: {}",
                                                                              ioEx);
            }
        }
    }
    /**
     * Unindex a lucene document
     * @param idTerm term used to locate the document to be unindexed
     *               IMPORTANT! the term must filter only the document and only the document
     *                          otherwise all matching docs will be unindexed
     */
    public void unIndex(final Term idTerm) {
        try {
            _reopenToken = _trackingIndexWriter.deleteDocuments(idTerm);
            log.debug("{}={} term matching records un-indexed from lucene");
        } catch(IOException ioEx) {
            log.error("Error in un-index lucene operation: {}",
                                                               ioEx);           
        } finally {
            try {
                _indexWriter.commit(); 
            } catch (IOException ioEx) {
                log.error("Error while commiting changes to Lucene index: {}",
                                                                              ioEx);
            }
        }
    }
    /**
     * Delete all lucene index docs
     */
    public void truncate() {
        try {
            _reopenToken = _trackingIndexWriter.deleteAll();
            log.warn("lucene index truncated");
        } catch(IOException ioEx) {
            log.error("Error truncating lucene index: {}",
                                                          ioEx);            
        } finally {
            try {
                _indexWriter.commit(); 
            } catch (IOException ioEx) {
                log.error("Error truncating lucene index: {}",
                                                              ioEx);
            }
        }
    }
/////// COUNT-SEARCH
    /**
     * Count the number of results returned by a search against the lucene index
     * @param qry the query
     * @return
     */
    public long count(final Query qry) {
        long outCount = 0;
        try {
            _indexSearcherReopenThread.waitForGeneration(_reopenToken);     // wait untill the index is re-opened
            IndexSearcher searcher = _indexSearcherReferenceManager.acquire();
            try {
                TopDocs docs = searcher.search(qry,0);
                if (docs != null) outCount = docs.totalHits;
                log.debug("count-search executed against lucene index returning {}");
            } finally {
                _indexSearcherReferenceManager.release(searcher);
            }
        } catch (IOException ioEx) {
            log.error("Error re-opening the index {}",
                                                      ioEx);
        } catch (InterruptedException intEx) {
            log.error("The index writer periodically re-open thread has stopped",
                                                                                 intEx);
        }
        return outCount;
    }
    /**
     * Executes a search query
     * @param qry the query to be executed
     * @param sortFields the search query criteria
     * @param firstResultItemOrder the order number of the first element to be returned
     * @param numberOfResults number of results to be returnee
     * @return a page of search results
     */
    public LucenePageResults search(final Query qry,Set<SortField> sortFields,
                                    final int firstResultItemOrder,final int numberOfResults) {
        LucenePageResults outDocs = null;
        try {
            _indexSearcherReopenThread.waitForGeneration(_reopenToken); // wait until the index is re-opened for the last update
            IndexSearcher searcher = _indexSearcherReferenceManager.acquire();
            try {
                // sort crieteria
 
                Sort theSort = new Sort();
                // number of results to be returned
                int theNumberOfResults = firstResultItemOrder + numberOfResults;
 
                // Exec the search (if the sort criteria is null, they're not used)
                TopDocs scoredDocs = searcher.search(qry,
                                                                       theNumberOfResults,
                                                                       theSort);
                log.debug("query {} {} executed against lucene index: returned {} total items, {} in this page");
 
 
                ScoreDoc[] hits = scoredDocs.scoreDocs;
 
    List items = new ArrayList();
    for (int i = 0; i < hits.length; i++) {
   
    org.apache.lucene.document.Document doc = searcher.doc(hits[i].doc);
   
 
 
    items.add(doc);
 
    }
   
            } finally {
                _indexSearcherReferenceManager.release(searcher);
            }
        } catch (IOException ioEx) {
            log.error("Error freeing the searcher {}",
                                                      ioEx);
        } catch (InterruptedException intEx) {
            log.error("The index writer periodically re-open thread has stopped",
                                                                                 intEx);
        }
        return outDocs;
    }
/////// INDEX MAINTEINANCE
    /**
     * Mergest the lucene index segments into one
     * (this should NOT be used, only rarely for index mainteinance)
     */
    public void optimize() {
        try {
            _indexWriter.forceMerge(1);
            log.debug("Lucene index merged into one segment");
        } catch (IOException ioEx) {
            log.error("Error optimizing lucene index {}",
                                                         ioEx);
        }
    }
}
标签 : , ,



发表评论 发送引用通报