<< rank/ITEYEBlogSimilarChecker.java at master · ysc/rank · GitHub | 首页 | 使用Lucene的MoreLikeThisQuery实现相关内容推荐 >>

Lucene过滤器 - baobeituping - ITeye技术网站

有的应用有些要求,对于某类型的内容即使满足条件了,但是也不能被搜索出来,lucene中提供了过滤器的功能,通过自定义的过滤器继承Filter,从而实现特定的过滤功能。

Filter是一种过滤行为BitSet是一种位集合队列,这个队列中只有两种取值,TRUE或FALSE,LUCENE以这两种取值代表文档是否被过滤,也就是说,LUCENE返回结果时,会首先遍历BITSET,仅将那些对应值为TRUE的文档返回。

 

过滤器:

package com.filter;

import java.io.IOException;
import java.util.BitSet;

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.search.Filter;

 

public class AdvancedSecurityFilter extends Filter {

 //安全级别的常量
 public static final int ADVANCED=0;
 @Override
 public BitSet bits(IndexReader reader) throws IOException {
  //首先初始化一个BITSET对象
  final BitSet bits = new BitSet(reader.maxDoc());
  //先将整个集合设置为TRUE,表示当前集合内的所有文档都是可以被检索到的。
  bits.set(0,bits.size()-1);
  //构造一个TERM对象,代表最高安全级别
  Term term = new Term("securitylevel",ADVANCED+"");
  
  //从索引中搜索出所有最高安全级别的文档
  TermDocs termDocs = reader.termDocs(term);
  
  //遍历每个文档,并将其
  while(termDocs.next())
  {
   bits.set(termDocs.doc(),false);
  }
  return bits;
 }

 

}

 

过滤器使用实例:

public class FilterDemo {

 /**
  * @param args
  */
 public static final int ADVANCED=0;
 public static final int MIDDLE =1;
 public static final int NORMAL=2;
 public static void main(String[] args) {
  try {
   
   /*File file = new File("d://demo");
   Analyzer luceneAnalyzer = new StandardAnalyzer();
   IndexWriter writer = new IndexWriter(file, luceneAnalyzer, false);
   Document doc1 = new Document();
   Field f1 = new Field("bookNumber","0003",Field.Store.YES,Field.Index.UN_TOKENIZED);
   Field f2 = new Field("bookName","非对称模型",Field.Store.YES,Field.Index.UN_TOKENIZED);
   Field f3 = new Field("securitylevel",ADVANCED+"",Field.Store.YES,Field.Index.UN_TOKENIZED);
   doc1.add(f1);
   doc1.add(f2);
   doc1.add(f3);
   
   Document doc2 = new Document();
   Field f4 = new Field("bookNumber","0001",Field.Store.YES,Field.Index.UN_TOKENIZED);
   Field f5 = new Field("bookName","钢铁战士",Field.Store.YES,Field.Index.TOKENIZED);
   Field f6 = new Field("securitylevel",MIDDLE+"",Field.Store.YES,Field.Index.UN_TOKENIZED);
   doc2.add(f4);
   doc2.add(f5);
   doc2.add(f6);
   
   Document doc3 = new Document();
   Field f7 = new Field("bookNumber","0004",Field.Store.YES,Field.Index.UN_TOKENIZED);
   Field f8 = new Field("bookName","黑猫警长",Field.Store.YES,Field.Index.TOKENIZED);
   Field f9 = new Field("securitylevel",NORMAL+"",Field.Store.YES,Field.Index.UN_TOKENIZED);
   doc3.add(f7);
   doc3.add(f8);
   doc3.add(f9);
   
   writer.addDocument(doc1);
   writer.addDocument(doc2);
   writer.addDocument(doc3);
   
   writer.setUseCompoundFile(true);
   writer.optimize();
   writer.close();*/
   
   Term begin = new Term("bookNumber","0001");
   Term end = new Term("bookNumber","0004");
   RangeQuery q = new RangeQuery(begin,end,true);
   
   IndexSearcher searcher = new IndexSearcher("d://demo");
   System.out.println(q.toString());

//通过将自定义的过滤器配置在search方法中,从而达到过滤的目的。
   Hits hits = searcher.search(q,new AdvancedSecurityFilter());
   for(int i=0;i<hits.length();i++)
   {
    System.out.println(hits.doc(i));
   }
   
  } catch (Exception e) {
   e.printStackTrace();
  }

 }

}

阅读全文……

标签 : ,



发表评论 发送引用通报