我在使用lucene5构建数据库blog表的索引,包括tid,title,content等字段,考虑到可能会出现重复索引的情况,我将id,存入索引时设置为:
doc.add(new Field("tid", post.getTid().toString(), LuceneType.indexType()));
indexType()为:
public static FieldType indexType() {
FieldType indexType = new FieldType();
indexType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
//不进行分词
indexType.setTokenized(false);
indexType.setStored(true);
return indexType;
}
查询的代码为:
Query query = new TermQuery(new Term("content", "故事"));
Filter filter=new DuplicateFilter("tid");
ScoreDoc[] hits=indexSearcher.search(query,filter,1000).scoreDocs;
for (int i=0;i<hits.length;i++){
Document hitDOc=indexSearcher.doc(hits[i].doc);
System.out.println(hitDOc.get("tid"));
}
但是查询时依然能查询到id相同的多条数据,寻求解决办法,谢谢