Java 全文搜索引擎,Lucene 5.2.0 发布

jopen 7年前

Lucene 是apache软件基金会一个开放源代码的全文检索引擎工具包,是一个全文检索引擎的架构,提供了完整的查询引擎和索引引擎,部分文本分析引擎。 Lucene的目的是为软件开发人员提供一个简单易用的工具包,以方便的在目标系统中实现全文检索的功能,或者是以此为基础建立起完整的全文检索引擎。

Lucene 最初是由Doug Cutting所撰写的,是一位资深全文索引/检索专家,曾经是V-Twin搜索引擎的主要开发者,后来在Excite担任高级系统架构设计师,目前从事 于一些INTERNET底层架构的研究。他贡献出Lucene的目标是为各种中小型应用程式加入全文检索功能。

Lucene 5.2.0 发布,该版本包含一些 bug 修复,优化,和一些改进,值得关注的如下:

  • * Span queries now share document conjunction/intersection code with boolean queries, and use two-phased iterators for faster intersection by avoiding loading positions in certain cases.

  •  * Added two-phase support to SpanNotQuery, and SpanPositionCheckQuery and its subclasses: SpanPositionRangeQuery, SpanPayloadCheckQuery,
    SpanNearPayloadCheckQuery, SpanFirstQuery.

  •  * Added a new query time join to the join module that uses global ordinals, which is faster for subsequent joins between reopens.

  •  * New CompositeSpatialStrategy combines speed of RPT with accuracy of SDV. Includes optimized Intersect predicate to avoid many geometry checks. Uses TwoPhaseIterator.

  •  * New LimitTokenOffsetFilter that limits tokens to those before a configured maximum start offset.