Lucene索引

现在的位置: 首页 > 综合 > 正文

RSS

2013年12月09日 ⁄ 综合 ⁄ 共 2569字 ⁄ 字号小中大 ⁄ 评论关闭

Lucene索引库是Lucene操作的核心位置。建索引是往里面建，查询也是从索引库里面查。

1.先到官网去下载Lucene.jar包。我用的lucene版本是Lucene 3.03，建议版本不要使用太高，版本太高了可能使用的人少，参考资料少。高版本的，官网给的资料也不一定很多。apache官网自己去下吧。

Lucene需要的jar包如下：lucene-core-3.0.3.jar是lucene的核心jar包，我们只使用它就够了。

2.先建张数据库表建建索引试试吧。

官网上的示例代码大概还保持在lucene2.x的年代。一直未改动。

以下代码适用于lucene3.x，可能最新的版本有微弱改动，反正官网的代码改动非常大才可以用，这是当时我对官网代码的改动版。

由于lucene最开始的架构不是特别好，没有面向接口编程，所以后来为了满足java的编程规范，后来做了比较大的改动，特别是lucene3.x，基本和lucene2.x不兼容。

版本更新太快，如果不行最好结合自己版本的api进行相应改动。

以下是测试例子：

A_GoodsInfo表包含GoodsId，GoodsName字段

    /**
   * 创建普通索引
   */
   public void createDBTableIndex(){
       Connection conn=null;
       String sql="";
       String indexPath="D:\\luceneIndex";
       IndexWriter indexWriter=null;
       try {
           conn=DBUtil.getConnection();
           sql = "select GoodsId,StyleNo from A_GoodsInfo";
           PreparedStatement pstmt = conn.prepareStatement(sql);
           // 查询获得结果集
            ResultSet rs = pstmt.executeQuery();
            System.out.println("连接成功!!");
            int id=0;
            String StyleNo="";
            float price=0.0f;

          //lucene index存放的目录

//indexDir is the directory that hosts Lucene's index files

//索引文件

        File   indexDir = new File(indexPath);
        //dataDir is the directory that hosts the text files that to be indexed
        Analyzer luceneAnalyzer = new StandardAnalyzer(Version.LUCENE_30);
        long startTime = new Date().getTime();
        Directory dir= FSDirectory.open(indexDir);
        indexWriter = new IndexWriter(dir,luceneAnalyzer,IndexWriter.MaxFieldLength.LIMITED);
        //此处对数据库表构建lucene索引
        while(rs.next()){
           id=rs.getInt("GoodsId");
           GoodsName=rs.getString("GoodsName");

Document document = new Document();

//向docment对象加入索引字段

           document.add(new Field("id",id+"",Field.Store.YES, Field.Index.ANALYZED));
           document.add(new Field("GoodsName",GoodsName+"",Field.Store.YES, Field.Index.ANALYZED));

           indexWriter.addDocument(document);
           System.out.println("插入name的索引的值为："+StyleNo);

}

//索引优化

        indexWriter.optimize();

       } catch (Exception e) {
           e.printStackTrace();
       }
       finally{
              try {
               indexWriter.close();
                  conn.close();
           } catch (Exception e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
           }

       }
   }

以上例子是从数据库导数据到索引文件。即建索引过程。

建索引不仅仅可以从数据库建索引，还可以从其他文件格式建索引。html，pdf，xml等等等等。百度，google的思路大概是。每隔一段时间在网络上抓取数据到他们的分布式文件系统中，然后再建立相应索引库。我们省略了抓取的过程，这个过程需要其他的框架支持，比如apache的开源框架tika。

索引建立就完成了。。最简单的索引。

http://blog.csdn.net/xiaozhengdong/article/details/7035384

【上篇】这样的房车你见过没有?
【下篇】PXE网络技术在自动测试中的实现

作者: diluted

该日志由 diluted 于10年前发表在综合分类下，最后更新于 2013年12月09日.
转载请注明: Lucene索引 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

Lucene索引

作者: diluted

书签

最新文章New

本站推荐

返回首页