一个网站,难免需要全文检索,目前基于lucene的全文检索几乎在java程序员中无人不知,下面我就举几个例子来说明
1.建立索引
很简单,两句代码搞定,其中Information是自己定义的类
SearchIndexer searchIndexer = new SearchIndexer("c://index");
searchIndexer.addInformation(new Information(DateTimeUtil.format(new Date(),"yyyy-MM-dd"),
dt,linkurl,hname,"",html));
2.分页搜索
ArrayList al=searchIndexer.processHits(hits,(currPage-1)*recordInEachPage,currPage*recordInEachPage-1);
支持分页的搜索结果显示
3.关键字加亮
这里我是用正则表达式自己写的,lucene自带好像也有
public static String setKeywordSplitWord(String kw,int ignoreLen,String str,int beforeLen,String bgcolor,String fcolor,int fontsize,int kwnum)
{
if(str==null || str.equals("")) return "";
//if(str.length()>1000) return setKeywordSplitWord(kw,ignoreLen,str.substring(0,1000),beforeLen,bgcolor,fcolor,fontsize,kwnum);
//过滤HTML标签
str=str.replaceAll("//<(.*?)//>","");
int start=0;
String hstr="",tstr="";
String mstr="";
int halfLen=ignoreLen/2;
kw=Utils.replace(kw," ","|");
Pattern p=Pattern.compile(kw,Pattern.DOTALL);
int kwnumber=0;
if(str.length()>ignoreLen)
{
hstr=str.substring(0,halfLen);
tstr=str.substring(str.length()-halfLen,str.length());
str=str.substring(halfLen,str.length()-halfLen);
Matcher m = p.matcher(str);
int prepos=0;
while(m.find(start)){
kwnumber++;
if(kwnumber>kwnum) break;
int is=m.start();
int ie=m.end();
String s="..."+(str.substring((is-beforeLen<prepos?is:(is-beforeLen)),is))
+ str.substring(is,ie)+
(str.substring(ie,(ie+beforeLen)>str.length()?str.length():(ie+beforeLen)))+"...";
mstr+=s;
start=ie+beforeLen;
prepos=ie+beforeLen;
if(start>str.length()) break;
}
String tempStr=hstr+"..."+mstr+"..."+tstr;
return setKeywordColor(kw,tempStr,bgcolor,fcolor,fontsize);
}else{
//直接替换
return setKeywordColor(kw,str,bgcolor,fcolor,fontsize);
}
}
最终效果如下:
分页效果如下: