现在的位置: 首页 > 综合 > 正文

berkeley db 中secondaryDatabase的用法 and 构建索引

2013年10月22日 ⁄ 综合 ⁄ 共 3869字 ⁄ 字号 评论关闭

       简单介绍下bdb中索引的构建。

       为什么要构建索引?

       由于bdb是key/value型数据库,我们通常是通过key值来检索数据。可是,实际运用中,我们更希望能够通过value中的某些值来得到这样一整条记录。显然仅仅靠key来检索是不可行的。原文摘抄如下:Usually you find database records by means of the record's key. However, the key that you
use for your record will not always contain the information required to provide you with rapid access to the data that you want to retrieve.(来自first start guide)。

       我们不希望一条条地取出记录,然后再判断是否包含我们需要的内容。于是,我们可以对这些感兴趣的内容构建索引,这样做可以极大地加快我们检索的速度!原文如下:Rather than iterate through all of the records in your database, examining each in turn for a given person's name,
you create indexes based on names and then just search that index for the name that you want. You can do this using secondary databases.(同上).

       所以,我们需要用到索引。在bdb中叫做 SecondaryDatabase,同主表(primary database)一样,是key/value型数据库。不一样的是,索引建好后,内容我们不能改变(添加、修改等操作,但是可以删除!)。primary database中记录有变动,则直接会导致secondaryDatabasez中数据的变化,这个过程是自动的,由bdb来维护。需要注意的是,对secondaryDatabase进行delete操作会直接影响到primary
database,也就是两个表中的数据都将被清除!

       secondaryDatabase一般会出现duplicate record,如果允许重复的key值,则应该对duplicate record进行设置。

       初始化的部分代码如下:

private SecondaryConfig sdbConfig; 
private SecondaryDatabase sdb; 
  
private SecondaryCursor sCursor; 
sdbConfig = new SecondaryConfig(); 
  
sdbConfig.setAllowCreate(true); 
sdbConfig.setSortedDuplicates(true);//important for secondaryDatabase!
sdbConfig.setKeyCreator(new ObjectKeyCreator());
  
sdb = env.openSecondaryDatabase(null, "name", db, sdbConfig);//name is "name"    
  
//sCursor = sdb.openSecondaryCursor(null, null);//this method has been deprecated!
sCursor = sdb.openCursor(null, null); 

      上面代码中有个类是ObjectKeyCreator,这个类主要负责对primary database中的特征数据(需要对他们建索引)产生key,作为该特征数据的secondaryDatabase中一条记录的key值。这条记录的value值由bdb来维护,为primary database的key值!(有点绕,不过很重要)。所以,构建索引的主要工作就是产生key,也就是implements SecondaryKeyCreator这个接口。

       ObjectKeyCreator类的代码如下(如果是对象中的某个属性值建索引,则需要在构造函数中引入对这个对象相关的TupleBinding类)

package com.berkeley.dbje;
 
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.SecondaryDatabase;
import com.sleepycat.je.SecondaryKeyCreator;
 
// create SecondaryDatabase for "name"
public class ObjectKeyCreator implements SecondaryKeyCreator {
 
    @Override
    //create key for secondaryDatabase by this method,that is "result"
    public boolean createSecondaryKey(SecondaryDatabase secondDb,
            DatabaseEntry key, DatabaseEntry value, DatabaseEntry result) {
        int index=-1,pos=-1;
        String res="";
        try{
            //get the primary data record ,that is "value"
            String mvalue = new String(value.getData(),"UTF-8");
            String reg = "@@name:";
            if((index = mvalue.indexOf(reg))!=-1){
                if((pos = mvalue.indexOf("@@",index+1))!=-1){
                    res = mvalue.substring(index+reg.length(), pos);
                }else{
                    res = mvalue.substring(index+reg.length(), mvalue.length());
                }
                
                result.setData(res.getBytes("UTF-8"));//set the key!
                return true;
            }
        }catch(Exception e){
            e.printStackTrace();
        }
        return false;
    }
 
}

      这样,索引就构建好了,接下来,当primary database发生变化,则会相应改变secondaryDatabase,因此不需要手动维护(废话,:D).

       如何利用这个索引呢?

       代码如下:

//get the data from primary by searching secondary database using "name"!
public String getRecord(String name){
    String res = "";
    try{
        DatabaseEntry skey = new DatabaseEntry(name.getBytes("UTF-8"));
        DatabaseEntry sdata = new DatabaseEntry();
        
        OperationStatus val = sCursor.getSearchKey(skey, sdata, LockMode.DEFAULT);
        while(OperationStatus.SUCCESS == val){
            res += new String(sdata.getData(),"UTF-8")+ ",";
            val = sCursor.getNextDup(skey, sdata, LockMode.DEFAULT);//find the next "name"!
        }
        
    }catch(Exception e){
        e.printStackTrace();
    }
    return res;
}

       通过游标便可以访问这个索引,通过"name"便可以查询到primary database中value中name属性对应的整条记录。注意:通过secondaryDatabase的key值(我们需要查询的内容)只能找到primary database中包含这个key的记录,而不能得到主key!因为getSearchKey()这个函数返回的是"the primary data returned as output."

       测试数据:

        myBdb.insert(12, "person$$name:peter$$age:12");
        myBdb.insert(11, "person$$name:sa$$age:23");//测试索引
        myBdb.insert(10, "person$$name:sa$$age:23");

        System.out.println(myBdb.getRecord("sa"));
        System.out.println(myBdb.getRecord("peter"));

       输出:

       person$$name:sa$$age:23,person$$name:sa$$age:23,
       person$$name:peter$$age:12,

       
       最后,close的顺序有点讲究!先关cursor,后关secondaryDatabase,然后关primary database,最后关Environment!

抱歉!评论已关闭.