Jedis 默认sharding实现原理分析

现在的位置: 首页 > 综合 > 正文

Jedis 默认sharding实现原理分析

2018年02月09日 ⁄ 综合 ⁄ 共 3988字 ⁄ 字号小中大 ⁄ 评论关闭

当信息量较大时，我们就需要将信息保存在多台机器上。如何均匀分配数据呢？
redis.clients.jedis.ShardedJedisPool.java 为我们提供了一个简单一样的数据分箱的实现，下面分析一下其原理。

从构造方法入手：

public ShardedJedisPool(final GenericObjectPool.Config poolConfig,
List<JedisShardInfo> shards, Hashing algo, Pattern keyTagPattern) {
super(poolConfig, new ShardedJedisFactory(shards, algo, keyTagPattern));
}

poolConfig：同JedisPool的设置，参见《常见JedisConnectionException异常分析》http://blog.csdn.net/fachang/article/details/7984123
ShardedJedisFactory：继承自org.apache.commons.pool.BasePoolableObjectFactory<T>.java类，提供了为连接池创建连接实例的抽象实现，api如下：

api地址：http://commons.apache.org/pool/apidocs/index.html?org/apache/commons/pool/impl/GenericObjectPool.html

void activateObject(T obj) No-op.
void destroyObject(T obj) No-op.
abstract T
makeObject() Creates an instance that can be served by the pool.
void passivateObject(T obj) No-op.
boolean validateObject(T obj) This implementation always returns true.

其中我们需要关心的是ShardedJedisFactory对makeObject()抽象方法的实现，源码如下：

public Object makeObject() throws Exception {
ShardedJedis jedis = new ShardedJedis(shards, algo, keyTagPattern);
return jedis;
}

接着探寻redis.clients.jedis.ShardedJedis extends BinaryShardedJedis类的构造方法：

public ShardedJedis(List<JedisShardInfo> shards, Hashing algo, Pattern keyTagPattern) {
super(shards, algo, keyTagPattern);
}

调用了父类redis.clients.jedis.BinaryShardedJedis extends Sharded<Jedis, JedisShardInfo>的构造方法，源码如下：

public BinaryShardedJedis(List<JedisShardInfo> shards, Hashing algo, Pattern keyTagPattern) {
super(shards, algo, keyTagPattern);
}

接着看其父类redis.clients.util.Sharded<R, S extends ShardInfo<R>>的构造方法：

连接池存储信息：
存储连接池pool中各个连接信息JedisShardedInfo，实际操作中一个JedisShardedInfo根据其名字，对应160个沙箱孔（key)，这样沙箱孔越多，数据将分布越均匀。
private TreeMap<Long, S> nodes;
存储连接信息JedisShardedInfo到实际连接实例ShardedJedis的映射。

private final Map<ShardInfo<R>, R> resources = new LinkedHashMap<ShardInfo<R>, R>();
public Sharded(List<S> shards, Hashing algo, Pattern tagPattern) {
this.algo = algo;
this.tagPattern = tagPattern;
initialize(shards);
}

真正的分校操作应该就是initialize(List<S> shards)方法了，源码如下：

private void initialize(List<S> shards) {
nodes = new TreeMap<Long, S>();
for (int i = 0; i != shards.size(); ++i) {
final S shardInfo = shards.get(i);
if (shardInfo.getName() == null)
for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n), shardInfo);
}
else
for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
nodes.put(this.algo.hash(shardInfo.getName() + "*" + shardInfo.getWeight() + n), shardInfo);
}
resources.put(shardInfo, shardInfo.createResource());
}
}

其中我们看出JedisShardedPool通过每一个JedisShardedInfo配置的连接的name属性类分箱。具体做法是，每个Redis连接根据其名字+权重+计数号等信息

进行160次哈希计算作为160个 “沙箱孔”key（一个Long值）都对应一个redis连接信息JedisShardedInfo实例。假设有2个redis连接信息如下：

List<JedisShardInfo> shards = new ArrayList<JedisShardInfo>();
shards.add(new JedisShardInfo("localhost", 6379, "master1"));
shards.add(new JedisShardInfo("localhost", 6380, "master2"));

则：
名字为master1，端口为6379的连接在pool中对应根据名字master1+权重+计数号生成的160个沙箱孔。
名字为master2，端口为6380的连接在pool中对应根据名字master2+权重+计数号生成的160个沙箱孔.
注：为什么会生成160个呢？个人认为这个数字是考虑到redis key 值能在各连接中均匀分布而为之。

这样当我们通过key对redis进行操作时，会用同样的hash算法对该key进行hash操作，然后在所有的沙箱孔（连接的keys）中找到那个力该key值最接近的那个孔（连接在pool中的key值），

获取 redis连接，进行相应的操作。

举个例子，进行如下操作：

ShardedJedis jedis = shardedJedisPool.getResource();
jedis.set("key1", "value1");

看ShardedJedis.java源码如下：

public String set(String key, String value) {
Jedis j = getShard(key);
return j.set(key, value);
}

首先要调用getShard(key)方法获取当前要操作的key对应的连接实例，继续看getShard（）方法源码：

public R getShard(String key) {
return resources.get(getShardInfo(key));
}

为从resources中获取连接实例，需要知道该连接实例在reeMap<Long, S> nodes 中对应哪个沙箱孔？

public S getShardInfo(byte[] key) {
SortedMap<Long, S> tail =nodes.tailMap(algo.hash(key));
if (tail.isEmpty()) {
return nodes.get(nodes.firstKey());
}
return tail.get(tail.firstKey());
}

public S getShardInfo(String key) {
return getShardInfo(SafeEncoder.encode(getKeyTag(key)));
}

重点在getShardInfo(byte[] key)的实现，根据key的hash值，找到nodes中比它大的key值（沙箱孔）。如果没有，则拿第一个孔，如果存在则取比它大的第一个孔。
这样对key的save/query类操作都能映射到沙箱中的同一个孔，获取相同的连接，保证数据的一致和完整性。

以上是Jedis沙盒机制的默认实现，但在实际应用中我们不一定要采用这样的机制。

【上篇】SQL server 2000
【下篇】ubuntu nagios 安装配置

作者: swirling

该日志由 swirling 于6年前发表在综合分类下，最后更新于 2018年02月09日.
转载请注明: Jedis 默认sharding实现原理分析 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

Jedis 默认sharding实现原理分析

作者: swirling

书签

最新文章New

本站推荐

返回首页