之前集群的配置为hadoop-0.20.3,hbase-0.90.4,zookeeper-3.3.4,hive-0.8.1。hadoop还算稳定,基本没什么bug,而hive基于hbse查询时真是问题百出,hbase各种bug,比如丢数据,丢表,regionserver频繁宕机,各种打补丁,改错误搞得我脑袋都要爆了。于是决定给hbase来一个彻底的升级替换。
一. 先是把hbase升级为0.94.0,升级(就是安装包简单的替换)还算顺利,但启动的时候却报错
- 2012-06-26 15:59:19,051 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to create cluster ID file in hdfs://server:9000/hbase, retrying in 10000msec: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String, org.apache.hadoop.fs.permission.FsPermission, java.lang.String, boolean, boolean, short, long)
- at java.lang.Class.getMethod(Class.java:1605)
- at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:517)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
- at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
- at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
- at org.apache.hadoop.ipc.Client.call(Client.java:1066)
- at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
- at $Proxy10.create(Unknown Source)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
- at java.lang.reflect.Method.invoke(Method.java:597)
- at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
- at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
- at $Proxy10.create(Unknown Source)
- at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3245)
- at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
- at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
- at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
- at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
- at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
- at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
- at org.apache.hadoop.hbase.util.FSUtils.setClusterId(FSUtils.java:463)
- at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:357)
- at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127)
- at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:112)
- at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:480)
- at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:343)
- at java.lang.Thread.run(Thread.java:662)
查看源码可知hadoop-0.20.3中ClientProtocol.create方法如下
- public void create(String src, FsPermission masked,String clientName,
- boolean overwrite, short replication,long blockSize) throws IOException;
与hbase0.94.0中调用的create方法不兼容。
二. 所以只能把hadoop给升级了,查看了release版本,决定把0.20.3升级为1.0.3版本
hadoop升级不像hbase,涉及到数据版本可能不兼容(这在1.0版本之前尤为明显),所以相对hbase来说升级比较繁琐。
除了简单替换了hadoop安装包的文件之外,还得给HDFS的数据和元数据升级。
然而HDFS的数据和元数据的升级有一定的风险,可能会造成数据丢失等,所以在生产线上集群升级之前最好在测试集群测试通过才进行。下面说说升级步骤:
1.确保文件系统健康(fsck工具全面检查)
2.清空HDFS和Mapreduce的临时数据
3.确保上一次升级已经定妥。
4.关闭MapReduce,终止tasktracker上运行的任何孤儿任务。
5.关闭HDFS,并备份元数据
6.替换安装文件(集群所有机器)
7.使用-upgrade选项启动HDFS升级 (start-dfs.sh -upgrade)
8.等待,直到升级完成(hadoop dfsadmin -upgradeProgress status查看升级状态)
9.检查HDFS是否运行正常(fsck工具)
10.启动MapReduce
11.回滚(start-dfs.sh -roollback)或定妥升级(hadoop dfsadmin -finalizeUpgrade)
三. 升级zookeeper到3.4.3版本(过程略)
四. 启动hbase,此时会有如下提示
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/usr/local/hadoop/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
因为hadoop lib下的slf4j-log4j12-1.5.8.jar于hbase lib的slf4j-log4j12-1.4.3.jar冲突,删除Hbase lib下的slf4j-log4j12-1.4.3.jar即可。
此时hbase能正常启动。升级完毕。
五. 替换hive中相关的jar包,及配置文件,启动hive的时候,又出现了问题
- 2012-06-28 16:27:55,303 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
- 2012-06-28 16:27:55,303 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
- 2012-06-28 16:27:55,307 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
- 2012-06-28 16:27:55,307 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
- 2012-06-28 16:27:55,307 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
- 2012-06-28 16:27:55,307 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
- 2012-06-28 16:27:58,605 WARN parse.SemanticAnalyzer (SemanticAnalyzer.java:genBodyPlan(5821)) - Common Gby keys:null
- 2012-06-28 16:27:58,850 WARN hbase.HBaseConfiguration (HBaseConfiguration.java:<init>(48)) - instantiating HBaseConfiguration() is deprecated. Please use HBaseConfiguration#create() to construct a plain Configuration
- 2012-06-28 16:27:59,006 WARN hbase.HBaseConfiguration (HBaseConfiguration.java:<init>(48)) - instantiating HBaseConfiguration() is deprecated. Please use HBaseConfiguration#create() to construct a plain Configuration
- 2012-06-28 16:27:59,539 ERROR CliDriver (SessionState.java:printError(