现在的位置: 首页 > 综合 > 正文

HDFS常用的Java Api详解

2013年04月07日 ⁄ 综合 ⁄ 共 1001字 ⁄ 字号 评论关闭

一、使用Hadoop URL读取数据

package hadoop;

import java.io.InputStream;
import java.net.URL;

import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import org.apache.hadoop.io.IOUtils;

public class URLCat {

    static {
        URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
    }

    public static void readHdfs(String url) throws Exception {
        InputStream in = null;
        try {
            in = new URL(url).openStream();
            IOUtils.copyBytes(in, System.out, 4096, false);
        } finally {
            IOUtils.closeStream(in);
        }
    }
    
    public static void main(String[] args) throws Exception {
        readHdfs("hdfs://192.168.49.131:9000/user/hadoopuser/input20120828/file01");
    }
}

其中,我使用到的jar包有:

hadoop-core的版本一定要和分布式环境上安装的hadoop版本保持一致,不然会报错:

12/09/11 14:18:59 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/thirdparty/guava/common/collect/LinkedListMultimap
    at org.apache.hadoop.hdfs.SocketCache.<init>(SocketCache.java:48)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:240)

分布式环境上安装的hadoop版本如下:

抱歉!评论已关闭.