现在的位置: 首页 > 综合 > 正文

Hadoop开发周期(三):单元测试

2018年02月20日 ⁄ 综合 ⁄ 共 3155字 ⁄ 字号 评论关闭

      MapReduce一旦打包提交到分布式环境,如果出了问题,需要要定位调试,然后再打包发布。如果在发布MapRduce之前其做单元测试,消除明显的代码bug和逻辑错误,可以提高开发效率。

     MRUnit是一款由Couldera公司开发的专门针对Hadoop中编写MapReduce单元测试的框架。可以用MapDriver单独测试Map,用ReduceDriver单独测试Reduce,用MapReduceDriver测试MapReduce作业。(Apache
MRUnit ™ is a Java library that helps developers unit test Apache Hadoop map reduce jobs.

      下面使用MRUnit对Hadoop开发周期(二):编写mapper和reducer程序 中的单词字数统计例子进行单元测试。在进行单元测试前需要导入MRUnit的jar包。单元测试代码如下:

package cn.com.yz.test;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.apache.hadoop.mrunit.types.Pair;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;

import cn.com.yz.mapreduce.WordCountMapper;
import cn.com.yz.mapreduce.WordCountReducer;

public class WordCountMapperReducerTest {

	MapDriver<Object, Text, Text, IntWritable> mapDriver;
	ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
	MapReduceDriver<Object, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;

	@Before
	public void setUp() {
		WordCountMapper mapper = new WordCountMapper();
		WordCountReducer reducer = new WordCountReducer();
		mapDriver = MapDriver.newMapDriver(mapper);
		reduceDriver = ReduceDriver.newReduceDriver(reducer);
		mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
	}// end setUp()

	@Test
	public void testMapper() { 
		String line = "Google coorperates with IBM in cloud area";

		mapDriver.withInput(new Object(), new Text(line));
		mapDriver.withOutput(new Text("Google"), new IntWritable(1))
				.withOutput(new Text("coorperates"), new IntWritable(1))
				.withOutput(new Text("with"), new IntWritable(1))
				.withOutput(new Text("IBM"), new IntWritable(1))
				.withOutput(new Text("in"), new IntWritable(1))
				.withOutput(new Text("cloud"), new IntWritable(1))
				.withOutput(new Text("area"), new IntWritable(1));

		mapDriver.runTest();
	}// end testMapper()

	@Test
	public void testReducer() {
		List<IntWritable> values = new ArrayList<IntWritable>();
		values.add(new IntWritable(1));
		values.add(new IntWritable(1));
		reduceDriver.withInput(new Text("Google"), values);
		reduceDriver.withOutput(new Text("Google"), new IntWritable(2));

		reduceDriver.runTest();
	}// end testReducer()
	
	
	public void testMapperReducer() throws IOException{
		String line = "Google uses Map Reduce Model.";
		List<Pair<Text, IntWritable>> out=null;
		List<Pair> expected=new ArrayList<Pair>();
		
		mapReduceDriver.withInput(new Object(), new Text(line));
		out=mapReduceDriver.run();

		expected.add(new Pair(new Text("Google"), new IntWritable(1)));
		expected.add(new Pair(new Text("uses"), new IntWritable(1)));
		expected.add(new Pair(new Text("Map"), new IntWritable(1)));
		expected.add(new Pair(new Text("Reduce"), new IntWritable(1)));
		expected.add(new Pair(new Text("Model"), new IntWritable(1)));
        
		Assert.assertEquals(expected, out);
	}//end testMapperReducer()
}

测试结果如下


常见错误解决:

错误:

java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskInputOutputContext, but interface was expected
解决办法
changed the jar from mrunit-0.9.0-incubating-hadoop2.jar(which is suggested in Tutorial[hadoop-1.0.4]) to mrunit-0.9.0-incubating-hadoop1.jar and now it works well.


抱歉!评论已关闭.