Hadoop集群性能测试

JerryXia 发表于 , 阅读 (24)

磁盘读:

# hdparm -tT --direct /dev/vdb1/dev/vdb1: Timing O_DIRECT cached reads:   3286 MB in  2.00 seconds = 1613.15 MB/sec Timing O_DIRECT disk reads: 3000MB in  3.01 seconds = 1022.49 MB/sec

网络IO

网络传输,点对点copy,传输速度平均101.6MB/s

iperf测的平均网络IO为110左右MB/s

Hadoop Benchmark

Benchmark工具

网上的benchmark工具挺多的,总结一下大致有下面几个:

  • hadoop自带的Test
  • intel的 HiBench
  • 中科院的BigDataBench
  • berkeley的benchmark
  • ebay的benchmark(名字记不清了)

这是目前我找到的几个比较出名一些的hadoopbenchmark。缩小一下范围后,准备在前三个中选一个。其实这个各有特点,但是考虑到这次只测试io,而且集群的root权限也不在我这,就用个比较省事的,hadoop自带的了。

脚本

写了个小脚本。

jar_path=hadoop-test-mr1.jarmain_class=TestDFSIOecho "开始hadoop集群测试!"echo "-------------------------------------------------------------"echo "清空测试目录!"hadoop jar $jar_path $main_class -cleanecho "开始极小文件测试!"echo "-------------------------------------------------------------"echo "读写10000个10B的文件"hadoop jar $jar_path $main_class -write -nrFiles 1000 -size "10B"hadoop jar $jar_path $main_class -read -nrFiles 1000 -size "10B"......hadoop jar $jar_path $main_class -cleanecho "开始巨文件测试!"echo "-------------------------------------------------------------"echo "读写5个100G的文件"hadoop jar $jar_path $main_class -write -nrFiles 5 -size "100GB"hadoop jar $jar_path $main_class -read -nrFiles 5 -size "100GB"

测试结果

每一次测试都会在当前目录的TestDFSIO_results.log中追加新的测试结果。

----- TestDFSIO ----- : write           Date & time: Tue Apr 12 12:20:18 CST 2016       Number of files: 1000Total MBytes processed: 0.009536743     Throughput mb/sec: 9.813281434897923E-5Average IO rate mb/sec: 9.844686428550631E-5 IO rate std deviation: 5.294680350263851E-6    Test exec time sec: 184.055----- TestDFSIO ----- : read           Date & time: Tue Apr 12 12:23:37 CST 2016       Number of files: 1000Total MBytes processed: 0.009536743     Throughput mb/sec: 0.0029361893978024937Average IO rate mb/sec: 0.003687877906486392 IO rate std deviation: 0.002046490931134166    Test exec time sec: 184.024