0.22.x (in development)
0.90.2
NO
从上图可以看出,HBase0.90.2与Hadoop的主干版本0.20.0是不兼容的,虽然可以使用,但是在生产环境中会导致数据丢失。
比如在hbase的web界面会有如下提醒:
You are currently running the HMaster without HDFS append support enabled. This may result in data loss. Please see the HBase wiki for details.
As of today, Hadoop 0.20.2 is the latest stable release of Apache Hadoop that is marked as ready for production (neither 0.21 nor 0.22 are).
Unfortunately, Hadoop 0.20.2 release is not compatible with the latest stable version of HBase: if you run HBase on top of Hadoop 0.20.2, you risk to lose data! Hence HBase users are required to build their own Hadoop 0.20.x version if they want to run HBase on a production cluster of Hadoop. In this article, I describe how to build such a production-ready version of Hadoop 0.20.x that is compatible with HBase 0.90.2.
在Hbase0.20.2的官方book中也有提到: This version of HBase will only run on Hadoop 0.20.x. It will not run on hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an HDFS that has a durable sync. Currently only the branch-0.20-append branch has this attribute [1]. No official releases have been made from this branch up to now so you will have to build your own Hadoop from the tip of this branch. Check it out using this url, branch-0.20-append. Scroll down in the Hadoop How To Release to the section Build Requirements for instruction on how to build Hadoop. Or rather than build your own, you could use Cloudera's CDH3. CDH has the 0.20-append patches needed to add a durable sync (CDH3 betas will suffice; b2, b3, or b4).
所以本文就讨论如何使用编译hadoop的append分支,并整合进入Hadoop主干版本。
首先安装git工具。(是个类似于svn一样的版本控制工具)
$ apt-get install git
使用git获取源代码,并建立本地版本库,需要下载较长时间
$ git clone git://git.apache.org/hadoop-common.git
进入库内
$ cd hadoop-common
$ git checkout -t origin/branch-0.20-append
Branch branch-0.20-append set up to track remote branch branch-0.20-append from origin.
Switched to a new branch 'branch-0.20-append'