Apache Kudu 0.10.0 发布，Hadoop 存储系统

cyrus · 发表于 2016-10-27 07:01:23

欢迎加入运维网交流群：263444886

　　Apache Kudu 0.10.0 发布了。
　　Apache Kudu 简介
　　
　　为了应对先前发现的这些趋势，有两种不同的方式：持续更新现有的Hadoop工具或者重新设计开发一个新的组件。其目标是：

　　对数据扫描(scan)和随机访问(random access)同时具有高性能，简化用户复杂的混合架构；
　　高CPU效率，最大化先进处理器的效能；
　　高IO性能，充分利用先进永久存储介质；
　　支持数据的原地更新，避免额外的数据处理、数据移动

　　我们为了实现这些目标，首先在现有的开源项目上实现原型，但是最终我们得出结论：需要从架构层作出重大改变。而这些改变足以让我们重新开发一个全新的数据存储系统。于是3年前开始开发，直到如今我们终于可以分享多年来的努力成果：Kudu，一个新的数据存储系统。

　　更新如下：
　　
　　Incompatible changes and deprecated APIs in 0.10.0

　　Gerrit #3737 The Java client has been repackaged under org.apache.kudu instead of org.kududb. Import statements for Kudu>
　　Gerrit #3055 The Java client’s synchronous API methods now throw KuduException instead of Exception. Existing code that catches Exception should still compile, but introspection of an exception’s message may be impacted. This change was made to allow thrown exceptions to be queried more easily using KuduException.getStatus and calling one of Status’s methods. For example, an operation that tries to delete a table that doesn’t exist would return a `Status that returns true when queried on isNotFound().
　　The Java client’s KuduTable.getTabletsLocations set of methods is now deprecated. Additionally, they now take an exclusive end partition key instead of an inclusive key. Applications are encouraged to use the scan tokens API instead of these methods in the future.
　　The C++ API for specifying split points on range-partitioned tables has been improved to make it easier for callers to properly manage the ownership of the provided rows.
　　The TableCreator::split_rows API took a vector, which made it very difficult for the calling application to do proper error handling with cleanup when setting the fields of the KuduPartialRow. This API has been now been deprecated and replaced by a new method TableCreator::add_range_split which allows easier use of smart pointers for safe memory management.
　　The Java client’s internal buffering has been reworked. Previously, the number of buffered write operations was constrained on a per-tablet-server basis. Now, the configured maximum buffer>
　　This change can negatively affect the write performance of Java clients which>setMutationBufferSpace API to increase a session’s maximum buffer>
　　The "remote bootstrap" process used to copy a tablet replica from one host to another has been renamed to "Tablet Copy". This resulted in the renaming of several RPC metrics. Any users previously explicitly fetching or monitoring metrics>
　　The SparkSQL datasource for Kudu no longer supports mode Overwrite. Users should use the new KuduContext.upsertRowsmethod instead. Additionally, inserts using the datasource are now upserts by default. The older behavior can be restored by setting the operation parameter to insert.

　　New features

　　Users may now manually manage the partitioning of a range-partitioned table. When a table is created, the user may specify a set of range partitions that do not cover the entire available key space. A user may add or drop range partitions to existing tables.
　　This feature can be particularly helpful with time series workloads in which new partitions can be created on an hourly or daily basis. Old partitions may be efficiently dropped if the application does not need to retain historical data past a certain point.

　　This feature is considered experimental for the 0.10>
　　Support for running Kudu clusters with multiple masters has been stabilized. Users may start a cluster with three or five masters to provide fault tolerance despite a failure of one or two masters, respectively.

　　Note that certain tools (e.g. ksck) are still lacking complete support for multiple masters. These deficiencies will be addressed in a following>
　　Kudu now supports the ability to reserve a certain amount of free disk space in each of its configured data directories. If a directory’s free disk space drops to less than the configured minimum, Kudu will stop writing to that directory until space becomes available. If no space is available in any configured directory, Kudu will abort.
　　This feature may be configured using the fs_data_dirs_reserved_bytes and fs_wal_dir_reserved_bytes flags.
　　The Spark integration’s KuduContext now supports four new methods for writing to Kudu tables: insertRows, upsertRows,updateRows, and deleteRows. These are now the preferred way to write to Kudu tables from Spark.

　　完整更新说明：http://kudu.apache.org/releases/0.10.0/docs/release_notes.html
　　下载：

　　Kudu 0.10.0 source tarball (SHA1, MD5, Signature)

账号		自动登录	找回密码
密码			立即注册

Centos6.5×64安装配置openmeetings3.0.3详

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

[软件发布] Apache Kudu 0.10.0 发布，Hadoop 存储系统

浏览过的版块

扫码加入运维网微信交流群