设为首页 收藏本站
查看: 1359|回复: 0

[经验分享] Hadoop是Apache提出的一个软件框架(即:开放源码并行运算编程工具和分布式文件系统,与MapReduce和Google档案系统的概念类似)

[复制链接]

尚未签到

发表于 2016-12-13 07:15:07 | 显示全部楼层 |阅读模式
  Apache Hadoopis asoftware frameworkthat supports data-intensivedistributed applicationsunder afree license.[1]It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired byGoogle'sMapReduceandGoogle File System(GFS) papers.
  Hadoop is a top-levelApacheproject being built and used by a global community of contributors,[2]using theJavaprogramming language.Yahoo!has been the largest contributor[3]to the project, and uses Hadoop extensively across its businesses.[4]
  Hadoop was created byDoug Cutting,[5]who named it after his son's toy elephant.[6]It was originally developed to support distribution for theNutchsearch engine project.[7]
Contents
[hide]

  • 1Architecture

    • 1.1Filesystems

      • 1.1.1Hadoop Distributed File System
      • 1.1.2Other Filesystems

    • 1.2Job Tracker and Task Tracker: the MapReduce engine

      • 1.2.1Scheduling

        • 1.2.1.1Fair scheduler
        • 1.2.1.2Capacity scheduler


    • 1.3Other applications

  • 2Prominent users

    • 2.1Yahoo!
    • 2.2Other users

  • 3Hadoop on Amazon EC2/S3 services
  • 4Hadoop at Google and IBM
  • 5Running Hadoop in compute farm environments

    • 5.1Grid Engine Integration
    • 5.2Condor Integration

  • 6Commercially supported Hadoop-related products
  • 7See also
  • 8References
  • 9Bibliography
  • 10External links
  http://hadoop.apache.org/

  • What Is Apache Hadoop?
  • Who Uses Hadoop?
  • News

    • March 2011 - Apache Hadoop takes top prize at Media Guardian Innovation Awards
    • January 2011 - ZooKeeper Graduates
    • September 2010 - Hive and Pig Graduate
    • May 2010 - Avro and HBase Graduate
    • July 2009 - New Hadoop Subprojects
    • March 2009 - ApacheCon EU
    • November 2008 - ApacheCon US
    • July 2008 - Hadoop Wins Terabyte Sort Benchmark


What Is Apache Hadoop?
  The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.
  The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
  The project includes these subprojects:

  • Hadoop Common: The common utilities that support the other Hadoop subprojects.
  • Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
  • Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.
  Other Hadoop-related projects at Apache include:

  • Avro™: A data serialization system.
  • Cassandra™: A scalable multi-master database with no single points of failure.
  • Chukwa™: A data collection system for managing large distributed systems.
  • HBase™: A scalable, distributed database that supports structured data storage for large tables.
  • Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
  • Mahout™: A Scalable machine learning and data mining library.
  • Pig™: A high-level data-flow language and execution framework for parallel computation.
  • ZooKeeper™: A high-performance coordination service for distributed applications.

Who Uses Hadoop?
  A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the HadoopPoweredBywiki page.

http://www.oschina.net/p/hadoop
  Hadoop并不仅仅是一个用于存储的分布式文件系统,而是设计用来在由通用计算设备组成的大型集群上执行分布式应用的框架。
  下图是Hadoop的体系结构:
DSC0000.gif

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-313372-1-1.html 上篇帖子: Hadoop MapReduce 学习笔记(七) MapReduce在多字段/列基础上实现类似SQL的max和min 下篇帖子: 解析hadoop框架下的Map-Reduce job的输出格式的实现
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表