x625802392 发表于 2016-12-7 11:13:38

hadoop显示hdfs的文件内容

1.Exception in thread "main" java.net.ConnectException: Call to master/192.168.1.101:9000 failed on connection exception: java.net.ConnectException: 拒绝连接
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
        at org.apache.hadoop.ipc.Client.call(Client.java:1118)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
hadoop的hdfs没启动,
启动hdfs:到bin目录执行start-dfs.sh
执行命令读取文件内容:
报错# ../bin/hadoop URLCat hdfs://192.168.1.101:9000/root/in/test1.txt
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /root/in/test1.txt
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
        at org.apache.hadoop.fs.FsUrlConnection.connect(FsUrlConnection.java:46)
        at org.apache.hadoop.fs.FsUrlConnection.getInputStream(FsUrlConnection.java:56)
        at java.net.URL.openStream(URL.java:1037)
        at URLCat.main(URLCat.java:15)
其实 文件test1.txt是存在的。
 
原因:没有把本地文件放到hdfs文件系统内。所以需要本地文件通过hadoop的命令存放到hdfs上。
(1).创建本地文件,当前目录是/usr/local/hadoop1.2.1目录
 mkdir input  //创建input目录
 cd input     //到input目录上
 echo "hello world ">test1.txt  //字符串内不能包含感叹号(!),bash命令无法解析
 echo "hello hadoop">test2.txt
(2).创建好文件后,通过hadoop命令本地文件放到hdfs系统上,当前目录:usr/local/hadoop1.2.1目录
 bin/hadoop fs -put input  ./in  //放到hdf上
 bin/hadoop fs -ls     //检查hdfs文件
运行结果为:
  Found 1 items
  drwxr-xr-x   - root supergroup          0 2015-05-17 01:11 /user/root/in
  bin/hadoop fs -cat ./in/test1.txt  //查看文件内容
  (3).通过hadoop提供的例子,读取刚上传的hdf文件的内容,统计单词
  bin/hadoop jar hadoop-examples-1.2.1.jar  wordcount in out  //统计in目录下,文件内容的单词数量
  运行结果为:
  15/05/17 01:13:40 INFO input.FileInputFormat: Total input paths to process : 2
  15/05/17 01:13:40 INFO util.NativeCodeLoader: Loaded the native-hadoop library
  15/05/17 01:13:40 WARN snappy.LoadSnappy: Snappy native library not loaded
  15/05/17 01:13:41 INFO mapred.JobClient: Running job: job_201505170049_0001
  15/05/17 01:13:42 INFO mapred.JobClient:  map 0% reduce 0%
  15/05/17 01:13:52 INFO mapred.JobClient:  map 100% reduce 0%
  15/05/17 01:14:28 INFO mapred.JobClient:  map 100% reduce 100%
  15/05/17 01:14:31 INFO mapred.JobClient: Job complete: job_201505170049_0001
  15/05/17 01:14:31 INFO mapred.JobClient: Counters: 29
  15/05/17 01:14:31 INFO mapred.JobClient:   Job Counters 
  15/05/17 01:14:31 INFO mapred.JobClient:     Launched reduce tasks=1
  15/05/17 01:14:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=14618
  15/05/17 01:14:31 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
  15/05/17 01:14:31 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
  15/05/17 01:14:31 INFO mapred.JobClient:     Launched map tasks=2
  15/05/17 01:14:31 INFO mapred.JobClient:     Data-local map tasks=2
  15/05/17 01:14:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=27918
  15/05/17 01:14:31 INFO mapred.JobClient:   File Output Format Counters 
  15/05/17 01:14:31 INFO mapred.JobClient:     Bytes Written=25
  15/05/17 01:14:31 INFO mapred.JobClient:   FileSystemCounters
  15/05/17 01:14:31 INFO mapred.JobClient:     FILE_BYTES_READ=55
  15/05/17 01:14:31 INFO mapred.JobClient:     HDFS_BYTES_READ=241
  15/05/17 01:14:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=173312
  15/05/17 01:14:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25
  15/05/17 01:14:31 INFO mapred.JobClient:   File Input Format Counters 
  15/05/17 01:14:31 INFO mapred.JobClient:     Bytes Read=25
  15/05/17 01:14:31 INFO mapred.JobClient:   Map-Reduce Framework
  15/05/17 01:14:31 INFO mapred.JobClient:     Map output materialized bytes=61
  15/05/17 01:14:31 INFO mapred.JobClient:     Map input records=2
  15/05/17 01:14:31 INFO mapred.JobClient:     Reduce shuffle bytes=61
  15/05/17 01:14:31 INFO mapred.JobClient:     Spilled Records=8
  15/05/17 01:14:31 INFO mapred.JobClient:     Map output bytes=41
  15/05/17 01:14:31 INFO mapred.JobClient:     Total committed heap usage (bytes)=415969280
  15/05/17 01:14:31 INFO mapred.JobClient:     CPU time spent (ms)=1550
  15/05/17 01:14:31 INFO mapred.JobClient:     Combine input records=4
  15/05/17 01:14:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=216
  15/05/17 01:14:31 INFO mapred.JobClient:     Reduce input records=4
  15/05/17 01:14:31 INFO mapred.JobClient:     Reduce input groups=3
  15/05/17 01:14:31 INFO mapred.JobClient:     Combine output records=4
  15/05/17 01:14:31 INFO mapred.JobClient:     Physical memory (bytes) snapshot=318287872
  15/05/17 01:14:31 INFO mapred.JobClient:     Reduce output records=3
  15/05/17 01:14:31 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1041612800
  15/05/17 01:14:31 INFO mapred.JobClient:     Map output records=4
  # bin/hadoop fs -ls
  Found 2 items
  drwxr-xr-x   - root supergroup          0 2015-05-17 01:11 /user/root/in
  drwxr-xr-x   - root supergroup          0 2015-05-17 01:14 /user/root/out
  # bin/hadoop fs -ls ./out
  Found 3 items
  -rw-r--r--   1 root supergroup          0 2015-05-17 01:14 /user/root/out/_SUCCESS
  drwxr-xr-x   - root supergroup          0 2015-05-17 01:13 /user/root/out/_logs
  -rw-r--r--   1 root supergroup         25 2015-05-17 01:14 /user/root/out/part-r-00000
  # bin/hadoop fs -cat ./out/part-r-00000
  hadoop  1
  hello   2
  world   1
 
 

  搜索
复制
<iframe src="/admin/blogs/2206799/"></iframe>
页: [1]
查看完整版本: hadoop显示hdfs的文件内容