开发者社区> 问答> 正文

hadoop集群部署出错,求帮助!? 400 报错

hadoop集群部署出错,求帮助!? 400 报错

来请教下!完全分布式搭建模式的demo,hadoop-0.20.2,系统是AIX5.3和HP-UX rx4640,172.168.1.240(AIX,datanode),172.168.1.243(HP-UX ,namenode)。

<!--core-site.xml -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://172.168.1.243:10000</value>
</property>
</configuration>

<!--hdfs-site.xml -->
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/public/interf/hadoop/data/dfs.name.dir</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/public/interf/hadoop/data/dfs.data.dir</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/public/interf/hadoop/data/fs.checkpoint.dir</value>
</property>
</configuration>

<!--mapred-site.xml-->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>172.168.1.243:10005</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/public/interf/hadoop/mapred/mapred.system.dir</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/public/interf/hadoop/mapred/mapred.local.dir</value>
</property>
</configuration>

slaves:172.168.1.240

目前的问题:没root权限,243-240 ssh已经建立,start-dfs.sh启动正常,243:50070页面下 Live Nodes:1,240:50075显示正常,但启动start-mapred.sh出现错误。
jobtracker.log:
2012-03-08 07:49:06,955 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /public/interf/hadoop/mapred/mapred.system.dir/ jobtracker.info could only be replicated to 0 nodes, instead of 1

2012-03-08 07:49:06,956 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null
2012-03-08 07:49:06,956 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/public/interf/hadoop/mapred/mapred.system.dir/jobtracker.info" - Aborting...
2012-03-08 07:49:06,957 WARN org.apache.hadoop.mapred.JobTracker: Writing to file hdfs://rx4640:10000/public/interf/hadoop/mapred/mapred.system.dir/jobtracker.info failed!
2012-03-08 07:49:06,957 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready yet!
2012-03-08 07:49:06,967 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager. 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /public/interf/hadoop/mapred/mapred.system.dir/ jobtracker.info could only be replicated to 0 nodes, instead of 1

2012-03-08 07:49:16,982 WARN org.apache.hadoop.mapred.JobTracker: Retrying...
2012-03-08 07:49:17,019 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /public/interf/hadoop/mapred/mapred.system.dir/ jobtracker.info could only be replicated to 0 nodes, instead of 1

namenode.log:
2012-03-08 07:49:06,949 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
2012-03-08 07:49:06,951 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 10000, call addBlock(/public/interf/hadoop/mapred/mapred.system.dir/jobtracker.info, DFSClient_-739369049) from 172.168.1.243:52154: error: java.io.IOException: File /public/interf/hadoop/mapred/mapred.system.dir/ jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /public/interf/hadoop/mapred/mapred.system.dir/ jobtracker.info could only be replicated to 0 nodes, instead of 1

网速查了些资料,重新format过,文件夹权限也改过,问题依旧,240启动后不报错,但243不断报错,关闭243后会提示连接不上:

2012-03-08 10:21:50,974 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Call to /172.168.1.243:10005 failed on local exception: java.io.IOException: A connection with a remote socket was reset by that socket.
Caused by: java.io.IOException: A connection with a remote socket was reset by that socket.

网上说有可能是防火墙和/etc/hosts文件问题,但start-dfs.sh启动是正常的。
243:50070/

Cluster Summary

6 files and directories, 0 blocks = 6 total. Heap Size is 17 MB / 888.94 MB (1%) 

Configured Capacity : 180 GB
DFS Used : 8 KB
Non DFS Used : 180 GB
DFS Remaining : 67 KB
DFS Used% : 0 %
DFS Remaining% : 0 %
Live Nodes : 1
Dead Nodes : 0

Live Datanodes : 1

NodeLast 
Contact
Admin StateConfigured 
Capacity (GB)
Used 
(GB)
Non DFS 
Used (GB)
Remaining 
(GB)
Used 
(%)
Used 
(%)
Remaining 
(%)
Blocks
ltbss1In Service180180


展开
收起
爱吃鱼的程序员 2020-06-03 13:11:43 673 0
1 条回答
写回答
取消 提交回答
  • https://developer.aliyun.com/profile/5yerqm5bn5yqg?spm=a2c6h.12873639.0.0.6eae304abcjaIB

    郁闷,想换个版本试试,还JDK不兼容......######

    换了台机器,问题解决,hadoop-0.20.2-test.jar TestDFSIO 和 hadoop-0.20.2-examples.jar sort测试通过,期间又遇到了2个问题。

    无法解析主机名:修改/etc/hosts(需要root权限)

    Name node is in safe mode:hadoop dfsadmin -safemode leave

    终于是弄好了,240机器怀疑是防火墙问题,但我又没权限,郁闷了。

    2020-06-03 13:45:16
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
《构建Hadoop生态批流一体的实时数仓》 立即下载
零基础实现hadoop 迁移 MaxCompute 之 数据 立即下载
CIO 指南:如何在SAP软件架构中使用Hadoop 立即下载

相关实验场景

更多