背景
hadoop集群启动后,无有效的datanode节点,表现形式为:
使用hdfs dfsadmin -report
查看节点的情况时,都是0
xf@master01:~/hadoop-2.6.5$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
排查解决
- 问题排查:检查datanode的安全模式是否关闭
网上有一些地方说可能是因为datanode开启了安全模式导致的。所以使用命令hdfs dfsadmin -safemode get
,查看datanode的安全模式是否关闭,但是却获得如下报错:
safemode: Call From slave01/127.0.1.1 to master01:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
- 问题跟踪:通过报错信息中的链接 http://wiki.apache.org/hadoop/ConnectionRefused ,看到这么一行
Check that there isn’t an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
大概意思是要查看hosts配置文件里面是否把主机名映射到了127.0.0.1 或者127.0.1.1 ,特别是Ubuntu容易出现这种异常情况!
- 尝试解决:我的系统就是Ubuntu,打开主机master01的
/etc/hosts
一看,确实是这个情况!!!
127.0.0.1 localhost
127.0.1.1 master01 --- 把这一行删掉就行了!!!
192.168.0.111 master01
192.168.0.112 master02
192.168.0.121 slave01
192.168.0.122 slave02
192.168.0.123 slave03
- 结果验证
把集群各个节点下的hosts文件中的127.0.1.1 主机名
这一行删除后,在slave节点重新执行hdfs dfsadmin -safemode get
,可以看到
Safe mode is OFF
说明datanode的安全模式已经关闭
最后,在master节点执行hdfs dfsadmin -report
验证,确实正常了,Live datanodes (3) ,而不是原来的0了
Configured Capacity: 25990606848 (24.21 GB)
Present Capacity: 16147132416 (15.04 GB)
DFS Remaining: 16147058688 (15.04 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.0.122:50010 (slave02)
Hostname: slave02
Decommission Status : Normal
Configured Capacity: 8663535616 (8.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 3268976640 (3.04 GB)
DFS Remaining: 5394534400 (5.02 GB)
DFS Used%: 0.00%
DFS Remaining%: 62.27%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Sep 09 23:04:24 CST 2020
Name: 192.168.0.121:50010 (slave01)
Hostname: slave01
Decommission Status : Normal
Configured Capacity: 8663535616 (8.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 3268980736 (3.04 GB)
DFS Remaining: 5394530304 (5.02 GB)
DFS Used%: 0.00%
DFS Remaining%: 62.27%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Sep 09 23:04:22 CST 2020
Name: 192.168.0.123:50010 (slave03)
Hostname: slave03
Decommission Status : Normal
Configured Capacity: 8663535616 (8.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 3305517056 (3.08 GB)
DFS Remaining: 5357993984 (4.99 GB)
DFS Used%: 0.00%
DFS Remaining%: 61.85%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Sep 09 23:04:23 CST 2020