Bootstrap

ambari 安装常见问题

hortonworks问题总结:

https://community.hortonworks.com/questions/118453/ambari-not-using-available-repos.html?page=2&pageSize=10&sort=oldest


rpm下载地址:链接:https://pan.baidu.com/s/1a154fMIAiPqsu9X64vV5tA 密码:6t3s

Ambari 安装问题总结
1、 缺少JAR问题 libtirpc-devel-0.2.4-0.6.el7.x86_64.rpm
  • 问题描述:
环境:hdp2.6,redhat 7.2/oracle linux 7.3
Installing package hadoop_2_6_0_3_8-hdfs ('/usr/bin/yum -d 0 -e 0 -y install hadoop_2_6_0_3_8-hdfs') 2017-05-26 17:07:30,977 - Execution of '/usr/bin/yum -d 0 -e 0 -y install hadoop_2_6_0_3_8-hdfs' returned 1. Error: Package: hadoop_2_6_0_3_8-hdfs-2.7.3.2.6.0.3-8.x86_64 (HDP-2.6) Requires: libtirpc-devel You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest 2017-05-26 17:07:30,977 - Failed to install package hadoop_2_6_0_3_8-hdfs. Executing '/usr/bin/yum clean metadata' 2017-05-26 17:07:31,544 - Retrying to install package hadoop_2_6_0_3_8-hdfs after 30 seconds
针对 redhat 7.2: yum install libtirpc-devel-0.2.4-0.6.el7.x86_64.rpm -y 针对 oracle linux 7.3: yum install libtirpc-devel-0.2.4-0.6.el7.i686.rpm -y
针对 centos6.5 ambari2.5.1.0: libtirpc-0.2.1-13.el6.x86_64.rpm -y libtirpc-devel-0.2.1-13.el6.x86_64.rpm -y
2、时间同步问题
  • 问题描述
  • 解决方法:
[root@ambari02 ~]$ sudo service ntpd start Starting ntpd: [ OK ]
3. snappy版本过高
  • 问题描述:环境~hdp2.6,redhat7.2
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 177, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 314, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 52, in install self.install_packages(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 605, in install_packages retry_count=agent_stack_retry_count) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 54, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 51, in install_package self.checked_call_with_retries(cmd, sudo=True, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 86, in checked_call_with_retries return self._call_with_retries(cmd, is_checked=True, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 98, in _call_with_retries code, out = func(cmd, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/yum -d 0 -e 0 -y install snappy-devel' returned 1. Error: Package: snappy-devel-1.0.5-1.el6.x86_64 (HDP-UTILS-1.1.0.21) Requires: snappy(x86-64) = 1.0.5-1.el6 Installed: snappy-1.1.0-3.el7.x86_64 (@anaconda/7.2) snappy(x86-64) = 1.1.0-3.el7 Available: snappy-1.0.5-1.el6.x86_64 (HDP-UTILS-1.1.0.21) snappy(x86-64) = 1.0.5-1.el6 You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest
  • 解决方法:
  • # rpm -qa | grep snappy
  • rpm -e snappy-1.1.0-3.el7.x86_64 --nodeps
  • yum install snappy-1.0.5-1.el6.x86_64 -y

      # rpm -ivh libtirpc-devel-0.2.4-0.* --nodeps
warning: libtirpc-devel-0.2.4-0.10.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
 Preparing...                          ################################# [100%]
Updating / installing...
   1:libtirpc-devel-0.2.4-0.8.el7     ################################# [ 50%]
   2:libtirpc-devel-0.2.4-0.10.el7    ################################# [100%]

snappy 下载地址:

http://rpmfind.net/linux/rpm2html/search.php?query=snappy
删除每台机器上的高版本snappy,安装低版本snappy rpm -e snappy-1.1.0-3.el7.x86_64 yum install snappy-1.0.5-1.el6.x86_64 -y
4. /usr/hdp/current/hadoop-client/conf doesn't exist,此问题多出现在安装失败,重试安装的时候
  • 问题描述
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_client.py", line 123, in <module> HdfsClient().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 314, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_client.py", line 39, in install self.configure(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 117, in locking_configure original_configure(obj, *args, **kw) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_client.py", line 44, in configure hdfs() File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs.py", line 61, in hdfs group=params.user_group File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/xml_config.py", line 66, in action_create encoding = self.resource.encoding File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 120, in action_create raise Fail("Applying %s failed, parent directory %s doesn't exist" % (self.resource, dirname)) resource_management.core.exceptions.Fail: Applying File['/usr/hdp/current/hadoop-client/conf/hadoop-policy.xml'] failed, parent directory /usr/hdp/current/hadoop-client/conf doesn't exist
  • 解决方案:
此问题是由于/etc/hadoop/ 下面的conf目录不存在导致的,从别的好的机器上拷贝一份过来即可 scp -r <good mechine>:/etc/hadoop/* /etc/hadoop
5. /usr/sbin/hst: line 321: install-activity-analyzer.sh: command not found
  • 问题描述
2016-09-12 16:34:18,905 - User['activity_analyzer'] {'gid': 'hadoop', 'groups': [u'hdfs']} Deploying activity analyzer Command: /usr/sbin/hst activity-analyzer setup root:root '/etc/rc.d/init.d' Exit code: 127 Std Out: None Std Err: /usr/sbin/hst: line 321: install-activity-analyzer.sh: command not found Command failed after 1 tries
  • 解决办法:删除smartsense-hst,然后重新安装
yum remove smartsense-hst rm -rf /var/log/smartsense/
6. 重新启用kerberos时,yarn resourcemanager启动失败: Couldn't set ACLs on parent ZNode: /yarn-leader-election
  • 问题描述
2017-06-14 10:03:29,878 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService failed in state INITED; cause: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1399) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1050) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1067) at org.apache.hadoop.ha.ActiveStandbyElector.setAclsWithRetries(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:349) ... 9 more 2017-06-14 10:03:29,880 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:processWatchEvent(600)) - Session connected. 2017-06-14 10:03:29,889 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:processWatchEvent(626)) - Successfully authenticated to ZooKeeper using SASL. 2017-06-14 10:03:29,890 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:quitElection(406)) - Yielding from election 2017-06-14 10:03:29,891 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(835)) - Terminating ZK connection for elector id=482307698 appData=null cb=Service org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService in state org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService: STOPPED 2017-06-14 10:03:29,910 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x15ca4396eda001a closed 2017-06-14 10:03:29,911 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(832)) - terminateConnection, zkConnectionState = TERMINATED 2017-06-14 10:03:29,911 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236) Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 7 more Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1399) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1050) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1067) at org.apache.hadoop.ha.ActiveStandbyElector.setAclsWithRetries(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:349) ... 9 more 2017-06-14 10:03:29,912 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236) Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 7 more Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1399) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1050) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1067) at org.apache.hadoop.ha.ActiveStandbyElector.setAclsWithRetries(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:349) ... 9 more 2017-06-14 10:03:29,913 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1078)) - Transitioning to standby state 2017-06-14 10:03:29,914 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1085)) - Transitioned to standby state 2017-06-14 10:03:29,914 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1240)) - Error starting ResourceManager org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236) Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 7 more Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1399) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1050) at org.apache.hadoop.ha.ActiveStandbyElector$7.run(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1067) at org.apache.hadoop.ha.ActiveStandbyElector.setAclsWithRetries(ActiveStandbyElector.java:1044) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:349) ... 9 more 2017-06-14 10:03:29,910 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down 2017-06-14 10:03:29,922 INFO resourcemanager.ResourceManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down ResourceManager at bigdata-nn-01/172.16.0.124 ************************************************************/
  • 解决办法:
登录zookeeper: [root@bigdata-nn-01 ~]$ su - hadoop [hadoop@bigdata-nn-01 ~]$ zookeeper-client -server bigdata-nn-01.cars.com:2181 其中: -server 后面一定要用FQDN主机名 删除:/yarn-leader-election [zk: bigdata-nn-01.cars.com:2181(CONNECTED) 1] rmr /yarn-leader-election
7.缺少jar包python-argparse-1.2.1-2.1.el6.noarch.rpm(针对centos6.5 ambari2.5.1.0)
  • 问题描述
2017-08-16 09:44:19,927 - Stack Feature Version Info: stack_version=2.6, version=None, current_cluster_version=None -> 2.6 2017-08-16 09:44:19,929 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf User Group mapping (user_group) is missing in the hostLevelParams 2017-08-16 09:44:19,931 - Group['hadoop'] {} 2017-08-16 09:44:19,933 - Group['users'] {} 2017-08-16 09:44:19,934 - User['hadoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2017-08-16 09:44:19,935 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2017-08-16 09:44:19,937 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hadoop /tmp/hadoop-hadoop,/tmp/hsperfdata_hadoop,/home/hadoop,/tmp/hadoop,/tmp/sqoop-hadoop'] {'not_if': '(test $(id -u hadoop) -gt 1000) || (false)'} 2017-08-16 09:44:19,964 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hadoop /tmp/hadoop-hadoop,/tmp/hsperfdata_hadoop,/home/hadoop,/tmp/hadoop,/tmp/sqoop-hadoop'] due to not_if 2017-08-16 09:44:19,965 - Directory['/tmp/hbase-hbase'] {'owner': 'hadoop', 'create_parents': True, 'mode': 0775, 'cd_access': 'a'} 2017-08-16 09:44:19,967 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2017-08-16 09:44:19,969 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hadoop /home/hadoop,/tmp/hadoop,/usr/bin/hadoop,/var/log/hadoop,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hadoop) -gt 1000) || (false)'} 2017-08-16 09:44:19,990 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hadoop /home/hadoop,/tmp/hadoop,/usr/bin/hadoop,/var/log/hadoop,/tmp/hbase-hbase'] due to not_if 2017-08-16 09:44:19,991 - Group['hadoop'] {} 2017-08-16 09:44:19,992 - User['hadoop'] {'fetch_nonlocal_groups': True, 'groups': ['users', 'hadoop']} 2017-08-16 09:44:19,993 - FS Type: 2017-08-16 09:44:19,993 - Directory['/etc/hadoop'] {'mode': 0755} 2017-08-16 09:44:20,020 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hadoop', 'group': 'hadoop'} 2017-08-16 09:44:20,021 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hadoop', 'group': 'hadoop', 'mode': 01777} 2017-08-16 09:44:20,041 - Initializing 2 repositories 2017-08-16 09:44:20,042 - Repository['HDP-2.6'] {'base_url': 'http://172.17.109.11/HDP/centos6/', 'action': ['create'], 'components': ['HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP', 'mirror_list': None} 2017-08-16 09:44:20,052 - File['/etc/yum.repos.d/HDP.repo'] {'content': '[HDP-2.6]\nname=HDP-2.6\nbaseurl=http://172.17.109.11/HDP/centos6/\n\npath=/\nenabled=1\ngpgcheck=0'} 2017-08-16 09:44:20,053 - Repository['HDP-UTILS-1.1.0.21'] {'base_url': 'http://172.17.109.11/HDP-UTILS/', 'action': ['create'], 'components': ['HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None} 2017-08-16 09:44:20,057 - File['/etc/yum.repos.d/HDP-UTILS.repo'] {'content': '[HDP-UTILS-1.1.0.21]\nname=HDP-UTILS-1.1.0.21\nbaseurl=http://172.17.109.11/HDP-UTILS/\n\npath=/\nenabled=1\ngpgcheck=0'} 2017-08-16 09:44:20,057 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:44:20,147 - Skipping installation of existing package unzip 2017-08-16 09:44:20,148 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:44:20,155 - Skipping installation of existing package curl 2017-08-16 09:44:20,155 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:44:20,164 - Skipping installation of existing package hdp-select 2017-08-16 09:44:20,444 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2017-08-16 09:44:20,453 - call['ambari-python-wrap /usr/bin/hdp-select status hive-server2'] {'timeout': 20} 2017-08-16 09:44:20,508 - call returned (0, 'hive-server2 - None') 2017-08-16 09:44:20,510 - Failed to get extracted version with /usr/bin/hdp-select 2017-08-16 09:44:20,510 - Stack Feature Version Info: stack_version=2.6, version=None, current_cluster_version=None -> 2.6 2017-08-16 09:44:20,527 - Package['mysql-connector-java'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:44:20,615 - Installing package mysql-connector-java ('/usr/bin/yum -d 0 -e 0 -y install mysql-connector-java') 2017-08-16 09:45:09,408 - Package['hive_2_6_1_0_129'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:45:09,421 - Installing package hive_2_6_1_0_129 ('/usr/bin/yum -d 0 -e 0 -y install hive_2_6_1_0_129') 2017-08-16 09:45:46,355 - Package['hive_2_6_1_0_129-hcatalog'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:45:46,368 - Installing package hive_2_6_1_0_129-hcatalog ('/usr/bin/yum -d 0 -e 0 -y install hive_2_6_1_0_129-hcatalog') 2017-08-16 09:45:52,828 - Package['hive_2_6_1_0_129-webhcat'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:45:52,844 - Installing package hive_2_6_1_0_129-webhcat ('/usr/bin/yum -d 0 -e 0 -y install hive_2_6_1_0_129-webhcat') 2017-08-16 09:46:00,697 - Package['hive2_2_6_1_0_129'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2017-08-16 09:46:00,710 - Installing package hive2_2_6_1_0_129 ('/usr/bin/yum -d 0 -e 0 -y install hive2_2_6_1_0_129') 2017-08-16 09:46:05,216 - Execution of '/usr/bin/yum -d 0 -e 0 -y install hive2_2_6_1_0_129' returned 1. There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them. The program yum-complete-transaction is found in the yum-utils package. Error: Package: hive2_2_6_1_0_129-2.1.0.2.6.1.0-129.noarch (HDP-2.6) Requires: python-argparse You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest 2017-08-16 09:46:05,216 - Failed to install package hive2_2_6_1_0_129. Executing '/usr/bin/yum clean metadata' 2017-08-16 09:46:05,400 - Retrying to install package hive2_2_6_1_0_129 after 30 secondsCommand failed after 1 tries
  • 解决办法:
yum install python-argparse-1.2.1-2.1.el6.noarch.rpm -y




关于ambri hst agent注册失败错误,查看log:
=============
INFO 2017-09-21 10:52:33,435 security.py:178 - Server certificate not exists, downloading
INFO 2017-09-21 10:52:33,435 security.py:191 - Downloading server cert from https://ambari-test1.com:9440/cert/ca/
ERROR 2017-09-21 10:52:33,510 ServerAPI.py:137 - POST https://ambari-test1.com:9441/api/v1/register failed. (SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)'),)

[qtp-ambari-agent-42] nio:720 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
[qtp-ambari-agent-42] nio:720 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
============
这个错误是因为python-2.7.5-e58版本默认是使用ssl验证,解决这个问题,修改/etc/python/cert-verification.cfg:
[https]
verify= disable
或者降级python版本

资料参考:

https://community.hortonworks.com/questions/145/openssl-error-upon-host-registration.html







2017-09-30 19:49:22,743 - Execution of '/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor' returned 1. Error: Package: ambari-metrics-monitor-2.5.0.3-7.x86_64 (ambari-2.5.0.3)
           Requires: gcc
Error: Package: ambari-metrics-monitor-2.5.0.3-7.x86_64 (ambari-2.5.0.3)
           Requires: python-devel包
 You could try using --skip-broken to work around the problem
** Found 2 pre-existing rpmdb problem(s), 'yum check' output follows:




请安装gcc与python-devel包和libtirpc-devel与yum install python-devel.x86_64




File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/yum -d 0 -e 0 -y install rpcbind' returned 1. 
Error: Nothing to do




请安装rpcbind






File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/yum -d 0 -e 0 -y install zookeeper_2_6_0_3_8-server' returned 1. Error: Package: zookeeper_2_6_0_3_8-server-3.4.6.2.6.0.3-8.noarch (HDP-2.6)
           Requires: redhat-lsb
  
  
请安装redhat-lsb   


Please check openssl library versions.问题

ERROR 2014-05-08 22:21:18,891 NetUtil.py:56 - [Errno 1] _ssl.c:492: error:100AE081:elliptic curve routines:EC_GROUP_new_by_curve_name:unknown group
ERROR 2014-05-08 22:21:18,892 NetUtil.py:58 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
INFO 2014-05-08 22:21:18,892 NetUtil.py:81 - Server at https://m1.hdp2:8440 is not reachable, sleeping for 10 seconds...
', None)


The good news is that message was referenced in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-4_2x.html and following the information provided I found out that I needed a newer version of OpenSSL which was accomplished by running "sudo yum upgrade openssl" on all the boxes to get past these errors.

I also found out that I had a warning from each of the four nodes that the ntpd service was not running.  I thought I took care of this earlier, but either way I just followed the instructions on this back in building a virtualized 5-node HDP 2.0 cluster (all within a mac) the warnings cleared up.

Unlike the other cluster install instructions, for this setup we want all services checked on the Choose Services page and then you can take some creative liberty on the Assign Masters page.  Here's a snapshot of my selections.



https://martin.atlassian.net/wiki/spaces/lestermartin/blog/2014/05/07/23494834/setting+up+hdp+2.1+with+non-standard+users+for+hadoop+services+why+not+use+a+non-standard+user+for+ambari+too

http://www.iops.cc/fix-openssl-version-with-python/
http://download.csdn.net/download/mgb_123456/5698885

http://blog.csdn.net/freeman1975/article/details/51475249


问题描述

2018-04-24 09:41:52,554 - Installing package hadoop_2_6_0_3_8-yarn ('/usr/bin/yum -d 0 -e 0 -y install hadoop_2_6_0_3_8-yarn')
2018-04-24 09:41:54,545 - Execution of '/usr/bin/yum -d 0 -e 0 -y install hadoop_2_6_0_3_8-yarn' returned 1. Error: Package: hadoop_2_6_0_3_8-2.7.3.2.6.0.3-8.x86_64 (HDP-2.6)
           Requires: nc
Error: Package: hadoop_2_6_0_3_8-2.7.3.2.6.0.3-8.x86_64 (HDP-2.6)
           Requires: redhat-lsb-core
Error: Package: hadoop_2_6_0_3_8-2.7.3.2.6.0.3-8.x86_64 (HDP-2.6)
           Requires: psmisc
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest
2018-04-24 09:41:54,545 - Failed to install package hadoop_2_6_0_3_8-yarn. Executing '/usr/bin/yum clean metadata'
2018-04-24 09:41:54,680 - Retrying to install package hadoop_2_6_0_3_8-yarn after 30 seconds

Command failed after 1 tries

解决方法

从以上明显看出是需要nc和redhat-lsb-core与psmisc

# /etc/yum.repos.d

# vi ol73.repo

[Server]
name=Server
baseurl=ftp://192.168.0.70/pub/os/OL-73
gpgcheck=0


# yum clean all

# yum install nc

# yum install redhat-lsb-core

# yum install psmisc

;