目前使用hive的方式主要有2种:
a、仅把hive作为一个可以对文件使用sql查询的工具来使用,即常见的在hive客户端上,指定hive的metastore和hive的数据仓库目录(即hdfs目录),然后指定hadoop的目录以及相应执行的jar包即可,在命令行下调用:
1 mammut@classb-ds-bigdata16:~/apache-hive-1.2.1-bin$ bin/hive 2 Logging initialized using configuration in /home/mammut/apache-hive-0.13.1-bin/conf/hive-log4j.properties.self 3 Hive history file=/tmp/mammut/hive_job_log_21e28d9e-7756-402e-b0d2-242f4ceef095_1602622773.txt 4 hive>
b、将hive作为一个可以提供SQL接口的数据库来使用,这样可以通过jdbc连接
这里就是对第二种方式做一下整理和记录,在实现的过程中,遇到了一些问题,解决的过程中也顺带学习了一些概念和方法。
分解成各个子模块后,主要是下面几点:
1、搭建hiveserver2服务器
2(可选)、认证方式为kerberos认证,则需要配置相应的kerberos
3、代码实现:
4、验证结果a、认证kerberosb、建立jdbc连接
一、搭建hiveserver2服务器:
a、分别设置hiveclient中的hive-site.xml中:warehouse路径、jdbc(metastore)为mysql以及相应连接方式、用户名密码 ---这几个是hive-client搭建必须的配置
b、设置hiveserver2的配置,配置认证方式为kerberos,keytab为一个有代理权限的keytab和相应principal(这里选择代理权限的keytab,1是因为hiveserver2是个代理服务器,2是为了方便多个不同用户通过jdbc连接时,可以通过代理拥护代理执行,否则一个普通用户访问会有如下错误)
1 16/03/02 17:32:33 [main]: ERROR transport.TSaslTransport: SASL negotiation failure 2 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)] 3 at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) 4 at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) 5 at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) 6 at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) 7 at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) 8 at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) 9 at java.security.AccessController.doPrivileged(Native Method)
c、设置hiveserver2的port和host(host设置成0.0.0.0,来接收未知来源的ip):
<property> <name>hive.server2.thrift.port</name> <value>10000</value> <description>Port number of HiveServer2 Thrift interface. Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>0.0.0.0</value> <description>Bind host on which to run the HiveServer2 Thrift interface. Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description> </property>
测试hiveserver2是否可用:
1 nohup ./bin/hive --service hiveserver2 & 2 ./bin/beeline 3 ~/apache-hive-1.2.1-bin$ bin/beeline 4 SLF4J: Class path contains multiple SLF4J bindings. 5 SLF4J: Found binding in [jar:file:/home/mammut/spark-1.5.2-SNAPSHOT-bin-2.4.0/lib/spark-assembly-1.5.2-SNAPSHOT-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 6 SLF4J: Found binding in [jar:file:/home/mammut/hadoop-binjiang/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 7 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 8 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 9 SLF4J: Class path contains multiple SLF4J bindings. 10 SLF4J: Found binding in [jar:file:/home/mammut/spark-1.5.2-SNAPSHOT-bin-2.4.0/lib/spark-assembly-1.5.2-SNAPSHOT-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 11 SLF4J: Found binding in [jar:file:/home/mammut/hadoop-binjiang/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 12 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 13 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 14 Beeline version 1.2.1 by Apache Hive 15 beeline> !connect jdbc:hive2://10.165.139.121:10000/mammut_auto_test;principal=mammut/[email protected];User=;Password=; 16 Connecting to jdbc:hive2://10.165.139.121:10000/mammut_auto_test;principal=mammut/[email protected];User=;Password=; 17 Enter username for jdbc:hive2://10.165.139.121:10000/mammut_auto_test;principal=mammut/[email protected];User=;Password=;: 18 Enter password for jdbc:hive2://10.165.139.121:10000/mammut_auto_test;principal=mammut/[email protected];User=;Password=;: 19 Connected to: Apache Hive (version 1.2.1) 20 Driver: Hive JDBC (version 1.2.1) 21 Transaction isolation: TRANSACTION_REPEATABLE_READ 22 0: jdbc:hive2://10.165.139.121:10000/mammut_a> 23 0: jdbc:hive2://10.165.139.121:10000/mammut_a> 24 0: jdbc:hive2://10.165.139.121:10000/mammut_a> show tables; 25 +-------------+--+ 26 | tab_name | 27 +-------------+--+ 28 | auto_test1 | 29 +-------------+--+ 30 1 row selected (0.145 seconds) 31 0: jdbc:hive2://10.165.139.121:10000/mammut_a>
至此,hiveserver2搭建完成
遇到的问题:
遇到的最大的问题,是hiveserver2启动时报错,这个问题耽误了一段时间,后来发现啥hive自身的一个bug,在hive_0.14版本就修复了,我用的是0.13,因此版本换到1.2.1后就可以正常启动了(
https://issues.apache.org/jira/browse/HIVE-7620)
2016-03-02 17:02:22,602 ERROR [Thread-6]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(93)) - Error: java.lang.NoSuchFieldError: SASL_PROPS at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S.getHadoopSaslProperties(HadoopThriftAuthBridge20S.java:126) at org.apache.hive.service.auth.HiveAuthFactory.getSaslProperties(HiveAuthFactory.java:116) at org.apache.hive.service.auth.HiveAuthFactory.getAuthTransFactory(HiveAuthFactory.java:133) at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:43) at java.lang.Thread.run(Thread.java:701) 2016-03-02 17:02:22,605 INFO [Thread[Thread-7,5,main]]: delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:updateCurrentKey( 222)) - Updating the current master key for generating delegation tokens 2016-03-02 17:02:22,612 INFO [Thread-3]: server.HiveServer2 (HiveStringUtils.java:run(623)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down HiveServer2 at ************************************************************/
2、代码实现kerberos认证:
在早期配置认证的时候,出现如下报错:
1 java.io.IOException: Login failure for qatest/[email protected] from keytab D:\00Tools\keytab\interface\qatest.keytab 2 at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:695) 3 at test.common.AuthKrb5.authKrb5(AuthKrb5.java:23) 4 at test.common.BaseHive.<init>(BaseHive.java:21) 5 at test.common.DataCheck.checkMysqlHive(DataCheck.java:23) 6 at test.verify.CheckRdsHiveTest.checkMsqToHive(CheckRdsHiveTest.java:24) 7 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 8 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 9 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 10 at java.lang.reflect.Method.invoke(Method.java:606) 11 at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85) 12 at org.testng.internal.Invoker.invokeMethod(Invoker.java:696) 13 at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:882) 14 at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1189) 15 at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:124) 16 at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108) 17 at org.testng.TestRunner.privateRun(TestRunner.java:767) 18 at org.testng.TestRunner.run(TestRunner.java:617) 19 at org.testng.SuiteRunner.runTest(SuiteRunner.java:348) 20 at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:343) 21 at org.testng.SuiteRunner.privateRun(SuiteRunner.java:305) 22 at org.testng.SuiteRunner.run(SuiteRunner.java:254) 23 at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) 24 at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) 25 at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224) 26 at org.testng.TestNG.runSuitesLocally(TestNG.java:1149) 27 at org.testng.TestNG.run(TestNG.java:1057) 28 at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111) 29 at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204) 30 at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175) 31 at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:125) 32 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 33 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 34 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 35 at java.lang.reflect.Method.invoke(Method.java:606) 36 at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) 37 Caused by: javax.security.auth.login.LoginException: java.lang.IllegalArgumentException:
看路径等都是正确的,将配置放到linux上去跑ok,怀疑是krb5的配置没有加载上,于是谷歌之,在IBM的知识社区中找到了可能的原因: JVM启动时没有加载krb5的配置
(kerberos的配置也可以参照这篇文章)
加上参数
System.setProperty("java.security.krb5.conf", Config.KRB5CONFIG_PATH);
就可以正常work了
PS:由于后续还要写操作hdfs的代码,也是通过kerberos认证,故把kerberos认证这块当做一个单独的类去提炼出来,也便于后期维护,代码如下:
1 public static void authKrb5(){ 2 //设置jvm启动时krb5的读取路径参数 3 System.setProperty("java.security.krb5.conf", Config.KRB5CONFIG_PATH); 4 //配置kerberos认证 5 Configuration conf = new Configuration(); 6 conf.setBoolean("hadoop.security.authorization", true); 7 conf.set("hadoop.security.authentication", "kerberos"); 8 // System.out.println(System.getProperty("java.security.krb5.conf")); 9 UserGroupInformation.setConfiguration(conf); 10 try { 11 UserGroupInformation.loginUserFromKeytab("qatest/[email protected]", "D:\\00Tools\\keytab\\interface\\qatest.keytab"); 12 } catch (IOException e) { 13 e.printStackTrace(); 14 } 15 // System.out.println("Succeeded in authenticating through Kerberos!"); 16 }
3、代码实现hive的jdbc连接
这个就不赘述了,由于经常调用,单独使用一个类来维护,还有一些常用方法的封装:
1 protected Statement stmt; 2 protected Connection con; 3 private static String driverName = "org.apache.hive.jdbc.HiveDriver"; 4 5 // public static void main(String[] args) throws SQLException{ 6 public BaseHive() { 7 //认证kerberos 8 AuthKrb5 authentkrb5 = new AuthKrb5(); 9 authentkrb5.authKrb5(); 10 11 try { 12 Class.forName(driverName); 13 } catch (ClassNotFoundException e) { 14 e.printStackTrace(); 15 System.exit(1); 16 } 17 18 //连接hiveserver2 19 try{ 20 con = DriverManager.getConnection("jdbc:hive2://"+Config.HIVE2_URL+"/mammut_auto_test;principal=mammut/[email protected];User=;Password=;", "", ""); 21 stmt = con.createStatement(); 22 }catch (SQLException e){ 23 e.printStackTrace(); 24 } 25 }