catalogue
1) Configure Hive site on Hive server xml
2) Configure the core site on each Hadoop node XML, remember to send to all nodes
3) Restart HDFS and Hive, and start Metastore and HiveServer2 services on Hive server
4) Connect Hive through beeline at the client
2, Installing Ranger hive plugin
1) Remotely send the compiled "hive plugin" to the directory "/ software" of node1 node and unzip it
2) Configure the "install.properties" file
3) Execute the "enable hive plugin. Sh" script to start the hive plugin
3, Configure Ranger connection Hive service
1) Start HDFS, Hive, Hive meetstore and Hive Server2
2) Configure Hive in Ranger page
3) Connection test whether Hive can be connected via jdbc
4, Ranger manages the rights of Hive users
Ranger manage Hive security
1, Configure HiveServer2
There are two ways to access hive: HiveServer2 and hive Client. Hive Client requires jar packages of hive and Hadoop to configure the environment. HiveServer2 makes the Client connecting hive independent from the Yarn and HDFS clusters. It is not necessary for each node to configure the jar package and a series of environments of hive and Hadoop.
Ranger can manage Hive permission only for HiveServer2 jdbc connection, so HiveServer2 needs to be configured here.
The steps to configure HiveServer2 are as follows:
1) Configure Hive site on Hive server xml
#$Hive on Hive server_ HOME/conf/Hive-site. Configuration in XML: <!-- to configure hiveserver2 --> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>192.168.179.4</value> </property> <!-- to configure hiveserver2 Used zookeeper --> <property> <name>hive.zookeeper.quorum</name> <value> node3:2181,node4:2181,node5:2181</value> </property>
Note: "hive.zookeeper.quorum" sets up the configuration item of hiverver2ha. It can not be configured. If it is not configured, the local zookeeper will always be connected when starting hiveserver2, resulting in a large number of error logs (/ tmp/root/hive.log), resulting in instability when connecting hiverver2 of the current node1 node through beeline, and there will be an error message that the connection is not connected.
2) Configure the core site on each Hadoop node XML, remember to send to all nodes
<!-- Configure proxy access users, if the following information is not configured hive of jdbc The connection will report an error --> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property>
3) Restart HDFS and Hive, and start Metastore and HiveServer2 services on Hive server
[root@node1 conf]# hive --service metastore & [root@node1 conf]# hive --service hiveserver2 > /root/hiveserver2_log.txt &
4) Connect Hive through beeline at the client
[root@node3 test]# beeline beeline> !connect jdbc:hive2://node1:10000 root Enter password for jdbc:hive2://node1:10000: * * * * # you can enter any password without verification 0: jdbc:hive2://node1:10000> show tables;
2, Installing Ranger hive plugin
We can use Ranger to manage Hive data security. Here, we need to install Hive plug-in "ranger-2.1.0-hive-plugin". This plug-in can only manage the permission of the request to connect to Hive in jdbc mode, and cannot manage the permission of Hive cli client mode (generally, the node installing Hive can be accessed by Hive client). The steps are as follows:
1) Remotely send the compiled "hive plugin" to the directory "/ software" of node1 node and unzip it
Remotely send "ranger-2.1.0-hive-plugin.tar.gz" under "/ software/apache-ranger-2.1.0/target /" to node1 node "/ software":
[root@node3 /]# scp /software/apache-ranger-2.1.0/target/ranger-2.1.0-hive-plugin.tar.gz node1:/software/ #Operate on node1 node [root@node1 ~]# cd /software/ [root@node1 software]# tar -zxvf ./ranger-2.1.0-hive-plugin.tar.gz
2) Configure the "install.properties" file
Enter the "/ software/ranger-2.1.0-hive-plugin" directory and modify the "install.properties" file:
[root@node1 ranger-2.1.0-hive-plugin]# vim install.properties #Configure Ranger admin access address POLICY_MGR_URL=http://node1:6080 #Configure Hive warehouse name, which can be customized and needs to be used in Ranger later REPOSITORY_NAME=hive_repo #Configure Hive's installation directory COMPONENT_INSTALL_DIR_NAME=/software/hive-3.1.2/ #Configure users and user groups that use the plug-in CUSTOM_USER=root CUSTOM_GROUP=root
3) Execute the "enable hive plugin. Sh" script to start the hive plugin
Enter the "/ software/ranger-2.1.0-hive-plugin" directory and execute the following command to enable the plug-in:
[root@node1 ~]# cd /software/ranger-2.1.0-hive-plugin [root@node1 ranger-2.1.0-hive-plugin]# enable-hive-plugin.sh
3, Configure Ranger connection Hive service
After installing the above Hive plugin, restart HDFS and start Hive,HiveMetastore, HiveServer2, etc. If you want to manage the table and column permissions of users connected to Hive, you need to add a corresponding Hive service in Ranger, and then you can use Ranger to configure the Hive library, table and column permissions management of each user through this service. The configuration is as follows:
1) Start HDFS, Hive, Hive meetstore and Hive Server2
#Start HDFS and Hive metastore on node1 node [root@node1 conf]# start-all.sh [root@node1 conf]# hive --service metastore & [root@node1 conf]# hive --service hiveserver2 > /root/hiveserver2_log.txt &
2) Configure Hive in Ranger page
Note that the above parameters are explained as follows:
- "Service Name" fills in the current Hive Service Name, which is consistent with the configuration parameter "REPOSITORY_NAME" of "install.properties" file in Hive plug-in.
- The configured "user" and "password" are also consistent with the "CUSTOM_USER=root" and "CUSTOM_GROUP=root" configured in the "install.properties" file.
- Just fill in "jdbc:hive2://node1:10000" in "jdbc.url", and connect node1 here.
After adding:
3) Connection test whether Hive can be connected via jdbc
Note: when connecting here, a single machine needs to wait for a period of time to connect normally.
4, Ranger manages the rights of Hive users
View Hive permission management service configured in Ranger:
In the above figure, only the root user has operation permissions on all libraries, tables and columns. After modification, it is as follows:
Login beeline in node3 and connect node1 hive:
#node3 connects Hive via beeline [root@node3 ~]# beeline #Connecting to HiveServer2 jdbc connection beeline> !connect jdbc:hive2://node1:10000 #At present, the user name here can be entered at will. There is no verification in Hive. Here, you can configure the functions through Hive #Users can connect to Hive and then manage the fine-grained access rights of these users through Ranger. You can see from the above figure #At present, only root users can access the table data. You can use non root users to test. Here you can use #"diaochan" user: Enter username for jdbc:hive2://node1:10000: diaochan #Since there is no password verification in Hive, you can enter any password here at will Enter password for jdbc:hive2://node1:10000: **** #Query the table under the library. You do not have permission. 0: jdbc:hive2://node1:10000> show tables Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [daochan] does not have [USE] privilege on [de fault] (state=42000,code=40000) #Log in to beeline again with root user, query the corresponding table, and have permission [root@node3 ~]# beeline beeline> !connect jdbc:hive2://node1:10000 Enter username for jdbc:hive2://node1:10000: root Enter password for jdbc:hive2://node1:10000: * * * # password optional 0: jdbc:hive2://node1:10000> show tables;
Next, create two tables in Hive for permission management:
#Create two tables in Hive create table student (id int,name string,age int) row format delimited fields terminated by '\t'; create table score (id int,name string,score int) row format delimited fields terminated by '\t';
Upload data attachments and upload the above files to node3 "/ software/test"
1 zhangsan 18 2 lisi 19 3 wangwu 20 4 maliu 21 5 tianqi 22 6 zhaoba 23
1 zhangsan 100 2 lisi 200 3 wangwu 300 4 maliu 400 5 tianqi 500 6 zhaoba 600
#Load data: hive> load data local inpath '/root/test/students.txt' into table student; hive> load data local inpath '/root/test/scores.txt' into table score;
Permission requirements: user "user1" is configured with access and modification permission for the above two tables, while user "user2" is configured with only access permission for the two tables.
The configuration steps are as follows:
1) Create two users in node1 node, and the password is the corresponding user name
#Create two users user1, and user2 [root@node1 ~]# useradd user1 [root@node1 ~]# passwd user1 [root@node1 ~]# useradd user2 [root@node1 ~]# passwd user2
2) On the Ranger page, open the "hive_repo" service. The configuration is as follows:
To configure permissions for the Student table:
The final configuration is as follows:
3) Log in to Hive Beeline test:
When inserting data into HDFS, user1 and user2 users need to operate HDFS and Yarn, so here change the "/ user" path permission in the path "/ user/hive/warehouse" corresponding to Hive in HDFS to "777", and change the "tmp" path permission of Yarn to "777"
[root@node5 bin]# hdfs dfs -chmod -R 777 /user [root@node5 bin]# hdfs dfs -chmod -R 777 /tmp
Test login user1 and have operation and modification permissions for "student" and "score" tables, as follows:
[root@node3 ~]# beeline beeline> !connect jdbc:hive2://node1:10000 0: jdbc:hive2://node1:10000> select * from student;
0: jdbc:hive2://node1:10000> select * from score;
#You can also insert data into the tables student and score. 0: jdbc:hive2://node1:10000> insert into student values (7,"aa",24); 0: jdbc:hive2://node1:10000> insert into score values (7,"bb",700);
Test login user2 and have operation and modification permissions for "student" and "score" tables, as follows:
[root@node3 software]# beeline beeline> !connect jdbc:hive2://node1:10000 Enter username for jdbc:hive2://node1:10000: user2 Enter password for jdbc:hive2://node1:10000: * * * # password input 0: jdbc:hive2://node1:10000> select * from student;
0: jdbc:hive2://node1:10000> select * from score;
#The test inserts data into "student" and "score" without corresponding permission: 0: jdbc:hive2://node1:10000> insert into table student values (8,"cc",25); Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [user2] does not have [UPDATE] privilege on [d efault/student] (state=42000,code=40000) 0: jdbc:hive2://node1:10000> insert into table score values (8,"dd",800); Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [user2] does not have [UPDATE] privilege on [d efault/score] (state=42000,code=40000)
Permission requirements: configure the user "user3" and have query permission for the "id" and "name" columns in the "student" table, while other columns do not have query permission.
The configuration steps are as follows:
1) Add user "user3" in node1 node
#Create two users user3 [root@node1 ~]# useradd user3 [root@node1 ~]# passwd user3
2) Give the user "user3" access to the configuration table "student"
3) Testing
#user3 login beeline [root@node3 software]# beeline beeline> !connect jdbc:hive2://node1:10000 Enter username for jdbc:hive2://node1:10000: user3 #Accessing the data of the "student" table, the "age" column cannot be queried, and the select * query is not allowed 0: jdbc:hive2://node1:10000> select id ,name from student;
Permission requirements: when the user "user1" accesses the table "student", the "age" column will be output with a null value and desensitized.
The configuration steps are as follows:
1) Give the user "user1" the "Masking" access to the configuration table "student"
2) Log in to Hive Beeline test
[root@node3 software]# beeline beeline> !connect jdbc:hive2://node1:10000 Enter username for jdbc:hive2://node1:10000: user1 0: jdbc:hive2://node1:10000> select * from student;
Permission requirements: when the user "user2" accesses the table "student", the "age" column can only insert row data less than or equal to 20.
The configuration steps are as follows:
1) Give the user "user1" the "Masking" access to the configuration table "student"
2) Log in to Hive Beeline test
[root@node3 software]# beeline beeline> !connect jdbc:hive2://node1:10000 Enter username for jdbc:hive2://node1:10000: user2 #Query only 3 rows of qualified data 0: jdbc:hive2://node1:10000> select * from student;
- 📢 Blog home page: https://lansonli.blog.csdn.net
- 📢 Welcome to like 👍 Collection ⭐ Leaving a message. 📝 Please correct any mistakes!
- 📢 This article was originally written by Lansonli and started on CSDN blog 🙉
- 📢 Big data series articles will be updated every day. Don't forget that others are still running when you stop to have a rest. I hope you can seize the time to study and make every effort to go to a better life ✨