Update history
Update time | HUE version | Update log |
---|---|---|
December 8, 2019 | 4.6 | New document |
December 23, 2020 | 4.8 | Update document |
1 Introduction
Hue (Hadoop user experience) is an open source Apache Hadoop UI system, which evolved from Cloudera Desktop. Finally, Cloudera company contributed it to the Hadoop community of Apache foundation, which is implemented based on the Python Web framework Django.
By using Hue, we can interact with Hadoop cluster on the browser side Web console to analyze and process data, such as operating data on HDFS, running MapReduce Job, executing Hive SQL statement, browsing Hbase database, etc.
2 installation and deployment
Hue official website: https://gethue.com
Hue download address: https://docs.gethue.com/releases/
2.1 installation environment
Centos version:
[root@linux01 hue-4.8.0]# cat /etc/redhat-release CentOS Linux release 7.8.2003 (Core)
Python version:
Hue uses local Python modules. You need to install some development libraries in the system and install tar files. You will need to install these library development packages and tools on your system. Hue 4.8 currently supports Python version:
- Python 2.7
- Python 3.6+
View current Python version
[root@linux01 hue-4.8.0]# python --version Python 2.7.5
If you use the 3.6 + version of Python, you need to set the corresponding version before build ing
export PYTHON_VER=python3.x (x Is the specific version)
2.2 dependent installation
Due to the need for local compilation, you need to install dependencies. You can copy the compiled execution file to other machines for operation
Since mariadb is installed locally, replace MySQL devel with mariadb devel:
yum install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mariadb mariadb-devel openldap-devel python-devel sqlite-devel gmp-devel
NodeJs with Version 10 + is also required. It is found that the version installed through yum is too old
[root@linux01 hue-4.8.0]# yum install nodejs ============================================================================================================================================================== Package Arch Version Repository Size ============================================================================================================================================================== Installing: nodejs x86_64 1:6.17.1-1.el7 epel 4.7 M Installing for dependencies: libuv x86_64 1:1.40.0-1.el7 epel 152 k npm x86_64 1:3.10.10-1.6.17.1.1.el7 epel 2.5 M Transaction Summary ==============================================================================================================================================================
Can from https://github.com/nodesource/distributions , find the command to install different versions of NodeJs on different systems
CentOS corresponding NodeJs 10 Command of [root@linux01 opt]# curl -sL https://rpm.nodesource.com/setup_10.x | bash - Execute the installation command again, and you can see that the version has changed to 10 [root@linux01 opt]# yum install nodejs Resolving Dependencies --> Running transaction check ---> Package nodejs.x86_64 2:10.23.0-1nodesource will be installed Installed: nodejs.x86_64 2:10.23.0-1nodesource Installation succeeded. Check the current version [root@linux01 opt]# node -v v10.23.0
2.3 compilation execution
Unzip the downloaded package
[root@linux01 pkg]# tar -zxvf hue-4.8.0.tgz
npm resources need to be downloaded during compilation. The Internet speed is too slow. Set npm Taobao image:
npm config set -g registry https://registry.npm.taobao.org
Method 1: enter the hue directory and install in the form of service. PREFIX specifies the path of our installation
compile PREFIX=/usr/share make install Enter the installation path and run HUE cd /usr/share/hue build/env/bin/supervisor
Method 2: enter the hue directory and do not install in the form of service (adopted)
Install directly in the current directory make apps function HUE build/env/bin/supervisor
The compilation process is long and may get stuck. You can recompile after interruption. After compilation, the following contents are displayed
make[1]: Leaving directory `/opt/hue-4.8.0'
2.4 configuration
By modifying / opt / hue-4.8.0 / desktop / conf / hue Ini
Configure running port and time zone
http_host=0.0.0.0 http_port=8888 time_zone=America/Shanghai
Configure startup user
Hue recommends using hue user to start the service and create hue in Centos
useradd hue
Configure user
# Webserver runs as this user server_user=hue server_group=hue # This should be the Hue admin and proxy user default_user=hue
Create Hue database in Mysql
CREATE DATABASE hue CHARACTER SET UTF8; CREATE USER 'hue'@'%' INDENTIFIED BY 'Hue123'; GRANT ALL ON hue.* to 'hue'@'%'; FLUSH PRIVILEGES;
Modify default database
[[database]] engine=mysql host=192.168.199.30 port=3306 user=hue password=Hue123 name=hue
Initialize database
[root@linux01 hue-4.8.0]# ./build/env/bin/hue migrate System check identified some issues: WARNINGS: jobbrowser.DagDetails.dag_info: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField. HINT: ForeignKey(unique=True) is usually better served by a OneToOneField. jobbrowser.QueryDetails.hive_query: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField. HINT: ForeignKey(unique=True) is usually better served by a OneToOneField. Operations to perform: Apply all migrations: admin, auth, axes, beeswax, contenttypes, desktop, jobsub, oozie, pig, search, sessions, sites, useradmin Running migrations: Applying contenttypes.0001_initial... OK Applying auth.0001_initial... OK Applying admin.0001_initial... OK Applying admin.0002_logentry_remove_auto_add... OK Applying contenttypes.0002_remove_content_type_name... OK Applying auth.0002_alter_permission_name_max_length... OK Applying auth.0003_alter_user_email_max_length... OK Applying auth.0004_alter_user_username_opts... OK Applying auth.0005_alter_user_last_login_null... OK Applying auth.0006_require_contenttypes_0002... OK Applying auth.0007_alter_validators_add_error_messages... OK Applying auth.0008_alter_user_username_max_length... OK Applying axes.0001_initial... OK Applying axes.0002_auto_20151217_2044... OK Applying axes.0003_auto_20160322_0929... OK Applying axes.0004_auto_20201223_1925... OK Applying beeswax.0001_initial... OK Applying beeswax.0002_auto_20200320_0746... OK Applying desktop.0001_initial... OK Applying desktop.0002_initial... OK Applying desktop.0003_initial... OK Applying desktop.0004_initial... OK Applying desktop.0005_initial... OK Applying desktop.0006_initial... OK Applying desktop.0007_initial... OK Applying desktop.0008_auto_20191031_0704... OK Applying desktop.0009_auto_20191202_1056... OK Applying desktop.0010_auto_20200115_0908... OK Applying desktop.0011_document2_connector... OK Applying jobsub.0001_initial... OK Applying oozie.0001_initial... OK Applying oozie.0002_initial... OK Applying oozie.0003_initial... OK Applying oozie.0004_initial... OK Applying oozie.0005_initial... OK Applying oozie.0006_auto_20200714_1204... OK Applying pig.0001_initial... OK Applying pig.0002_auto_20200714_1204... OK Applying pig.0003_auto_20200923_0657... OK Applying search.0001_initial... OK Applying sessions.0001_initial... OK Applying sites.0001_initial... OK Applying sites.0002_alter_domain_unique... OK Applying useradmin.0001_initial... OK Applying useradmin.0002_userprofile_json_data... OK Applying useradmin.0003_auto_20200203_0802... OK Applying useradmin.0004_userprofile_hostname... OK
Configure LDAP login authentication (optional)
Hue supports a variety of secure login authentication. By default, the user name and password of the first login are used. If LDAP is used for authentication
[[auth]] backend=desktop.auth.backend.LdapBackend [[ldap]] base_cn="DC=pazl,DC=com" ldap_url=ldap://linux01:386 use_start_tls=false bind_dn="CN=root,DC=pazl,DC=com" bind_password=123456 ldap_sername_pattern="cn=<username>,ou=People,dc=yydjj,dc=com" create_users_on_login=true search_bind_authentiction=false
HDFS configuration
Hue supports the management of HDFS, which is defined in [Hadoop] - [hdfs_cluster]] - > [[default]]]
fs_defaultfs=hdfs://linux01:8020 webhdfs_url=http://linux01:50070/webhdfs/v1
The REST HDFS API of Hadoop cluster mainly includes WebHDFS and HttpFS. WebHDFS is a built-in service of HDFS and is enabled by default. HttpFS is an independent service of HDFS. If it needs to be used, it needs to be installed manually. There is no default installation in Ambari. WebHDFS was developed by HortonWorks and donated to Apache; HttpFS was developed by Cloudera and donated to Apache. When configuring high availability, you need to use HttpFS.
Since Ambari only turns on WebHDFS by default, we need to configure the hue user as the proxy user of all other users and groups, which can submit requests on behalf of other users on the core site Add the following configuration to XML“
<property> <name>hadoop.proxyuser.hue.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hue.hosts</name> <value>*</value> </property>
YARN configuration
[[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=linux01 # The port where the ResourceManager IPC listens on resourcemanager_port=8050 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) ## logical_name= # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false # URL of the ResourceManager API resourcemanager_api_url=http://linux01:8088 # URL of the ProxyServer API proxy_api_url=http://linux01:8088 # URL of the HistoryServer API history_server_api_url=http://linux01:19888 # URL of the Spark History Server spark_history_server_url=http://linux01:18081 # Change this if your Spark History Server is Kerberos-secured ## spark_history_server_security_enabled=false # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs # have to be verified against certificate authority ## ssl_cert_ca_verify=True # HA support by specifying multiple clusters. # Redefine different properties there. # e.g. # [[[ha]]] # Resource Manager logical name (required for HA) ## logical_name=my-rm-name # Un-comment to enable ## submit_to=True # URL of the ResourceManager API ## resourcemanager_api_url=http://localhost:8088 # ...
HIVE configuration
[beeswax] # Host where HiveServer2 is running. # If Kerberos security is esnabled, use fully-qualified domain name (FQDN). hive_server_host=linux01 # Binary thrift port for HiveServer2. hive_server_port=10000 # Hive configuration directory, where hive-site.xml is located hive_conf_dir=/etc/hive/conf
3 start the HUE process
[root@linux01 hue-4.8.0]# ./build/env/bin/supervisor
visit http://linux01:8888/hue Enter the interface and configure the default user name yang and password 123456