HUE installation and Deployment Guide

Update history

Update timeHUE versionUpdate log
December 8, 20194.6New document
December 23, 20204.8Update document

1 Introduction

Hue (Hadoop user experience) is an open source Apache Hadoop UI system, which evolved from Cloudera Desktop. Finally, Cloudera company contributed it to the Hadoop community of Apache foundation, which is implemented based on the Python Web framework Django.
By using Hue, we can interact with Hadoop cluster on the browser side Web console to analyze and process data, such as operating data on HDFS, running MapReduce Job, executing Hive SQL statement, browsing Hbase database, etc.

2 installation and deployment

Hue official website: https://gethue.com
Hue download address: https://docs.gethue.com/releases/

2.1 installation environment

Centos version:

[root@linux01 hue-4.8.0]# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

Python version:
Hue uses local Python modules. You need to install some development libraries in the system and install tar files. You will need to install these library development packages and tools on your system. Hue 4.8 currently supports Python version:

  • Python 2.7
  • Python 3.6+

View current Python version

[root@linux01 hue-4.8.0]# python --version
Python 2.7.5

If you use the 3.6 + version of Python, you need to set the corresponding version before build ing

export PYTHON_VER=python3.x (x Is the specific version)

2.2 dependent installation

Due to the need for local compilation, you need to install dependencies. You can copy the compiled execution file to other machines for operation
Since mariadb is installed locally, replace MySQL devel with mariadb devel:

yum install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mariadb mariadb-devel openldap-devel python-devel sqlite-devel gmp-devel

NodeJs with Version 10 + is also required. It is found that the version installed through yum is too old

[root@linux01 hue-4.8.0]# yum install nodejs
==============================================================================================================================================================
 Package                          Arch                             Version                                               Repository                      Size
==============================================================================================================================================================
Installing:
 nodejs                           x86_64                           1:6.17.1-1.el7                                        epel                           4.7 M
Installing for dependencies:
 libuv                            x86_64                           1:1.40.0-1.el7                                        epel                           152 k
 npm                              x86_64                           1:3.10.10-1.6.17.1.1.el7                              epel                           2.5 M

Transaction Summary
==============================================================================================================================================================

Can from https://github.com/nodesource/distributions , find the command to install different versions of NodeJs on different systems

CentOS corresponding NodeJs 10 Command of
[root@linux01 opt]# curl -sL https://rpm.nodesource.com/setup_10.x | bash -

Execute the installation command again, and you can see that the version has changed to 10
[root@linux01 opt]# yum install nodejs
Resolving Dependencies
--> Running transaction check
---> Package nodejs.x86_64 2:10.23.0-1nodesource will be installed
Installed:
  nodejs.x86_64 2:10.23.0-1nodesource

Installation succeeded. Check the current version
[root@linux01 opt]# node -v
v10.23.0

2.3 compilation execution

Unzip the downloaded package

[root@linux01 pkg]# tar -zxvf hue-4.8.0.tgz

npm resources need to be downloaded during compilation. The Internet speed is too slow. Set npm Taobao image:

npm config set -g registry https://registry.npm.taobao.org

Method 1: enter the hue directory and install in the form of service. PREFIX specifies the path of our installation

compile
PREFIX=/usr/share make install
 Enter the installation path and run HUE
cd /usr/share/hue
build/env/bin/supervisor

Method 2: enter the hue directory and do not install in the form of service (adopted)

Install directly in the current directory
make apps
 function HUE
build/env/bin/supervisor

The compilation process is long and may get stuck. You can recompile after interruption. After compilation, the following contents are displayed

make[1]: Leaving directory `/opt/hue-4.8.0'

2.4 configuration

By modifying / opt / hue-4.8.0 / desktop / conf / hue Ini
Configure running port and time zone

http_host=0.0.0.0
http_port=8888

time_zone=America/Shanghai

Configure startup user
Hue recommends using hue user to start the service and create hue in Centos

useradd hue

Configure user

# Webserver runs as this user
server_user=hue
server_group=hue

# This should be the Hue admin and proxy user
default_user=hue

Create Hue database in Mysql

CREATE DATABASE hue CHARACTER SET UTF8;

CREATE USER 'hue'@'%' INDENTIFIED BY 'Hue123';

GRANT ALL ON hue.* to 'hue'@'%';

FLUSH PRIVILEGES;

Modify default database

[[database]]
    engine=mysql
    host=192.168.199.30
    port=3306
    user=hue
    password=Hue123
    name=hue

Initialize database

[root@linux01 hue-4.8.0]# ./build/env/bin/hue migrate
System check identified some issues:

WARNINGS:
jobbrowser.DagDetails.dag_info: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
        HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
jobbrowser.QueryDetails.hive_query: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
        HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
Operations to perform:
  Apply all migrations: admin, auth, axes, beeswax, contenttypes, desktop, jobsub, oozie, pig, search, sessions, sites, useradmin
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying axes.0001_initial... OK
  Applying axes.0002_auto_20151217_2044... OK
  Applying axes.0003_auto_20160322_0929... OK
  Applying axes.0004_auto_20201223_1925... OK
  Applying beeswax.0001_initial... OK
  Applying beeswax.0002_auto_20200320_0746... OK
  Applying desktop.0001_initial... OK
  Applying desktop.0002_initial... OK
  Applying desktop.0003_initial... OK
  Applying desktop.0004_initial... OK
  Applying desktop.0005_initial... OK
  Applying desktop.0006_initial... OK
  Applying desktop.0007_initial... OK
  Applying desktop.0008_auto_20191031_0704... OK
  Applying desktop.0009_auto_20191202_1056... OK
  Applying desktop.0010_auto_20200115_0908... OK
  Applying desktop.0011_document2_connector... OK
  Applying jobsub.0001_initial... OK
  Applying oozie.0001_initial... OK
  Applying oozie.0002_initial... OK
  Applying oozie.0003_initial... OK
  Applying oozie.0004_initial... OK
  Applying oozie.0005_initial... OK
  Applying oozie.0006_auto_20200714_1204... OK
  Applying pig.0001_initial... OK
  Applying pig.0002_auto_20200714_1204... OK
  Applying pig.0003_auto_20200923_0657... OK
  Applying search.0001_initial... OK
  Applying sessions.0001_initial... OK
  Applying sites.0001_initial... OK
  Applying sites.0002_alter_domain_unique... OK
  Applying useradmin.0001_initial... OK
  Applying useradmin.0002_userprofile_json_data... OK
  Applying useradmin.0003_auto_20200203_0802... OK
  Applying useradmin.0004_userprofile_hostname... OK

Configure LDAP login authentication (optional)
Hue supports a variety of secure login authentication. By default, the user name and password of the first login are used. If LDAP is used for authentication

[[auth]]
backend=desktop.auth.backend.LdapBackend
[[ldap]]
base_cn="DC=pazl,DC=com"
ldap_url=ldap://linux01:386
use_start_tls=false
bind_dn="CN=root,DC=pazl,DC=com"
bind_password=123456
ldap_sername_pattern="cn=<username>,ou=People,dc=yydjj,dc=com"
create_users_on_login=true
search_bind_authentiction=false

HDFS configuration
Hue supports the management of HDFS, which is defined in [Hadoop] - [hdfs_cluster]] - > [[default]]]

fs_defaultfs=hdfs://linux01:8020
webhdfs_url=http://linux01:50070/webhdfs/v1

The REST HDFS API of Hadoop cluster mainly includes WebHDFS and HttpFS. WebHDFS is a built-in service of HDFS and is enabled by default. HttpFS is an independent service of HDFS. If it needs to be used, it needs to be installed manually. There is no default installation in Ambari. WebHDFS was developed by HortonWorks and donated to Apache; HttpFS was developed by Cloudera and donated to Apache. When configuring high availability, you need to use HttpFS.
Since Ambari only turns on WebHDFS by default, we need to configure the hue user as the proxy user of all other users and groups, which can submit requests on behalf of other users on the core site Add the following configuration to XML“

<property>
  <name>hadoop.proxyuser.hue.groups</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hue.hosts</name>
  <value>*</value>
</property>

YARN configuration

  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=linux01

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8050

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://linux01:8088

      # URL of the ProxyServer API
      proxy_api_url=http://linux01:8088

      # URL of the HistoryServer API
      history_server_api_url=http://linux01:19888

      # URL of the Spark History Server
      spark_history_server_url=http://linux01:18081

      # Change this if your Spark History Server is Kerberos-secured
      ## spark_history_server_security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters.
    # Redefine different properties there.
    # e.g.

    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      ## logical_name=my-rm-name

      # Un-comment to enable
      ## submit_to=True

      # URL of the ResourceManager API
      ## resourcemanager_api_url=http://localhost:8088

      # ...

HIVE configuration

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is esnabled, use fully-qualified domain name (FQDN).
	hive_server_host=linux01

  # Binary thrift port for HiveServer2.
	hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/etc/hive/conf

3 start the HUE process

[root@linux01 hue-4.8.0]# ./build/env/bin/supervisor

visit http://linux01:8888/hue Enter the interface and configure the default user name yang and password 123456

Tags: Big Data

Posted by paulbrown83 on Sat, 16 Apr 2022 02:48:05 +0930