Hadoop distributed deployment: deploy one namenode and three datanode s

This paper introduces the deployment of hadoop with one master and three slaves
1 first clone the original centos system

2 nn_y is the master, DN1, DN2 and DN3 are the slaves, all of which are quickly completed by cloning Right click Manage clone and select full clone.
3. Configure network cluster:
Set static ip
192.168.64.132
192.168.64.133
192.168.64.134
192.168.64.135
Set the static ip address according to your own ip network. The last three digits are different.
4. First complete NN through x 'shell_ Y static ip settings

1) Command VI / etc / sysconfig / network scripts / ifcfg-ens33

2 command: vi /etc/hostname
Change the host name to: nn

3 vi /etc/hosts

4 do the same for the other three
5. Add new users. Add four new users respectively: adduser hadoop
6. Set ssh login without password, such as nn login without password dn1
1) Command SSH keygen - t RSA all enter

Because starting Hadoop will let you enter the password, now set password free startup
The public key is then written to authorized_keys file and modify the permissions of this file (important, please do not ignore)

cat id_rsa.pub >> authorized_keys
cat id_rsa.pub >> authorized_keys

2) See the following blog for specific principles:
https://blog.csdn.net/wh_19910525/article/details/74331649
Generate the following files

3) id_rsa (private key) id_rsa.pub (public key) known_hosts (record who logged in)
What we need to do now is to change the id_rsa.pub (public key) is passed to dn1,dn2,dn3
10. The command SSH copy id dn1 passes the id to dn1. The result is as follows:

4) Do the same for dn2 and dn3. As shown in the figure below, you can log in to DN1, dn2 and dn3 without password


7 configure jdk Java home
1) Install jdk and hadoop
2) Establish soft connection ln - s jdk1 8.0_ 152 jdk8 and LN - s hadoop-2.7.5 hadoop-2

3) Environment variable configuration
1) echo export JAVA_HOME=/home/hadoop/opt/jdk8
2) echo export JAVA_HOME=/home/hadoop/opt/jdk8 >> ~/.bashrc
3)echo export HADOOP_HOME=/home/hadoop/opt/hadoop2 >> ~/.bashrc
4) echo export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop >> ~/.bashrc
As shown below:

5) vi ~/.bashrc uses this command, and it is the same to fill in directly. In the end, there must be four lines
After configuration, enter source Bashrc to make the configuration effective


6) Test whether the configuration is successful echo $JAVA_HOME and echo $HADOOP_CONF_DIR

7. Hadoop does not support distributed system by default. It needs to be changed to distributed system. Change its protocol and change the file: / / / protocol to hdfs

  1. Command vi $HADOOP_CONF_DIR/core-site.xml
  2. vi $HADOOP_CONF_DIR/hdfs-site.xml
    Set the number of datanodes: hadoop defaults to 3. This hadoop deployment is a namenode and three datanodes. You can not configure it, but you need to know where to configure it
  3. Configure namenode path vi $HADOOP_CONF_DIR/hdfs-site.xml

Start the second generation hadoop engine yar to mapred site xml. Template copy and rename mapred site xml
cp mapred-site.xml.template mapred-site.xml


5) vi $HADOOP_CONF_DIR/yarn-site.xml

6) vi $HADOOP_CONF_DIR/slaves

8 command tar ZCF opt tar. GZ opt compresses the opt folder and transfers it to DN1, DN2 and DN3
9 command SCP opt tar. GZ DN1: ~ and environment variable SCP bashrc dn1:~
Pass to dn1

10 enter DN1, DN2 and DN3, decompress and source bashrc

11 format hdfs namenode -format

12 the format succeeds when the status is zero

13 start start DFS sh

14 input your own website and 50070 port above. If you can log in, it means that the startup is successful. Be sure to close the firewall

15 hdfs dfs -mkdir -p /user/hadoop create folder
start-yarn.sh start yarn

16. Use java to read files from hadoop cluster:
maven project created with idea:

pom import file:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.hadoop</groupId>
  <artifactId>hadoop</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>war</packaging>

  <name>hadoop Maven Webapp</name>
  <!-- FIXME change it to the project's website -->
  <url>http://www.example.com</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.7</maven.compiler.source>
    <maven.compiler.target>1.7</maven.compiler.target>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>2.8.1</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>2.8.1</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-hdfs</artifactId>
      <version>2.8.1</version>
    </dependency>

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-mapreduce-client-core</artifactId>
      <version>2.8.1</version>
    </dependency>
    
  </dependencies>

  <build>
    <finalName>hadoop</finalName>
    <pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
      <plugins>
        <plugin>
          <artifactId>maven-clean-plugin</artifactId>
          <version>3.1.0</version>
        </plugin>
        <!-- see http://maven.apache.org/ref/current/maven-core/default-bindings.html#Plugin_bindings_for_war_packaging -->
        <plugin>
          <artifactId>maven-resources-plugin</artifactId>
          <version>3.0.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-compiler-plugin</artifactId>
          <version>3.8.0</version>
        </plugin>
        <plugin>
          <artifactId>maven-surefire-plugin</artifactId>
          <version>2.22.1</version>
        </plugin>
        <plugin>
          <artifactId>maven-war-plugin</artifactId>
          <version>3.2.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-install-plugin</artifactId>
          <version>2.5.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-deploy-plugin</artifactId>
          <version>2.8.2</version>
        </plugin>
      </plugins>
    </pluginManagement>
  </build>
</project>



import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.IOException;
//Read data from hadoop cluster and from namenode
public class ReadHdfs {
    public static void main(String[] args) throws IOException {
        //Get file system
        FileSystem fs=FileSystem.get(new Configuration());
        //According to the path, obtain the file through open
      FSDataInputStream  fis=fs.open(new Path("out1/part-r-00000"));
      byte[] buffer=new byte[2048];
      while(true){
          //Read the file through the read method
         int n= fis.read(buffer);
         //When n==-1 indicates that the file has been read
         if(n==-1){
            break;
          }
          System.out.println(new String(buffer,0,n));
      }
          //close resource
             fis.close();
    }
}

17. Write files to hadoop:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.FileInputStream;
import java.io.IOException;

//Write data to hadoop cluster
public class WriteHdfs{
    public static void main(String[] args) throws IOException {
        FileSystem fs=FileSystem.get(new Configuration());
        Path path=new Path("data4");
         //Create data4 folder
        fs.mkdirs(path);
        //Read files according to local path
        FileInputStream fis=
                new FileInputStream("E:\javaWorkSpace\hadoop\src\main\java\ReadHdfs.java");
        //Write the read file into the data4 folder sub file readhdfs In Java
        FSDataOutputStream fos = fs.create(new Path(path,"ReadHdfs.java") );
        byte[] buffer=new byte[2048];
        while(true){
           int n=fis.read(buffer);
              if(n==-1){
                  break;
              }
              //Write to hadoop cluster from 0 to n
              fos.write(buffer,0,n);
              fos.hflush();
        }
        //close resource
        fis.close();
        fos.close();
    }
}

18 jps and other command prompts, command not found
1) yum list openjdk-devel
2)yum install java-1.8.0-openjdk-devel.x86_64

Tags: Java Back-end

Posted by swallace on Mon, 18 Apr 2022 13:01:37 +0930