Character stream and encoding format

Character stream

Character stream overview

Character stream: only character files in the computer can be operated
/ / file copy case: regardless of file type, use byte stream!
/ / when operating character files, byte stream will be inconvenient, so it is recommended to use character stream
/ / character file: open it with the Notepad tool of windows
    
The essence of character stream: the bottom layer is byte stream, but it is highly encapsulated! / / the character stream is a wrapper stream with a buffer
Character stream = byte stream + encoding format;
/ / the output content of the character output stream needs to refresh the buffer!!

Root node of character stream: reader/writer - > abstract class
Ordinary character stream: filewriter/filereader - > not commonly used, inefficient
Efficient character stream: bufferedwriter / BufferedReader - > common, high efficiency

FileWriter/FileReader

FileWriter/FileReader: normal character stream

FileWriter: file character output stream
Construction method:

FileWriter(String fileName) //Target file

FileWriter(File file) 
//Write with append: FileOutputStream/FileWriter

FileWriter(String fileName, boolean append) 

FileWriter(File file, boolean append)

How to write:

Write one character at a time : void write(int ch)

Write one character array at a time :  void write(char[] chs)

Write part of a character array one at a time : void write(char[] chs,int offset,int length)

Write one string at a time : void write(String str) 

Write one part of a string at a time : void write(String str,int offset,int length)     

FileReader: file character input stream
Construction method:

FileReader(String fileName) //source file

FileReader(File file) //source file

How to read:

Read one character at a time : int read()

Read one character array at a time : int read(char[] chs)

Read part of a character array one at a time : int read(char[] chs,int offset,int length)
        //There is no way to read one string at a time

Standard code

Common character streams copy character files one character at a time:

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class Common character stream copy character file one character at a time {
    public static void main(String[] args) throws IOException {
        //Create stream
        FileWriter fw = new FileWriter("Destination file address");
        FileReader fr = new FileReader("Source file address");
        //Cyclic read / write
        int ch;//Store one character read at a time

        while((ch = fr.read()) != -1){
            //Read one character and write one character
            fw.write(ch);
            //Manual refresh
            fw.flush();
        }

        //Closed flow
        fw.close();
        fr.close();
    }
}

Common character streams copy character files one character array at a time:

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class Ordinary character stream copy character file one character array at a time {
    public static void main(String[] args) throws IOException {
        //Create stream
        FileWriter fw = new FileWriter("Destination file address");
        FileReader fr = new FileReader("Source file address");

        //Cyclic read / write
        char[] chs = new char[1024];
        int len;//Count the number of characters read each time

        while ((len = fr.read(chs)) != -1){
            //Write as much as you read
            fw.write(chs,0,len);
            //Refresh
            fw.flush();
        }

        //Closed flow
        fw.close();
        fr.close();
    }
}

BufferedWriter/BufferedReader

BufferedWriter/BufferedReader: efficient character output / character input stream (automatically refreshed buffer)
    
Construction method:
    BufferedWriter(Writer writer)
    BufferedReader(Reader reader)
        
General reading and writing methods:
How to write:

Write one character at a time : void write(int ch)

Write one character array at a time :  void write(char[] chs)

Write part of a character array one at a time : void write(char[] chs,int offset,int length)

Write one string at a time : void write(String str) 

Write one part of a string at a time : void write(String str,int offset,int length)   

How to read:

Read one character at a time : int read()

Read one character array at a time : int read(char[] chs)

Read part of a character array one at a time : int read(char[] chs,int offset,int length)
        //There is no way to read one string at a time

Standard code

Efficient character stream copy character files one character at a time:

import java.io.*;

public class Efficient character stream copy character files one character at a time {
    public static void main(String[] args) throws IOException {
        //Create stream
        BufferedWriter bw = new BufferedWriter(
                new FileWriter("Destination file address"));
        BufferedReader br = new BufferedReader(
                new FileReader("Source file address"));

        //Cyclic read / write
        int ch;//Store one character read at a time

        while((ch = br.read()) != -1){
            //Read one character and write one character
            bw.write(ch);
        }

        //Closed flow
        bw.close();
        br.close();
    }
}

Efficient character stream copy character files one character array at a time:

import java.io.*;

public class Efficient character stream copy character files one character array at a time {
    public static void main(String[] args) throws IOException {
        //Create stream
        BufferedWriter bw = new BufferedWriter(new FileWriter("Destination file address"));
        BufferedReader br = new BufferedReader(new FileReader("Source file address"));

        //Cyclic read / write
        char[] chs = new char[1024];
        int len;//Count the number of characters read each time

        while ((len = br.read(chs)) != -1){
            //Write as much as you read
            bw.write(chs,0,len);
        }

        //Closed flow
        bw.close();
        br.close();
    }
}

Efficient character stream copy character files one line at a time:

import java.io.*;

public class Efficient character stream copy character files one line at a time {
    public static void main(String[] args) throws IOException {
        //Create stream
        BufferedWriter bw = new BufferedWriter(
                new FileWriter("Destination file address"));
        BufferedReader br = new BufferedReader(
                new FileReader("Source file address"));

        //Cyclic read / write
        String line;//To store a row of data read each time

        while((line = br.readLine()) != null){
            //Read and write one line
            bw.write(line);
            //Fill in line breaks
            bw.newLine();
        }

        //Closed flow
        bw.close();
        br.close();
    }
}

Coding format

Encoding and decoding

Code: binary code (01010101110011) - > bytes in hard disk - > file (character file)

Coding: write binary code (text -- > binary code) - > > from understandable to incomprehensible

Decoding: parsing binary code (binary code -- > text) - > > from incomprehensible to understandable

The problem of file garbled

Coding: write binary code (text - (in what way) - > binary code) - > > from understandable to incomprehensible

Decoding: parsing binary code (binary code - (how to solve) - > text) - > > from incomprehensible to understandable

How to compile / solve: coding format (coding table)

The root cause of garbled Code: inconsistent encoding and decoding methods

Solution to file garbled Code: modify the coding format of the file to be consistent with that of the platform!!!

 

Coding table

Code table: the "codebook" between byte and text conversion// If the "codebook" of encoding and decoding is inconsistent, it will be garbled!

Character set: there are multiple sets of coding tables in a character set;
The most basic character set: ASCII (American standard information exchange code table) - > can only solve the language coding problem in English environment
/ / English letters and numbers will not be garbled. The first 128 bits of all coding tables are ASCII
    
Chinese character set: GB series (National Standard Series)
GB2312: early Chinese coding table
GBK: the most commonly used Chinese coding table now
//GBK uses 2 bytes to represent a Chinese character
                
"International Organization for unity": unified the coding format of the World Wide Web
Designed a character set: Unicode character set
UTF-8 coding table: unified coding format of Internet data;
//UTF-8 encoding table uses 3 bytes to represent a Chinese character
            UTF-16
            UTF-32
                
ANSI: it is not a coding table -- > local character set: windows system will automatically switch the coding format represented by ANSI local character set according to time zone and system language;  
ANSI = default character set of the current computer;

Tags: Java

Posted by newbiehacker on Tue, 19 Jul 2022 10:59:05 +0930