Introduction to machine learning: Chapter 19 deep belief network

Deep belief network (DBN)

Deep belief network (DBN) was proposed by Geoffrey Hinton in 2006. It is a generation model. By training the weights among its neurons, we can make the whole neural network generate training data according to the maximum probability. We can not only use DBN to identify features and classify data, but also use it to generate data.

After the deep belief network has been launched, it has been sought after by countless people because of its effectiveness, and it also leads the direction of machine learning.
DBN is composed of two parts, one is the bottom network and the other is the top network. The bottom network can be composed of RBM (restricted Boltzmann machine). The top layer is usually a logic layer. When training, the bottom network is trained first, and the top network is trained.
The underlying structure is as follows:

You can customize the number of layers. The h of the first layer is used as the v of the second layer, and so on.

The top layer is a logic layer, and the h of the last layer of the bottom layer is used as the input of the logic layer for training.

Build DBN structure:

# construct multi-layer
for i in range(self.n_layers):
	# layer_size
	if i == 0:
		input_size = n_ins
	else:
		input_size = hidden_layer_sizes[i - 1]

	# layer_input
	if i == 0:
		layer_input = self.x
	else:
		layer_input = self.sigmoid_layers[-1].sample_h_given_v()
		
	# construct sigmoid_layer
	sigmoid_layer = HiddenLayer(input=layer_input,
								n_in=input_size,
								n_out=hidden_layer_sizes[i],
								numpy_rng=numpy_rng,
								activation=sigmoid)
	self.sigmoid_layers.append(sigmoid_layer)


	# construct rbm_layer
	rbm_layer = RBM(input=layer_input,
					n_visible=input_size,
					n_hidden=hidden_layer_sizes[i],
					W=sigmoid_layer.W,     # W, b are shared
					hbias=sigmoid_layer.b)
	self.rbm_layers.append(rbm_layer)


# layer for output using Logistic Regression
self.log_layer = LogisticRegression(input=self.sigmoid_layers[-1].sample_h_given_v(),
									label=self.y,
									n_in=hidden_layer_sizes[-1],
									n_out=n_outs)

Train the bottom layer and pay attention to the relationship between input and output of the bottom layer:

def pretrain(self, lr=0.1, k=1, epochs=100):
        # pre-train layer-wise
        for i in range(self.n_layers):
            if i == 0:
                layer_input = self.x
            else:
                layer_input = self.sigmoid_layers[i-1].sample_h_given_v(layer_input)
            rbm = self.rbm_layers[i]
            
            for epoch in range(epochs):
                rbm.contrastive_divergence(lr=lr, k=k, input=layer_input)
                # cost = rbm.get_reconstruction_cross_entropy()
                # print >> sys.stderr, \
                #        'Pre-training layer %d, epoch %d, cost ' %(i, epoch), cost

Finally, train the logic layer:

def finetune(self, lr=0.1, epochs=100):
        layer_input = self.sigmoid_layers[-1].sample_h_given_v()

        # train log_layer
        epoch = 0
        done_looping = False
        while (epoch < epochs) and (not done_looping):
            self.log_layer.train(lr=lr, input=layer_input)
            # self.finetune_cost = self.log_layer.negative_log_likelihood()
            # print >> sys.stderr, 'Training epoch %d, cost is ' % epoch, self.finetune_cost
            
            lr *= 0.95
            epoch += 1

Due to the addition of a logic layer, the use of DBN is the same as that of softmax. The predicted output results are as follows:

def predict(self, x):
        layer_input = x
        
        for i in range(self.n_layers):
            sigmoid_layer = self.sigmoid_layers[i]
            layer_input = sigmoid_layer.output(input=layer_input)

        out = self.log_layer.predict(layer_input)
        return out

DBN is an important direction of machine learning and a reason why machine learning is so hot. There are many knowledge points involved in this process, but no specific process is given in the derivation process. If you need to know the derivation process, you can refer to specific literature. Here, if you talk about the derivation process too much, you can't explain the application process clearly. It can be seen that although DBN is similar to previous neural networks, However, its theoretical basis is different, which creates a new direction of e-learning.

Tags: Machine Learning

Posted by Brandon_R on Sat, 16 Apr 2022 10:01:54 +0930