[MindSpore] Simply use Resnet50 to classify dog wolf pictures. Attach all code downloads.

Examples used in this article are from the MindSpore website Course Here is to share your personal understanding and integration of related code.

Environment Configuration:

  • windows10
  • MindSpore1.6.1 CPU Version
  • python3.9.0

1. Download of datasets and pre-training models

Dataset Download

Pre-training Model Download

Note: The above links are from the MindSpore website.

2. Define command line parameters and save them as config.yml

def parse_args():
    parser = argparse.ArgumentParser()
    # Project File Name
    parser.add_argument("--name", default="resnet50_classify", help="The name of project.")
    # Dataset Name
    parser.add_argument('--dataset', default='Canidae', help='dataset name')
    parser.add_argument("--epochs", default=200, type=int, metavar='N')
    parser.add_argument('--batch_size', default=16, type=int, metavar='N')

    # Pre-training weight, parameter fill path of pre-training file
    parser.add_argument('--pre_ckpt', default="./pre_ckpt/resnet50.ckpt")
    # Whether to delete the full connection layer of the pre-training model
    parser.add_argument("--delfc_flag", default=True)
    # Input picture channels, default is RGB three-channel picture
    parser.add_argument('--input_channels', default=3, type=int, help='input channels')
    # Number of Categories
    parser.add_argument('--num_classes', default=20, type=int, help='number of classes')
    # Size of input image
    parser.add_argument('--image_size', default=128, type=int, help='image size')

    # Optimizer
    parser.add_argument('--optimizer', default='Adam')
    # loss function
    parser.add_argument('--loss', default='SoftmaxCrossEntropyWithLogits')
    parser.add_argument('--dataset_sink_mode', default=False)

    config = parser.parse_args()
    return config

Using argparse, you can customize command line parameters to facilitate changing training parameters. These parameters can then be stored as config.yml, used as a configuration file, also makes it easy for you or others to understand the parameters associated with this network.

def main():
    # Parse command line parameters. The config type is a dictionary, the key is the previously defined command line parameter name, and the value is its content.
    config = vars(parse_args())
    # Create a model project folder to save training files
    if config["name"] is None:
        config['name'] = "test"
    os.makedirs(os.path.join("models", config["name"]), exist_ok=True)

    # Create config file and write parameters
    config_path = 'models/%s/config.yml' % config['name']
    with open(config_path, 'w') as f:
        yaml.dump(config, f)

3. Data Set Preprocessing and Loading

The dataset is divided into training set and validation set.

MindSpore provides mindspore. Dataset. The ImageFolderDataset function makes it easy to read datasets. Official Explanation of Functions

To use functions, however, you need to store the data in a prescribed way. For example, this dataset can be stored in the following way.

  │   └─dogs
  │   └─wolves

The data is stored in a single directory by creating folders for the pictures to be classified, one for each category.

For example, the function automatically reads all folders in the train directory, treating all pictures in the same folder as the same category.

Each type of picture is given a numeric label, label, after it is read in the order of folder names. The first read folder picture is marked as 0, the second read folder picture is marked as 1, and so on until all folders have been read.

This is the default way of reading functions, but it may make it impossible to determine exactly which category corresponds to which label, which can be a little more cumbersome for later test reasoning. So we can use the class_of this function The indexing parameter, which directly specifies the relationship between the folder and label.

You can pass in a dictionary type for this parameter. The key of the dictionary is the folder name, and the value is the corresponding number.

For example:

dataset = ds.ImageFolderDataset(dataset_dir=image_folder_dataset_dir,
                                class_indexing={"dogs":0, "wolves":1})

This directly specifies that dogs are 0 and wolves are 1. This clarifies the specific relationship between label and category.

Therefore, I recommend that you use the name of a category as the name of a folder directly. You can read the name of a folder through a function, assign values to the folder, generate a dictionary, and save the contents to a config file, so that the test can clearly understand the category and label relationship.

def getClasses(data_path, config_path):
    Load the names of all folders in the path and sort them in ascending order.
    Record the first folder as 0, the second as 1, and so on.
    Then write the folder name and corresponding Tags config Files, used to make datasets label. 
        data_path (str): Dataset Path
        config_path (str): Profile Path
        _type_: _description_
    res_dict = {"classes":{}}
    data_list = os.listdir(data_path)
    for data in data_list:
        res_dict["classes"][data] = data_list.index(data)
    # Save to config file
    with open(config_path, 'a+') as f:
        yaml.dump(res_dict, f, Dumper=yaml.RoundTripDumper)
    return res_dict["classes"]

# Getting label means getting all kinds of names to correspond to numbers
classesDict = getClasses(train_data_path, config_path)

Crea_was given in the official tutorial Dataset function. This function is a more complete data set reading and processing. I'll skip over this function and write everything I can say in the comments below.

Note the training parameter. When True, this parameter indicates that the training set is processed, and False is processed for the validation set. That is, only the training set is enhanced and the validation set is not used.

def create_dataset(data_path, image_size, classDict=None, batch_size=24, repeat_num=1, training=True):
    """Define Dataset"""
    # By default, folder names are sorted (alphabetically), and each class is given a unique index starting at 0.
    data_set = ds.ImageFolderDataset(data_path,

    # Enhance data
    mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
    std = [0.229 * 255, 0.224 * 255, 0.225 * 255]
    if training:
        trans = [
            # Clipping, Decode, Resize
            CV.RandomCropDecodeResize(image_size, scale=(0.8, 1.0), ratio=(0.75, 1.333)),
            # Horizontal flip
            # Random Rotation
            # normalization
            CV.Normalize(mean=mean, std=std),
            # h,w,c -> c,h,w
        trans = [
            # Decode
            # Resize
            CV.Resize((image_size, image_size)),
            CV.Normalize(mean=mean, std=std),
    type_cast_op = C.TypeCast(mstype.int32)
    # Implement map mapping of data, batch processing, and data duplication
    data_set = data_set.map(operations=trans, input_columns="image", num_parallel_workers=8)
    data_set = data_set.map(operations=type_cast_op, input_columns="label", num_parallel_workers=8)
    data_set = data_set.batch(batch_size, drop_remainder=True)
    data_set = data_set.repeat(repeat_num)

    return data_set

4. Training Process

Major parameter configuration records and dataset loading are finished. MindSpore integrates model loading and training very well and can be called in a few sentences, so that's not much to say. There will be all the files of the project for download at the end of the article.

V. Testing Process

The first thing to load is the taxonomy labels, which were previously saved in the config file.

You also need to swap the key values of the dictionary contents so that the category names can be output directly from the predicted results.

# classes format
classesDict = config["classes"]
# To swap the key values of a dictionary
class_name = dict(zip(classesDict.values(), classesDict.keys()))

Loading Test Data and Preprocessing

This part is critical. This step loads data differently from previous training, using a custom load method to read the dataset.

class testDataset:
    Custom class for reading pictures.
    For use only test Use, do not return label. 
    The first value is the graph used for classification
    The second value is for the graph ID,Name and suffix
    The third value is the graph used for display
    def __init__(self, data_path):
        self.data_path = data_path
        self.imgList = os.listdir(self.data_path)

    def __getitem__(self, index):
        imgID = self.imgList[index]
        img = cv2.imread(os.path.join(self.data_path, imgID))
        # Image preprocessing
        img = cv2.resize(img, (256, 256))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = img.astype(np.float32)
        return img, imgID, img

    def __len__(self):
        return len(self.imgList)

# Read test data and preprocess it
testData = testDataset(test_data_path)
# Image is used for classification processing; imgID is the picture name and suffix; imageInit is used to display the original image
dataset = ds.GeneratorDataset(testData, ["image", "imgID", "imageInit"])

# Image preprocessing, primarily normalization and channel exchange order
mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
std = [0.229 * 255, 0.224 * 255, 0.225 * 255]
dataset = dataset.map(operations=[CV.Normalize(mean=mean, std=std), CV.HWC2CHW()],

This part may be more difficult to understand.

First, you define your own testDataset class to read datasets and do simple preprocessing. This class's u getitem_u The function returns three values, img, imgID, img. What do these three values mean? I wrote in the introduction of the class to understand.

ds.GeneratorDataset This function creates a dataset. The first parameter fills in the instance results of the testDataset class written earlier, and the second parameter fills in the Column Name. This column name is column_ The names parameter, I think, can be interpreted as the channel name. _u of the previous testDataset class getitem_u The function returns three values that can be passed through ds. The GeneratorDataset function gives each value a separate name, and the corresponding data can then be invoked with that name.

This part may be difficult to understand, and I may not be detailed and concise enough, so I recommend that this part be debugged step by step to understand the usage of these functions a little bit better.

This is followed by the basic operation of normalizing the image and changing the order of the number of channels.

Prediction process

# Used to store predictions
pred_list = []
# Store display pictures
img_list = []
# Picture Name
imgID_list = []
# Extended Dimension
expand = ExpandDims()
for data in dataset.create_dict_iterator():
    # Add images to the list of pictures to display
    # Prediction using processed images
    image = data["image"].asnumpy()
    # Extended Dimension
    img = expand(Tensor(image), 0)
    output = model.predict(img)
    pred = np.argmax(output.asnumpy(), axis=1)[0]

My prediction here does not use batch_ The size format is the prediction of a picture, so a dimension expansion operation is required.

If it is batch_ In size format, batch processing is required on the dataset before this.

The final output of the model, pred, is a number corresponding to the type of dataset. The specific correspondence can be from class_name dictionary obtained.

# Visual Model Prediction
plt.figure(figsize=(12, 5))
# Show 24 Charts
for i in range(len(img_list)):
    plt.subplot(3, 9, i+1)
    picture_show = img_list[i]/np.amax(img_list[i])
    picture_show = np.clip(picture_show, 0, 1)

Visual output. The predictions and corresponding pictures can be displayed.

Note that the pictures shown here come from the imageInit channel, because the pictures of the image channel are converted by normalization and channel order, so they cannot be used for display directly.


6. Complete Code Download

Github Download

CSDN Download

Tags: Python image processing Deep Learning Computer Vision

Posted by BrettCarr on Sat, 16 Apr 2022 01:33:14 +0930