Examples used in this article are from the MindSpore website Course Here is to share your personal understanding and integration of related code.
Environment Configuration:
- windows10
- MindSpore1.6.1 CPU Version
- python3.9.0
1. Download of datasets and pre-training models
Note: The above links are from the MindSpore website.
2. Define command line parameters and save them as config.yml
def parse_args(): parser = argparse.ArgumentParser() # Project File Name parser.add_argument("--name", default="resnet50_classify", help="The name of project.") # Dataset Name parser.add_argument('--dataset', default='Canidae', help='dataset name') parser.add_argument("--epochs", default=200, type=int, metavar='N') parser.add_argument('--batch_size', default=16, type=int, metavar='N') # Pre-training weight, parameter fill path of pre-training file parser.add_argument('--pre_ckpt', default="./pre_ckpt/resnet50.ckpt") # Whether to delete the full connection layer of the pre-training model parser.add_argument("--delfc_flag", default=True) # Input picture channels, default is RGB three-channel picture parser.add_argument('--input_channels', default=3, type=int, help='input channels') # Number of Categories parser.add_argument('--num_classes', default=20, type=int, help='number of classes') # Size of input image parser.add_argument('--image_size', default=128, type=int, help='image size') # Optimizer parser.add_argument('--optimizer', default='Adam') # loss function parser.add_argument('--loss', default='SoftmaxCrossEntropyWithLogits') parser.add_argument('--dataset_sink_mode', default=False) config = parser.parse_args() return config
Using argparse, you can customize command line parameters to facilitate changing training parameters. These parameters can then be stored as config.yml, used as a configuration file, also makes it easy for you or others to understand the parameters associated with this network.
def main(): # Parse command line parameters. The config type is a dictionary, the key is the previously defined command line parameter name, and the value is its content. config = vars(parse_args()) # Create a model project folder to save training files if config["name"] is None: config['name'] = "test" os.makedirs(os.path.join("models", config["name"]), exist_ok=True) # Create config file and write parameters config_path = 'models/%s/config.yml' % config['name'] with open(config_path, 'w') as f: yaml.dump(config, f)
3. Data Set Preprocessing and Loading
The dataset is divided into training set and validation set.
MindSpore provides mindspore. Dataset. The ImageFolderDataset function makes it easy to read datasets. Official Explanation of Functions
To use functions, however, you need to store the data in a prescribed way. For example, this dataset can be stored in the following way.
└─Canidae └─train │ └─dogs │ └─wolves └─val └─dogs └─wolves
The data is stored in a single directory by creating folders for the pictures to be classified, one for each category.
For example, the function automatically reads all folders in the train directory, treating all pictures in the same folder as the same category.
Each type of picture is given a numeric label, label, after it is read in the order of folder names. The first read folder picture is marked as 0, the second read folder picture is marked as 1, and so on until all folders have been read.
This is the default way of reading functions, but it may make it impossible to determine exactly which category corresponds to which label, which can be a little more cumbersome for later test reasoning. So we can use the class_of this function The indexing parameter, which directly specifies the relationship between the folder and label.
You can pass in a dictionary type for this parameter. The key of the dictionary is the folder name, and the value is the corresponding number.
For example:
dataset = ds.ImageFolderDataset(dataset_dir=image_folder_dataset_dir, class_indexing={"dogs":0, "wolves":1})
This directly specifies that dogs are 0 and wolves are 1. This clarifies the specific relationship between label and category.
Therefore, I recommend that you use the name of a category as the name of a folder directly. You can read the name of a folder through a function, assign values to the folder, generate a dictionary, and save the contents to a config file, so that the test can clearly understand the category and label relationship.
def getClasses(data_path, config_path): """ Load the names of all folders in the path and sort them in ascending order. Record the first folder as 0, the second as 1, and so on. Then write the folder name and corresponding Tags config Files, used to make datasets label. Args: data_path (str): Dataset Path config_path (str): Profile Path Returns: _type_: _description_ """ res_dict = {"classes":{}} data_list = os.listdir(data_path) data_list.sort() for data in data_list: res_dict["classes"][data] = data_list.index(data) # Save to config file with open(config_path, 'a+') as f: yaml.dump(res_dict, f, Dumper=yaml.RoundTripDumper) f.close() return res_dict["classes"] # Getting label means getting all kinds of names to correspond to numbers classesDict = getClasses(train_data_path, config_path)
Crea_was given in the official tutorial Dataset function. This function is a more complete data set reading and processing. I'll skip over this function and write everything I can say in the comments below.
Note the training parameter. When True, this parameter indicates that the training set is processed, and False is processed for the validation set. That is, only the training set is enhanced and the validation set is not used.
def create_dataset(data_path, image_size, classDict=None, batch_size=24, repeat_num=1, training=True): """Define Dataset""" # By default, folder names are sorted (alphabetically), and each class is given a unique index starting at 0. data_set = ds.ImageFolderDataset(data_path, num_parallel_workers=8, shuffle=True, class_indexing=classDict) # Enhance data mean = [0.485 * 255, 0.456 * 255, 0.406 * 255] std = [0.229 * 255, 0.224 * 255, 0.225 * 255] if training: trans = [ # Clipping, Decode, Resize CV.RandomCropDecodeResize(image_size, scale=(0.8, 1.0), ratio=(0.75, 1.333)), # Horizontal flip CV.RandomHorizontalFlip(prob=0.5), # Random Rotation CV.RandomRotation(90), # normalization CV.Normalize(mean=mean, std=std), # h,w,c -> c,h,w CV.HWC2CHW() ] else: trans = [ # Decode CV.Decode(), # Resize CV.Resize((image_size, image_size)), CV.Normalize(mean=mean, std=std), CV.HWC2CHW() ] type_cast_op = C.TypeCast(mstype.int32) # Implement map mapping of data, batch processing, and data duplication data_set = data_set.map(operations=trans, input_columns="image", num_parallel_workers=8) data_set = data_set.map(operations=type_cast_op, input_columns="label", num_parallel_workers=8) data_set = data_set.batch(batch_size, drop_remainder=True) data_set = data_set.repeat(repeat_num) return data_set
4. Training Process
Major parameter configuration records and dataset loading are finished. MindSpore integrates model loading and training very well and can be called in a few sentences, so that's not much to say. There will be all the files of the project for download at the end of the article.
V. Testing Process
The first thing to load is the taxonomy labels, which were previously saved in the config file.
You also need to swap the key values of the dictionary contents so that the category names can be output directly from the predicted results.
# classes format classesDict = config["classes"] # To swap the key values of a dictionary class_name = dict(zip(classesDict.values(), classesDict.keys()))
Loading Test Data and Preprocessing
This part is critical. This step loads data differently from previous training, using a custom load method to read the dataset.
class testDataset: """ Custom class for reading pictures. For use only test Use, do not return label. Return: The first value is the graph used for classification The second value is for the graph ID,Name and suffix The third value is the graph used for display """ def __init__(self, data_path): self.data_path = data_path self.imgList = os.listdir(self.data_path) def __getitem__(self, index): imgID = self.imgList[index] img = cv2.imread(os.path.join(self.data_path, imgID)) # Image preprocessing img = cv2.resize(img, (256, 256)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = img.astype(np.float32) return img, imgID, img def __len__(self): return len(self.imgList) # Read test data and preprocess it testData = testDataset(test_data_path) # Image is used for classification processing; imgID is the picture name and suffix; imageInit is used to display the original image dataset = ds.GeneratorDataset(testData, ["image", "imgID", "imageInit"]) # Image preprocessing, primarily normalization and channel exchange order mean = [0.485 * 255, 0.456 * 255, 0.406 * 255] std = [0.229 * 255, 0.224 * 255, 0.225 * 255] dataset = dataset.map(operations=[CV.Normalize(mean=mean, std=std), CV.HWC2CHW()], input_columns="image", num_parallel_workers=8)
This part may be more difficult to understand.
First, you define your own testDataset class to read datasets and do simple preprocessing. This class's u getitem_u The function returns three values, img, imgID, img. What do these three values mean? I wrote in the introduction of the class to understand.
ds.GeneratorDataset This function creates a dataset. The first parameter fills in the instance results of the testDataset class written earlier, and the second parameter fills in the Column Name. This column name is column_ The names parameter, I think, can be interpreted as the channel name. _u of the previous testDataset class getitem_u The function returns three values that can be passed through ds. The GeneratorDataset function gives each value a separate name, and the corresponding data can then be invoked with that name.
This part may be difficult to understand, and I may not be detailed and concise enough, so I recommend that this part be debugged step by step to understand the usage of these functions a little bit better.
This is followed by the basic operation of normalizing the image and changing the order of the number of channels.
Prediction process
# Used to store predictions pred_list = [] # Store display pictures img_list = [] # Picture Name imgID_list = [] # Extended Dimension expand = ExpandDims() for data in dataset.create_dict_iterator(): # Add images to the list of pictures to display img_list.append(data["imageInit"].asnumpy()) imgID_list.append(str(data["imgID"])) # Prediction using processed images image = data["image"].asnumpy() # Extended Dimension img = expand(Tensor(image), 0) output = model.predict(img) pred = np.argmax(output.asnumpy(), axis=1)[0] pred_list.append(pred)
My prediction here does not use batch_ The size format is the prediction of a picture, so a dimension expansion operation is required.
If it is batch_ In size format, batch processing is required on the dataset before this.
The final output of the model, pred, is a number corresponding to the type of dataset. The specific correspondence can be from class_name dictionary obtained.
# Visual Model Prediction plt.figure(figsize=(12, 5)) # Show 24 Charts for i in range(len(img_list)): plt.subplot(3, 9, i+1) plt.title('{}'.format(class_name[pred_list[i]])) picture_show = img_list[i]/np.amax(img_list[i]) picture_show = np.clip(picture_show, 0, 1) plt.imshow(picture_show) plt.axis('off') plt.show()
Visual output. The predictions and corresponding pictures can be displayed.
Note that the pictures shown here come from the imageInit channel, because the pictures of the image channel are converted by normalization and channel order, so they cannot be used for display directly.
Visualization: