catalogue
1. Detailed explanation of FCN paper
2.FCN project practice (using PASCAL VOC 2012 dataset)
3. Collection of image data sets
1. Detailed explanation of FCN paper
https://mydreamambitious.blog.csdn.net/article/details/125966298
2.FCN project practice (using PASCAL VOC 2012 dataset)
https://mydreamambitious.blog.csdn.net/article/details/125774545
Note:
Common data sets for semantic segmentation in deep learning fields: Common data sets for semantic segmentation in deep learning: Pascal VOC 2007, 2012 nyudv2 sunrgbd cityscapes camvid sift flow 7 big data sets introduction_ Keep_Trying_Go blog -CSDN blog
3. Collection of image data sets
Here is a training data set made by downloading pictures. There are only 39 pictures. Readers can use this data set for training first, and then make their own data set after there is no problem (the collection of data sets can also be collected from life, and it is not necessary to download from the Internet, which may be more meaningful).
Link: https://pan.baidu.com/s/1hUZYmy0iQ5dG4dbDD69ZwA
Extraction code: 25d6
4. Selection of tools
(1) Tool 1
labelme is a polygon annotation tool, which can accurately label the outline, and is often used for segmentation.
labelme: I haven't used this tool before. In fact, the use method of this tool is the same as that of labelImg to be introduced below.
Installation: pip install labelme
Open: Enter directly in the downloaded environment: labelme
Reference link: https://blog.csdn.net/weixin_44245653/article/details/119150966
(2) Tool 2
labelimg is a rectangular annotation tool, which is commonly used in target recognition and target detection. It can directly generate txt label format read by yolo, but it can only annotate rectangular boxes. Since this tool is mainly used for image recognition and target detection, why can it be used to make segmentation data sets? The main reason is that PASCAL VOC data sets can be made here, but label files will not be generated in the end, so this tool is mainly used to generate.xml files in our practical project.
Installation: pip install labelImg
Open: Enter directly in the downloaded environment: labelImg
First open the tool:
It will be generated after clicking save xml file:
This is the simple use of this tool. For more information, please refer to the video on station B.
(3) Tool III
Installation: Step 1: pip install paddlepaddle -i https://pypi.tuna.tsinghua.edu.cn/simple Step 2: pip install eiseg -i https://pypi.tuna.tsinghua.edu.cn/simple
Open: eiseg
Maybe this error will be reported when it is opened after the installation is completed.
AttributeError: module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline'
resolvent:
pip install --user --upgrade opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple
The first step is to load the model parameters first:
Model type | Applicable scenarios | model structure | Model download address |
High precision model | Image annotation of general scene | HRNet18_OCR64 | static_hrnet18_ocr64_cocolvis |
Lightweight model | Image annotation of general scene | HRNet18s_OCR48 | static_hrnet18s_ocr48_cocolvis |
High precision model | General image annotation | EdgeFlow | static_edgeflow_cocolvis |
High precision model | Portrait scene annotation | HRNet18_OCR64 | static_hrnet18_ocr64_human |
Lightweight model | Portrait scene annotation | HRNet18s_OCR48 | static_hrnet18s_ocr48_human |
Note: the model loaded here depends on the data set marked by itself, such as the applicable scenario.
Step 2 open the file:
Step 3 add labels:
Step 4 select the saved data format: JSON or COCO
Selecting JSON will save a JSON file for each image, and the COCO data format will only be saved in a JSON file.
Step 5 click the target to mark the image (just click the target in the image):
Although we want such a gray-scale image, it is not good to look at it directly, so we add it to the palette:
import os import cv2 import numpy as np from PIL import Image def palette(): #Get current directory root=os.getcwd() #Location of grayscale image imgFile=root+'\\img' #Add all grayscale images to the palette for i,img in enumerate(os.listdir(imgFile)): filename, _ = os.path.splitext(img) #This place needs to add the image path completely, otherwise the image file read later does not exist img='img/'+img img = cv2.imread(img, cv2.IMREAD_GRAYSCALE) save_path=root+'\\imgPalette\\'+filename+'.png' img = Image.fromarray(img) # Convert the image from the data format of numpy to the image format in PIL palette = [] for i in range(256): palette.extend((i, i, i)) #Here, 21 colors are set, of which the background is black, with a total of 21 categories (including the background) palette[:3 * 21] = np.array([[0, 0,0],[0,255,255],[0, 128, 0],[128, 128, 0],[0, 0, 128], [128, 0, 128],[0, 128, 128],[128, 128, 128],[64, 0, 0],[192, 0, 0], [64, 128, 0],[192, 128, 0],[64, 0, 128],[192, 0, 128],[64, 128, 128], [192, 128, 128],[0, 64, 0],[128, 64, 0],[0, 192, 0],[128, 192, 0],[0, 64, 128] ], dtype='uint8').flatten() img.putpalette(palette) # print(np.shape(palette)) output (768,) img.save(save_path) if __name__ == '__main__': print('Pycharm') palette()
Code reference: https://blog.csdn.net/weixin_39886251/article/details/111704330
When the tool is selected, the annotation of the dataset is also completed, and the file location of the following dataset can be carried out.
5. Location of dataset file
Anyway, no matter how the data set is stored, it should be based on the operation of the program. It doesn't mean that it must be the same as mine. You can modify the program (if there is any error, don't be afraid, follow the error prompt and continue to modify).
Note: if you don't know the location of the above file, you can download the PASCAL VOC 2012 data set yourself and check how the file location of this data set is placed (in fact, you can modify the program yourself, and you don't have to do it in this format).
Common data sets for semantic segmentation in deep learning fields: Common data sets for semantic segmentation in deep learning: Pascal VOC 2007, 2012 nyudv2 sunrgbd cityscapes camvid sift flow 7 big data sets introduction_ Keep_Trying_Go blog -CSDN blog
6. Training data set
train. Some places in the PY file need to be modified: the location of the dataset and the number of categories. Others shall be modified according to their own needs.
#Location of dataset file parser.add_argument("--data-path", default="data/Mydata/", help="VOCdevkit root") #Number of categories of dataset parser.add_argument("--num-classes", default=1, type=int) #Whether to use secondary branch parser.add_argument("--aux", default=True, type=bool, help="auxilier loss") #CPU device is used by default parser.add_argument("--device", default="cuda", help="training device") #batchsize size used parser.add_argument("-b", "--batch-size", default=4, type=int) #Number of iterations parser.add_argument("--epochs", default=30, type=int, metavar="N", help="number of total epochs to train") #Learning rate of training parser.add_argument('--lr', default=0.0001, type=float, help='initial learning rate') #Momentum of training parser.add_argument('--momentum', default=0.9, type=float, metavar='M', help='momentum') #Weight delay parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float, metavar='W', help='weight decay (default: 1e-4)', dest='weight_decay') #How many epoch s are trained to print once parser.add_argument('--print-freq', default=10, type=int, help='print frequency') #When the training is interrupted, reload the weight file from the last interruption to start the training parser.add_argument('--resume', default='', help='resume from checkpoint') #epoch to start training parser.add_argument('--start-epoch', default=0, type=int, metavar='N', help='start epoch') # Mixed precision training parameters parser.add_argument("--amp", default=False, type=bool, help="Use torch.cuda.amp for mixed precision training")
predict.py: