基于Max78000的直肠息肉识别机

内容介绍

项目介绍
该项目是基于PyTorch的深度学习框架和YOLOv5算法的医学图像检测系统，旨在自动识别直肠息肉并提供辅助诊断。团队成员来自大连东软信息学院，项目的目标是在直肠息肉的早期诊断方面取得突破，以提高治疗的准确性和效果。

FqorEAgylj57Xv05gdPh0gStmkyv

背景

癌症直肠息肉诊断问题是一个十分紧迫和重要的问题，由于人口的不断增加和老
龄化，肠癌和其前期病变的患病率都将逐年增加。而肠癌的早期发现是预防和治疗肠
癌最有效的方式之一。秉承解决社会实际问题，所以选择了直肠息肉为题的目标检测模型做为模型部署的选择模型。

项目设计思路

选择YOLOv5框架进行模型的训练包括数据的预处理、特征提取和模型构建、训练环境的搭建、模型的优化与调整以及实际应用与成果评估。对模型进行半精度的量化和转换为C数组代码，用于MAX78000的模型部署和推理。

搜集数据集介绍

数据集来自于第八届全国大学生生物医学工程创新设计竞赛官方标注数据集，数据集内有近三万直肠息肉数据集，包括图像和标签对，用于训练和验证。数据集中的每个图像都有一个与之同名的txt格式标签文件，用于标注直肠息肉的位置和类别，带有分类标注和目标框中心相对位置和左上角相对位置。

FlWPLmPAboSMNrGfoFkKF2NqqCKD

预训练实现过程
定义数据信息文件data.yaml：在data.yaml定义了数据集的配置信息，包括类别名称、训练集和验证集的路径。我们指定了类别为2类，二分类类别名称分别为'adenoma',和'hyperplasia'，用于区分。

train:  data/dataset/train
val: data/dataset/val

# number of classes
nc: 2
names: [ 'adenoma', 'hyperplasia' ]

在YOLOv5定义了加载数据的函数load_image，load_image()方法的输入参数是数据集中图像的索引i，输出是处理后的图像、原始图像的高宽和处理后图像的高宽，用于加载一张图像数据，配合数据生成器进行批量的数据读取。

    def load_image(self, i):
        # Loads 1 image from dataset index 'i', returns (im, original hw, resized hw)
        im, f, fn = self.ims[i], self.im_files[i], self.npy_files[i],
        if im is None:  # not cached in RAM
            if fn.exists():  # load npy
                im = np.load(fn)
            else:  # read image
                im = cv2.imread(f)  # BGR
                assert im is not None, f'Image Not Found {f}'
            h0, w0 = im.shape[:2]  # orig hw
            r = self.img_size / max(h0, w0)  # ratio
            if r != 1:  # if sizes are not equal
                interp = cv2.INTER_LINEAR if (self.augment or r > 1) else cv2.INTER_AREA
                im = cv2.resize(im, (math.ceil(w0 * r), math.ceil(h0 * r)), interpolation=interp)
            return im, (h0, w0), im.shape[:2]  # im, hw_original, hw_resized
        return self.ims[i], self.im_hw0[i], self.im_hw[i]  # im, hw_original, hw_resized

yolo模型的选择

yolov5框架需要指定预训练模型的位置，既在命令行参数'weights'命令行参数指定default的值为预训练模型的位置。使用预训练模型是属于迁移学习的内容，模型在初始
训练时参数是随机生成的，毫无规律。迁移学习可以在一个任务上学习到的知
识可以被迁移到另一个任务上。使得新的任务可以在一定基础的模型参数上偏
移改进，大大加快了模型的训练速度。其中预训练模型有yolo5x、yolo5s、yolo5n、yolo5m和yolo5l等不同的模型变体，yolo5s是在这些预训练模型中模型最小，参数最少，帧率较好的，适合用于做嵌入式的模型部署。

在选择yolov5s预训练模型，我研究了这个项目的参数，其中nc是分类个数，depth_multiple和width_multiple分别代表模型的宽度和深度，定义anchors为模型对识别物体
中对大、中、小不同物体的不同尺寸输出不同大小的预测框。backbone和bead分别定义了模型的特征提取网络和模型预测网络，将根据这些参数生成适合
yolov5s预训练模型的网络，用于训练。

nc: 2  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

训练环境的搭建

使用算力云租赁服务器平台进行配置环境和训练。

Fku3N3dVdKoaJjVxd0WKha0IGodY

选择pytorch框架后进入服务器控制台中安装环境。

FhXKCjwTtpgQHTrkjPm0fiELpuWN

开始训练，在此之前为为了不断提升模型的性能，模型已经反复训练了几回，这种情况可以指定命令行参数resume设置为true指定现有的模型参数继续训练。

FgkRljhGp_k-5DHVpEgAIqp1D6VL

模型对直肠息肉识别和分类的性能效果

FuvRw2g7_5xV9w6R-LU1uIAY9q6D

将模型保存在本地

FqNeEXsArOerE_1HSCCYY61tsHY_

拉取MAX78000模型量化源码

安装git，使用sudo apt-get install git命令进行安装，安装完成后我们创建max78000的文件夹进行max78000相关的文件的下载。

(base) root@fugubiao:~# sudo apt-get install git
Reading package lists... Done
······
Processing triggers for libc-bin (2.27-3ubuntu1.6) ...
(base) root@fugubiao:~# mkdir max78000
(base) root@fugubiao:~# ls
anaconda3  Anaconda3-2020.07-Linux-x86_64.sh  auto_install  max78000  NVIDIA_CUDA-11.0_Samples
(base) root@fugubiao:~# cd max8000
-bash: cd: max8000: No such file or directory
(base) root@fugubiao:~# cd max78000
(base) root@fugubiao:~/max78000# git clone --recursive https://github.com/MaximIntegratedAI/ai8x-training.git
Cloning into 'ai8x-training'...
······
Submodule path 'distiller': checked out '94b0857bfc5c455991ee84fc0195fc1079366df1'

使用以下命令拉取max78000模型量化代码：

git clone --recursive https://github.com/MaximIntegratedAI/ai8x-synthesis.git

因为训练使用yolov5框架训练，所以就不拉取ai8x-training了，对于环境的安装重要的就是安装pytorch环境，在命令行窗口使用nvcc -V命令查询 NVIDIA CUDA 编译器版本：

C:\Users\Hasee>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:24:09_Pacific_Daylight_Time_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0

可以看到我的cuda版本是11.3，所以在pytorch安装中我也要使用11.3cuda的版本安装：

# CUDA 11.3
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

对于其他包的安装可以使用项目自带的requirements.txt，首先要激活名为Max7800的环境，然后使用pip命令进行安装。

(base) root@fugubiao:~/max78000# conda activate max78000
(base) root@fugubiao:~/max78000# pip install -r requirements.txt

模型量化

量化有两种主要方法——量化感知训练和训练后量化。文件移动到ai8x-synthesis/proj文件夹，准备量化。

将训练出的/best.pth.tar模型复制到ai8x-synthesis\proj文件夹中，然后进入ai8x-synthesis目录，执行以下命令即可：

(max78000) root@iZ2ze9gdjnlb2cwpcldexpZ:~/max78000/ai8x-synthesis# python ./quantize.py proj/qat_best.pth.tar proj/qat_best-ai8x-q.pth.tar --device MAX78000 -v -c networks/faceid.yaml
Configuring device: MAX78000
Reading networks/faceid.yaml to configure network...
Converting checkpoint file proj/qat_best.pth.tar to proj/qat_best-ai8x-q.pth.tar

Model keys (state_dict):
conv1.output_shift, conv1.weight_bits, conv1.bias_bits, conv1.quantize_activation, conv1.adjust_output_shift, conv1.shift_quantile, conv1.op.weight, conv2.output_shift, conv2.weight_bits, conv2.bias_bits, conv2.quantize_activation, conv2.adjust_output_shift, conv2.shift_quantile, conv2.op.weight, conv3.output_shift, conv3.weight_bits, conv3.bias_bits, conv3.quantize_activation, conv3.adjust_output_shift, conv3.shift_quantile, conv3.op.weight, conv4.output_shift, conv4.weight_bits, conv4.bias_bits, conv4.quantize_activation, conv4.adjust_output_shift, conv4.shift_quantile, conv4.op.weight, conv5.output_shift, conv5.weight_bits, conv5.bias_bits, conv5.quantize_activation, conv5.adjust_output_shift, conv5.shift_quantile, conv5.op.weight, conv6.output_shift, conv6.weight_bits, conv6.bias_bits, conv6.quantize_activation, conv6.adjust_output_shift, conv6.shift_quantile, conv6.op.weight, conv7.output_shift, conv7.weight_bits, conv7.bias_bits, conv7.quantize_activation, conv7.adjust_output_shift, conv7.shift_quantile, conv7.op.weight, conv8.output_shift, conv8.weight_bits, conv8.bias_bits, conv8.quantize_activation, conv8.adjust_output_shift, conv8.shift_quantile, conv8.op.weight, avgpool.output_shift, avgpool.weight_bits, avgpool.bias_bits, avgpool.quantize_activation, avgpool.adjust_output_shift, avgpool.shift_quantile
conv1.op.weight avg_max: 0.23429993 max: 0.2981393 mean: -0.0012111976 factor: [256.] bits: 8
conv2.op.weight avg_max: 0.29567257 max: 0.49292472 mean: -0.013966456 factor: [256.] bits: 8
conv3.op.weight avg_max: 0.3058117 max: 0.5719163 mean: -0.0075516785 factor: [128.] bits: 8
conv4.op.weight avg_max: 0.2525767 max: 0.50934964 mean: -0.015919587 factor: [128.] bits: 8
conv5.op.weight avg_max: 0.24346802 max: 0.5357917 mean: -0.0096125575 factor: [128.] bits: 8
conv6.op.weight avg_max: 0.3251059 max: 0.5835651 mean: -0.010212447 factor: [128.] bits: 8
conv7.op.weight avg_max: 0.29670662 max: 0.59505373 mean: -0.010930746 factor: [128.] bits: 8
conv8.op.weight avg_max: 0.3084465 max: 0.68859434 mean: 0.00021292236 factor: [128.] bits: 8

我自己还研究了别的量化方法

model = torch.hub.load('ultralytics/yolov5', 'custom', path='.pt', force_reload=True)
model.load_state_dict('D:\Users\Hasee\Desktop\yolov5_polyp/runs/train\exp10\weights/best.pt')  # 加载已训练好的模型权重

# 将模型设置为评估模式
model.eval()

# 创建输入张量
input_tensor = torch.randn(1, 3, 640, 640)

# 进行推理
with torch.no_grad():
    outputs = model(input_tensor)

# 将模型转换为FP16格式
model = model.half()

# 量化模型
quantized_model = quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

# 保存量化后的模型
torch.save(quantized_model.state_dict(), 'quantized_model.pt')