树莓派与YOLOv5-Lite的那些事：配置、踩坑与部署

前言

项目设计想选个简单的，于是找了个 基于硬件的目标检测 。就YOLO嗯套呗，反正网上的例子一大把，下下来改改代码就行了吧。这不，百度搜搜就有：基于树莓派4B的YOLOv5-Lite目标检测的移植与部署（含训练教程）。稳辣！

我是这么想的。

结果发现，最新版的RaspiOS的教程很少，许多的配置界面都不一样，还遇到了各种各样的申必问题，只能自己摸索踩坑了。

环境配置

我喜欢找最新的镜像，自己配置环境。

老规矩开启三件套：WIFI、SSH、VNC。过程略。

之后安装各种依赖。需要注意的是，树莓派上面的Python包是固定死的，为了避免依赖冲突。安装包需要使用sudo apt install python3-opencv的命令。

但是有些包没有，比如python3-onnxruntime。在树莓派5 问题汇总 - 知乎找到了解决方法：

1	sudo mv /usr/lib/python3.x/EXTERNALLY-MANAGED /usr/lib/python3.x/EXTERNALLY-MANAGED.bk

其中python3.x是你树莓派上的实际Python版本。

摄像头

我喜欢选最新的，于是烧录的时候选的是最新版的RpiOS。新版的树莓派系统并没有以前的Legacy Camera配置，似乎默认已经是启用的了。但是在使用libcamera-hello时并没有显示。输入vcgencmd get_camera也显示supported=0 detected=0, libcamera interfaces=0即没检测到任何摄像头。

修改/boot/config.txt（若为新版本则是/boot/firmware/config.txt）：修改原来的摄像头检测语句：
1
2
3
#camera_auto_detect=1
gpu_mem=128
start_x=1
修改/etc/modules：在最后面添加bcm2835-v4l2，这是为了加载老驱动
重启树莓派

之后再次使用vcgencmd get_camera查询会变成supported=1 detected=1, libcamera interfaces=0，也就是正常识别到摄像头并且加载。

但是在CV2调用时又出现了问题，无法正常读取。查询相关资料得知，最新版的RpiOS在Bullseye版本后，底层的树莓派驱动从Raspicam切换到了libcamera。因此，我们使用官方的picamera2库对摄像头进行操作：

sudo apt install -y python3-picamera2
改/boot/config.txt（若为新版本则是/boot/firmware/config.txt）：在最后根据摄像头型号添加语句：dtoverlay=ov5647

参考树莓派4B使用opencv获取Camera Module 3摄像头图像（解决无法直接获取图像的问题），封装一个函数来负责采集图像：

#!/usr/bin/python
# Picamera2_Img_et.py

from picamera2 import Picamera2
from libcamera import controls


class Imget:
    def __init__(self):
        # 创建一个Picamera2对象的实例
        self.cam = Picamera2()

        # 设置相机预览的分辨率
        # 调小一点可以显著提高帧率
        self.cam.preview_configuration.main.size = (320, 320)
        self.cam.preview_configuration.main.format = "RGB888"
        # 设置预览帧率
        self.cam.preview_configuration.controls.FrameRate = 10
        # 对预览帧进行校准
        self.cam.preview_configuration.align()
        # 配置相机为预览模式
        self.cam.configure("preview")
        # 设置相机控制参数为连续对焦模式(自动对焦)
        # 我使用的树莓派官方摄像头v1.3(ov5647)并不支持自动对焦
        # self.cam.set_controls({"AfMode": controls.AfModeEnum.Continuous})
        # 启动相机
        self.cam.start()

    def getImg(self):
        # 获取相机捕获的图像数组(numpy数组)
        frame = self.cam.capture_array()
        # 返回捕获的图像数组
        return frame

    def __del__(self):
        self.cam.stop()
        self.cam.close()

然后调用：

import cv2
from threading import Thread
import os
import time
from Picamera2_Img_et import Imget  # 导入Imget类

# 下面是老的图像采集函数，无法直接使用，需要换为上面封装的采集方式
def image_collect_old(cap):
    global count
    while True:
        success, img = cap.read()
        if success:
            file_name = str(uuid.uuid4())+'.jpg'
            cv2.imwrite(os.path.join('images',file_name),img)
            count = count+1
            print("save %d %s"%(count,file_name))
        time.sleep(0.4)

# 新的图像采集函数
def image_collect_new(getImg):
    global count
    while True:
        frame = getImg.getImg()  # 使用Imget类获取图像
        if frame is not None:
            file_name = str(uuid.uuid4()) + '.jpg'
            cv2.imwrite(os.path.join('images', file_name), frame)
            count = count + 1
            print("save %d %s" % (count, file_name))
        time.sleep(0.4)


if __name__ == "__main__":
    os.makedirs("images", exist_ok=True)

    getImg = Imget()  # 创建Imget对象实例

    m_thread = Thread(target=image_collect, args=(getImg,), daemon=True)

    while True:
        frame = getImg.getImg()  # 使用Imget类获取图像

        if frame is not None:
            cv2.imshow("video", frame)

        key = cv2.waitKey(1) & 0xFF

        # 按键 "c" 开始采集图像
        if key == ord('c'):
            m_thread.start()
            continue
        elif key == ord('q'):
            break

    cv2.destroyAllWindows()

然后就可以正常采集图像辣！

采集、训练与推理

数据集采集与标注

使用上面写的img_collection.py进行采集。嫌速度太快会拍到手的把0.4s的间隔调大一点就行。

我采集了100张，然后又随机选了四十张进行左旋转和右旋转（毕竟采集分辨率设置的是320x320，可以加快训练速度和检测速度）。

标注的话，使用wkentaro/labelme进行标注。但是yolo并不认识，还需要转换一下格式。移植与部署里转换的脚本是有问题的，转换后的yolo坐标出现负值，会导致之后训练时提示Ignoring corrupted image and/or label，训练时会自动跳过该图片导致数据集很小。~~把苹果识别成橘子的原因找到了~~

因此，使用labelme生成的标注数据转换成yolov5格式里提供的转换脚本：

# -*- coding: utf-8 -*-
"""
Time:     2021.10.26
Author:   Athrunsunny
Version:  V 0.1
File:     toyolo.py
Describe: Functions in this file is change the dataset format to yolov5
"""
 
import os
import numpy as np
import json
from glob import glob
import cv2
import shutil
import yaml
from sklearn.model_selection import train_test_split
from tqdm import tqdm
 
ROOT_DIR = os.getcwd()
 
 
def change_image_format(label_path=ROOT_DIR, suffix='.jpg'):
    """
    统一当前文件夹下所有图像的格式，如'.jpg'
    :param suffix: 图像文件后缀
    :param label_path:当前文件路径
    :return:
    """
    externs = ['png', 'jpg', 'JPEG', 'BMP', 'bmp']
    files = list()
    for extern in externs:
        files.extend(glob(label_path + "\\*." + extern))
    for file in files:
        name = ''.join(file.split('.')[:-1])
        file_suffix = file.split('.')[-1]
        if file_suffix != suffix.split('.')[-1]:
            new_name = name + suffix
            image = cv2.imread(file)
            cv2.imwrite(new_name, image)
            os.remove(file)
 
 
def get_all_class(file_list, label_path=ROOT_DIR):
    """
    从json文件中获取当前数据的所有类别
    :param file_list:当前路径下的所有文件名
    :param label_path:当前文件路径
    :return:
    """
    classes = list()
    for filename in tqdm(file_list):
        json_path = os.path.join(label_path, filename + '.json')
        json_file = json.load(open(json_path, "r", encoding="utf-8"))
        for item in json_file["shapes"]:
            label_class = item['label']
            if label_class not in classes:
                classes.append(label_class)
    print('read file done')
    return classes
 
 
def split_dataset(label_path, test_size=0.3, isUseTest=False, useNumpyShuffle=False):
    """
    将文件分为训练集，测试集和验证集
    :param useNumpyShuffle: 使用numpy方法分割数据集
    :param test_size: 分割测试集或验证集的比例
    :param isUseTest: 是否使用测试集，默认为False
    :param label_path:当前文件路径
    :return:
    """
    files = glob(label_path + "\\*.json")
    files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
 
    if useNumpyShuffle:
        file_length = len(files)
        index = np.arange(file_length)
        np.random.seed(32)
        np.random.shuffle(index)
 
        test_files = None
        if isUseTest:
            trainval_files, test_files = np.array(files)[index[:int(file_length * (1 - test_size))]], np.array(files)[
                index[int(file_length * (1 - test_size)):]]
        else:
            trainval_files = files
        train_files, val_files = np.array(trainval_files)[index[:int(len(trainval_files) * (1 - test_size))]], \
                                 np.array(trainval_files)[index[int(len(trainval_files) * (1 - test_size)):]]
    else:
        test_files = None
        if isUseTest:
            trainval_files, test_files = train_test_split(files, test_size=test_size, random_state=55)
        else:
            trainval_files = files
        train_files, val_files = train_test_split(trainval_files, test_size=test_size, random_state=55)
 
    return train_files, val_files, test_files, files
 
 
def create_save_file(label_path=ROOT_DIR):
    """
    按照训练时的图像和标注路径创建文件夹
    :param label_path:当前文件路径
    :return:
    """
    # 生成训练集
    train_image = os.path.join(label_path, 'train', 'images')
    if not os.path.exists(train_image):
        os.makedirs(train_image)
    train_label = os.path.join(label_path, 'train', 'labels')
    if not os.path.exists(train_label):
        os.makedirs(train_label)
    # 生成验证集
    val_image = os.path.join(label_path, 'valid', 'images')
    if not os.path.exists(val_image):
        os.makedirs(val_image)
    val_label = os.path.join(label_path, 'valid', 'labels')
    if not os.path.exists(val_label):
        os.makedirs(val_label)
    # 生成测试集
    test_image = os.path.join(label_path, 'test', 'images')
    if not os.path.exists(test_image):
        os.makedirs(test_image)
    test_label = os.path.join(label_path, 'test', 'labels')
    if not os.path.exists(test_label):
        os.makedirs(test_label)
    return train_image, train_label, val_image, val_label, test_image, test_label
 
 
def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return x, y, w, h
 
 
def push_into_file(file, images, labels, label_path=ROOT_DIR, suffix='.jpg'):
    """
    最终生成在当前文件夹下的所有文件按image和label分别存在到训练集/验证集/测试集路径的文件夹下
    :param file: 文件名列表
    :param images: 存放images的路径
    :param labels: 存放labels的路径
    :param label_path: 当前文件路径
    :param suffix: 图像文件后缀
    :return:
    """
 
    for filename in file:
        image_file = os.path.join(label_path, filename + suffix)
        label_file = os.path.join(label_path, filename + '.txt')
        if not os.path.exists(os.path.join(images, filename + suffix)):
            try:
                shutil.copy2(image_file, images)
            except OSError:
                pass
        if not os.path.exists(os.path.join(labels, filename + suffix)):
            try:
                shutil.copy2(label_file, labels)
            except OSError:
                pass
 
 
def json2txt(classes, txt_Name='allfiles', label_path=ROOT_DIR, suffix='.jpg'):
    """
    将json文件转化为txt文件，并将json文件存放到指定文件夹
    :param classes: 类别名
    :param txt_Name:txt文件，用来存放所有文件的路径
    :param label_path:当前文件路径
    :param suffix:图像文件后缀
    :return:
    """
    store_json = os.path.join(label_path, 'json')
    if not os.path.exists(store_json):
        os.makedirs(store_json)
 
    _, _, _, files = split_dataset(label_path)
    if not os.path.exists(os.path.join(label_path, 'tmp')):
        os.makedirs(os.path.join(label_path, 'tmp'))
 
    list_file = open('tmp/%s.txt' % txt_Name, 'w')
    for json_file_ in tqdm(files):
        json_filename = os.path.join(label_path, json_file_ + ".json")
        imagePath = os.path.join(label_path, json_file_ + suffix)
        list_file.write('%s\n' % imagePath)
        out_file = open('%s/%s.txt' % (label_path, json_file_), 'w')
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        if os.path.exists(imagePath):
            height, width, channels = cv2.imread(imagePath).shape
            for multi in json_file["shapes"]:
                if len(multi["points"][0]) == 0:
                    out_file.write('')
                    continue
                points = np.array(multi["points"])
                xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
                xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
                ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
                ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
                label = multi["label"]
                if xmax <= xmin:
                    pass
                elif ymax <= ymin:
                    pass
                else:
                    cls_id = classes.index(label)
                    b = (float(xmin), float(xmax), float(ymin), float(ymax))
                    bb = convert((width, height), b)
                    out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
                    # print(json_filename, xmin, ymin, xmax, ymax, cls_id)
        if not os.path.exists(os.path.join(store_json, json_file_ + '.json')):
            try:
                shutil.copy2(json_filename, store_json)
            except OSError:
                pass
 
 
def create_yaml(classes, label_path, isUseTest=False):
    nc = len(classes)
    if not isUseTest:
        desired_caps = {
            'path': label_path,
            'train': 'train/images',
            'val': 'valid/images',
            'nc': nc,
            'names': classes
        }
    else:
        desired_caps = {
            'path': label_path,
            'train': 'train/images',
            'val': 'valid/images',
            'test': 'test/images',
            'nc': nc,
            'names': classes
        }
    yamlpath = os.path.join(label_path, "data" + ".yaml")
 
    # 写入到yaml文件
    with open(yamlpath, "w+", encoding="utf-8") as f:
        for key, val in desired_caps.items():
            yaml.dump({key: val}, f, default_flow_style=False)
 
 
# 首先确保当前文件夹下的所有图片统一后缀，如.jpg，如果为其他后缀，将suffix改为对应的后缀，如.png
def ChangeToYolo5(label_path=ROOT_DIR, suffix='.jpg', test_size=0.1, isUseTest=True):
    """
    生成最终标准格式的文件
    :param test_size: 分割测试集或验证集的比例
    :param label_path:当前文件路径
    :param suffix: 文件后缀名
    :param isUseTest: 是否使用测试集
    :return:
    """
    change_image_format(label_path)
    train_files, val_files, test_file, files = split_dataset(label_path, test_size=test_size, isUseTest=isUseTest)
    classes = get_all_class(files)
    json2txt(classes)
    create_yaml(classes, label_path, isUseTest=isUseTest)
    train_image, train_label, val_image, val_label, test_image, test_label = create_save_file(label_path)
    push_into_file(train_files, train_image, train_label, suffix=suffix)
    push_into_file(val_files, val_image, val_label, suffix=suffix)
    if test_file is not None:
        push_into_file(test_file, test_image, test_label, suffix=suffix)
    print('create dataset done')
 
 
if __name__ == "__main__":
    ChangeToYolo5()

这样子就行了。

训练开始！

修改一下train.py：

...
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', type=str, default='my/v5lite-s.pt', help='initial weights path')
    parser.add_argument('--cfg', type=str, default='models/v5Lite-s.yaml', help='model.yaml path')
    parser.add_argument('--data', type=str, default='data/my.yaml', help='data.yaml path')

    parser.add_argument('--epochs', type=int, default=300)
    parser.add_argument('--batch-size', type=int, default=8, help='total batch size for all GPUs')
    parser.add_argument('--img-size', nargs='+', type=int, default=[320, 320], help='[train, test] image sizes')

    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

    opt = parser.parse_args()

    ...

其中：

weights: 初始基准模型
cfg: 初始基准模型的配置文件，不用动

data: 训练的配置文件，内容如下：

path: data/path
train: train/images
val: valid/images
test: test/images
nc: 3
names:
 - your
 - label

epoch: 训练轮次
batch: 批处理量，推荐别设太大以免爆显存/内存
img-size: 你的数据源的图片尺寸
device: 训练使用设备。0代表默认CUDA设备。可以改成cpu以仅仅使用CPU训练。

采集了140张图片，划分比例9:1。炼丹开始！

在配置环境时遇到了问题：明明卸载了电脑上的CUDA相关环境，执行nvidia-smi查看信息，仍提示有CUDA 12.7。这实际上是显卡驱动自带的CUDA，不用管。安装pytorch时选择适配CUDA12.6版本的nightly release就行。

~~显存还剩4G内存还剩24G怎么还能爆了我请问了~~

[{"url":"https://webp.esing.dev/img/confusion_matrix_250107_1959_YCLi.png","alt":"confusion_matrix"},{"url":"https://webp.esing.dev/img/P_curve_250107_2000_Ywei.png","alt":"P_curve"},{"url":"https://webp.esing.dev/img/PR_curve_250107_2001_nIg5.png","alt":"PR_curve"},{"url":"https://webp.esing.dev/img/R_curve_250107_2001_xRDx.png","alt":"R_curve"},{"url":"https://webp.esing.dev/img/results_250107_2001_2t7K.png","alt":"results"}]

我也看不懂训练结果。

导出为ONNX

这里要参考YOLOv5-Lite (onnx)(v1.5版本 5月22日) 类似报错中的导出方法：

python .\export.py --weights runs\train\91\weights\best.pt --end2end

其中--end2end是为了带上额外的后处理。

推理

移植与部署里的代码太老了，是之前的YOLOv5-Lite版本的。同样在类似报错中找到了解决方法：将最新版本的库中的 python_demo/onnxruntime/v5lite.py内class yolov5_lite() {...}粘贴到原本的推理代码中，然后修改一下就行了。代码如下。

#!/usr/bin/python
# test_video.py

import cv2
import argparse
import numpy as np
import onnxruntime as ort
import time
import sys

sys.path.append('camera_driver')  # 添加camera_driver到系统路径
from Picamera2_Img_et import Imget  # 导入Imget类

class yolov5_lite():
    def __init__(self, model_pb_path, label_path, confThreshold=0.5, nmsThreshold=0.5):
        so = ort.SessionOptions()
        so.log_severity_level = 3
        self.net = ort.InferenceSession(model_pb_path, so)
        self.classes = list(map(lambda x: x.strip(), open(label_path, 'r').readlines()))

        self.confThreshold = confThreshold
        self.nmsThreshold = nmsThreshold
        self.input_shape = (self.net.get_inputs()[0].shape[2], self.net.get_inputs()[0].shape[3])

    def letterBox(self, srcimg, keep_ratio=True):
        top, left, newh, neww = 0, 0, self.input_shape[0], self.input_shape[1]
        if keep_ratio and srcimg.shape[0] != srcimg.shape[1]:
            hw_scale = srcimg.shape[0] / srcimg.shape[1]
            if hw_scale > 1:
                newh, neww = self.input_shape[0], int(self.input_shape[1] / hw_scale)
                img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA)
                left = int((self.input_shape[1] - neww) * 0.5)
                img = cv2.copyMakeBorder(img, 0, 0, left, self.input_shape[1] - neww - left, cv2.BORDER_CONSTANT,
                                         value=0)  # add border
            else:
                newh, neww = int(self.input_shape[0] * hw_scale), self.input_shape[1]
                img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA)
                top = int((self.input_shape[0] - newh) * 0.5)
                img = cv2.copyMakeBorder(img, top, self.input_shape[0] - newh - top, 0, 0, cv2.BORDER_CONSTANT, value=0)
        else:
            img = cv2.resize(srcimg, self.input_shape, interpolation=cv2.INTER_AREA)
        return img, newh, neww, top, left

    def postprocess(self, frame, outs, pad_hw):
        newh, neww, padh, padw = pad_hw
        frameHeight = frame.shape[0]
        frameWidth = frame.shape[1]
        ratioh, ratiow = frameHeight / newh, frameWidth / neww
        classIds = []
        confidences = []
        boxes = []
        for detection in outs:
            scores, classId = detection[4], detection[5]
            if scores > self.confThreshold:  # and detection[4] > self.objThreshold:
                x1 = int((detection[0] - padw) * ratiow)
                y1 = int((detection[1] - padh) * ratioh)
                x2 = int((detection[2] - padw) * ratiow)
                y2 = int((detection[3] - padh) * ratioh)
                classIds.append(classId)
                confidences.append(scores)
                boxes.append([x1, y1, x2, y2])

        # # Perform non maximum suppression to eliminate redundant overlapping boxes with
        # # lower confidences.
        indices = cv2.dnn.NMSBoxes(boxes, confidences, self.confThreshold, self.nmsThreshold)

        for ind in indices:
            frame = self.drawPred(frame, classIds[ind], confidences[ind], boxes[ind][0], boxes[ind][1], boxes[ind][2], boxes[ind][3])
        return frame, classIds

    def drawPred(self, frame, classId, conf, x1, y1, x2, y2):
        # Draw a bounding box.
        cv2.rectangle(frame, (x1, y1), (x2, y2), (116, 24, 138), thickness=2)

        label = '%.2f' % conf
        text = '%s:%s' % (self.classes[int(classId)], label)

        # Display the label at the top of the bounding box
        labelSize, baseLine = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, 0.75, 1)
        y1 = max(y1, labelSize[1])
        cv2.putText(frame, text, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 0, 255), thickness=1)
        return frame

    def detect(self, srcimg):
        img, newh, neww, top, left = self.letterBox(srcimg)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = img.astype(np.float32) / 255.0
        blob = np.expand_dims(np.transpose(img, (2, 0, 1)), axis=0)

        t1 = time.time()
        outs = self.net.run(None, {self.net.get_inputs()[0].name: blob})[0]
        cost_time = time.time() - t1
        # print(outs.shape)

        srcimg, classIds = self.postprocess(srcimg, outs, (newh, neww, top, left))
        infer_time = 'Inference Time: ' + str(int(cost_time * 1000)) + 'ms'
        cv2.putText(srcimg, infer_time, (120, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), thickness=2)
        return srcimg, classIds

def test1():
    print("test1")
    time.sleep(2)
    print("gogogo")

def test2():
    print("test2")
    time.sleep(2)
    print("hahaha")
    
active_target = ''
pre_target = ''

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--modelpath', type=str, default='onnx/best.onnx', help="onnx filepath")
    parser.add_argument('--classfile', type=str, default='labels.txt', help="classname filepath")
    parser.add_argument('--confThreshold', default=0.6, type=float, help='class confidence')
    parser.add_argument('--nmsThreshold', default=0.4, type=float, help='nms iou thresh')
    parser.add_argument('--eink', action='store_true')

    args = parser.parse_args()
    
    if (args.eink):
        import concurrent.futures
        from my_epd_func import epd_init, epd_display, epd_clear, epd_sleep
        myepd = epd_init()
        executor = concurrent.futures.ThreadPoolExecutor(max_workers=2)  # 创建线程池



    detector = yolov5_lite(args.modelpath, args.classfile, confThreshold=args.confThreshold, nmsThreshold=args.nmsThreshold)
    
    #Capture
    video = 0
    cap = cv2.VideoCapture(video)
    flag_det = False
    
    cap.release()

    getImg = Imget()  # 创建Imget对象实例
    while True:
        frame = getImg.getImg()  # 使用Imget类获取图像
        if frame is not None:

            if flag_det:
                t1 = time.time()
                frame, classIds = detector.detect(frame.copy())
                t2 = time.time()

                if len(classIds) == 1:
                    pre_target = active_target

                    if detector.classes[int(classIds[0])] == 'apple':
                        active_target = 'apple'
                    elif detector.classes[int(classIds[0])] == 'orange':
                        active_target = 'orange'
                    elif detector.classes[int(classIds[0])] == 'banana':
                        active_target = 'banana'
                    
                if (active_target != pre_target) and args.eink:
                    executor.submit(epd_display(myepd, active_target))

                str_FPS = "FPS: %.2f" % (1. / (t2 - t1))

                cv2.putText(frame, str_FPS, (20, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

            cv2.imshow("video", frame)

        key = cv2.waitKey(1) & 0xFF
        if key == ord('q'):
            break
        elif key & 0xFF == ord('s'):
            flag_det = not flag_det
            print(flag_det)

    if (args.eink):
        executor.shutdown()  # 关闭线程池
        epd_clear(myepd)
        epd_sleep(myepd)

    cv2.destroyAllWindows()
    getImg.cam.stop()
    getImg.cam.close()

自己加入了墨水屏显示的代码。可以通过传入--eink参数来决定是否启用。驱动部分参考了微雪的官方树莓派Python例程。

实机展示

适当修改上面confThreshold与nmsThreshold的值后问题解决。

在YOLO（You Only Look Once）目标检测算法中，confThreshold 和 nmsThreshold 是用于过滤预测结果的两个重要参数。

confThreshold (置信度阈值):
- 这个参数决定了一个检测框被保留的最小置信度。YOLO 模型会为每个预测的边界框输出一个置信度分数，这个分数表示模型对边界框内存在目标的确定程度。
- 如果一个边界框的置信度分数低于 confThreshold，那么这个边界框会被丢弃，不会被认为是有效的检测结果。
- 设置较高的 confThreshold 可以减少误报（False Positives），但可能会导致一些真实的对象没有被检测到（即漏检）。
nmsThreshold (非极大值抑制阈值):
- NMS（Non-Maximum Suppression）是一种后处理技术，用来解决同一物体被多次检测的问题。当多个边界框重叠并指向同一个物体时，NMS 会选择其中具有最高置信度分数的那个，并删除其他重叠的边界框。
- nmsThreshold 定义了两个边界框之间的重叠程度（通常使用 IoU，Intersection over Union，即交并比来衡量）。如果两个边界框的 IoU 超过了 nmsThreshold，则其中一个会被抑制（移除）。
- 较低的 nmsThreshold 值意味着更严格的抑制条件，只有那些几乎不重叠或者重叠非常小的边界框才会被保留下来；而较高的 nmsThreshold 可能会导致更多重叠的边界框被保留，这可能会增加冗余检测。