OpenCV的视频处理之人脸检测 1

jf_78858299 2023-02-07 785

电子说

1.3w人已加入

描述

目前可依靠模块化方式实现图像处理管道，检测一堆图像文件中的人脸，并将其与漂亮的结构化JSON摘要文件一起保存在单独的文件夹中。

让我们对视频流也可以进行同样的操作。

首先，我们需要捕获视频流。该管线任务将从视频文件或网络摄像头（逐帧）生成一系列图像。接下来，我们将检测每个帧上的脸部并将其保存。接下来的三个块是可选的，它们的目标是创建带有注释的输出视频，例如在检测到的人脸周围的框。我们可以显示带注释的视频并将其保存。最后一个任务将收集有关检测到的面部的信息，并保存带有面部的框坐标和置信度的JSON摘要文件。

如果尚未设置jagin / image-processing-pipeline存储库以查看源代码并运行一些示例，则可以立即执行以下操作：

$ git clone git://github.com/jagin/image-processing-pipeline.git
$ cd image-processing-pipeline
$ git checkout 7df1963247caa01b503980fe152138b88df6c526
$ conda env create -f environment.yml
$ conda activate pipeline

如果已经克隆了存储库并设置了环境，请使用以下命令对其进行更新：

$ git pull
$ git checkout 7df1963247caa01b503980fe152138b88df6c526
$ conda env update -f environment.yml

拍摄影片

使用OpenCV捕获视频非常简单。我们需要创建一个VideoCapture对象，其中参数是设备索引（指定哪个摄像机的数字）或视频文件的名称。然后，我们可以逐帧捕获视频流。

我们可以使用以下CaptureVideo扩展类来实现捕获视频任务Pipeline：

import cv2
from pipeline.pipeline import Pipeline


class CaptureVideo(Pipeline):
    def __init__(self, src=0):
        self.cap = cv2.VideoCapture(src)
        if not self.cap.isOpened():
            raise IOError(f"Cannot open video {src}")


        self.fps = int(self.cap.get(cv2.CAP_PROP_FPS))
        self.frame_count = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))


        super(CaptureVideo, self).__init__()


    def generator(self):
        image_idx = 0
        while self.has_next():
            ret, image = self.cap.read()
            if not ret:
                # no frames has been grabbed
                break


            data = {
                "image_id": f"{image_idx:05d}",
                "image": image,
            }


            if self.filter(data):
                image_idx += 1
                yield self.map(data)


    def cleanup(self):
        # Closes video file or capturing device
        self.cap.release()

使用__init__我们创建VideoCapture对象（第6行）并提取视频流的属性，例如每秒帧数和帧数。我们将需要它们显示进度条并正确保存视频。图像帧将在具有字典结构的generator函数（第30行）中产生：

data = {
    "image_id": f"{image_idx:05d}",
    "image": image,
}

当然，数据中也包括图像的序列号和帧的二进制数据。

检测人脸

我们准备检测面部。这次，我们将使用OpenCV的深度神经网络模块，而不是我在上一个故事中所承诺的Haar级联。我们将要使用的模型更加准确，并且还为我们提供了置信度得分。

从版本3.3开始，OpenCV支持许多深度学习框架，例如Caffe，TensorFlow和PyTorch，从而使我们能够加载模型，预处理输入图像并进行推理以获得输出分类。

有一位优秀的博客文章中阿德里安·罗斯布鲁克（Adrian Rosebrock）解释如何使用OpenCV和深度学习实现人脸检测。我们将在FaceDetector类中使用部分代码：

import cv2
import numpy as np


class FaceDetector:
    def __init__(self, prototxt, model, confidence=0.5):
        self.confidence = confidence


        self.net = cv2.dnn.readNetFromCaffe(prototxt, model)


    def detect(self, images):
        # convert images into blob
        blob = self.preprocess(images)


        # pass the blob through the network and obtain the detections and predictions
        self.net.setInput(blob)
        detections = self.net.forward()
        # Prepare storage for faces for every image in the batch
        faces = dict(zip(range(len(images)), [[] for _ in range(len(images))]))


        # loop over the detections
        for i in range(0, detections.shape[2]):
            # extract the confidence (i.e., probability) associated with the prediction
            confidence = detections[0, 0, i, 2]


            # filter out weak detections by ensuring the `confidence` is
            # greater than the minimum confidence
            if confidence < self.confidence:
                continue


            # grab the image index
            image_idx = int(detections[0, 0, i, 0])
            # grab the image dimensions
            (h, w) = images[image_idx].shape[:2]
            # compute the (x, y)-coordinates of the bounding box for the object
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])


            # Add result
            faces[image_idx].append((box, confidence))


        return faces


    def preprocess(self, images):
        return cv2.dnn.blobFromImages(images, 1.0, (300, 300), (104.0, 177.0, 123.0))

打开APP阅读更多精彩内容