通过图像处理改善OCR识别结果的实例

jf_78858299 2023-02-08 854

电子说

1.3w人已加入

描述

本文主要介绍一个通过图像处理改善OCR识别结果的实例，并给出详细步骤和源码。

背景介绍

在很多情况下，文字识别会遇到困难。比如非单一的背景、杂讯干扰、文字部分缺失等。

我们希望识别图中的黑色文字(12-14),但背景较复杂且存在其他干扰，如果直接用Tesseract识别(代码如下)，识别结果为空。

# -*- coding:utf-8 -*- 
import pytesseract
from PIL import Image

# 打开图像
image = Image.open('0.png')

# OCR识别：lang默认英文
text = pytesseract.image_to_string(image)

# 打印识别后的文本
print(text)

对这种复杂情况的文字识别，直接去识别很容易失败。思考：可不可以通过图像处理将我们需要的部分分割或凸显出来再做识别？本文将以此为例做演示说明。

**详细实现步骤

【1】OTSU二值化

image = cv2.imread('0.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)


_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
cv2.imshow("Otsu", thresh)

【2】距离变化 + 归一化

dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)
dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)
dist = (dist * 255).astype("uint8")
cv2.imshow("Dist", dist)

【3】对距离变换结果图做OTSU二值化

_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imshow("Dist Otsu", dist)

【4】形态学开运算滤除杂讯

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)
cv2.imshow("Opening", opening)

【5】轮廓筛选，找出文字区域

black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)


cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
chars = []
# loop over the contours
for c in cnts:
  # compute the bounding box of the contour
  (x, y, w, h) = cv2.boundingRect(c)
  if w >= 35 and h >= 100:
    chars.append(c)

cv2.drawContours(black_img,chars,-1,(0,255,0),2)
cv2.imshow("chars", black_img)

【6】计算轮廓凸包，进一步获取文字区域mask

mask = np.zeros(image.shape[:2], dtype="uint8")

cv2.drawContours(mask, [hull], -1, 255, -1)

mask = cv2.dilate(mask, None, iterations=2)

cv2.imshow("Mask", mask)

take the bitwise of the opening image and the mask to reveal just

the characters in the image

final = cv2.bitwise_and(opening, opening, mask=mask)

cv2.imshow("final", mask)

【7】Tesseract文字识别


text = pytesseract.image_to_string(final)

# 打印识别后的文本

print(text)

【8】完整代码：

#公众号：OpenCV与AI深度学习

import cv2

import numpy as np

import imutils

import pytesseract

image = cv2.imread('0.png')

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)

cv2.imshow("Otsu", thresh)

dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)

dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)

dist = (dist * 255).astype("uint8")

cv2.imshow("Dist", dist)

threshold the distance transform using Otsu's method

_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

cv2.imshow("Dist Otsu", dist)

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))

opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)

cv2.imshow("Opening", opening)

black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)

cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

cnts = imutils.grab_contours(cnts)

chars = []

loop over the contours

for c in cnts:

compute the bounding box of the contour

(x, y, w, h) = cv2.boundingRect(c)

if w >= 35 and h >= 100:

chars.append(c)

cv2.drawContours(black_img,chars,-1,(0,255,0),2)

cv2.imshow("chars", black_img)

chars = np.vstack([chars[i] for i in range(0, len(chars))])

hull = cv2.convexHull(chars)

allocate memory for the convex hull mask, draw the convex hull on

the image, and then enlarge it via a dilation

mask = np.zeros(image.shape[:2], dtype="uint8")

cv2.drawContours(mask, [hull], -1, 255, -1)

mask = cv2.dilate(mask, None, iterations=2)

cv2.imshow("Mask", mask)

take the bitwise of the opening image and the mask to reveal just

the characters in the image

final = cv2.bitwise_and(opening, opening, mask=mask)

cv2.imshow("final", final)

text = pytesseract.image_to_string(final)

打印识别后的文本

print(text)

cv2.waitKey()

cv2.destroyAllWindows()

**参考链接**

(1)https://pyimagesearch.com/2021/11/22/improving-ocr-results-with-basic-image-processing/

(2)https://stackoverflow.com/questions/33881175/remove-background-noise-from-image-to-make-text-more-clear-for-ocr

打开APP阅读更多精彩内容