机器视觉基础之PaddleOCR入门

1.Python安装

官网下载较慢, 可到淘宝镜像源https://registry.npmmirror.com/binary.html?path=python/安装3.8或3.9, windows下切勿安装3.10以上版本，paddleocr的opencv依赖无法安装。

2.安装依赖

pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

# 有可能会提示
# Microsoft Visual C++ 14.0 is required. 
# Get it with "Build Tools for Visual Studio": 
# https://visualstudio.microsoft.com/downloads/
# 务必带入版本号,否则默认是1.x版本的识别率较低
pip install "paddleocr>=2.0.1" -i https://mirror.baidu.com/pypi/simple

#如果执行paddleocr命令提示protobuf版本错误需要降级到3.20.0版本
pip uninstall protobuf
pip install protobuf==3.20.0 -i https://mirror.baidu.com/pypi/simple

3.测试例子

默认会自动下载模型到用户目录, 默认是3.0版本, 也可手动下载指定模型目录。

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False)

# ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False,
#                 rec_model_dir='./model/v3.0/ch_PP-OCRv3_rec_infer/',
#                 cls_model_dir='./model/v3.0/ch_ppocr_mobile_v2.0_cls_infer/',
#                 det_model_dir='./model/v3.0/ch_PP-OCRv3_det_infer/')

img_path = './img/1.png'
result = ocr.ocr(img_path, det=True)
for line in result:
    print(line)

4.简单HTTP服务器，客户端代码

服务器端安装依赖

pip install flask

import base64

from flask import Flask, request, render_template, jsonify
from paddleocr import PaddleOCR
import numpy as np
import cv2

app = Flask(__name__)
# 消息体最大50M
app.config['MAX_CONTENT_LENGTH'] = 50 * 1000 * 1000
app.config['JSON_AS_ASCII'] = False
iocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False,
                 rec_model_dir='./model/v3.0/ch_PP-OCRv3_rec_infer/',
                 cls_model_dir='./model/v3.0/ch_ppocr_mobile_v2.0_cls_infer/',
                 det_model_dir='./model/v3.0/ch_PP-OCRv3_det_infer/')

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/ocr', methods=['post'])
def ocr():
    image_bytes = None
    # 文件上传
    if 'image' in request.files.keys():
        print(request.files)
        image_bytes = request.files['image'].read()
    # json图片内容base64
    elif 'image' in request.json:
        base64_image = request.json['image']
        image_bytes = base64.b64decode(base64_image)
    # error
    else:
        return jsonify({
            'code': 400,
            'msg': '错误的请求'
        })

    # FIXME Try catch
    try:
        image_ndarray = np.fromstring(image_bytes, np.uint8)
        image = cv2.imdecode(image_ndarray, cv2.IMREAD_COLOR)  # cv2.IMREAD_COLOR
        result = iocr.ocr(image)
        return jsonify({
            'code': 200,
            'data': [line[1] for line in result],
            'msg': '操作成功'
        })
    except BaseException as exp:
        print("OCR失败", exp)
        return jsonify({
            'code': 500,
            'msg': 'OCR失败'
        })

if __name__ == '__main__':
    app.run(port=6666, debug=True, threaded=True)

注意代码勿用于生产

Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.

客户端代码

import base64
import os
import requests
import time
import cv2

OCR_URL = "http://127.0.0.1:6666/ocr"

def ocr_file_request():
    root = os.getcwd() + '\\img'
    name = "1.png"
    start_time = time.time()
    file = os.path.join(root, name)
    file_ptr = open(file, 'rb')
    files_t = {'image': (name, file_ptr)}
    headers = {'File-Name': name}

    r = requests.post(OCR_URL, files=files_t, headers=headers)
    file_ptr.close()
    end_time = time.time()
    print("ocr_request耗时: {:.2f}秒".format(end_time - start_time))
    print(r.text)

def ocr_json_request():
    img_path = os.getcwd() + '\\img\\7.png'
    img_file = open(img_path, "rb")
    img_byte_obj = base64.b64encode(img_file.read())
    img_file.close()

    data = {"image": img_byte_obj.decode()}
    headers = {'Content-Type': 'application/json;charset=UTF-8'}  # 头文件
    start_time = time.time()
    r = requests.post(OCR_URL, json=data, headers=headers)
    end_time = time.time()
    print("ocr_json_request耗时: {:.2f}秒".format(end_time - start_time))
    print(r.text)


ocr_json_request()
# ocr_file_request()

5.总结

PaddleOCR通用文本识别整率和资源占用比chinese_lite_ocr之类是要好一些, 而对于特定场景的例如证件，仪表识别需要训练特定模型，使用PPOCRLabel标注足够量的图片，未完待续。

文章目录