我正在使用下面的算法将句子分割成单词,并将单词分割成字符. 正如您在下面的输出中所看到的那样,单词"STAND"中的字母"S"和"T"是绑定在一起的,我不明白我做错了什么,如果你们能帮助我,我会很高兴.

2.我已经在EMBIST信件数据集上训练了一个模型.我的模型一次只能预测一个字母.要进一步进行,我需要将每个字符框提取到字符图像数组中.最终,我的目标是拥有一个包含所有角色图像的array.之后,我计划使用我的模型来单独预测每个角色.

此外,我需要将每个字符的大小调整为28 x28像素,因为模型经过训练,可以从该大小的图像中预测字母. 我做这个有麻烦..希望你能帮我

import cv2



# Preprocessing

def preProcessing(myImage):
    grayImg = cv2.cvtColor(myImage, cv2.COLOR_BGR2GRAY)
    # cv2.imshow('Gray Image', grayImg)
    # cv2.waitKey()

    ret, thresh1 = cv2.threshold(grayImg, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
    # cv2.imshow('After threshold', thresh1)
    # cv2.waitKey()

    print(f'The threshold valua applied to the image is: {ret} ')
    horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))
    dilation = cv2.dilate(thresh1, horizontal_kernel, iterations=1)
    horizontal_contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    im2 = myImage.copy()

    for cnt in horizontal_contours:
        x, y, w, h = cv2.boundingRect(cnt)
        rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (255, 255, 255), 0)
    im2= seg_word(rect)
    #im2 = seg_word(rect)
    #im2=character_seg(im2)
    return im2

# Word segmentation
def seg_word(wordImage):
    # convert the input image into gray scale
    grayImg = cv2.cvtColor(wordImage, cv2.COLOR_BGR2GRAY)

    # Binarize the gray image with OTSU algorithm
    ret, thresh2 = cv2.threshold(grayImg, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
    #print(ret)

    # create a Structuring Element size of 8*10 for the vertical contouring
    vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (8, 10))

    # apply Dilation for once only
    dilation = cv2.dilate(thresh2, vertical_kernel, iterations=1)

    #fingd the vertical contours
    vertical_contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    word_img = wordImage.copy()

    # Run through each contour and extract the bounding box
    for cnt in vertical_contours:
        #computes the minimum rectangle
        x, y, w, h = cv2.boundingRect(cnt)
        # Draw a rectangular from the top left to the bottom right with the
        # given Coordinates x,y and height and width
        rect = cv2.rectangle(word_img, (x, y), (x + w, y + h), (0, 255, 0), 0)
    # apply a Character Segmentation and return the output Image
    word_img= character_seg(rect)
    return word_img

# Character segmentation
def character_seg(img):
    #conver the input image int gray scale
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Threshold the image
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

    # Apply morphological erosion to remove small artifacts
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,5))
    eroded = cv2.erode(thresh, kernel, iterations=1)

    # Apply morphological dilation to expand the characters
    dilated = cv2.dilate(eroded, kernel, iterations=3)

    # Find contours in the image
    contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    # Iterate through each contour and extract the bounding box
    for contour in contours:
            (x, y, w, h) = cv2.boundingRect(contour)
            cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0 ), 2)
    return  img

# Load the test image
image_path = r"C:\Users\student\Desktop\FinalProject\Flask\uploads\1_lWmB8FGf1uWT6r1TichK-Q- 
ezgif.com-webp-to-png-converter.png"
myImage = cv2.imread(image_path)
# Display the image
cv2.imshow('Text Image', myImage)
cv2.waitKey(0)

processed_img = preProcessing(myImage)
cv2.imshow('Text Image', processed_img)
cv2.waitKey(0)

enter image description here

enter image description here

推荐答案

正如您在下面的输出中看到的那样,单词中的字母"S"和"T" "站"被束缚在一起,我不明白我做了什么 错了,如果你们能帮助我,我会很高兴.

这个问题可以解决.

第73行更改iterations=3:

dilated = cv2.dilate(eroded, kernel, iterations=3)  

致:

dilated = cv2.dilate(eroded, kernel, iterations=1) #Change index to 1

截图 :

enter image description here

Python相关问答推荐

Python 枕头上的图像背景变黑

如何输入提示抽象方法属性并让mypy高兴?

从包含基本数据描述的文本字段中识别和检索特定字符序列

是否有方法将现有的X-Y图转换为X-Y-Y1图(以重新填充)?

带有pandas的分区列上的过滤器的多个条件read_parquet

获取Azure Pipelines以从pyproject.toml(而不是relevments_dev.文本)安装测试环境

Numpy索引argsorted使用integer数组,同时保留排序顺序

绘制系列时如何反转轴?

从管道将Python应用程序部署到Azure Web应用程序,不包括需求包

Python -Polars库中的滚动索引?

计算所有前面行(当前行)中列的值

如何自动抓取以下CSV

点到面的Y距离

Pytest两个具有无限循环和await命令的Deliverc函数

追溯(最近最后一次调用):文件C:\Users\Diplom/PycharmProject\Yolo01\Roboflow-4.py,第4行,在模块导入roboflow中

更改键盘按钮进入'

Pandas DataFrame中行之间的差异

将JSON对象转换为Dataframe

在两极中过滤

Polars将相同的自定义函数应用于组中的多个列,