There is a problem of captcha recognition from a certain resource.

Captcha, I think, is simple in terms of recognition, because there are only one numbers in it, which are not distorted, but can be slightly rotated at an angle.

Original grayscale image.

The problem is that it is not clear how to pre-process the image before recognition.

For example, how to bring an image to a binary form (black background, white characters or vice versa)?

I use OpenCV

Sample Source Image Sample Source Image

    1 answer 1

    It’s easy not to get rid of the background to leave some numbers. Do this:

    • Canny
    • findContours
    • loop along the contours to get rid of the contours that are adjacent to the edges, or too wide, or contain vertical or horizontal straight lines.
    • in the remaining contours, the outlines of characters with minor impurities will remain, for example, you can draw them in black on white (drawConrours) to make an erode and send OCR

    Filter out the contours will be the most difficult, I sketched for example, but in fact it will be necessary to smooth out - perhaps even to throw out somewhere not whole contours but fragments - there is where to turn around creativity.

    • Thanks for the answer. I will try - rekrut