Uncategorized

opencv – Using Python to extract timestamp text from frame of videos


I have a series of videos that contain timestamp in a specific ROI which I have already found. The text can vary between black and white and can mix in the same timestamp. I am struggling to use tesseract to extract the text from some videos, processing as below:

gray_roi_frame = cv2.cvtColor(roi_frame, cv2.COLOR_BGR2GRAY)


# Gaussian blur
blurred = cv2.GaussianBlur(gray_roi_frame, (5, 5), 0)

# Canny edge detection
edges = cv2.Canny(blurred, 50, 150, apertureSize=7)

# Display the original image and edges
cv2.imshow('Original Image', gray_roi_frame)
cv2.imshow('Edges', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

# OCR on edges frame
extracted_text = pytesseract.image_to_string(edges, lang='eng', config='--psm 6')

print("Extracted Text:", extracted_text)
# Regex for date and time information
date_time_match = re.search(r'(\d{2}-\d{2}-\d{4} \w+ \d{2}:\d{2}:\d{2})', extracted_text)
if date_time_match:
    extracted_date_time = date_time_match.group(1)
    print("Extracted Date and Time:", extracted_date_time)
else:
    print("Date and time not found in the extracted text.")
# Release the video capture object
cap.release()

Greyscale and canny edge images
Example grab from clip A
Example grab from clip B (This one was easy to extract)
Output is essentially garbage:

Extracted Text: i ore Wei) is O25
oe aie oF ew ales (Oe)

What else could I do to allow OCR to obtain the text accurately given black/white text on a variety of backgrounds?

I have tried canny edge detection, sobel edge detection, adaptive thresholding. Current iteration is clearest visually I’ve gotten, but OCR is unable to pick up the correct output.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *