Optical Character Recognition (OCR) Filter

The FilterOpticalCharacterRecognition is a pluggable filter that extracts text from image frames using Optical Character Recognition (OCR). It supports multiple OCR backends and offers flexible configuration for language support, output, and debug logging.

Features

Dual OCR Engine Support
Choose between:
- tesseract
- easyocr
  Configure with the ocr_engine parameter.
Multi-language OCR
Use the ocr_language option to specify one or more language codes (e.g., ['en', 'fr']).
Output to JSON
Extracted text is written to a newline-delimited JSON file at the path specified by output_json_path.
Debug Mode
Enabling debug: true will increase logging verbosity for troubleshooting and transparency.
Frame-level Skipping
Add the metadata flag skip_ocr: true to individual frames to bypass OCR processing.
Custom Tesseract Path
You can specify a custom tesseract_cmd binary path if using the Tesseract engine (defaults to a bundled AppImage).
Safe Streaming Output
Results are flushed to disk immediately after processing each frame.

Note
This may lead to heavy I/O operations. A configurable flushing strategy is planned for future releases.

Example Output

Each processed frame will produce a JSON line similar to:

{
  "frame_id": "abc123",
  "texts": ["Detected text line 1", "Detected text line 2"]
}

When to Use

This filter is ideal for any pipeline that requires reading printed or handwritten text from images, such as:

Scanned documents
Signboards or product packaging in photos
Scene text in videos

Configuration Reference

Key	Type	Default	Description
`ocr_engine`	`string`	`"easyocr"`	OCR engine to use: `"tesseract"` or `"easyocr"`
`ocr_language`	`string[]`	`["en"]`	Language codes for OCR
`output_json_path`	`string`	`"./output/ocr_results.json"`	Path to save output results
`debug`	`boolean`	`false`	Enable debug logging
`tesseract_cmd`	`string`	Packaged AppImage path	Path to Tesseract binary

Features​

Example Output​

When to Use​

Configuration Reference​

Features

Example Output

When to Use

Configuration Reference