Workflows for OCR¶

Below you can find example workflows you can use as inspiration to build your apps.

Workflow with DocTR model¶

This example showcases quite sophisticated workflows usage scenario that assume the following:

we have generic object detection model capable of recognising cars
we have specialised object detection model trained to detect license plates in the images depicting single car only
we have generic OCR model capable of recognising lines of texts from images

Our goal is to read license plates of every car we detect in the picture. We can achieve that goal with workflow from this example. In the definition we can see that generic object detection model is applied first, to make the job easier for the secondary (plates detection) model we enlarge bounding boxes, slightly offsetting its dimensions with Detections Offset block - later we apply cropping to be able to run license plate detection for every detected car instance (increasing the depth of the batch). Once secondary model runs and we have bounding boxes for license plates - we crop previously cropped cars images to extract plates. Once this is done, plates crops are passed to OCR step which turns images of plates into text.

Workflow definition

{
    "version": "1.0",
    "inputs": [
        {
            "type": "WorkflowImage",
            "name": "image"
        }
    ],
    "steps": [
        {
            "type": "RoboflowObjectDetectionModel",
            "name": "detection",
            "image": "$inputs.image",
            "model_id": "yolov8n-640",
            "class_filter": [
                "car"
            ]
        },
        {
            "type": "DetectionOffset",
            "name": "offset",
            "predictions": "$steps.detection.predictions",
            "image_metadata": "$steps.detection.image",
            "prediction_type": "$steps.detection.prediction_type",
            "offset_width": 10,
            "offset_height": 10
        },
        {
            "type": "Crop",
            "name": "cars_crops",
            "image": "$inputs.image",
            "predictions": "$steps.offset.predictions"
        },
        {
            "type": "RoboflowObjectDetectionModel",
            "name": "plates_detection",
            "image": "$steps.cars_crops.crops",
            "model_id": "vehicle-registration-plates-trudk/2"
        },
        {
            "type": "DetectionOffset",
            "name": "plates_offset",
            "predictions": "$steps.plates_detection.predictions",
            "image_metadata": "$steps.plates_detection.image",
            "prediction_type": "$steps.plates_detection.prediction_type",
            "offset_width": 50,
            "offset_height": 50
        },
        {
            "type": "Crop",
            "name": "plates_crops",
            "image": "$steps.cars_crops.crops",
            "predictions": "$steps.plates_offset.predictions"
        },
        {
            "type": "OCRModel",
            "name": "ocr",
            "image": "$steps.plates_crops.crops"
        }
    ],
    "outputs": [
        {
            "type": "JsonField",
            "name": "cars_crops",
            "selector": "$steps.cars_crops.crops"
        },
        {
            "type": "JsonField",
            "name": "plates_crops",
            "selector": "$steps.plates_crops.crops"
        },
        {
            "type": "JsonField",
            "name": "plates_ocr",
            "selector": "$steps.ocr.result"
        }
    ]
}

Google Vision OCR¶

In this example, Google Vision OCR is used to extract text from input image. Additionally, example presents how to combine structured output of Google API with visualisation blocks.

Workflow definition

{
    "version": "1.0",
    "inputs": [
        {
            "type": "WorkflowImage",
            "name": "image"
        },
        {
            "type": "WorkflowParameter",
            "name": "api_key"
        }
    ],
    "steps": [
        {
            "type": "roboflow_core/google_vision_ocr@v1",
            "name": "google_vision_ocr",
            "image": "$inputs.image",
            "ocr_type": "text_detection",
            "api_key": "$inputs.api_key"
        },
        {
            "type": "roboflow_core/bounding_box_visualization@v1",
            "name": "bounding_box_visualization",
            "predictions": "$steps.google_vision_ocr.predictions",
            "image": "$inputs.image"
        },
        {
            "type": "roboflow_core/label_visualization@v1",
            "name": "label_visualization",
            "predictions": "$steps.google_vision_ocr.predictions",
            "image": "$steps.bounding_box_visualization.image"
        }
    ],
    "outputs": [
        {
            "type": "JsonField",
            "name": "extracted_text",
            "selector": "$steps.google_vision_ocr.text"
        },
        {
            "type": "JsonField",
            "name": "text_detections",
            "selector": "$steps.google_vision_ocr.predictions"
        },
        {
            "type": "JsonField",
            "name": "text_visualised",
            "selector": "$steps.label_visualization.image"
        }
    ]
}

Workflow with model detecting individual characters and text stitching¶

This workflow extracts and organizes text from an image using OCR. It begins by analyzing the image with detection model to detect individual characters or words and their positions.

Then, it groups nearby text into lines based on a specified tolerance for spacing and arranges them in reading order (left-to-right).

The final output is a JSON field containing the structured text in readable, logical order, accurately reflecting the layout of the original image.

Workflow definition

{
    "version": "1.0",
    "inputs": [
        {
            "type": "WorkflowImage",
            "name": "image"
        },
        {
            "type": "WorkflowParameter",
            "name": "model_id",
            "default_value": "ocr-oy9a7/1"
        },
        {
            "type": "WorkflowParameter",
            "name": "tolerance",
            "default_value": 10
        },
        {
            "type": "WorkflowParameter",
            "name": "confidence",
            "default_value": 0.4
        }
    ],
    "steps": [
        {
            "type": "roboflow_core/roboflow_object_detection_model@v2",
            "name": "ocr_detection",
            "image": "$inputs.image",
            "model_id": "$inputs.model_id",
            "confidence": "$inputs.confidence"
        },
        {
            "type": "roboflow_core/stitch_ocr_detections@v1",
            "name": "detections_stitch",
            "predictions": "$steps.ocr_detection.predictions",
            "reading_direction": "left_to_right",
            "tolerance": "$inputs.tolerance"
        }
    ],
    "outputs": [
        {
            "type": "JsonField",
            "name": "ocr_text",
            "selector": "$steps.detections_stitch.ocr_text"
        }
    ]
}