Skip to content

VLM as Detector

v2

Class: VLMAsDetectorBlockV2 (there are multiple versions of this block)

Source: inference.core.workflows.core_steps.formatters.vlm_as_detector.v2.VLMAsDetectorBlockV2

Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning

The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to object-detection prediction and returned as block output.

Accepted formats:

  • valid JSON strings

  • JSON documents wrapped with Markdown tags

Example

{"my": "json"}

Details regarding block behavior:

  • error_status is set True whenever parsing cannot be completed

  • in case of multiple markdown blocks with raw JSON content - only first will be parsed

Type identifier

Use the following identifier in step "type" field: roboflow_core/vlm_as_detector@v2to add the block as as step in your workflow.

Properties

Name Type Description Refs
name str Enter a unique identifier for this step..
classes List[str] List of all classes used by the model, required to generate mapping between class name and class id..
model_type str Type of the model that generated prediction.
task_type str Task type to performed by model..

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Available Connections

Compatible Blocks

Check what blocks you can connect to VLM as Detector in version v2.

Input and Output Bindings

The available connections depend on its binding kinds. Check what binding kinds VLM as Detector in version v2 has.

Bindings
  • input

    • image (image): The image which was the base to generate VLM prediction.
    • vlm_output (language_model_output): The string with raw classification prediction to parse..
    • classes (list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
  • output

Example JSON definition of step VLM as Detector in version v2
{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/vlm_as_detector@v2",
    "image": "$inputs.image",
    "vlm_output": [
        "$steps.lmm.output"
    ],
    "classes": [
        "$steps.lmm.classes",
        "$inputs.classes",
        [
            "class_a",
            "class_b"
        ]
    ],
    "model_type": [
        "google-gemini",
        "anthropic-claude",
        "florence-2"
    ],
    "task_type": "<block_does_not_provide_example>"
}

v1

Class: VLMAsDetectorBlockV1 (there are multiple versions of this block)

Source: inference.core.workflows.core_steps.formatters.vlm_as_detector.v1.VLMAsDetectorBlockV1

Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning

The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to object-detection prediction and returned as block output.

Accepted formats:

  • valid JSON strings

  • JSON documents wrapped with Markdown tags

Example

{"my": "json"}

Details regarding block behavior:

  • error_status is set True whenever parsing cannot be completed

  • in case of multiple markdown blocks with raw JSON content - only first will be parsed

Type identifier

Use the following identifier in step "type" field: roboflow_core/vlm_as_detector@v1to add the block as as step in your workflow.

Properties

Name Type Description Refs
name str Enter a unique identifier for this step..
classes List[str] List of all classes used by the model, required to generate mapping between class name and class id..
model_type str Type of the model that generated prediction.
task_type str Task type to performed by model..

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Available Connections

Compatible Blocks

Check what blocks you can connect to VLM as Detector in version v1.

Input and Output Bindings

The available connections depend on its binding kinds. Check what binding kinds VLM as Detector in version v1 has.

Bindings
  • input

    • image (image): The image which was the base to generate VLM prediction.
    • vlm_output (language_model_output): The string with raw classification prediction to parse..
    • classes (list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
  • output

Example JSON definition of step VLM as Detector in version v1
{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/vlm_as_detector@v1",
    "image": "$inputs.image",
    "vlm_output": [
        "$steps.lmm.output"
    ],
    "classes": [
        "$steps.lmm.classes",
        "$inputs.classes",
        [
            "class_a",
            "class_b"
        ]
    ],
    "model_type": [
        "google-gemini",
        "anthropic-claude",
        "florence-2"
    ],
    "task_type": "<block_does_not_provide_example>"
}