SmolVLM2¶
Class: SmolVLM2BlockV1
Source: inference.core.workflows.core_steps.models.foundation.smolvlm.v1.SmolVLM2BlockV1
This workflow block runs SmolVLM2, a multimodal vision-language model. You can ask questions about images -- including documents and photos -- and get answers in natural language.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/smolvlm2@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Optional text prompt to provide additional context to SmolVLM2. Otherwise it will just be None. | ❌ |
model_version |
str |
The SmolVLM2 model to be used for inference.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to SmolVLM2 in version v1.
- inputs:
Instance Segmentation Model,Heatmap Visualization,Background Subtraction,Ellipse Visualization,Line Counter Visualization,Polygon Zone Visualization,Depth Estimation,SIFT Comparison,Camera Focus,QR Code Generator,Image Contours,Absolute Static Crop,Label Visualization,Morphological Transformation,Classification Label Visualization,Stability AI Image Generation,Multi-Label Classification Model,Polygon Visualization,Image Threshold,Text Display,SIFT,Triangle Visualization,Bounding Box Visualization,Blur Visualization,Image Preprocessing,Camera Calibration,Corner Visualization,Dot Visualization,Relative Static Crop,Image Slicer,Model Comparison Visualization,Object Detection Model,Icon Visualization,Contrast Equalization,Keypoint Detection Model,Reference Path Visualization,Pixelate Visualization,Halo Visualization,Background Color Visualization,Mask Visualization,Stability AI Outpainting,Circle Visualization,Halo Visualization,Image Blur,Dynamic Crop,Keypoint Visualization,Single-Label Classification Model,Perspective Correction,Image Convert Grayscale,Grid Visualization,Polygon Visualization,Stitch Images,Camera Focus,Stability AI Inpainting,Color Visualization,Trace Visualization,Crop Visualization,Image Slicer - outputs:
Detections Consensus
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
SmolVLM2 in version v1 has.
Bindings
-
input
images(image): The image to infer on..model_version(roboflow_model_id): The SmolVLM2 model to be used for inference..
-
output
parsed_output(dictionary): Dictionary.
Example JSON definition of step SmolVLM2 in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/smolvlm2@v1",
"images": "$inputs.image",
"prompt": "What is in this image?",
"model_version": "smolvlm2/smolvlm-2.2b-instruct"
}