Moondream2¶
Class: Moondream2BlockV1
Source: inference.core.workflows.core_steps.models.foundation.moondream2.v1.Moondream2BlockV1
This workflow block runs Moondream2, a multimodal vision-language model. You can use this block to run zero-shot object detection.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/moondream2@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Optional text prompt to provide additional context to Moondream2.. | ✅ |
model_version |
str |
The Moondream2 model to be used for inference.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
hard— runtimeself_hosted_cpu; executionlocal - Requires a GPU; run_locally() loads a model that needs CUDA.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Moondream2 in version v1.
- inputs:
Circle Visualization,Roboflow Asset Library Attributes,MoonshotAI Kimi,Twilio SMS Notification,Semantic Segmentation Model,Image Blur,Email Notification,S3 Sink,Reference Path Visualization,Camera Focus,Event Writer,Slack Notification,Halo Visualization,VLM As Classifier,Google Gemma,Qwen 3.6 API,Object Detection Model,Dot Visualization,Single-Label Classification Model,Image Slicer,Label Visualization,Background Color Visualization,LMM For Classification,Llama 3.2 Vision,Email Notification,OCR Model,Pixelate Visualization,OpenAI-Compatible LLM,Heatmap Visualization,Google Gemini,Anthropic Claude,OpenAI,Google Gemma API,Stitch Images,Morphological Transformation,EasyOCR,OpenAI,Single-Label Classification Model,Current Time,Blur Visualization,Trace Visualization,Stitch OCR Detections,Llama 3.2 Vision,Clip Comparison,Camera Focus,OpenAI,Florence-2 Model,Google Gemini,GLM-OCR,Corner Visualization,OpenRouter,Model Comparison Visualization,MQTT Writer,SIFT Comparison,CSV Formatter,Webhook Sink,Model Monitoring Inference Aggregator,Google Vision OCR,Image Threshold,Image Contours,Local File Sink,Instance Segmentation Model,Google Gemini,MoonshotAI Kimi,LMM,Single-Label Classification Model,Polygon Visualization,Polygon Visualization,SIFT,Stability AI Image Generation,Classification Label Visualization,Line Counter Visualization,CogVLM,Multi-Label Classification Model,Relative Static Crop,Qwen3.5-VL,Instance Segmentation Model,Keypoint Detection Model,Grid Visualization,Image Preprocessing,Keypoint Visualization,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Instance Segmentation Model,Icon Visualization,Color Visualization,Triangle Visualization,QR Code Generator,Contrast Enhancement,Roboflow Dataset Upload,Absolute Static Crop,Dynamic Crop,Stability AI Inpainting,Background Subtraction,Qwen 3.5 API,Multi-Label Classification Model,Bounding Box Visualization,Polygon Zone Visualization,Stability AI Outpainting,Multi-Label Classification Model,Crop Visualization,Image Convert Grayscale,OpenAI,Mask Visualization,Halo Visualization,Image Slicer,Semantic Segmentation Model,Qwen-VL,Florence-2 Model,Perspective Correction,Twilio SMS/MMS Notification,Text Display,Morphological Transformation,Anthropic Claude,Roboflow Vision Events,Microsoft SQL Server Sink,Instance Segmentation Model,Roboflow Dataset Upload,Depth Estimation,Roboflow Custom Metadata,Contrast Equalization,Object Detection Model,Camera Calibration,Ellipse Visualization,VLM As Detector,Keypoint Detection Model,Keypoint Detection Model,Object Detection Model - outputs:
Size Measurement,Circle Visualization,Path Deviation,Path Deviation,Overlap Filter,PTZ Tracking (ONVIF),Event Writer,SAM2 Video Tracker,Dot Visualization,Byte Tracker,Label Visualization,Background Color Visualization,SAM 3 Interactive,Velocity,Pixelate Visualization,Mask Area Measurement,Heatmap Visualization,Track Class Lock,Time in Zone,Trace Visualization,Stitch OCR Detections,Detection Event Log,ByteTrack Tracker,Blur Visualization,Camera Focus,Detections List Roll-Up,Florence-2 Model,Corner Visualization,Detections Stabilizer,Model Comparison Visualization,Model Monitoring Inference Aggregator,Byte Tracker,Segment Anything 2 Model,Time in Zone,Line Counter,Per-Class Confidence Filter,Stitch OCR Detections,Icon Visualization,Color Visualization,Detections Combine,Triangle Visualization,Roboflow Dataset Upload,Dynamic Crop,BoT-SORT Tracker,Detections Transformation,Bounding Box Visualization,Crop Visualization,OC-SORT Tracker,Byte Tracker,Detections Stitch,Distance Measurement,Florence-2 Model,Detection Offset,SORT Tracker,Perspective Correction,Roboflow Vision Events,Overlap Analysis,Roboflow Dataset Upload,Detections Consensus,Roboflow Custom Metadata,Detections Filter,Ellipse Visualization,Detections Merge,Detections Classes Replacement,Line Counter,Time in Zone
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Moondream2 in version v1 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Optional text prompt to provide additional context to Moondream2..model_version(roboflow_model_id): The Moondream2 model to be used for inference..
-
output
predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step Moondream2 in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/moondream2@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"model_version": "moondream2/moondream2_2b_jul24"
}