YOLO-World Model¶
Class: YoloWorldModelBlockV1
Source: inference.core.workflows.core_steps.models.foundation.yolo_world.v1.YoloWorldModelBlockV1
Run YOLO-World, a zero-shot object detection model, on an image.
YOLO-World accepts one or more text classes you want to identify in an image. The model returns the location of objects that meet the specified class, if YOLO-World is able to identify objects of that class.
We recommend experimenting with YOLO-World to evaluate the model on your use case before using this block in production. For example on how to effectively prompt YOLO-World, refer to the Roboflow YOLO-World prompting guide.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/yolo_world_model@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
class_names |
List[str] |
One or more classes that you want YOLO-World to detect. The model accepts any string as an input, though does best with short descriptions of common objects.. | ✅ |
version |
str |
Variant of YoloWorld model. | ✅ |
confidence |
float |
Confidence threshold for detections. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to YOLO-World Model in version v1.
- inputs:
Halo Visualization,Stitch OCR Detections,GLM-OCR,Image Threshold,Stitch Images,Morphological Transformation,Classification Label Visualization,Twilio SMS/MMS Notification,Crop Visualization,Icon Visualization,Stability AI Outpainting,Blur Visualization,VLM As Classifier,Reference Path Visualization,MoonshotAI Kimi,OpenAI,Google Gemini,Anthropic Claude,Webhook Sink,Camera Focus,QR Code Generator,Size Measurement,Model Comparison Visualization,Florence-2 Model,MQTT Writer,Trace Visualization,Ellipse Visualization,Anthropic Claude,Dot Visualization,Perspective Correction,Label Visualization,Image Convert Grayscale,Florence-2 Model,Text Display,Qwen-VL,Llama 3.2 Vision,Roboflow Dataset Upload,PLC ModbusTCP,Image Blur,Keypoint Detection Model,Absolute Static Crop,SIFT,CSV Formatter,LMM,Google Gemini,Dimension Collapse,EasyOCR,Qwen 3.5 API,Qwen 3.6 API,Local File Sink,Triangle Visualization,Camera Focus,Contrast Equalization,Polygon Visualization,OpenAI,Heatmap Visualization,Clip Comparison,Google Gemma API,Detections List Roll-Up,Contrast Enhancement,Google Gemini,PLC EthernetIP,Halo Visualization,Color Visualization,Morphological Transformation,MoonshotAI Kimi,Stitch OCR Detections,LMM For Classification,Event Writer,VLM As Detector,Llama 3.2 Vision,Buffer,Polygon Visualization,Image Stack,Email Notification,Mask Visualization,Anthropic Claude,Identify Changes,Stability AI Inpainting,Roboflow Asset Library Attributes,Microsoft SQL Server Sink,Keypoint Visualization,OpenAI,Background Subtraction,Multi-Label Classification Model,Roboflow Vision Events,Identify Outliers,Twilio SMS Notification,Email Notification,Image Slicer,Image Contours,Line Counter Visualization,CogVLM,Detections Consensus,Object Detection Model,Image Preprocessing,OPC UA Writer Sink,Dynamic Crop,Depth Estimation,Bounding Box Visualization,Motion Detection,Qwen3.5-VL,Current Time,Clip Comparison,Corner Visualization,Polygon Zone Visualization,Camera Calibration,Roboflow Dataset Upload,Grid Visualization,Stability AI Image Generation,Dynamic Zone,OpenAI,S3 Sink,Circle Visualization,Image Slicer,OCR Model,Single-Label Classification Model,Relative Static Crop,Roboflow Custom Metadata,Instance Segmentation Model,Model Monitoring Inference Aggregator,OpenAI-Compatible LLM,Slack Notification,OpenRouter,SIFT Comparison,Pixelate Visualization,Google Vision OCR,Background Color Visualization,Google Gemma - outputs:
Overlap Analysis,Stitch OCR Detections,SAM 3 Interactive,Crop Visualization,Icon Visualization,Detections Transformation,Blur Visualization,ByteTrack Tracker,Detections Classes Replacement,Byte Tracker,Track Class Lock,Size Measurement,Model Comparison Visualization,Florence-2 Model,Path Deviation,Trace Visualization,Ellipse Visualization,BoT-SORT Tracker,Dot Visualization,Perspective Correction,Label Visualization,Florence-2 Model,Per-Class Confidence Filter,Roboflow Dataset Upload,Detections Stabilizer,Detections Merge,Velocity,OC-SORT Tracker,Triangle Visualization,Camera Focus,Time in Zone,SAM2 Video Tracker,SORT Tracker,Line Counter,Heatmap Visualization,Detections Stitch,Detections List Roll-Up,Stitch OCR Detections,Color Visualization,Event Writer,Detections Filter,Distance Measurement,PTZ Tracking (ONVIF),Time in Zone,Overlap Filter,Roboflow Vision Events,Mask Area Measurement,Detection Offset,Detections Consensus,Byte Tracker,Path Deviation,Dynamic Crop,Byte Tracker,Bounding Box Visualization,Detections Combine,Roboflow Dataset Upload,Corner Visualization,Segment Anything 2 Model,Circle Visualization,Time in Zone,Roboflow Custom Metadata,Model Monitoring Inference Aggregator,Detection Event Log,Pixelate Visualization,Background Color Visualization,Line Counter
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
YOLO-World Model in version v1 has.
Bindings
-
input
images(image): The image to infer on..class_names(list_of_values): One or more classes that you want YOLO-World to detect. The model accepts any string as an input, though does best with short descriptions of common objects..version(string): Variant of YoloWorld model.confidence(float_zero_to_one): Confidence threshold for detections.
-
output
predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step YOLO-World Model in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/yolo_world_model@v1",
"images": "$inputs.image",
"class_names": [
"person",
"car",
"license plate"
],
"version": "v2-s",
"confidence": 0.005
}