SAM 3¶
v3¶
Class: SegmentAnything3BlockV3 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.segment_anything3.v3.SegmentAnything3BlockV3
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Run Segment Anything 3 (SAM3), a zero-shot instance segmentation model, on an image.
You can use text prompts for open-vocabulary segmentation - just specify class names and SAM3 will segment those objects in the image.
This block supports two output formats: - rle (default): Returns masks in RLE (Run-Length Encoding) format, which is more memory-efficient - polygons: Returns polygon coordinates for each mask
RLE format is recommended for high-resolution images or workflows with many detections.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/sam3@v3to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
model_id |
str |
model version. You only need to change this for fine tuned sam3 models.. | ✅ |
class_names |
Optional[List[str], str] |
List of classes to recognise. | ✅ |
class_mapping |
Dict[str, str] |
Maps class names in predictions to different output names. Applied after inference, e.g. {'cat': 'gato'} renames 'cat' predictions to 'gato'.. | ✅ |
confidence |
float |
Minimum confidence threshold for predicted masks. | ✅ |
per_class_confidence |
List[float] |
List of confidence thresholds per class (must match class_names length). | ✅ |
apply_nms |
bool |
Whether to apply Non-Maximum Suppression across prompts. | ✅ |
nms_iou_threshold |
float |
IoU threshold for cross-prompt NMS. Must be in [0.0, 1.0]. | ✅ |
output_format |
str |
'rle' returns efficient RLE encoding (recommended), 'polygons' returns polygon coordinates. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to SAM 3 in version v3.
- inputs:
Stability AI Outpainting,Multi-Label Classification Model,Motion Detection,Contrast Enhancement,Camera Focus,Image Preprocessing,Corner Visualization,Ellipse Visualization,Object Detection Model,Roboflow Vision Events,Cosine Similarity,Heatmap Visualization,Trace Visualization,VLM As Classifier,OpenAI,Email Notification,Keypoint Visualization,Detections Consensus,Model Comparison Visualization,JSON Parser,Polygon Zone Visualization,Single-Label Classification Model,Dynamic Crop,Polygon Visualization,QR Code Generator,GLM-OCR,Stitch Images,OpenRouter,Semantic Segmentation Model,Image Blur,Clip Comparison,Dynamic Zone,Model Monitoring Inference Aggregator,Detections List Roll-Up,Instance Segmentation Model,Buffer,Google Gemini,Pixelate Visualization,EasyOCR,SIFT,Contrast Equalization,Image Threshold,Instance Segmentation Model,Polygon Visualization,Anthropic Claude,Halo Visualization,Qwen2.5-VL,Roboflow Custom Metadata,Keypoint Detection Model,Florence-2 Model,Local File Sink,Icon Visualization,Single-Label Classification Model,Image Contours,OpenAI,Grid Visualization,VLM As Detector,Size Measurement,Multi-Label Classification Model,Object Detection Model,LMM,Image Convert Grayscale,Reference Path Visualization,Stitch OCR Detections,Keypoint Detection Model,SIFT Comparison,Identify Changes,Roboflow Dataset Upload,CSV Formatter,S3 Sink,Qwen3-VL,OpenAI-Compatible LLM,Morphological Transformation,Object Detection Model,SIFT Comparison,Identify Outliers,Crop Visualization,Blur Visualization,Mask Visualization,Stability AI Image Generation,Qwen-VL,Stitch OCR Detections,Google Gemma API,Qwen3.5,Image Slicer,Qwen 3.5 API,Background Color Visualization,Slack Notification,Anthropic Claude,Dimension Collapse,Qwen 3.6 API,Webhook Sink,Color Visualization,Bounding Box Visualization,Google Gemma,Relative Static Crop,Detection Event Log,Llama 3.2 Vision,CogVLM,Instance Segmentation Model,Qwen3.5-VL,Camera Focus,Instance Segmentation Model,Google Vision OCR,Google Gemini,Llama 3.2 Vision,Single-Label Classification Model,Twilio SMS Notification,SmolVLM2,Anthropic Claude,Image Slicer,Depth Estimation,OpenAI,Multi-Label Classification Model,Gaze Detection,Classification Label Visualization,PTZ Tracking (ONVIF),Florence-2 Model,MoonshotAI Kimi,MoonshotAI Kimi,Dot Visualization,Background Subtraction,Keypoint Detection Model,Roboflow Dataset Upload,Stability AI Inpainting,Semantic Segmentation Model,Label Visualization,Absolute Static Crop,Google Gemini,VLM As Classifier,Camera Calibration,Halo Visualization,Email Notification,OpenAI,Clip Comparison,LMM For Classification,Text Display,Circle Visualization,Line Counter Visualization,OCR Model,VLM As Detector,Image Stack,Morphological Transformation,Twilio SMS/MMS Notification,Triangle Visualization,Perspective Correction - outputs:
BoT-SORT Tracker,Mask Visualization,Crop Visualization,Blur Visualization,Camera Focus,Corner Visualization,Ellipse Visualization,Roboflow Vision Events,Velocity,Heatmap Visualization,Trace Visualization,OC-SORT Tracker,Path Deviation,Background Color Visualization,Time in Zone,Color Visualization,Bounding Box Visualization,Byte Tracker,Detections Consensus,Model Comparison Visualization,Detection Event Log,Bounding Rectangle,Path Deviation,Byte Tracker,Dynamic Crop,Polygon Visualization,Distance Measurement,SORT Tracker,Detections Stabilizer,Dynamic Zone,Model Monitoring Inference Aggregator,Detections Stitch,Time in Zone,Detections List Roll-Up,Segment Anything 2 Model,Pixelate Visualization,PTZ Tracking (ONVIF),Florence-2 Model,Time in Zone,Line Counter,Dot Visualization,Polygon Visualization,Roboflow Dataset Upload,Halo Visualization,Per-Class Confidence Filter,Stability AI Inpainting,Roboflow Custom Metadata,Line Counter,Detections Classes Replacement,Florence-2 Model,Label Visualization,Icon Visualization,Detection Offset,Overlap Filter,Detections Merge,SAM2 Video Tracker,Detections Filter,ByteTrack Tracker,Halo Visualization,Size Measurement,Detections Transformation,Mask Area Measurement,Circle Visualization,Byte Tracker,Detections Combine,Overlap Analysis,Roboflow Dataset Upload,Mask Edge Snap,Triangle Visualization,Perspective Correction
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
SAM 3 in version v3 has.
Bindings
-
input
images(image): The image to infer on..model_id(roboflow_model_id): model version. You only need to change this for fine tuned sam3 models..class_names(Union[list_of_values,string]): List of classes to recognise.class_mapping(dictionary): Maps class names in predictions to different output names. Applied after inference, e.g. {'cat': 'gato'} renames 'cat' predictions to 'gato'..confidence(float): Minimum confidence threshold for predicted masks.per_class_confidence(list_of_values): List of confidence thresholds per class (must match class_names length).apply_nms(boolean): Whether to apply Non-Maximum Suppression across prompts.nms_iou_threshold(float): IoU threshold for cross-prompt NMS. Must be in [0.0, 1.0].
-
output
predictions(Union[rle_instance_segmentation_prediction,instance_segmentation_prediction]): Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object ifrle_instance_segmentation_predictionor Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object ifinstance_segmentation_prediction.
Example JSON definition of step SAM 3 in version v3
{
"name": "<your_step_name_here>",
"type": "roboflow_core/sam3@v3",
"images": "$inputs.image",
"model_id": "sam3/sam3_final",
"class_names": [
"car",
"person"
],
"class_mapping": {
"cat": "gato",
"dog": "perro"
},
"confidence": 0.3,
"per_class_confidence": [
0.3,
0.5,
0.7
],
"apply_nms": "<block_does_not_provide_example>",
"nms_iou_threshold": 0.5,
"output_format": "rle"
}
v2¶
Class: SegmentAnything3BlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.segment_anything3.v2.SegmentAnything3BlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Run Segment Anything 3, a zero-shot instance segmentation model, on an image.
You can pass in boxes/predictions from other models as prompts, or use a text prompt for open-vocabulary segmentation. If you pass in box detections from another model, the class names of the boxes will be forwarded to the predicted masks.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/sam3@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
model_id |
str |
model version. You only need to change this for fine tuned sam3 models.. | ✅ |
class_names |
Optional[List[str], str] |
List of classes to recognise. | ✅ |
confidence |
float |
Minimum confidence threshold for predicted masks. | ✅ |
per_class_confidence |
List[float] |
List of confidence thresholds per class (must match class_names length). | ✅ |
apply_nms |
bool |
Whether to apply Non-Maximum Suppression across prompts. | ✅ |
nms_iou_threshold |
float |
IoU threshold for cross-prompt NMS. Must be in [0.0, 1.0]. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to SAM 3 in version v2.
- inputs:
Stability AI Outpainting,Multi-Label Classification Model,Motion Detection,Contrast Enhancement,Camera Focus,Image Preprocessing,Corner Visualization,Ellipse Visualization,Object Detection Model,Roboflow Vision Events,Cosine Similarity,Heatmap Visualization,Trace Visualization,VLM As Classifier,OpenAI,Email Notification,Keypoint Visualization,Detections Consensus,Model Comparison Visualization,JSON Parser,Polygon Zone Visualization,Single-Label Classification Model,Dynamic Crop,Polygon Visualization,QR Code Generator,GLM-OCR,Stitch Images,OpenRouter,Semantic Segmentation Model,Image Blur,Clip Comparison,Dynamic Zone,Model Monitoring Inference Aggregator,Detections List Roll-Up,Instance Segmentation Model,Buffer,Google Gemini,Pixelate Visualization,EasyOCR,SIFT,Contrast Equalization,Image Threshold,Instance Segmentation Model,Polygon Visualization,Anthropic Claude,Halo Visualization,Roboflow Custom Metadata,Keypoint Detection Model,Florence-2 Model,Local File Sink,Icon Visualization,Single-Label Classification Model,Image Contours,OpenAI,Grid Visualization,VLM As Detector,Size Measurement,Multi-Label Classification Model,Object Detection Model,LMM,Image Convert Grayscale,Reference Path Visualization,Stitch OCR Detections,Keypoint Detection Model,SIFT Comparison,Identify Changes,Roboflow Dataset Upload,CSV Formatter,S3 Sink,SIFT Comparison,OpenAI-Compatible LLM,Morphological Transformation,Object Detection Model,Identify Outliers,Crop Visualization,Blur Visualization,Mask Visualization,Stability AI Image Generation,Qwen-VL,Stitch OCR Detections,Google Gemma API,Image Slicer,Qwen 3.5 API,Background Color Visualization,Slack Notification,Anthropic Claude,Dimension Collapse,Qwen 3.6 API,Webhook Sink,Color Visualization,Bounding Box Visualization,Google Gemma,Relative Static Crop,Llama 3.2 Vision,CogVLM,Instance Segmentation Model,Qwen3.5-VL,Camera Focus,Instance Segmentation Model,Google Vision OCR,Google Gemini,Llama 3.2 Vision,Single-Label Classification Model,Twilio SMS Notification,Anthropic Claude,Image Slicer,Depth Estimation,OpenAI,Multi-Label Classification Model,Gaze Detection,Classification Label Visualization,PTZ Tracking (ONVIF),Florence-2 Model,MoonshotAI Kimi,MoonshotAI Kimi,Dot Visualization,Background Subtraction,Keypoint Detection Model,Roboflow Dataset Upload,Stability AI Inpainting,Semantic Segmentation Model,Label Visualization,Absolute Static Crop,Google Gemini,VLM As Classifier,Camera Calibration,Halo Visualization,Email Notification,OpenAI,Clip Comparison,LMM For Classification,Text Display,Circle Visualization,Line Counter Visualization,OCR Model,VLM As Detector,Image Stack,Morphological Transformation,Twilio SMS/MMS Notification,Triangle Visualization,Perspective Correction - outputs:
BoT-SORT Tracker,Crop Visualization,Mask Visualization,Camera Focus,Blur Visualization,Corner Visualization,Ellipse Visualization,Roboflow Vision Events,Velocity,Heatmap Visualization,Trace Visualization,Path Deviation,OC-SORT Tracker,Background Color Visualization,Time in Zone,Color Visualization,Byte Tracker,Bounding Box Visualization,Detections Consensus,Model Comparison Visualization,Detection Event Log,Bounding Rectangle,Path Deviation,Byte Tracker,Dynamic Crop,Polygon Visualization,Distance Measurement,SORT Tracker,Detections Stabilizer,Dynamic Zone,Model Monitoring Inference Aggregator,Detections Stitch,Time in Zone,Detections List Roll-Up,Segment Anything 2 Model,Pixelate Visualization,PTZ Tracking (ONVIF),Florence-2 Model,Time in Zone,Line Counter,Dot Visualization,Polygon Visualization,Roboflow Dataset Upload,Halo Visualization,Per-Class Confidence Filter,Stability AI Inpainting,Roboflow Custom Metadata,Line Counter,Detections Classes Replacement,Florence-2 Model,Label Visualization,Icon Visualization,Detection Offset,Overlap Filter,Detections Merge,SAM2 Video Tracker,Detections Filter,ByteTrack Tracker,Halo Visualization,Size Measurement,Detections Transformation,Mask Area Measurement,Circle Visualization,Byte Tracker,Detections Combine,Overlap Analysis,Roboflow Dataset Upload,Mask Edge Snap,Triangle Visualization,Perspective Correction
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
SAM 3 in version v2 has.
Bindings
-
input
images(image): The image to infer on..model_id(roboflow_model_id): model version. You only need to change this for fine tuned sam3 models..class_names(Union[list_of_values,string]): List of classes to recognise.confidence(float): Minimum confidence threshold for predicted masks.per_class_confidence(list_of_values): List of confidence thresholds per class (must match class_names length).apply_nms(boolean): Whether to apply Non-Maximum Suppression across prompts.nms_iou_threshold(float): IoU threshold for cross-prompt NMS. Must be in [0.0, 1.0].
-
output
predictions(instance_segmentation_prediction): Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object.
Example JSON definition of step SAM 3 in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/sam3@v2",
"images": "$inputs.image",
"model_id": "sam3/sam3_final",
"class_names": [
"car",
"person"
],
"confidence": 0.3,
"per_class_confidence": [
0.3,
0.5,
0.7
],
"apply_nms": "<block_does_not_provide_example>",
"nms_iou_threshold": 0.5
}
v1¶
Class: SegmentAnything3BlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.segment_anything3.v1.SegmentAnything3BlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Run Segment Anything 3, a zero-shot instance segmentation model, on an image.
You can pass in boxes/predictions from other models as prompts, or use a text prompt for open-vocabulary segmentation. If you pass in box detections from another model, the class names of the boxes will be forwarded to the predicted masks.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/sam3@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
model_id |
str |
model version. You only need to change this for fine tuned sam3 models.. | ✅ |
class_names |
Optional[List[str], str] |
List of classes to recognise. | ✅ |
threshold |
float |
Threshold for predicted mask scores. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to SAM 3 in version v1.
- inputs:
Stability AI Outpainting,Multi-Label Classification Model,Motion Detection,Contrast Enhancement,Camera Focus,Image Preprocessing,Corner Visualization,Ellipse Visualization,Object Detection Model,Roboflow Vision Events,Cosine Similarity,Heatmap Visualization,Trace Visualization,OpenAI,Email Notification,Keypoint Visualization,Model Comparison Visualization,Polygon Zone Visualization,Single-Label Classification Model,Dynamic Crop,Polygon Visualization,QR Code Generator,GLM-OCR,Stitch Images,OpenRouter,Semantic Segmentation Model,Image Blur,Clip Comparison,Dynamic Zone,Model Monitoring Inference Aggregator,Detections List Roll-Up,Instance Segmentation Model,Buffer,Google Gemini,Pixelate Visualization,EasyOCR,SIFT,Contrast Equalization,Image Threshold,Instance Segmentation Model,Polygon Visualization,Anthropic Claude,Halo Visualization,Roboflow Custom Metadata,Keypoint Detection Model,Florence-2 Model,Local File Sink,Icon Visualization,Single-Label Classification Model,Image Contours,OpenAI,Grid Visualization,Size Measurement,Multi-Label Classification Model,Object Detection Model,LMM,Image Convert Grayscale,Reference Path Visualization,Stitch OCR Detections,Keypoint Detection Model,SIFT Comparison,Identify Changes,Roboflow Dataset Upload,CSV Formatter,S3 Sink,OpenAI-Compatible LLM,Morphological Transformation,Object Detection Model,Crop Visualization,Blur Visualization,Mask Visualization,Stability AI Image Generation,Qwen-VL,Stitch OCR Detections,Google Gemma API,Image Slicer,Qwen 3.5 API,Background Color Visualization,Slack Notification,Anthropic Claude,Dimension Collapse,Qwen 3.6 API,Webhook Sink,Color Visualization,Bounding Box Visualization,Google Gemma,Relative Static Crop,Llama 3.2 Vision,CogVLM,Instance Segmentation Model,Qwen3.5-VL,Camera Focus,Instance Segmentation Model,Google Vision OCR,Google Gemini,Llama 3.2 Vision,Single-Label Classification Model,Twilio SMS Notification,Anthropic Claude,Image Slicer,Depth Estimation,OpenAI,Multi-Label Classification Model,Gaze Detection,Classification Label Visualization,Florence-2 Model,MoonshotAI Kimi,MoonshotAI Kimi,Dot Visualization,Background Subtraction,Keypoint Detection Model,Roboflow Dataset Upload,Stability AI Inpainting,Semantic Segmentation Model,Label Visualization,Absolute Static Crop,Google Gemini,VLM As Classifier,Camera Calibration,Halo Visualization,Email Notification,OpenAI,Clip Comparison,LMM For Classification,Text Display,Circle Visualization,Line Counter Visualization,OCR Model,VLM As Detector,Image Stack,Morphological Transformation,Twilio SMS/MMS Notification,Triangle Visualization,Perspective Correction - outputs:
BoT-SORT Tracker,Crop Visualization,Mask Visualization,Camera Focus,Blur Visualization,Corner Visualization,Ellipse Visualization,Roboflow Vision Events,Velocity,Heatmap Visualization,Trace Visualization,Path Deviation,OC-SORT Tracker,Background Color Visualization,Time in Zone,Color Visualization,Byte Tracker,Bounding Box Visualization,Detections Consensus,Model Comparison Visualization,Detection Event Log,Bounding Rectangle,Path Deviation,Byte Tracker,Dynamic Crop,Polygon Visualization,Distance Measurement,SORT Tracker,Detections Stabilizer,Dynamic Zone,Model Monitoring Inference Aggregator,Detections Stitch,Time in Zone,Detections List Roll-Up,Segment Anything 2 Model,Pixelate Visualization,PTZ Tracking (ONVIF),Florence-2 Model,Time in Zone,Line Counter,Dot Visualization,Polygon Visualization,Roboflow Dataset Upload,Halo Visualization,Per-Class Confidence Filter,Stability AI Inpainting,Roboflow Custom Metadata,Line Counter,Detections Classes Replacement,Florence-2 Model,Label Visualization,Icon Visualization,Detection Offset,Overlap Filter,Detections Merge,SAM2 Video Tracker,Detections Filter,ByteTrack Tracker,Halo Visualization,Size Measurement,Detections Transformation,Mask Area Measurement,Circle Visualization,Byte Tracker,Detections Combine,Overlap Analysis,Roboflow Dataset Upload,Mask Edge Snap,Triangle Visualization,Perspective Correction
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
SAM 3 in version v1 has.
Bindings
-
input
images(image): The image to infer on..model_id(roboflow_model_id): model version. You only need to change this for fine tuned sam3 models..class_names(Union[list_of_values,string]): List of classes to recognise.threshold(float): Threshold for predicted mask scores.
-
output
predictions(instance_segmentation_prediction): Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object.
Example JSON definition of step SAM 3 in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/sam3@v1",
"images": "$inputs.image",
"model_id": "sam3/sam3_final",
"class_names": [
"car",
"person"
],
"threshold": 0.3
}