Perception Encoder Embedding Model¶
Class: PerceptionEncoderModelBlockV1
Use the Meta Perception Encoder model to create semantic embeddings of text and images.
This block accepts an image or string and returns an embedding. The embedding can be used to compare similarity between different images or between images and text.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/perception_encoder@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
data |
str |
The string or image to generate an embedding for.. | ✅ |
version |
str |
Variant of Perception Encoder model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
hard— runtimeself_hosted_cpu; executionlocal - Requires a GPU; run_locally() loads a model that needs CUDA.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Perception Encoder Embedding Model in version v1.
- inputs:
Roboflow Asset Library Attributes,MoonshotAI Kimi,Image Blur,Reference Path Visualization,Event Writer,Slack Notification,Halo Visualization,VLM As Classifier,Google Gemma,Qwen 3.6 API,Dot Visualization,Label Visualization,Background Color Visualization,Llama 3.2 Vision,Email Notification,Pixelate Visualization,OpenAI-Compatible LLM,Google Gemini,Anthropic Claude,OpenAI,Trace Visualization,Llama 3.2 Vision,Clip Comparison,Camera Focus,OpenAI,GLM-OCR,MQTT Writer,SIFT Comparison,CSV Formatter,Webhook Sink,Image Contours,Local File Sink,Google Gemini,MoonshotAI Kimi,Polygon Visualization,SIFT,Classification Label Visualization,Multi-Label Classification Model,Keypoint Detection Model,Keypoint Visualization,Icon Visualization,Dynamic Crop,Stability AI Inpainting,Bounding Box Visualization,Polygon Zone Visualization,Stability AI Outpainting,Crop Visualization,Image Convert Grayscale,Mask Visualization,Halo Visualization,Text Display,Morphological Transformation,Anthropic Claude,Roboflow Dataset Upload,Object Detection Model,Ellipse Visualization,Circle Visualization,Twilio SMS Notification,Email Notification,S3 Sink,Camera Focus,Image Slicer,LMM For Classification,OCR Model,Heatmap Visualization,OpenAI,Google Gemma API,Stitch Images,Morphological Transformation,EasyOCR,Current Time,Blur Visualization,Stitch OCR Detections,Florence-2 Model,Google Gemini,Corner Visualization,OpenRouter,Model Comparison Visualization,Model Monitoring Inference Aggregator,Google Vision OCR,Image Threshold,LMM,Single-Label Classification Model,Polygon Visualization,Stability AI Image Generation,Line Counter Visualization,CogVLM,Relative Static Crop,Qwen3.5-VL,Grid Visualization,Image Preprocessing,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Color Visualization,Triangle Visualization,QR Code Generator,Contrast Enhancement,Roboflow Dataset Upload,Absolute Static Crop,Qwen 3.5 API,Background Subtraction,OpenAI,Image Slicer,Qwen-VL,Florence-2 Model,Perspective Correction,Twilio SMS/MMS Notification,Roboflow Vision Events,Microsoft SQL Server Sink,Instance Segmentation Model,Depth Estimation,Roboflow Custom Metadata,Contrast Equalization,Camera Calibration,VLM As Detector - outputs:
Cosine Similarity,Identify Changes,Identify Outliers
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Perception Encoder Embedding Model in version v1 has.
Bindings
Example JSON definition of step Perception Encoder Embedding Model in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/perception_encoder@v1",
"data": "$inputs.image",
"version": "PE-Core-B16-224"
}