SIFT¶
Class: SIFTBlockV1
Source: inference.core.workflows.core_steps.classical_cv.sift.v1.SIFTBlockV1
Detect and describe distinctive visual features in images using SIFT (Scale-Invariant Feature Transform), extracting keypoints (interest points) and computing 128-dimensional feature descriptors that are invariant to scale, rotation, and lighting conditions, enabling feature-based image matching, object recognition, and image similarity detection workflows.
How This Block Works¶
This block detects distinctive visual features in an image using SIFT and computes feature descriptors for each detected keypoint. The block:
- Receives an input image to analyze for feature detection
- Converts the image to grayscale (SIFT operates on grayscale images for efficiency and robustness)
- Creates a SIFT detector using OpenCV's SIFT implementation
- Detects keypoints and computes descriptors simultaneously using detectAndCompute:
- Keypoint Detection: Identifies distinctive interest points (keypoints) in the image that are stable across different viewing conditions
- Keypoints are detected at multiple scales (pyramid of scale-space images) to handle scale variations
- Keypoints are detected with orientation assignment to handle rotation variations
- Each keypoint has properties: position (x, y coordinates), size (scale at which it was detected), angle (orientation), response (strength), octave (scale level), and class_id
- Descriptor Computation: Computes 128-dimensional feature descriptors for each keypoint that describe the local image region around the keypoint
- Descriptors encode gradient information in the local region, making them distinctive and robust to lighting changes
- Descriptors are normalized to be partially invariant to illumination changes
- Draws keypoints on the original image for visualization:
- Uses OpenCV's drawKeypoints to overlay keypoint markers on the image
- Visualizes keypoint locations, orientations, and scales
- Creates a visual representation showing where features were detected
- Converts keypoints to dictionary format:
- Extracts keypoint properties (position, size, angle, response, octave, class_id) into dictionaries
- Makes keypoint data accessible for downstream processing and analysis
- Returns the image with keypoints drawn, the keypoints data (as dictionaries), and the descriptors (as numpy array)
SIFT features are scale-invariant (work at different zoom levels), rotation-invariant (handle rotated images), and partially lighting-invariant (robust to illumination changes). This makes them highly effective for matching the same object or scene across different images taken from different viewpoints, distances, angles, or lighting conditions. The 128-dimensional descriptors provide rich information about local image regions, enabling robust feature matching and comparison.
Common Use Cases¶
- Feature-Based Image Matching: Detect features for matching objects or scenes across different images (e.g., match objects in multiple images, find corresponding features across viewpoints, identify matching regions in image pairs), enabling feature-based matching workflows
- Object Recognition: Use SIFT features for object recognition and identification (e.g., recognize objects using feature matching, identify objects by their distinctive features, match object features for classification), enabling feature-based object recognition workflows
- Image Similarity Detection: Detect similar images by comparing SIFT features (e.g., find similar images in databases, detect duplicate images, identify matching scenes), enabling image similarity workflows
- Feature Extraction for Analysis: Extract distinctive features from images for further analysis (e.g., extract features for processing, analyze image characteristics, identify interesting regions), enabling feature extraction workflows
- Visual Localization: Use SIFT features for visual localization and mapping (e.g., localize objects in scenes, track features across frames, map feature correspondences), enabling visual localization workflows
- Image Registration: Align images using SIFT feature correspondences (e.g., register images for stitching, align images from different viewpoints, match images for alignment), enabling image registration workflows
Connecting to Other Blocks¶
This block receives an image and produces SIFT keypoints and descriptors:
- After image input blocks to extract SIFT features from input images (e.g., detect features in camera feeds, extract features from image inputs, analyze features in images), enabling SIFT feature extraction workflows
- After preprocessing blocks to extract features from preprocessed images (e.g., detect features after filtering, extract features from enhanced images, analyze features after preprocessing), enabling preprocessed feature extraction workflows
- Before SIFT Comparison blocks to provide SIFT descriptors for image comparison (e.g., provide descriptors for matching, prepare features for comparison, supply descriptors for similarity detection), enabling SIFT-based image comparison workflows
- Before filtering or logic blocks that use feature counts or properties for decision-making (e.g., filter based on feature count, make decisions based on detected features, apply logic based on feature properties), enabling feature-based conditional workflows
- Before data storage blocks to store feature data (e.g., store keypoints and descriptors, save feature information, record feature data for analysis), enabling feature data storage workflows
- Before visualization blocks to display detected features (e.g., visualize keypoints, display feature locations, show feature analysis results), enabling feature visualization workflows
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/sift@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to SIFT in version v1.
- inputs:
Polygon Visualization,Polygon Visualization,SIFT,Circle Visualization,Classification Label Visualization,Line Counter Visualization,Stability AI Image Generation,Relative Static Crop,Image Blur,Grid Visualization,Reference Path Visualization,Camera Focus,Image Preprocessing,Keypoint Visualization,Icon Visualization,Color Visualization,Halo Visualization,Triangle Visualization,Dot Visualization,QR Code Generator,Contrast Enhancement,Absolute Static Crop,Dynamic Crop,Stability AI Inpainting,Image Slicer,Background Subtraction,Label Visualization,Background Color Visualization,Bounding Box Visualization,Polygon Zone Visualization,Stability AI Outpainting,Crop Visualization,Pixelate Visualization,Image Convert Grayscale,Mask Visualization,Halo Visualization,Heatmap Visualization,Image Slicer,Perspective Correction,Stitch Images,Text Display,Morphological Transformation,Morphological Transformation,Image Threshold,Blur Visualization,Depth Estimation,Trace Visualization,Camera Focus,Contrast Equalization,Camera Calibration,Corner Visualization,Ellipse Visualization,Model Comparison Visualization,SIFT Comparison,Image Contours - outputs:
MoonshotAI Kimi,Image Blur,SmolVLM2,Reference Path Visualization,SIFT Comparison,Event Writer,SAM2 Video Tracker,VLM As Classifier,Halo Visualization,CLIP Embedding Model,Image Stack,Clip Comparison,Google Gemma,Qwen 3.6 API,Object Detection Model,Dot Visualization,Label Visualization,Background Color Visualization,Llama 3.2 Vision,SAM 3 Interactive,Pixelate Visualization,Qwen3-VL,Google Gemini,Track Class Lock,Anthropic Claude,OpenAI,Trace Visualization,Llama 3.2 Vision,ByteTrack Tracker,Clip Comparison,Camera Focus,GLM-OCR,OpenAI,Qwen3.5,Buffer,QR Code Detection,SIFT Comparison,Image Contours,Motion Detection,Google Gemini,MoonshotAI Kimi,Polygon Visualization,SIFT,Classification Label Visualization,Multi-Label Classification Model,Instance Segmentation Model,Keypoint Detection Model,Template Matching,Keypoint Visualization,Instance Segmentation Model,Icon Visualization,Seg Preview,Dynamic Crop,Stability AI Inpainting,Bounding Box Visualization,BoT-SORT Tracker,Multi-Label Classification Model,Polygon Zone Visualization,Crop Visualization,Stability AI Outpainting,Image Convert Grayscale,Mask Visualization,Halo Visualization,Detections Stitch,SORT Tracker,Barcode Detection,Text Display,Anthropic Claude,Morphological Transformation,VLM As Classifier,Roboflow Dataset Upload,VLM As Detector,Object Detection Model,Ellipse Visualization,Keypoint Detection Model,SAM3 Video Tracker,SAM 3,Circle Visualization,Semantic Segmentation Model,Email Notification,Camera Focus,Single-Label Classification Model,SAM 3,Image Slicer,LMM For Classification,Dominant Color,OCR Model,Heatmap Visualization,Google Gemma API,OpenAI,Stitch Images,Morphological Transformation,EasyOCR,Single-Label Classification Model,YOLO-World Model,Blur Visualization,Moondream2,Florence-2 Model,Google Gemini,Corner Visualization,OpenRouter,Detections Stabilizer,Pixel Color Count,Model Comparison Visualization,SAM 3,Google Vision OCR,Byte Tracker,Image Threshold,Instance Segmentation Model,LMM,Single-Label Classification Model,Polygon Visualization,Segment Anything 2 Model,Time in Zone,Mask Edge Snap,Line Counter Visualization,Stability AI Image Generation,CogVLM,Relative Static Crop,Qwen3.5-VL,Image Preprocessing,Gaze Detection,Anthropic Claude,Color Visualization,Triangle Visualization,Roboflow Dataset Upload,Contrast Enhancement,Absolute Static Crop,Qwen 3.5 API,Background Subtraction,Multi-Label Classification Model,OC-SORT Tracker,OpenAI,Image Slicer,Semantic Segmentation Model,Qwen-VL,Florence-2 Model,Perspective Correction,Roboflow Vision Events,Twilio SMS/MMS Notification,Perception Encoder Embedding Model,Instance Segmentation Model,Depth Estimation,Contrast Equalization,Camera Calibration,VLM As Detector,Qwen2.5-VL,Keypoint Detection Model,Object Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
SIFT in version v1 has.
Bindings
-
input
image(image): Input image to analyze for SIFT feature detection. The image will be converted to grayscale internally for SIFT processing. SIFT works best on images with good texture and detail - images with rich visual content (edges, corners, patterns) produce more keypoints than uniform or smooth images. Each detected keypoint will have a 128-dimensional descriptor computed. The output includes an image with keypoints drawn for visualization, keypoint data (position, size, angle, response, octave), and descriptor arrays for matching and comparison. SIFT features are scale and rotation invariant, making them effective for matching across different viewpoints and conditions..
-
output
image(image): Image in workflows.keypoints(image_keypoints): Image keypoints detected by classical Computer Vision method.descriptors(numpy_array): Numpy array.
Example JSON definition of step SIFT in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/sift@v1",
"image": "$inputs.image"
}