Perspective Correction¶
Class: PerspectiveCorrectionBlockV1
Transform detection coordinates and optionally images from a perspective view to a top-down orthographic view using perspective transformation, correcting camera angle distortions to enable accurate measurements, top-down analysis, and coordinate normalization for scenarios where objects are viewed at an angle (e.g., surveillance cameras, aerial imagery, or tilted camera setups).
How This Block Works¶
This block corrects perspective distortion by transforming coordinates from a perspective view (where objects appear smaller when further away and angles are distorted) to a top-down orthographic view (as if the camera were directly above the scene). The block:
- Receives input data: images and/or detections (object detection or instance segmentation predictions), along with perspective polygons defining regions to transform
- Processes perspective polygons:
- Selects the largest polygon from provided polygons (if multiple are provided)
- Sorts polygon vertices in clockwise order and orients them starting from the leftmost bottom vertex
- Ensures proper polygon ordering for transformation matrix calculation
- Optionally extends perspective polygons to contain all detections:
- If
extend_perspective_polygon_by_detections_anchoris set, extends the polygon to ensure all detection anchor points (or entire bounding boxes if "ALL" is specified) are contained within the polygon - Calculates extension amounts needed to contain detections outside the original polygon
- Adjusts polygon vertices to create a larger region that encompasses all detections
- Generates perspective transformation matrix:
- Maps the source polygon (4 vertices in the perspective view) to a destination rectangle (top-down view) with specified width and height
- Uses OpenCV's
getPerspectiveTransformto compute the 3x3 transformation matrix - Handles extended dimensions when polygon extension is enabled
- Applies perspective transformation to detections (if provided):
- Transforms bounding box coordinates from perspective view to top-down coordinates
- Transforms instance segmentation masks by converting masks to polygons, transforming polygon vertices, and converting back to masks in the new coordinate space
- Transforms keypoint coordinates for keypoint detection predictions
- Updates all coordinate data to reflect the corrected perspective
- Optionally warps images (if
warp_imageis True): - Applies the perspective transformation to the entire image using OpenCV's
warpPerspective - Produces a top-down view of the image with corrected perspective
- Outputs the warped image at the specified transformed rectangle dimensions (plus any extensions)
- Returns corrected outputs:
corrected_coordinates: Detections with transformed coordinates in the top-down coordinate spacewarped_image: Perspective-corrected image (if image warping is enabled)extended_transformed_rect_widthandextended_transformed_rect_height: Final dimensions including any polygon extensions
The transformation effectively "unwarps" the perspective distortion, making coordinates and images appear as if viewed from directly above. This is useful for accurate measurements, area calculations, distance measurements, and spatial analysis where perspective distortion would otherwise introduce errors.
Common Use Cases¶
- Top-Down Analysis: Correct perspective distortion for top-down analysis and measurement (e.g., surveillance camera analysis, overhead view generation, top-down coordinate normalization), enabling top-down analysis workflows
- Accurate Measurements: Enable accurate distance, area, and size measurements by removing perspective distortion (e.g., measure object sizes in real-world units, calculate areas accurately, measure distances without distortion), enabling measurement workflows
- Spatial Analysis: Perform spatial analysis and coordinate-based operations on corrected coordinates (e.g., zone-based analysis, spatial tracking, coordinate-based filtering), enabling spatial analysis workflows
- Aerial and Overhead Imagery: Process aerial imagery or overhead camera feeds with perspective correction (e.g., drone imagery analysis, overhead camera processing, satellite image analysis), enabling aerial analysis workflows
- Quality Control and Inspection: Correct perspective for quality control and inspection workflows (e.g., manufacturing inspection, product quality checks, defect detection with accurate measurements), enabling quality control workflows
- Indoor Navigation and Mapping: Correct perspective for indoor navigation and mapping applications (e.g., floor plan generation, indoor mapping, navigation systems), enabling mapping workflows
Connecting to Other Blocks¶
This block receives images and/or detections and produces perspective-corrected outputs:
- After detection models to correct coordinates for accurate analysis (e.g., object detection with perspective correction, instance segmentation with corrected coordinates), enabling detection-to-correction workflows
- After zone or polygon definition blocks to use defined regions as perspective polygons (e.g., use polygon zones as perspective regions, apply correction to specific regions), enabling zone-to-correction workflows
- Before measurement blocks to enable accurate measurements on corrected coordinates (e.g., distance measurement with corrected coordinates, size measurement on top-down view, area calculation on corrected coordinates), enabling correction-to-measurement workflows
- Before analytics blocks to perform analytics on corrected coordinates (e.g., zone analytics with corrected coordinates, tracking with top-down view, path analysis with corrected paths), enabling correction-to-analytics workflows
- Before visualization blocks to visualize corrected coordinates and warped images (e.g., display top-down view, visualize corrected detections, show perspective-corrected results), enabling correction-to-visualization workflows
- In workflow outputs to provide perspective-corrected final outputs (e.g., top-down coordinate outputs, corrected detection outputs, warped image outputs), enabling correction-to-output workflows
Requirements¶
This block requires either images or predictions (detections) as input. The perspective_polygons parameter must contain at least one polygon with exactly 4 vertices defining the region to transform. Polygons can be provided as a list of 4 coordinate pairs [[x1, y1], [x2, y2], [x3, y3], [x4, y4]] or as NumPy arrays. If multiple polygons are provided, the largest polygon (by area) is selected for each batch element. The transformed_rect_width and transformed_rect_height parameters define the dimensions of the output top-down rectangle. The block uses OpenCV's perspective transformation functions, which require proper polygon ordering and valid coordinate data. If polygon extension is enabled, the output dimensions are automatically adjusted to include the extended regions.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/perspective_correction@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
perspective_polygons |
List[Any] |
Perspective polygons defining regions to transform from perspective view to top-down view. Each polygon must consist of exactly 4 vertices (coordinates). Format: list of 4 coordinate pairs [[x1, y1], [x2, y2], [x3, y3], [x4, y4]] or NumPy arrays. If multiple polygons are provided for a batch element, the largest polygon (by area) is selected. The polygon defines the source region in the perspective view that will be mapped to the destination rectangle.. | ✅ |
transformed_rect_width |
int |
Width of the destination rectangle in the top-down view (in pixels). The perspective polygon is transformed to fit this width. Coordinates are scaled to match this dimension. If polygon extension is enabled, the actual output width may be larger to accommodate extended regions.. | ✅ |
transformed_rect_height |
int |
Height of the destination rectangle in the top-down view (in pixels). The perspective polygon is transformed to fit this height. Coordinates are scaled to match this dimension. If polygon extension is enabled, the actual output height may be larger to accommodate extended regions.. | ✅ |
extend_perspective_polygon_by_detections_anchor |
str |
Optional setting to extend the perspective polygon to contain all detection anchor points. If set to a Position value (CENTER, CENTER_LEFT, CENTER_RIGHT, TOP_CENTER, TOP_LEFT, TOP_RIGHT, BOTTOM_LEFT, BOTTOM_CENTER, BOTTOM_RIGHT, CENTER_OF_MASS), extends the polygon to contain that anchor point from all detections. If set to 'ALL', extends to contain entire bounding boxes (all corners). Empty string (default) disables extension. Extension ensures all detections are within the transformed region, automatically adjusting polygon boundaries and output dimensions.. | ✅ |
warp_image |
bool |
If True, applies perspective transformation to the input image, producing a warped image in the top-down view. The warped image shows the perspective-corrected view at the specified transformed rectangle dimensions (plus any extensions). If False (default), only detection coordinates are transformed, and the original image is returned unchanged. Images must be provided if this is True.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Perspective Correction in version v1.
- inputs:
Contrast Equalization,Clip Comparison,Detections Transformation,VLM as Detector,Polygon Visualization,Image Blur,SIFT Comparison,Text Display,SIFT,Moondream2,Google Vision OCR,Pixelate Visualization,Time in Zone,VLM as Classifier,Detection Offset,Detections Filter,Instance Segmentation Model,Perspective Correction,Halo Visualization,Image Threshold,Path Deviation,Keypoint Detection Model,CSV Formatter,Florence-2 Model,Detections Stabilizer,Twilio SMS Notification,Image Convert Grayscale,Corner Visualization,Dynamic Zone,Identify Changes,Icon Visualization,SAM 3,Detections Consensus,Multi-Label Classification Model,Detections Stitch,Dynamic Crop,Bounding Box Visualization,YOLO-World Model,Detection Event Log,Detections Classes Replacement,Blur Visualization,Camera Calibration,Line Counter,Path Deviation,OpenAI,Camera Focus,Trace Visualization,CogVLM,Image Slicer,Absolute Static Crop,Dot Visualization,Label Visualization,Slack Notification,Google Gemini,Object Detection Model,LMM For Classification,Stitch OCR Detections,OpenAI,Classification Label Visualization,Stitch OCR Detections,Byte Tracker,Velocity,Twilio SMS/MMS Notification,Anthropic Claude,Clip Comparison,VLM as Detector,Webhook Sink,Llama 3.2 Vision,SIFT Comparison,Anthropic Claude,Time in Zone,QR Code Generator,Local File Sink,Email Notification,Roboflow Dataset Upload,Motion Detection,Model Comparison Visualization,Camera Focus,PTZ Tracking (ONVIF).md),LMM,Byte Tracker,SAM 3,Mask Visualization,Relative Static Crop,Anthropic Claude,Object Detection Model,Detections Merge,Circle Visualization,Seg Preview,EasyOCR,Stability AI Inpainting,Reference Path Visualization,Time in Zone,Detections Combine,Ellipse Visualization,Crop Visualization,Overlap Filter,Line Counter,Image Preprocessing,Detections List Roll-Up,Segment Anything 2 Model,Background Subtraction,Image Contours,Image Slicer,Depth Estimation,Pixel Color Count,Stitch Images,VLM as Classifier,Model Monitoring Inference Aggregator,Instance Segmentation Model,Line Counter Visualization,Morphological Transformation,Polygon Zone Visualization,Single-Label Classification Model,Email Notification,OCR Model,Keypoint Visualization,Distance Measurement,Google Gemini,Roboflow Custom Metadata,OpenAI,Color Visualization,Size Measurement,Byte Tracker,Identify Outliers,Buffer,Florence-2 Model,Google Gemini,JSON Parser,Grid Visualization,Template Matching,OpenAI,Dimension Collapse,Bounding Rectangle,Background Color Visualization,SAM 3,Stability AI Outpainting,Roboflow Dataset Upload,Triangle Visualization,Stability AI Image Generation - outputs:
Contrast Equalization,Clip Comparison,Detections Transformation,VLM as Detector,Polygon Visualization,Image Blur,SIFT Comparison,Text Display,SIFT,Moondream2,Qwen3-VL,Google Vision OCR,Pixelate Visualization,Time in Zone,VLM as Classifier,Detection Offset,Detections Filter,Instance Segmentation Model,Perspective Correction,Halo Visualization,Image Threshold,Path Deviation,Keypoint Detection Model,Florence-2 Model,Detections Stabilizer,Twilio SMS Notification,Image Convert Grayscale,Perception Encoder Embedding Model,Corner Visualization,Dynamic Zone,Identify Changes,Icon Visualization,SAM 3,Qwen2.5-VL,Detections Consensus,Multi-Label Classification Model,Detections Stitch,Dynamic Crop,QR Code Detection,Bounding Box Visualization,YOLO-World Model,Detection Event Log,Detections Classes Replacement,Blur Visualization,Camera Calibration,Line Counter,Dominant Color,Path Deviation,OpenAI,Camera Focus,Trace Visualization,CogVLM,Image Slicer,Absolute Static Crop,Dot Visualization,Label Visualization,Slack Notification,Google Gemini,Object Detection Model,LMM For Classification,Stitch OCR Detections,OpenAI,Stitch OCR Detections,Classification Label Visualization,Byte Tracker,Velocity,Twilio SMS/MMS Notification,Anthropic Claude,Gaze Detection,Clip Comparison,VLM as Detector,Webhook Sink,Llama 3.2 Vision,SIFT Comparison,Anthropic Claude,Time in Zone,QR Code Generator,SmolVLM2,Email Notification,CLIP Embedding Model,Roboflow Dataset Upload,Motion Detection,Model Comparison Visualization,Camera Focus,PTZ Tracking (ONVIF).md),LMM,Byte Tracker,Single-Label Classification Model,Mask Visualization,SAM 3,Anthropic Claude,Relative Static Crop,Object Detection Model,Detections Merge,Keypoint Detection Model,Circle Visualization,Seg Preview,EasyOCR,Stability AI Inpainting,Multi-Label Classification Model,Reference Path Visualization,Time in Zone,Detections Combine,Crop Visualization,Ellipse Visualization,Overlap Filter,Line Counter,Image Preprocessing,Barcode Detection,Detections List Roll-Up,Segment Anything 2 Model,Background Subtraction,Image Slicer,Image Contours,Depth Estimation,Pixel Color Count,Stitch Images,VLM as Classifier,Model Monitoring Inference Aggregator,Instance Segmentation Model,Line Counter Visualization,Morphological Transformation,Polygon Zone Visualization,Single-Label Classification Model,Email Notification,OCR Model,Distance Measurement,Roboflow Custom Metadata,Keypoint Visualization,Google Gemini,OpenAI,Color Visualization,Size Measurement,Byte Tracker,Identify Outliers,Buffer,Florence-2 Model,Google Gemini,Grid Visualization,Template Matching,OpenAI,Bounding Rectangle,Background Color Visualization,Roboflow Dataset Upload,Stability AI Outpainting,SAM 3,Triangle Visualization,Stability AI Image Generation
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Perspective Correction in version v1 has.
Bindings
-
input
predictions(Union[instance_segmentation_prediction,object_detection_prediction]): Optional object detection or instance segmentation predictions to transform. If provided, bounding boxes, masks, and keypoints are transformed to the top-down coordinate space. If not provided, only image warping is performed (if enabled). Either predictions or images must be provided..images(image): Input images to optionally warp to top-down view. Required if warp_image is True. Images are transformed using the perspective transformation matrix to produce top-down views. If only images are provided (no predictions), only image warping is performed..perspective_polygons(list_of_values): Perspective polygons defining regions to transform from perspective view to top-down view. Each polygon must consist of exactly 4 vertices (coordinates). Format: list of 4 coordinate pairs [[x1, y1], [x2, y2], [x3, y3], [x4, y4]] or NumPy arrays. If multiple polygons are provided for a batch element, the largest polygon (by area) is selected. The polygon defines the source region in the perspective view that will be mapped to the destination rectangle..transformed_rect_width(integer): Width of the destination rectangle in the top-down view (in pixels). The perspective polygon is transformed to fit this width. Coordinates are scaled to match this dimension. If polygon extension is enabled, the actual output width may be larger to accommodate extended regions..transformed_rect_height(integer): Height of the destination rectangle in the top-down view (in pixels). The perspective polygon is transformed to fit this height. Coordinates are scaled to match this dimension. If polygon extension is enabled, the actual output height may be larger to accommodate extended regions..extend_perspective_polygon_by_detections_anchor(string): Optional setting to extend the perspective polygon to contain all detection anchor points. If set to a Position value (CENTER, CENTER_LEFT, CENTER_RIGHT, TOP_CENTER, TOP_LEFT, TOP_RIGHT, BOTTOM_LEFT, BOTTOM_CENTER, BOTTOM_RIGHT, CENTER_OF_MASS), extends the polygon to contain that anchor point from all detections. If set to 'ALL', extends to contain entire bounding boxes (all corners). Empty string (default) disables extension. Extension ensures all detections are within the transformed region, automatically adjusting polygon boundaries and output dimensions..warp_image(boolean): If True, applies perspective transformation to the input image, producing a warped image in the top-down view. The warped image shows the perspective-corrected view at the specified transformed rectangle dimensions (plus any extensions). If False (default), only detection coordinates are transformed, and the original image is returned unchanged. Images must be provided if this is True..
-
output
corrected_coordinates(Union[object_detection_prediction,instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object ifobject_detection_predictionor Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object ifinstance_segmentation_prediction.warped_image(image): Image in workflows.extended_transformed_rect_width(integer): Integer value.extended_transformed_rect_height(integer): Integer value.
Example JSON definition of step Perspective Correction in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/perspective_correction@v1",
"predictions": "$steps.object_detection_model.predictions",
"images": "$inputs.image",
"perspective_polygons": "$steps.perspective_wrap.zones",
"transformed_rect_width": 1000,
"transformed_rect_height": 1000,
"extend_perspective_polygon_by_detections_anchor": "CENTER",
"warp_image": false
}