Image Stack¶
Class: ImageStackBlockV1
Source: inference.core.workflows.core_steps.fusion.image_stack.v1.ImageStackBlockV1
Accumulate compressed video frames into a fixed-size stack, returning the most recent N frames as JPEG-encoded binary blobs. Designed for shared-hosting safety: frames are always JPEG-compressed and downsampled to fit within resolution limits, preventing out-of-memory conditions.
How This Block Works¶
- Receives a video frame (WorkflowImageData) each workflow cycle.
- Downsamples the frame if it exceeds the configured resolution limits (default 1920x1080), preserving aspect ratio.
- JPEG-encodes the frame at quality 75 and stores the resulting bytes.
- Maintains a per-camera FIFO buffer (deque) of up to
stack_sizecompressed frames. When the buffer is full the oldest frame is automatically evicted. - If
stack_sizechanges between calls (e.g. via a dynamic selector), the buffer is resized and existing frames are preserved up to the new limit. - If the
clearinput is True the buffer is flushed before the current frame is added. - Outputs the list of JPEG byte blobs (newest first) and the current frame count.
Common Use Cases¶
- Action / activity recognition: accumulate a clip of N frames and pass them to a vision-language model (e.g. Google Gemini, Qwen) that can reason over multiple images to classify actions, detect events, or describe what is happening in a scene.
- Time-lapse snapshots: collect the last N frames for periodic visual comparison.
- Event buffering: keep a rolling window of frames around an event of interest.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/image_stack@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
stack_size |
int |
Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted.. | ✅ |
resolution_width |
int |
Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio.. | ✅ |
resolution_height |
int |
Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio.. | ✅ |
clear |
bool |
When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Image Stack in version v1.
- inputs:
Triangle Visualization,Background Color Visualization,Heatmap Visualization,Classification Label Visualization,Dynamic Zone,Pixelate Visualization,Polygon Zone Visualization,SIFT Comparison,Stitch Images,Ellipse Visualization,S3 Sink,Image Stack,Image Blur,Distance Measurement,Background Subtraction,Roboflow Dataset Upload,Camera Focus,Morphological Transformation,Identify Changes,QR Code Generator,Stability AI Outpainting,Image Preprocessing,Detection Event Log,Stability AI Image Generation,Keypoint Visualization,Contrast Enhancement,Email Notification,Line Counter,Crop Visualization,Roboflow Dataset Upload,Twilio SMS/MMS Notification,Image Convert Grayscale,Relative Static Crop,Roboflow Vision Events,Identify Outliers,Perspective Correction,Halo Visualization,VLM As Detector,Bounding Box Visualization,PTZ Tracking (ONVIF),Email Notification,Twilio SMS Notification,Line Counter,Contrast Equalization,Image Contours,Text Display,Icon Visualization,SIFT,Circle Visualization,Mask Visualization,Reference Path Visualization,Image Slicer,Camera Focus,Roboflow Custom Metadata,Image Slicer,Absolute Static Crop,Polygon Visualization,Motion Detection,Polygon Visualization,JSON Parser,Model Monitoring Inference Aggregator,Image Threshold,Halo Visualization,Stability AI Inpainting,Grid Visualization,Dynamic Crop,Color Visualization,Blur Visualization,Corner Visualization,Label Visualization,Detections Consensus,VLM As Detector,Dot Visualization,Camera Calibration,Morphological Transformation,Trace Visualization,Model Comparison Visualization,Slack Notification,VLM As Classifier,SIFT Comparison,Depth Estimation,Template Matching,Pixel Color Count,VLM As Classifier,Local File Sink,Line Counter Visualization,Webhook Sink - outputs:
Keypoint Detection Model,Triangle Visualization,SAM 3,Classification Label Visualization,Dynamic Zone,Florence-2 Model,Pixelate Visualization,Ellipse Visualization,Object Detection Model,Image Stack,Path Deviation,OpenAI,Clip Comparison,Dominant Color,Background Subtraction,Roboflow Dataset Upload,Stability AI Outpainting,Google Gemini,Email Notification,Roboflow Vision Events,Instance Segmentation Model,Detections List Roll-Up,Instance Segmentation Model,VLM As Detector,Email Notification,Image Contours,Icon Visualization,Circle Visualization,Reference Path Visualization,Buffer,Mask Visualization,OpenAI,Image Slicer,Time in Zone,Image Slicer,Polygon Visualization,Polygon Visualization,SAM 3,Clip Comparison,Halo Visualization,Stability AI Inpainting,Stitch OCR Detections,Detections Classes Replacement,Grid Visualization,Color Visualization,Time in Zone,VLM As Detector,Dot Visualization,Size Measurement,Llama 3.2 Vision,Trace Visualization,Slack Notification,Instance Segmentation Model,Qwen 3.5 API,Byte Tracker,Google Gemma API,LMM For Classification,Anthropic Claude,VLM As Classifier,OpenAI,Byte Tracker,Webhook Sink,Mask Edge Snap,Time in Zone,Heatmap Visualization,SORT Tracker,Polygon Zone Visualization,SIFT Comparison,Stitch Images,Image Blur,Byte Tracker,Morphological Transformation,Identify Changes,QR Code Generator,Image Preprocessing,Keypoint Visualization,Line Counter,Crop Visualization,Roboflow Dataset Upload,Twilio SMS/MMS Notification,Identify Outliers,Perspective Correction,Halo Visualization,Bounding Box Visualization,Object Detection Model,Qwen 3.6 API,Seg Preview,Twilio SMS Notification,Line Counter,OC-SORT Tracker,Text Display,MoonshotAI Kimi,Stitch OCR Detections,Anthropic Claude,Florence-2 Model,Google Gemini,Absolute Static Crop,Motion Detection,Keypoint Detection Model,YOLO-World Model,Google Gemini,Line Counter Visualization,Image Threshold,Blur Visualization,Corner Visualization,Object Detection Model,Label Visualization,Detections Consensus,SAM2 Video Tracker,Detection Offset,SAM 3,Morphological Transformation,Detections Stabilizer,Anthropic Claude,Keypoint Detection Model,VLM As Classifier,Path Deviation,SIFT Comparison,ByteTrack Tracker,Pixel Color Count,Cache Set,PTZ Tracking (ONVIF)
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Image Stack in version v1 has.
Bindings
-
input
image(image): Video frame to add to the stack..stack_size(integer): Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..resolution_width(integer): Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..resolution_height(integer): Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..clear(boolean): When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..
-
output
frames(list_of_values): List of values of any type.frames_count(integer): Integer value.
Example JSON definition of step Image Stack in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/image_stack@v1",
"image": "$inputs.image",
"stack_size": 5,
"resolution_width": 640,
"resolution_height": 480,
"clear": false
}