Image Stack¶
Class: ImageStackBlockV1
Source: inference.core.workflows.core_steps.fusion.image_stack.v1.ImageStackBlockV1
Accumulate compressed video frames into a fixed-size stack, returning the most recent N frames as JPEG-encoded binary blobs. Designed for shared-hosting safety: frames are always JPEG-compressed and downsampled to fit within resolution limits, preventing out-of-memory conditions.
How This Block Works¶
- Receives a video frame (WorkflowImageData) each workflow cycle.
- Downsamples the frame if it exceeds the configured resolution limits (default 1920x1080), preserving aspect ratio.
- JPEG-encodes the frame at quality 75 and stores the resulting bytes.
- Maintains a per-camera FIFO buffer (deque) of up to
stack_sizecompressed frames. When the buffer is full the oldest frame is automatically evicted. - If
stack_sizechanges between calls (e.g. via a dynamic selector), the buffer is resized and existing frames are preserved up to the new limit. - If the
clearinput is True the buffer is flushed before the current frame is added. - Outputs the list of JPEG byte blobs (newest first) and the current frame count.
Common Use Cases¶
- Action / activity recognition: accumulate a clip of N frames and pass them to a vision-language model (e.g. Google Gemini, Qwen) that can reason over multiple images to classify actions, detect events, or describe what is happening in a scene.
- Time-lapse snapshots: collect the last N frames for periodic visual comparison.
- Event buffering: keep a rolling window of frames around an event of interest.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/image_stack@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
stack_size |
int |
Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted.. | ✅ |
resolution_width |
int |
Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio.. | ✅ |
resolution_height |
int |
Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio.. | ✅ |
clear |
bool |
When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Image Stack in version v1.
- inputs:
Bounding Box Visualization,Image Slicer,Distance Measurement,Identify Outliers,PTZ Tracking (ONVIF),Roboflow Dataset Upload,Morphological Transformation,Slack Notification,Camera Calibration,Email Notification,Corner Visualization,Heatmap Visualization,Halo Visualization,Motion Detection,Stability AI Image Generation,VLM As Detector,Color Visualization,Text Display,Line Counter Visualization,Line Counter,Image Slicer,VLM As Classifier,Reference Path Visualization,Pixelate Visualization,Email Notification,Ellipse Visualization,Roboflow Asset Library Attributes,Camera Focus,Image Blur,Dynamic Crop,Identify Changes,Relative Static Crop,Halo Visualization,Morphological Transformation,Perspective Correction,Camera Focus,Twilio SMS/MMS Notification,Trace Visualization,QR Code Generator,JSON Parser,Dot Visualization,Detection Event Log,Image Stack,Background Color Visualization,Crop Visualization,VLM As Classifier,Detections Consensus,Roboflow Dataset Upload,Classification Label Visualization,Polygon Visualization,Background Subtraction,Stability AI Outpainting,Pixel Color Count,Contrast Enhancement,Circle Visualization,SIFT Comparison,Model Comparison Visualization,Absolute Static Crop,Grid Visualization,Image Threshold,Triangle Visualization,Dynamic Zone,Local File Sink,Line Counter,Image Preprocessing,Roboflow Vision Events,Depth Estimation,SIFT Comparison,Webhook Sink,Stability AI Inpainting,Template Matching,Mask Visualization,VLM As Detector,Polygon Zone Visualization,Stitch Images,SIFT,Icon Visualization,Label Visualization,Roboflow Custom Metadata,Keypoint Visualization,Model Monitoring Inference Aggregator,Contrast Equalization,S3 Sink,Polygon Visualization,Image Convert Grayscale,Image Contours,Twilio SMS Notification,Blur Visualization - outputs:
Image Slicer,OpenAI,LMM For Classification,Detections List Roll-Up,SAM 3,Llama 3.2 Vision,Mask Edge Snap,Heatmap Visualization,Keypoint Detection Model,Halo Visualization,Stitch OCR Detections,OpenRouter,Color Visualization,Cache Set,Text Display,Line Counter Visualization,VLM As Classifier,Line Counter,Image Slicer,ByteTrack Tracker,Ellipse Visualization,Roboflow Asset Library Attributes,MoonshotAI Kimi,SORT Tracker,Image Blur,Byte Tracker,Detections Classes Replacement,Google Gemma API,Halo Visualization,Morphological Transformation,Perspective Correction,Trace Visualization,Anthropic Claude,Dot Visualization,Crop Visualization,VLM As Classifier,Time in Zone,SAM2 Video Tracker,Classification Label Visualization,Path Deviation,Stability AI Outpainting,Pixel Color Count,SIFT Comparison,Google Gemini,Absolute Static Crop,Grid Visualization,Dynamic Zone,Instance Segmentation Model,Image Preprocessing,Detection Offset,Clip Comparison,Time in Zone,Stability AI Inpainting,BoT-SORT Tracker,Mask Visualization,VLM As Detector,Google Gemini,Label Visualization,Instance Segmentation Model,Slack Notification,Size Measurement,YOLO-World Model,Anthropic Claude,Bounding Box Visualization,Florence-2 Model,Identify Outliers,OC-SORT Tracker,PTZ Tracking (ONVIF),Qwen 3.6 API,Roboflow Dataset Upload,SAM 3,Morphological Transformation,Google Gemma,Email Notification,Corner Visualization,Motion Detection,VLM As Detector,Time in Zone,Byte Tracker,Instance Segmentation Model,Reference Path Visualization,Anthropic Claude,Pixelate Visualization,Email Notification,Detections Stabilizer,Twilio SMS Notification,Dominant Color,Florence-2 Model,Identify Changes,Twilio SMS/MMS Notification,Seg Preview,QR Code Generator,Qwen-VL,Image Stack,SAM 3,Clip Comparison,OpenAI,Detections Consensus,Roboflow Dataset Upload,Llama 3.2 Vision,Qwen 3.5 API,Polygon Visualization,Google Gemini,Background Subtraction,Circle Visualization,Triangle Visualization,Object Detection Model,Image Threshold,Keypoint Detection Model,Object Detection Model,MoonshotAI Kimi,Line Counter,Buffer,OpenAI,Roboflow Vision Events,SIFT Comparison,Path Deviation,Instance Segmentation Model,Webhook Sink,Object Detection Model,Polygon Zone Visualization,Stitch Images,Icon Visualization,Keypoint Visualization,Polygon Visualization,Image Contours,Byte Tracker,Stitch OCR Detections,Blur Visualization,Keypoint Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Image Stack in version v1 has.
Bindings
-
input
image(image): Video frame to add to the stack..stack_size(integer): Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..resolution_width(integer): Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..resolution_height(integer): Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..clear(boolean): When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..
-
output
frames(list_of_values): List of values of any type.frames_count(integer): Integer value.
Example JSON definition of step Image Stack in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/image_stack@v1",
"image": "$inputs.image",
"stack_size": 5,
"resolution_width": 640,
"resolution_height": 480,
"clear": false
}