Image Stack¶
Class: ImageStackBlockV1
Source: inference.core.workflows.core_steps.fusion.image_stack.v1.ImageStackBlockV1
Accumulate compressed video frames into a fixed-size stack, returning the most recent N frames as JPEG-encoded binary blobs. Designed for shared-hosting safety: frames are always JPEG-compressed and downsampled to fit within resolution limits, preventing out-of-memory conditions.
How This Block Works¶
- Receives a video frame (WorkflowImageData) each workflow cycle.
- Downsamples the frame if it exceeds the configured resolution limits (default 1920x1080), preserving aspect ratio.
- JPEG-encodes the frame at quality 75 and stores the resulting bytes.
- Maintains a per-camera FIFO buffer (deque) of up to
stack_sizecompressed frames. When the buffer is full the oldest frame is automatically evicted. - If
stack_sizechanges between calls (e.g. via a dynamic selector), the buffer is resized and existing frames are preserved up to the new limit. - If the
clearinput is True the buffer is flushed before the current frame is added. - Outputs the list of JPEG byte blobs (newest first) and the current frame count.
Common Use Cases¶
- Action / activity recognition: accumulate a clip of N frames and pass them to a vision-language model (e.g. Google Gemini, Qwen) that can reason over multiple images to classify actions, detect events, or describe what is happening in a scene.
- Time-lapse snapshots: collect the last N frames for periodic visual comparison.
- Event buffering: keep a rolling window of frames around an event of interest.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/image_stack@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
stack_size |
int |
Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted.. | ✅ |
resolution_width |
int |
Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio.. | ✅ |
resolution_height |
int |
Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio.. | ✅ |
clear |
bool |
When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
soft— runtimehosted_serverless,dedicated_deployment; executionremote; inputvideo - Frame stack is stored in process memory per video_identifier. With remote step execution on stateless or multi-replica HTTP runtimes, successive frames may be served by different worker processes, so the stack resets or contains only a partial frame history. Use local step execution in an InferencePipeline for stable cross-frame results.
-
soft— inputimage - Block depends on temporal context from video or repeated-frame workflows. With a still image/photo, there is no meaningful history to track, compare, aggregate, or visualize, so the block provides little or no benefit.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Image Stack in version v1.
- inputs:
Stability AI Outpainting,Camera Focus,Halo Visualization,Event Writer,Email Notification,QR Code Generator,Ellipse Visualization,Local File Sink,Camera Focus,Classification Label Visualization,Bounding Box Visualization,Image Contours,Image Preprocessing,Background Subtraction,Pixelate Visualization,Color Visualization,Crop Visualization,Mask Visualization,Image Slicer,Template Matching,Depth Estimation,Text Display,Roboflow Vision Events,JSON Parser,Relative Static Crop,Distance Measurement,Icon Visualization,Motion Detection,Blur Visualization,VLM As Classifier,SIFT Comparison,Grid Visualization,VLM As Detector,Stability AI Inpainting,Roboflow Dataset Upload,Contrast Enhancement,Model Monitoring Inference Aggregator,Image Threshold,Image Convert Grayscale,Trace Visualization,Circle Visualization,Label Visualization,Pixel Color Count,Morphological Transformation,Morphological Transformation,Polygon Zone Visualization,Image Blur,Keypoint Visualization,Identify Changes,MQTT Writer,VLM As Classifier,Dynamic Crop,Camera Calibration,Polygon Visualization,VLM As Detector,PTZ Tracking (ONVIF),SIFT Comparison,Twilio SMS Notification,Stability AI Image Generation,Perspective Correction,Absolute Static Crop,Roboflow Dataset Upload,Stitch Images,Twilio SMS/MMS Notification,Line Counter,Contrast Equalization,Image Stack,Triangle Visualization,Roboflow Asset Library Attributes,Background Color Visualization,Detections Consensus,Corner Visualization,Model Comparison Visualization,Identify Outliers,Dot Visualization,Line Counter Visualization,Dynamic Zone,Reference Path Visualization,Polygon Visualization,Halo Visualization,Roboflow Custom Metadata,Line Counter,Email Notification,Microsoft SQL Server Sink,Slack Notification,S3 Sink,Detection Event Log,Webhook Sink,SIFT,Heatmap Visualization,OPC UA Writer Sink,Image Slicer - outputs:
Google Gemma API,Event Writer,Google Gemini,Ellipse Visualization,SAM 3,Image Preprocessing,Background Subtraction,Image Contours,YOLO-World Model,Mask Visualization,Crop Visualization,Color Visualization,Seg Preview,SAM2 Video Tracker,Cache Set,SIFT Comparison,Object Detection Model,Google Gemini,Instance Segmentation Model,Llama 3.2 Vision,VLM As Detector,SAM 3,Google Gemma,Google Gemini,Keypoint Detection Model,Image Blur,Keypoint Visualization,Identify Changes,Buffer,MQTT Writer,Anthropic Claude,OC-SORT Tracker,VLM As Detector,PTZ Tracking (ONVIF),SIFT Comparison,Twilio SMS Notification,LMM For Classification,Qwen 3.6 API,Perspective Correction,Time in Zone,SAM 3,Detections Classes Replacement,Detection Offset,Mask Edge Snap,Clip Comparison,SORT Tracker,Qwen-VL,Image Stack,PLC EthernetIP,Identify Outliers,Dot Visualization,OpenAI,Dynamic Zone,Reference Path Visualization,Polygon Visualization,Size Measurement,Detections Stabilizer,Dominant Color,Path Deviation,Webhook Sink,Instance Segmentation Model,Image Slicer,Stability AI Outpainting,Halo Visualization,Email Notification,QR Code Generator,ByteTrack Tracker,Classification Label Visualization,Bounding Box Visualization,MoonshotAI Kimi,Object Detection Model,Pixelate Visualization,Path Deviation,Image Slicer,Text Display,Roboflow Vision Events,Byte Tracker,Object Detection Model,Time in Zone,Icon Visualization,Motion Detection,Blur Visualization,VLM As Classifier,Anthropic Claude,Grid Visualization,OpenRouter,MoonshotAI Kimi,Anthropic Claude,Detections List Roll-Up,Stability AI Inpainting,Roboflow Dataset Upload,Image Threshold,Trace Visualization,Circle Visualization,Instance Segmentation Model,Label Visualization,Pixel Color Count,Morphological Transformation,Morphological Transformation,Polygon Zone Visualization,Keypoint Detection Model,VLM As Classifier,Polygon Visualization,Florence-2 Model,Qwen 3.5 API,Byte Tracker,Absolute Static Crop,Roboflow Dataset Upload,Stitch Images,Florence-2 Model,Twilio SMS/MMS Notification,Line Counter,Time in Zone,Clip Comparison,Byte Tracker,Triangle Visualization,Roboflow Asset Library Attributes,Detections Consensus,Keypoint Detection Model,Corner Visualization,Stitch OCR Detections,Line Counter Visualization,BoT-SORT Tracker,Stitch OCR Detections,Halo Visualization,Line Counter,OpenAI,Email Notification,Slack Notification,OpenAI,Llama 3.2 Vision,Heatmap Visualization,Instance Segmentation Model,OPC UA Writer Sink
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Image Stack in version v1 has.
Bindings
-
input
image(image): Video frame to add to the stack..stack_size(integer): Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..resolution_width(integer): Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..resolution_height(integer): Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..clear(boolean): When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..
-
output
frames(list_of_values): List of values of any type.frames_count(integer): Integer value.
Example JSON definition of step Image Stack in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/image_stack@v1",
"image": "$inputs.image",
"stack_size": 5,
"resolution_width": 640,
"resolution_height": 480,
"clear": false
}