Image Stack¶

Class: ImageStackBlockV1

Source: inference.core.workflows.core_steps.fusion.image_stack.v1.ImageStackBlockV1

Accumulate compressed video frames into a fixed-size stack, returning the most recent N frames as JPEG-encoded binary blobs. Designed for shared-hosting safety: frames are always JPEG-compressed and downsampled to fit within resolution limits, preventing out-of-memory conditions.

How This Block Works¶

Receives a video frame (WorkflowImageData) each workflow cycle.
Downsamples the frame if it exceeds the configured resolution limits (default 1920x1080), preserving aspect ratio.
JPEG-encodes the frame at quality 75 and stores the resulting bytes.
Maintains a per-camera FIFO buffer (deque) of up to stack_size compressed frames. When the buffer is full the oldest frame is automatically evicted.
If stack_size changes between calls (e.g. via a dynamic selector), the buffer is resized and existing frames are preserved up to the new limit.
If the clear input is True the buffer is flushed before the current frame is added.
Outputs the list of JPEG byte blobs (newest first) and the current frame count.

Common Use Cases¶

Action / activity recognition: accumulate a clip of N frames and pass them to a vision-language model (e.g. Google Gemini, Qwen) that can reason over multiple images to classify actions, detect events, or describe what is happening in a scene.
Time-lapse snapshots: collect the last N frames for periodic visual comparison.
Event buffering: keep a rolling window of frames around an event of interest.

Type identifier¶

Use the following identifier in step "type" field: roboflow_core/image_stack@v1to add the block as as step in your workflow.

Properties¶

Name	Type	Description	Refs
`name`	`str`	Enter a unique identifier for this step..	❌
`stack_size`	`int`	Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..	✅
`resolution_width`	`int`	Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..	✅
`resolution_height`	`int`	Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..	✅
`clear`	`bool`	When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..	✅

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Runtime compatibility¶

soft — runtime hosted_serverless, dedicated_deployment; execution remote; input video: Frame stack is stored in process memory per video_identifier. With remote step execution on stateless or multi-replica HTTP runtimes, successive frames may be served by different worker processes, so the stack resets or contains only a partial frame history. Use local step execution in an InferencePipeline for stable cross-frame results.
soft — input image: Block depends on temporal context from video or repeated-frame workflows. With a still image/photo, there is no meaningful history to track, compare, aggregate, or visualize, so the block provides little or no benefit.

Available Connections¶

Compatible Blocks

Check what blocks you can connect to Image Stack in version v1.

Input and Output Bindings¶

The available connections depend on its binding kinds. Check what binding kinds Image Stack in version v1 has.

Bindings

input
- image (image): Video frame to add to the stack..
- stack_size (integer): Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..
- resolution_width (integer): Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..
- resolution_height (integer): Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..
- clear (boolean): When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..
output
- frames (list_of_values): List of values of any type.
- frames_count (integer): Integer value.

Example JSON definition of step Image Stack in version v1

{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/image_stack@v1",
    "image": "$inputs.image",
    "stack_size": 5,
    "resolution_width": 640,
    "resolution_height": 480,
    "clear": false
}