Skip to content

Image Stack

Class: ImageStackBlockV1

Source: inference.core.workflows.core_steps.fusion.image_stack.v1.ImageStackBlockV1

Accumulate compressed video frames into a fixed-size stack, returning the most recent N frames as JPEG-encoded binary blobs. Designed for shared-hosting safety: frames are always JPEG-compressed and downsampled to fit within resolution limits, preventing out-of-memory conditions.

How This Block Works

  1. Receives a video frame (WorkflowImageData) each workflow cycle.
  2. Downsamples the frame if it exceeds the configured resolution limits (default 1920x1080), preserving aspect ratio.
  3. JPEG-encodes the frame at quality 75 and stores the resulting bytes.
  4. Maintains a per-camera FIFO buffer (deque) of up to stack_size compressed frames. When the buffer is full the oldest frame is automatically evicted.
  5. If stack_size changes between calls (e.g. via a dynamic selector), the buffer is resized and existing frames are preserved up to the new limit.
  6. If the clear input is True the buffer is flushed before the current frame is added.
  7. Outputs the list of JPEG byte blobs (newest first) and the current frame count.

Common Use Cases

  • Action / activity recognition: accumulate a clip of N frames and pass them to a vision-language model (e.g. Google Gemini, Qwen) that can reason over multiple images to classify actions, detect events, or describe what is happening in a scene.
  • Time-lapse snapshots: collect the last N frames for periodic visual comparison.
  • Event buffering: keep a rolling window of frames around an event of interest.

Type identifier

Use the following identifier in step "type" field: roboflow_core/image_stack@v1to add the block as as step in your workflow.

Properties

Name Type Description Refs
name str Enter a unique identifier for this step..
stack_size int Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..
resolution_width int Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..
resolution_height int Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..
clear bool When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Available Connections

Compatible Blocks

Check what blocks you can connect to Image Stack in version v1.

Input and Output Bindings

The available connections depend on its binding kinds. Check what binding kinds Image Stack in version v1 has.

Bindings
  • input

    • image (image): Video frame to add to the stack..
    • stack_size (integer): Maximum number of frames to keep in the stack (1-64). When the stack is full the oldest frame is evicted..
    • resolution_width (integer): Maximum frame width in pixels (64-1920). Frames wider than this are downsampled preserving aspect ratio..
    • resolution_height (integer): Maximum frame height in pixels (64-1080). Frames taller than this are downsampled preserving aspect ratio..
    • clear (boolean): When True the entire frame buffer is flushed before the current frame is added. Useful for resetting state on scene changes..
  • output

Example JSON definition of step Image Stack in version v1
{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/image_stack@v1",
    "image": "$inputs.image",
    "stack_size": 5,
    "resolution_width": 640,
    "resolution_height": 480,
    "clear": false
}