Model Monitoring Inference Aggregator¶
Class: ModelMonitoringInferenceAggregatorBlockV1
Periodically aggregate and report a curated sample of inference predictions to Roboflow Model Monitoring by collecting predictions in memory, grouping by class, selecting the most confident prediction per class, and sending aggregated results at configurable intervals to enable efficient video processing monitoring, production analytics, and model performance tracking workflows with minimal performance overhead.
How This Block Works¶
This block aggregates predictions over time and sends representative samples to Roboflow Model Monitoring at regular intervals, reducing API calls and maintaining video processing performance. The block:
- Receives predictions and configuration:
- Takes predictions from any supported model type (object detection, instance segmentation, keypoint detection, or classification)
- Receives model ID for identification in Model Monitoring
- Accepts frequency parameter specifying reporting interval in seconds
- Receives execution mode flag (fire-and-forget)
- Validates Roboflow API key:
- Checks that a valid Roboflow API key is available (required for API access)
- Raises an error if API key is missing with instructions on how to retrieve one
- Collects predictions in memory:
- Stores predictions in an in-memory aggregator organized by model ID
- Accumulates predictions between reporting intervals
- Maintains state for the duration of the workflow execution session
- Checks reporting interval:
- Uses cache to track last report time based on unique aggregator key
- Calculates time elapsed since last report
- Compares elapsed time to configured frequency threshold
- Skips reporting if interval has not been reached (returns status message)
- Consolidates predictions when reporting:
- Formats all collected predictions for Model Monitoring
- Groups predictions by class name across all collected data
- For each class, sorts predictions by confidence (highest first)
- Selects the most confident prediction per class as representative sample
- Creates a curated set of predictions (one per class with highest confidence)
- Retrieves workspace information:
- Gets workspace ID from Roboflow API using the provided API key
- Uses caching (15-minute expiration) to avoid repeated API calls
- Caches workspace name using MD5 hash of API key as cache key
- Sends aggregated data to Model Monitoring:
- Constructs inference data payload with timestamp, source info, device ID, and server version
- Includes system information (if available) for monitoring context
- Sends aggregated predictions (one per class) to Roboflow Model Monitoring API
- Flushes in-memory aggregator after sending (starts fresh collection)
- Updates last report time in cache
- Executes synchronously or asynchronously:
- Asynchronous mode (fire_and_forget=True): Submits task to background thread pool or FastAPI background tasks, allowing workflow to continue without waiting for API call to complete
- Synchronous mode (fire_and_forget=False): Waits for API call to complete and returns immediate status, useful for debugging and error handling
- Returns status information:
- Outputs error_status indicating success (False) or failure (True)
- Outputs message with reporting status or error details
- Provides feedback on whether aggregation was sent or skipped
The block is optimized for video processing workflows where sending every prediction would create excessive API calls and impact performance. By aggregating predictions and selecting representative samples (most confident per class), the block provides meaningful monitoring data while minimizing overhead. The interval-based reporting ensures regular updates to Model Monitoring without constant API calls.
Common Use Cases¶
π Why Use This Block?¶
This block is a game-changer for projects relying on video processing in Workflows. With its aggregation process, it identifies the most confident predictions across classes and sends them at regular intervals in small messages to Roboflow backend - ensuring that video processing performance is impacted to the least extent.
Perfect for:
-
Monitoring production line performance in real-time π.
-
Debugging and validating your modelβs performance over time β±οΈ.
-
Providing actionable insights from inference workflows with minimal overhead π§.
π¨ Limitations¶
- The block is should not be relied on when running Workflow in
inferenceserver or via HTTP request to Roboflow hosted platform, as the internal state is not persisted in a memory that would be accessible for all requests to the server, causing aggregation to only have a scope of single request. We will solve that problem in future releases if proven to be serious limitation for clients.
Connecting to Other Blocks¶
This block receives predictions and outputs status information:
- After model blocks (Object Detection Model, Instance Segmentation Model, Classification Model, Keypoint Detection Model) to aggregate and report predictions to Model Monitoring (e.g., aggregate detection results, report classification outputs, monitor model predictions), enabling model-to-monitoring workflows
- After filtering or analytics blocks (DetectionsFilter, ContinueIf, OverlapFilter) to aggregate filtered or analyzed results for monitoring (e.g., aggregate filtered detections, report analytics results, monitor processed predictions), enabling analysis-to-monitoring workflows
- In video processing workflows to efficiently monitor video analysis with minimal performance impact (e.g., aggregate video frame detections, report video processing results, monitor video analysis performance), enabling video monitoring workflows
- After preprocessing or transformation blocks to monitor transformed predictions (e.g., aggregate transformed detections, report processed results, monitor transformation outputs), enabling transformation-to-monitoring workflows
- In production deployment workflows to track model performance in production environments (e.g., monitor production inference, track deployment performance, report production metrics), enabling production monitoring workflows
- As a sink block to send aggregated monitoring data without blocking workflow execution (e.g., background monitoring reporting, non-blocking analytics, efficient data collection), enabling sink-to-monitoring workflows
Requirements¶
This block requires a valid Roboflow API key configured in the environment or workflow configuration. The API key is required to authenticate with Roboflow API and access Model Monitoring features. Visit https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key to learn how to retrieve an API key. The block maintains in-memory state for aggregation, which means it works best for long-running workflows (like video processing with InferencePipeline). The block should not be relied upon when running workflows in inference server or via HTTP requests to Roboflow hosted platform, as the internal state is only accessible for single requests and aggregation scope is limited to single request execution. The block aggregates data for all video feeds connected to a single InferencePipeline process (cannot separate aggregations per video feed). The frequency parameter must be at least 1 second. For more information on Model Monitoring at Roboflow, see https://docs.roboflow.com/deploy/model-monitoring.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/model_monitoring_inference_aggregator@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | β |
frequency |
int |
Reporting frequency in seconds. Specifies how often aggregated predictions are sent to Roboflow Model Monitoring. For example, if set to 5, the block collects predictions for 5 seconds, then sends the aggregated sample (one most confident prediction per class) to Model Monitoring. Must be at least 1 second. Lower values provide more frequent updates but increase API calls. Higher values reduce API calls but provide less frequent updates. Default: 5 seconds. Works well for video processing where you want regular but not excessive reporting.. | β |
unique_aggregator_key |
str |
Unique key used internally to track the aggregation session and cache last report time. This key must be unique for each instance of this block in your workflow. The key is used to create cache entries that track when the last report was sent, enabling interval-based reporting. This field is automatically generated and hidden in the UI.. | β |
fire_and_forget |
bool |
Execution mode flag. When True (default), the block runs asynchronously in the background, allowing the workflow to continue processing without waiting for the API call to complete. This provides faster workflow execution but errors are not immediately available. When False, the block runs synchronously and waits for the API call to complete, returning immediate status and error information. Use False for debugging and error handling, True for production workflows where performance is prioritized.. | β |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
requires_internetβ air-gapped / offline deployments - This block depends on a service that is not reachable from fully offline / air-gapped deployments.
-
softβ runtimehosted_serverless,dedicated_deployment; executionremote; inputvideo - Aggregation buffers are stored in process memory while the reporting interval is tracked in cache. With remote step execution on stateless or multi-replica HTTP runtimes, predictions may be collected by different worker processes, so reports can under-collect or flush partial aggregation windows. Use local step execution in an InferencePipeline for stable video aggregation.
-
softβ inputimage - Block depends on temporal context from video or repeated-frame workflows. With a still image/photo, there is no meaningful history to track, compare, aggregate, or visualize, so the block provides little or no benefit.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Model Monitoring Inference Aggregator in version v1.
- inputs:
Roboflow Asset Library Attributes,MoonshotAI Kimi,Path Deviation,Overlap Filter,PTZ Tracking (ONVIF),SIFT Comparison,Event Writer,Slack Notification,SAM2 Video Tracker,VLM As Classifier,Google Gemma,Qwen 3.6 API,Bounding Rectangle,Object Detection Model,Llama 3.2 Vision,Email Notification,SAM 3 Interactive,Velocity,OpenAI-Compatible LLM,Google Gemini,JSON Parser,Track Class Lock,Anthropic Claude,OpenAI,Llama 3.2 Vision,Detection Event Log,ByteTrack Tracker,Clip Comparison,OpenAI,GLM-OCR,MQTT Writer,CSV Formatter,Webhook Sink,SIFT Comparison,Motion Detection,Local File Sink,Google Gemini,MoonshotAI Kimi,Multi-Label Classification Model,Instance Segmentation Model,Keypoint Detection Model,Template Matching,Instance Segmentation Model,Seg Preview,Dynamic Crop,Detections Transformation,BoT-SORT Tracker,Multi-Label Classification Model,Byte Tracker,Detections Stitch,Detection Offset,SORT Tracker,Anthropic Claude,VLM As Classifier,Roboflow Dataset Upload,VLM As Detector,Detections Consensus,Object Detection Model,Detections Filter,Detections Merge,Keypoint Detection Model,SAM3 Video Tracker,Time in Zone,SAM 3,Semantic Segmentation Model,Path Deviation,Twilio SMS Notification,Email Notification,S3 Sink,Identify Changes,Byte Tracker,Single-Label Classification Model,SAM 3,LMM For Classification,OCR Model,Mask Area Measurement,OpenAI,Google Gemma API,Identify Outliers,Time in Zone,EasyOCR,Single-Label Classification Model,YOLO-World Model,Current Time,Stitch OCR Detections,Moondream2,Detections List Roll-Up,Florence-2 Model,Google Gemini,OpenRouter,Detections Stabilizer,SAM 3,Model Monitoring Inference Aggregator,Google Vision OCR,Byte Tracker,Instance Segmentation Model,Single-Label Classification Model,LMM,Segment Anything 2 Model,Time in Zone,Mask Edge Snap,Line Counter,CogVLM,Qwen3.5-VL,Per-Class Confidence Filter,Gaze Detection,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Dynamic Zone,Detections Combine,Roboflow Dataset Upload,Qwen 3.5 API,Multi-Label Classification Model,OC-SORT Tracker,OpenAI,Semantic Segmentation Model,Qwen-VL,Florence-2 Model,Perspective Correction,Twilio SMS/MMS Notification,Roboflow Vision Events,Microsoft SQL Server Sink,Instance Segmentation Model,Roboflow Custom Metadata,Detections Classes Replacement,VLM As Detector,Keypoint Detection Model,Object Detection Model - outputs:
Cache Set,Roboflow Asset Library Attributes,MoonshotAI Kimi,Path Deviation,Image Blur,Reference Path Visualization,PTZ Tracking (ONVIF),Event Writer,Slack Notification,Halo Visualization,CLIP Embedding Model,Image Stack,Google Gemma,Qwen 3.6 API,Object Detection Model,Dot Visualization,Label Visualization,Background Color Visualization,Llama 3.2 Vision,Email Notification,SAM 3 Interactive,Pixelate Visualization,OpenAI-Compatible LLM,Google Gemini,Anthropic Claude,Cache Get,OpenAI,Trace Visualization,Llama 3.2 Vision,OpenAI,Clip Comparison,GLM-OCR,MQTT Writer,Webhook Sink,SIFT Comparison,Motion Detection,Local File Sink,Google Gemini,MoonshotAI Kimi,Polygon Visualization,Classification Label Visualization,Multi-Label Classification Model,Instance Segmentation Model,Keypoint Detection Model,Keypoint Visualization,Template Matching,Instance Segmentation Model,Icon Visualization,Seg Preview,Dynamic Crop,Stability AI Inpainting,BoT-SORT Tracker,Bounding Box Visualization,Multi-Label Classification Model,Polygon Zone Visualization,Crop Visualization,Stability AI Outpainting,Mask Visualization,Halo Visualization,Detections Stitch,Distance Measurement,Text Display,Anthropic Claude,Morphological Transformation,Line Counter,Roboflow Dataset Upload,Detections Consensus,Object Detection Model,Ellipse Visualization,Keypoint Detection Model,SAM3 Video Tracker,Time in Zone,SAM 3,Size Measurement,Circle Visualization,Semantic Segmentation Model,Twilio SMS Notification,Path Deviation,Email Notification,S3 Sink,Single-Label Classification Model,SAM 3,LMM For Classification,Heatmap Visualization,Google Gemma API,OpenAI,Time in Zone,Morphological Transformation,Single-Label Classification Model,YOLO-World Model,Current Time,Blur Visualization,Stitch OCR Detections,Moondream2,Florence-2 Model,Google Gemini,Corner Visualization,OpenRouter,Pixel Color Count,Model Comparison Visualization,SAM 3,Model Monitoring Inference Aggregator,Google Vision OCR,Image Threshold,Instance Segmentation Model,Single-Label Classification Model,LMM,Polygon Visualization,Segment Anything 2 Model,Time in Zone,Stability AI Image Generation,Line Counter Visualization,Line Counter,CogVLM,Qwen3.5-VL,Image Preprocessing,Gaze Detection,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Color Visualization,Dynamic Zone,Triangle Visualization,QR Code Generator,Roboflow Dataset Upload,Qwen 3.5 API,Multi-Label Classification Model,OpenAI,Qwen-VL,Florence-2 Model,Perspective Correction,Roboflow Vision Events,Twilio SMS/MMS Notification,Microsoft SQL Server Sink,Perception Encoder Embedding Model,Instance Segmentation Model,Depth Estimation,Roboflow Custom Metadata,Contrast Equalization,Camera Calibration,Detections Classes Replacement,Keypoint Detection Model,Object Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Model Monitoring Inference Aggregator in version v1 has.
Bindings
-
input
predictions(Union[object_detection_prediction,keypoint_detection_prediction,instance_segmentation_prediction,classification_prediction]): Model predictions (object detection, instance segmentation, keypoint detection, or classification) to aggregate and report to Roboflow Model Monitoring. Predictions are collected in memory, grouped by class name, and the most confident prediction per class is selected as a representative sample. Predictions accumulate between reporting intervals based on the frequency setting. Supported prediction types: supervision Detections objects or classification prediction dictionaries..model_id(roboflow_model_id): Roboflow model ID (format: 'project/version') to associate with the predictions in Model Monitoring. This identifies which model generated the predictions being reported. The model ID is included in the monitoring data sent to Roboflow, allowing you to track performance per model in the Model Monitoring dashboard..frequency(string): Reporting frequency in seconds. Specifies how often aggregated predictions are sent to Roboflow Model Monitoring. For example, if set to 5, the block collects predictions for 5 seconds, then sends the aggregated sample (one most confident prediction per class) to Model Monitoring. Must be at least 1 second. Lower values provide more frequent updates but increase API calls. Higher values reduce API calls but provide less frequent updates. Default: 5 seconds. Works well for video processing where you want regular but not excessive reporting..fire_and_forget(boolean): Execution mode flag. When True (default), the block runs asynchronously in the background, allowing the workflow to continue processing without waiting for the API call to complete. This provides faster workflow execution but errors are not immediately available. When False, the block runs synchronously and waits for the API call to complete, returning immediate status and error information. Use False for debugging and error handling, True for production workflows where performance is prioritized..
-
output
Example JSON definition of step Model Monitoring Inference Aggregator in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/model_monitoring_inference_aggregator@v1",
"predictions": "$steps.object_detection.predictions",
"model_id": "my_project/3",
"frequency": 3,
"unique_aggregator_key": "session-1v73kdhfse",
"fire_and_forget": true
}