S3 Sink¶
Class: S3SinkBlockV1
Source: inference.core.workflows.core_steps.sinks.s3.v1.S3SinkBlockV1
Save workflow data directly to an AWS S3 bucket, supporting CSV, JSON, and text file formats with configurable output modes for aggregating multiple entries into single objects or saving each entry as a separate S3 object.
How This Block Works¶
This block uploads string content from workflow steps to S3 objects. The block:
- Takes string content (from formatters, predictions, or other string-producing blocks) and S3 configuration as input
- Connects to AWS S3 using the provided credentials (or the default AWS credential chain if none are supplied)
- Selects the appropriate upload strategy based on
output_mode: - Separate Files Mode: Creates a new S3 object for each input, generating unique keys with timestamps
- Append Log Mode: Buffers content in memory, uploading a complete object when
max_entries_per_fileis reached or when the block is destroyed - For separate files mode: Generates a unique S3 key from the prefix, file name prefix, file type, and a timestamp, then uploads the content directly
- For append log mode:
- Buffers content entries in memory under a single S3 key
- Applies format-specific handling for appending:
- CSV: Removes the header row from subsequent appends (CSV content must include headers on first write)
- JSON: Converts to JSONL (JSON Lines) format, parsing and re-serializing each JSON document to fit on a single line
- TXT: Appends content directly with newlines
- Tracks entry count and uploads the full buffer as a complete S3 object when
max_entries_per_fileis reached, then starts a fresh buffer with a new key - Uploads any remaining buffered data when the block is destroyed
- Returns error status and messages indicating save success or failure
The block supports two storage strategies: separate files mode creates individual timestamped S3 objects per input (useful for organizing outputs by execution), while append log mode accumulates entries in memory and writes them as complete S3 objects on rotation (useful for time-series logging with controlled upload frequency). S3 key names include timestamps (format: YYYY_MM_DD_HH_MM_SS_microseconds) for unique keys and chronological ordering.
AWS Credentials¶
Credentials can be supplied in two ways:
1. Workflow inputs — declare aws_access_key_id and aws_secret_access_key as workflow inputs of kind parameter and connect them to the corresponding fields. This keeps credentials out of the workflow definition and allows them to be supplied at runtime.
2. Secrets provider block — connect the credential fields to the output of an Environment Secrets Store block, which reads values from server-side environment variables without embedding them in the workflow. Note: this is only available on self-hosted inference servers and cannot be used on the Roboflow hosted platform.
S3 Key Structure¶
The final S3 key is composed of:
{s3_prefix}/{file_name_prefix}_{timestamp}.{extension}
s3_prefix="logs/detections", file_name_prefix="run", and file_type="csv":
logs/detections/run_2024_10_18_14_09_57_622297.csv
s3_prefix is empty, the key starts directly with the file name.
Note on Append Log Mode¶
In append log mode, data is buffered in memory and only uploaded to S3 when:
- The max_entries_per_file limit is reached (object rotation), or
- The block instance is destroyed at workflow teardown
This means data may not be immediately visible in S3 after each step execution. Use separate_files mode if immediate S3 visibility is required.
Common Use Cases¶
- Cloud Data Logging: Upload detection results, metrics, or workflow outputs directly to S3 for durable cloud storage and downstream processing
- Data Pipeline Integration: Export formatted CSV or JSONL files to S3 for consumption by data pipelines, analytics tools, or ML training jobs
- Batch Result Archival: Store individual inference results as separate S3 objects organized by timestamp and prefix
- Time-Series Collection: Aggregate workflow outputs into batched JSONL or CSV files in S3 for cost-efficient log storage
- Cross-Service Integration: Write data to S3 to trigger Lambda functions, feed SQS queues, or integrate with other AWS services
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/s3_sink@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
file_type |
str |
Type of file to create: 'csv' (CSV format), 'json' (JSON format, or JSONL in append_log mode), or 'txt' (plain text). In append_log mode, JSON files are stored as .jsonl (JSON Lines) format with one JSON object per line.. | ❌ |
output_mode |
str |
Upload strategy: 'append_log' buffers multiple entries and uploads them as a single S3 object when the entry limit is reached (useful for batched logging), or 'separate_files' uploads each input as a new S3 object with a unique timestamp-based key (useful for per-execution outputs).. | ❌ |
bucket_name |
str |
Name of the target S3 bucket. Can be a static string or a selector resolving to a string at runtime.. | ✅ |
s3_prefix |
str |
S3 key prefix (folder path) where objects will be stored. Trailing slashes are normalized automatically. Combined with file_name_prefix and a timestamp to form the full object key. Example: 'logs/detections' produces keys like 'logs/detections/workflow_output_2024_10_18_14_09_57_622297.csv'.. | ✅ |
file_name_prefix |
str |
Prefix used to generate S3 object names. Combined with a timestamp (format: YYYY_MM_DD_HH_MM_SS_microseconds) and file extension to create unique keys like 'workflow_output_2024_10_18_14_09_57_622297.csv'.. | ✅ |
max_entries_per_file |
int |
Maximum number of buffered entries before uploading to S3 and starting a new object in append_log mode. When this limit is reached, the accumulated buffer is uploaded as a complete S3 object and a new buffer starts with a fresh key. Only applies when output_mode is 'append_log'. Must be at least 1.. | ✅ |
aws_access_key_id |
str |
AWS access key ID for authentication. If not provided, boto3's default credential chain is used (environment variables, ~/.aws/credentials, or IAM role). Recommended: connect this to an Environment Secrets Store block rather than hardcoding.. | ✅ |
aws_secret_access_key |
str |
AWS secret access key for authentication. If not provided, boto3's default credential chain is used. Recommended: connect this to an Environment Secrets Store block rather than hardcoding.. | ✅ |
aws_region |
str |
AWS region where the bucket is located (e.g., 'us-east-1'). If not provided, boto3's default region is used (AWS_DEFAULT_REGION environment variable or ~/.aws/config).. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
soft— runtimehosted_serverless,dedicated_deployment; executionremote - Append-log mode buffers entries in process memory before uploading the accumulated object to S3. With remote step execution on stateless or multi-replica HTTP runtimes, successive requests may be served by different worker processes, so append-log objects can reset or split across workers. Use separate_files mode, or local step execution in an InferencePipeline when each entry must be captured in a single ordered log.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to S3 Sink in version v1.
- inputs:
VLM As Classifier,Google Gemma API,MoonshotAI Kimi,Stitch OCR Detections,Anthropic Claude,S3 Sink,LMM For Classification,Microsoft SQL Server Sink,Roboflow Custom Metadata,Google Vision OCR,Twilio SMS Notification,Qwen-VL,Email Notification,Roboflow Vision Events,Stitch OCR Detections,Google Gemma,Event Writer,Qwen3.5-VL,Llama 3.2 Vision,Email Notification,Twilio SMS/MMS Notification,OPC UA Writer Sink,Llama 3.2 Vision,Model Monitoring Inference Aggregator,OpenRouter,OpenAI,Florence-2 Model,OpenAI-Compatible LLM,MoonshotAI Kimi,OpenAI,Single-Label Classification Model,OCR Model,CogVLM,Instance Segmentation Model,Anthropic Claude,Google Gemini,Qwen 3.6 API,Clip Comparison,Google Gemini,CSV Formatter,Webhook Sink,Multi-Label Classification Model,LMM,OpenAI,Florence-2 Model,Current Time,OpenAI,VLM As Detector,Google Gemini,Roboflow Visual Search,Slack Notification,EasyOCR,Roboflow Dataset Upload,Roboflow Dataset Upload,PLC Writer,Qwen 3.5 API,Anthropic Claude,Object Detection Model,Local File Sink,MQTT Writer,Keypoint Detection Model,GLM-OCR,Roboflow Asset Library Attributes - outputs:
Line Counter,MoonshotAI Kimi,Stability AI Image Generation,Trace Visualization,Path Deviation,Image Stack,Anthropic Claude,Icon Visualization,SIFT Comparison,Morphological Transformation,Color Visualization,LMM For Classification,Single-Label Classification Model,Perspective Correction,Corner Visualization,Roboflow Custom Metadata,Halo Visualization,Dynamic Zone,Keypoint Detection Model,Qwen-VL,Email Notification,Halo Visualization,Object Detection Model,Google Gemma,Background Color Visualization,Ellipse Visualization,Email Notification,Twilio SMS/MMS Notification,Text Display,Polygon Visualization,Crop Visualization,Image Preprocessing,Template Matching,Model Monitoring Inference Aggregator,OpenRouter,OpenAI,Florence-2 Model,Motion Detection,Heatmap Visualization,OpenAI,Perception Encoder Embedding Model,Blur Visualization,Depth Estimation,Instance Segmentation Model,Stability AI Outpainting,Anthropic Claude,YOLO-World Model,Google Gemini,Clip Comparison,Google Gemini,Keypoint Visualization,Webhook Sink,Florence-2 Model,Current Time,Contrast Equalization,OpenAI,Moondream2,Line Counter,Google Gemini,Slack Notification,Triangle Visualization,Time in Zone,CLIP Embedding Model,Multi-Label Classification Model,Local File Sink,Keypoint Detection Model,Pixel Color Count,GLM-OCR,Roboflow Asset Library Attributes,Polygon Zone Visualization,Time in Zone,Google Gemma API,Stitch OCR Detections,Line Counter Visualization,Semantic Segmentation Model,Distance Measurement,Image Threshold,Multi-Label Classification Model,Camera Calibration,QR Code Generator,S3 Sink,Microsoft SQL Server Sink,Twilio SMS Notification,Google Vision OCR,Image Blur,Morphological Transformation,Roboflow Vision Events,Size Measurement,PTZ Tracking (ONVIF),Stability AI Inpainting,Classification Label Visualization,Stitch OCR Detections,Event Writer,Qwen3.5-VL,Mask Visualization,Llama 3.2 Vision,Reference Path Visualization,Label Visualization,OPC UA Writer Sink,Dot Visualization,Cache Set,Dynamic Crop,Detections Stitch,Circle Visualization,Llama 3.2 Vision,BoT-SORT Tracker,Path Deviation,SAM3 Video Tracker,Gaze Detection,Segment Anything 2 Model,OpenAI-Compatible LLM,MoonshotAI Kimi,Single-Label Classification Model,CogVLM,Object Detection Model,SAM 3 Interactive,Qwen 3.6 API,Detections Consensus,Bounding Box Visualization,Multi-Label Classification Model,LMM,OpenAI,SAM 3,Instance Segmentation Model,Roboflow Visual Search,Roboflow Dataset Upload,SAM 3,Cache Get,Instance Segmentation Model,Detections Classes Replacement,Pixelate Visualization,Keypoint Detection Model,Instance Segmentation Model,Roboflow Dataset Upload,PLC Writer,Qwen 3.5 API,Object Detection Model,Anthropic Claude,Time in Zone,MQTT Writer,Polygon Visualization,SAM 3,Model Comparison Visualization,Single-Label Classification Model,Seg Preview
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
S3 Sink in version v1 has.
Bindings
-
input
content(string): String content to upload to S3. This should be formatted data from other workflow blocks (e.g., CSV content from CSV Formatter, JSON strings, or plain text). The content format should match the specified file_type. For CSV files in append_log mode, content must include header rows on the first write..bucket_name(string): Name of the target S3 bucket. Can be a static string or a selector resolving to a string at runtime..s3_prefix(string): S3 key prefix (folder path) where objects will be stored. Trailing slashes are normalized automatically. Combined with file_name_prefix and a timestamp to form the full object key. Example: 'logs/detections' produces keys like 'logs/detections/workflow_output_2024_10_18_14_09_57_622297.csv'..file_name_prefix(string): Prefix used to generate S3 object names. Combined with a timestamp (format: YYYY_MM_DD_HH_MM_SS_microseconds) and file extension to create unique keys like 'workflow_output_2024_10_18_14_09_57_622297.csv'..max_entries_per_file(string): Maximum number of buffered entries before uploading to S3 and starting a new object in append_log mode. When this limit is reached, the accumulated buffer is uploaded as a complete S3 object and a new buffer starts with a fresh key. Only applies when output_mode is 'append_log'. Must be at least 1..aws_access_key_id(Union[secret,string]): AWS access key ID for authentication. If not provided, boto3's default credential chain is used (environment variables, ~/.aws/credentials, or IAM role). Recommended: connect this to an Environment Secrets Store block rather than hardcoding..aws_secret_access_key(Union[secret,string]): AWS secret access key for authentication. If not provided, boto3's default credential chain is used. Recommended: connect this to an Environment Secrets Store block rather than hardcoding..aws_region(string): AWS region where the bucket is located (e.g., 'us-east-1'). If not provided, boto3's default region is used (AWS_DEFAULT_REGION environment variable or ~/.aws/config)..
-
output
Example JSON definition of step S3 Sink in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/s3_sink@v1",
"content": "$steps.csv_formatter.csv_content",
"file_type": "csv",
"output_mode": "append_log",
"bucket_name": "my-inference-results",
"s3_prefix": "logs/detections",
"file_name_prefix": "my_output",
"max_entries_per_file": 1024,
"aws_access_key_id": "$steps.secrets.aws_access_key_id",
"aws_secret_access_key": "$steps.secrets.aws_secret_access_key",
"aws_region": "us-east-1"
}