CSV Formatter¶

Class: CSVFormatterBlockV1

Source: inference.core.workflows.core_steps.formatters.csv.v1.CSVFormatterBlockV1

The CSV Formatter block prepares structured CSV content based on specified data configurations within a workflow. It allows users to:

choose which data appears as columns
apply operations to transform the data within the block
aggregate whole batch of data into single CSV document (see Data Aggregation section)

The generated CSV content can be used as input for other blocks, such as File Sink or Email Notifications.

Defining columns¶

Use columns_data property to specify name of the columns and data sources. Defining UQL operations in columns_operations you can perform specific operation on each column.

Timestamp column

The block automatically adds timestamp column and this column name is reserved and cannot be used.

The value of timestamp would be in the following format: 2024-10-18T14:09:57.622297+00:00, values are scaled to UTC time zone.

For example, the following definition

columns_data = {
    "predictions": "$steps.model.predictions",
    "reference": "$inputs.reference_class_names",
}
columns_operations = {
    "predictions": [
        {"type": "DetectionsPropertyExtract", "property_name": "class_name"}
    ],
}

Will generate CSV content:

timestamp,predictions,reference
"2024-10-16T11:15:15.336322+00:00","['a', 'b', 'c']","['a', 'b']"

When applied on object detection predictions from a single image, assuming that $inputs.reference_class_names holds a list of reference classes.

Data Aggregation¶

The block may take input from different blocks, hence its behavior may differ depending on context:

data batch_size=1: whenever single input is provided - block will provide the output as in the example above - CSV header will be placed in the first row, the second row will hold the data
data batch_size>1: each datapoint will create one row in CSV document, but only the last batch element will be fed with the aggregated output, leaving other batch elements' outputs empty

When should I expect `batch_size=1`?¶

You may expect batch_size=1 in the following scenarios:

CSV Formatter was connected to the output of block that only operates on one image and produces one prediction
CSV Formatter was connected to the output of block that aggregates data for whole batch and produces single non-empty output (which is exactly the characteristics of CSV Formatter itself)

When should I expect `batch_size>1`?¶

You may expect batch_size=1 in the following scenarios:

CSV Formatter was connected to the output of block that produces single prediction for single image, but batch of images were fed - then CSV Formatter will aggregate the CSV content and output it in the position of the last batch element:

--- input_batch[0] ----> ┌───────────────────────┐ ---->  <Empty>
--- input_batch[1] ----> │                       │ ---->  <Empty>
        ...              │      CSV Formatter    │ ---->  <Empty>
        ...              │                       │ ---->  <Empty>           
--- input_batch[n] ----> └───────────────────────┘ ---->  {"csv_content": "..."}

Format of CSV document for batch_size>1

If the example presented above is applied for larger input batch sizes - the output document structure would be as follows:

timestamp,predictions,reference
"2024-10-16T11:15:15.336322+00:00","['a', 'b', 'c']","['a', 'b']"
"2024-10-16T11:15:15.436322+00:00","['b', 'c']","['a', 'b']"
"2024-10-16T11:15:15.536322+00:00","['a', 'c']","['a', 'b']"

Type identifier¶

Use the following identifier in step "type" field: roboflow_core/csv_formatter@v1to add the block as as step in your workflow.

Properties¶

Name	Type	Description	Refs
`name`	`str`	Enter a unique identifier for this step..	❌
`columns_data`	`Dict[str, Union[bool, float, int, str]]`	References data to be used to construct each and every column.	✅
`columns_operations`	Dict[str, List[Union[ClassificationPropertyExtract, ConvertDictionaryToJSON, ConvertImageToBase64, ConvertImageToJPEG, DetectionsFilter, DetectionsOffset, DetectionsPropertyExtract, DetectionsRename, DetectionsSelection, DetectionsShift, DetectionsToDictionary, Divide, ExtractDetectionProperty, ExtractFrameMetadata, ExtractImageProperty, LookupTable, Multiply, NumberRound, NumericSequenceAggregate, PickDetectionsByParentClass, RandomNumber, SequenceAggregate, SequenceApply, SequenceElementsCount, SequenceLength, SequenceMap, SortDetections, StringMatches, StringSubSequence, StringToLowerCase, StringToUpperCase, TimestampToISOFormat, ToBoolean, ToNumber, ToString]]]	UQL definitions of operations to be performed on defined data w.r.t. each column.	❌

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Available Connections¶

Compatible Blocks

Check what blocks you can connect to CSV Formatter in version v1.

Input and Output Bindings¶

The available connections depend on its binding kinds. Check what binding kinds CSV Formatter in version v1 has.

Bindings

input
- columns_data (*): References data to be used to construct each and every column.
output
- csv_content (string): String value.

Example JSON definition of step CSV Formatter in version v1

{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/csv_formatter@v1",
    "columns_data": {
        "predictions": "$steps.model.predictions",
        "reference": "$inputs.reference_class_names"
    },
    "columns_operations": {
        "predictions": [
            {
                "property_name": "class_name",
                "type": "DetectionsPropertyExtract"
            }
        ]
    }
}