Name: CSV To Avro Schema
Author: A.Tools

Login

Password

Don't have an account yet?

Go to Sign up

Change File

Enter Data

{{ displayRows.length }} rows x {{ displayHeaders.length }} columns{{ firstRowAsHeader ? ' (1 header)' : '' }} {{ selectedRows.length > 0 ? selectedRows.length + ' selected' : '' }}

Output Data

Download File

Properties

Convert CSV to Avro Schema online — paste, edit, and download Avro.

Convert

Restart

Case sensitive

Use regex

Cancel

Replace All

What Is the CSV to Avro Schema Converter?

Apache Avro is a row-oriented remote procedure call and data serialization framework developed within the Apache Hadoop project. It uses JSON-based schemas to define data structures and compacts binary encoding for efficient storage and transmission. Avro is the default serialization format for Apache Kafka (via Confluent), and is widely used in Apache Spark, Apache Flink, and AWS services.

The CSV to Avro Schema Converter on A.Tools reads your CSV file, analyzes the data in each column, and generates a complete Avro schema in JSON format with automatically inferred types.

All processing runs locally in your browser. No data leaves your device.

Core Features

Automatic Type Inference

The tool examines the actual values in each CSV column and maps them to Avro primitive types:

Data Pattern	Avro Type	Example Values
Text, mixed characters	`string`	`"Alice"`, `"NYC"`
Whole numbers (small)	`int`	`30`, `-5`, `145`
Whole numbers (large)	`long`	`9223372036854775807`
Decimal numbers	`float` / `double`	`3.14`, `99.99`
Boolean	`boolean`	`true`, `false`
Empty cells	`["null", "type"]` (union)	(empty)

Nullable Columns (Union Types)

When a column contains empty cells, the tool generates an Avro union type to allow null values:

{

"name": "age",

"type": ["null", "int"]

}

This is essential for real-world data where some fields may be missing.

Avro Record Schema Structure

The output follows the standard Avro schema specification:

{

"type": "record",

"name": "MyRecord",

"namespace": "com.example",

"fields": [

{"name": "id", "type": "int"},

{"name": "name", "type": "string"},

{"name": "price", "type": "double"}

]

}

Online Table Editor

Edit your data in-browser before converting:

Undo / Redo — Full edit history.
Add / Delete Rows & Columns — Expand or trim the table.
Transpose — Swap rows and columns.
Delete Empty — Remove empty rows and columns.
Deduplicate — Remove duplicate rows.
ABC / abc / Abc — Batch case conversion.
Find & Replace — With regex support.
First Row as Header — Column headers become Avro field names.

Privacy & Security

All processing runs client-side via the browser File API. Files are never uploaded, transmitted, or stored. Safe for enterprise data models, proprietary schemas, and production field definitions.

How to Use the CSV to Avro Schema Converter

Step 1 — Load Your Data

Upload a .csv or .tsv file by dragging it onto the upload area, or click to browse. Alternatively, click Enter Data to type or paste data directly.

Step 2 — Edit Your Data (Optional)

Use the toolbar to refine your data:

Add, insert, or delete rows and columns.
Transpose the table.
Remove empty rows/columns or duplicate rows.
Change text case.
Find and replace values (supports regex).
Toggle First Row as Header to define field names.

Step 3 — Convert

Click Convert. The tool analyzes each column's data values and generates an Avro schema with inferred types. The JSON schema appears in the Output Data panel.

Step 4 — Copy and Use

Click Copy to Clipboard and use the schema with:

Confluent Schema Registry — Register the schema for Kafka topics.
Apache Spark — Define the schema for spark.read.format("avro").
Apache Kafka Producers/Consumers — Embed in producer/consumer config.
AWS Glue / Kinesis Data Analytics — Use as table schema definitions.

Practical Examples

Example 1: Kafka Event Schema

Input CSV:

event_id,event_type,user_id,amount,timestamp,processed

1001,purchase,U-501,49.99,2026-05-07T10:30:00Z,true

1002,refund,U-502,,2026-05-07T11:15:00Z,false

1003,purchase,U-503,125.00,2026-05-07T12:00:00Z,true

Output Avro Schema:

{

"type": "record",

"name": "CsvRecord",

"fields": [

{"name": "event_id", "type": "int"},

{"name": "event_type", "type": "string"},

{"name": "user_id", "type": "string"},

{"name": "amount", "type": ["null", "double"]},

{"name": "timestamp", "type": "string"},

{"name": "processed", "type": "boolean"}

]

}

Note: amount is a union ["null", "double"] because row 2 has an empty value.

Example 2: Product Catalog Schema

Input CSV:

id,name,category,price,in_stock,rating1,Widget A,Hardware,12.99,145,4.52,Widget B,Hardware,8.50,0,3.83,Gadget X,Electronics,45.00,23,4.9

Output Avro Schema:

{

"type": "record",

"name": "CsvRecord",

"fields": [

{"name": "id", "type": "int"},

{"name": "name", "type": "string"},

{"name": "category", "type": "string"},

{"name": "price", "type": "double"},

{"name": "in_stock", "type": "int"},

{"name": "rating", "type": "double"}

]

}

Example 3: IoT Sensor Data

Input CSV:

sensor_id,temperature,humidity,active,reading_timeS-001,22.5,65.0,true,2026-05-07T08:00:00ZS-002,,78.3,true,2026-05-07T08:00:01ZS-003,19.0,,false,

Output Avro Schema:

{

"type": "record",

"name": "CsvRecord",

"fields": [

{"name": "sensor_id", "type": "string"},

{"name": "temperature", "type": ["null", "double"]},

{"name": "humidity", "type": ["null", "double"]},

{"name": "active", "type": ["null", "boolean"]},

{"name": "reading_time", "type": ["null", "string"]}

]

}

Multiple columns have empty values, so most fields use union types.

Understanding Apache Avro

What Is Avro?

Apache Avro is a data serialization system that provides:

Rich data structures — Records, enums, arrays, maps, unions.
Compact binary format — Smaller than JSON or XML.
Schema-based — Every data file includes its schema.
Schema evolution — Add/remove fields without breaking consumers.
Language-agnostic — Bindings for Java, Python, C, C++, C#, Go, Ruby, etc.

Avro is defined by the Apache Avro Specification.

Avro Schema Structure

An Avro schema is a JSON document with this structure:

{

"type": "record",

"name": "RecordName",

"namespace": "com.example.namespace",

"doc": "Description of this record",

"fields": [

{"name": "fieldName", "type": "string", "doc": "Field description"}

]

}

Key elements:

type: "record" — A record is Avro's equivalent of a struct or class.
name — The record type name.
namespace — Java-style package name for uniqueness.
fields — Array of field definitions, each with name and type.

Avro Primitive Types

Type	Description	Size
`null`	No value	0 bytes
`boolean`	True or false	1 byte
`int`	32-bit signed integer	variable (zigzag)
`long`	64-bit signed integer	variable (zigzag)
`float`	IEEE 754 single precision	4 bytes
`double`	IEEE 754 double precision	8 bytes
`bytes`	Sequence of 8-bit bytes	variable
`string`	Unicode character sequence	variable

Avro vs. JSON vs. Protobuf

Aspect	Avro	JSON	Protobuf
Schema format	JSON	None (self-describing)	.proto (IDL)
Encoding	Binary	Text	Binary
Schema evolution	Full (add/remove/alias)	N/A	Partial
Type safety	Strong	Weak	Strong
Used by	Kafka, Hadoop, Spark	REST APIs, web	gRPC, Google services
Field ordering	Must match writer schema	N/A	By field number

Avro in the Kafka Ecosystem

In Confluent Platform and Kafka:

Schema Registry stores Avro schemas with versioning.
Producers serialize data using a specific schema ID.
Consumers deserialize using the same schema or a compatible evolved version.
The generated schema can be registered directly via the Schema Registry REST API:
```
POST /subjects/my-topic-value/versions{ "schema": "<generated schema JSON>" }
```

Frequently Asked Questions

Is my CSV data uploaded to a server?
No. All file processing happens entirely in your browser using JavaScript. Your CSV data is never uploaded, transferred, or stored on any server.
What is Apache Avro?
Apache Avro is a data serialization framework that uses JSON schemas to define data structures and binary encoding for compact, efficient serialization. It is the default format for Confluent Kafka and is widely used in Hadoop, Spark, and Flink ecosystems.
How does type inference work?
The tool scans the data values in each CSV column. If all non-empty values are whole numbers within int range, it uses int. Larger integers become long. Decimal values become double. true/false becomes boolean. Everything else defaults to string. Columns with empty cells get union types (["null", "type"]).
Can I use the schema with Confluent Schema Registry?
Yes. Copy the generated schema JSON and register it via the Schema Registry REST API: POST /subjects/<topic-name>-value/versions with {"schema": "<your schema>"}.
What Avro types does the tool generate?
The tool generates Avro primitive types: string, int, long, float, double, boolean, and null. Nullable fields use Avro union types (e.g., ["null", "string"]).
What file formats are supported?
The tool accepts .csv (comma-separated values) and .tsv (tab-separated values) files. You can also enter data manually through the built-in table editor.
Can I edit the generated schema manually?
Yes. The output is plain JSON. You can modify field names, types, add doc descriptions, change the record name/namespace, or add logical types (e.g., {"type": "long", "logicalType": "timestamp-millis"}) after generation.
What is a union type in Avro?
A union type is an array of types that allows a field to hold values of different types. The most common use is ["null", "string"] which means the field can be either null or a string. The tool generates unions when a column has empty cells.

CSV To Avro Schema

Login

Don't have an account yet?

What Is the CSV to Avro Schema Converter?

Core Features

Automatic Type Inference

Nullable Columns (Union Types)

Avro Record Schema Structure

Online Table Editor

Privacy & Security

How to Use the CSV to Avro Schema Converter

Step 1 — Load Your Data

Step 2 — Edit Your Data (Optional)

Step 3 — Convert

Step 4 — Copy and Use

Practical Examples

Example 1: Kafka Event Schema

Example 2: Product Catalog Schema

Example 3: IoT Sensor Data

Understanding Apache Avro

What Is Avro?

Avro Schema Structure

Avro Primitive Types

Avro vs. JSON vs. Protobuf

Avro in the Kafka Ecosystem

Frequently Asked Questions

Is my CSV data uploaded to a server?

What is Apache Avro?

How does type inference work?

Can I use the schema with Confluent Schema Registry?

What Avro types does the tool generate?

What file formats are supported?

Can I edit the generated schema manually?

What is a union type in Avro?

Featured Tools

Popular Tools

New Tools

Topics

Related Tools

Excel To JSON

Excel To CSV

Excel To SQL

Excel To ASCII Table