
JSON vs XML vs YAML: Which Data Format Should You Use? (2026)
Compare JSON vs XML vs YAML for APIs, config files, and data exchange. Includes syntax examples, performance benchmarks, readability tests, and use case recommendations for developers.
This tool takes a JSON array and writes each item as one line in a JSONL file. JSONL is the format used by OpenAI, Hugging Face, and most LLM fine-tuning pipelines. Runs in your browser, free.
[
{
"id": 1,
"name": "Alice",
"role": "engineer",
"active": true
},
{
"id": 2,
"name": "Bob",
"role": "designer",
"active": false
},
{
"id": 3,
"name": "Carol",
"role": "manager",
"active": true
}
]{"id":1,"name":"Alice","role":"engineer","active":true}
{"id":2,"name":"Bob","role":"designer","active":false}
{"id":3,"name":"Carol","role":"manager","active":true}| Feature | JSON Array | JSONL/NDJSON |
|---|---|---|
| Structure | [{...}, {...}, {...}] | {...}\n{...}\n{...} |
| Streaming Support | ❌ Must load entire file | ✅ Process line by line |
| BigQuery Compatible | ❌ Requires preprocessing | ✅ Direct import ready |
| Elasticsearch Bulk API | ❌ Not supported | ✅ Native NDJSON format |
| Memory Usage | High (entire array in memory) | Low (process one line at a time) |
| Parallel Processing | ❌ Sequential parsing required | ✅ Each line independent |
Converting JSON to JSONL in Python is straightforward using the built-in json module. Load your JSON array with json.load(), iterate through each object, and write each one as a separate line with json.dumps(). This creates newline-delimited JSON perfect for streaming, log processing, or machine learning datasets. Ideal for preparing data for tools like BigQuery, Elasticsearch, or training AI models that expect JSONL format.
import json
# Read a JSON array file
with open("data.json", "r") as f:
records = json.load(f)
# Write each object as a single JSONL line
with open("output.jsonl", "w") as f:
for record in records:
f.write(json.dumps(record, ensure_ascii=False) + "\n")
print(f"Converted {len(records)} records to JSONL")The jq command-line tool makes JSON to JSONL conversion fast and efficient. Use .[] to iterate through array elements and -c flag for compact output, with each object on its own line. This one-liner approach is perfect for shell scripts, data pipelines, and quick conversions without writing code. Works great for preparing data exports, converting API responses, or formatting logs for analysis tools.
# Convert a JSON array to JSONL with jq
jq -c '.[]' data.json > output.jsonl
# Verify line count matches array length
wc -l output.jsonlWith standard JSON you have to load the whole file into memory before processing any of it. A 1GB JSON array needs 1GB of RAM just to parse. With JSONL you can process one line at a time, which is essential for files with millions of records.
Logging systems use JSONL because you can append new events without rewriting the whole file. With a JSON array, adding one new record means reading the entire file, parsing it, adding the record, and writing it all back. With JSONL, you just append a new line to the end.
LLM training datasets almost always use JSONL because each training example is one line. OpenAI fine-tuning, Hugging Face datasets, and most ML pipelines expect JSONL input. Each line is an independent training example that can be shuffled, batched, and processed in parallel.
If you are working with data that gets processed line-by-line (logs, training data, streaming APIs), JSONL is the right format. If you need to load everything at once for analysis, standard JSON arrays work fine.
Python: with open('file.jsonl') as f: for line in f: record = json.loads(line). This reads one line at a time without loading the whole file into memory. Each line is a complete JSON object that you parse independently.
OpenAI fine-tuning: the API expects JSONL with specific keys like messages or prompt/completion. The tool produces valid JSONL format. You need to format your data with the right keys before converting. Example: each line should be {"messages": [{"role": "user", "content": "..."}]}.
Hugging Face datasets: load_dataset('json', data_files='file.jsonl') works directly. The datasets library reads JSONL natively and creates a Dataset object. Each line becomes one example in your dataset. No preprocessing needed.
BigQuery: bq load --source_format=NEWLINE_DELIMITED_JSON dataset.table file.jsonl. BigQuery calls it NEWLINE_DELIMITED_JSON but it is the same as JSONL. Each line is one row in your table.
The key pattern: one line equals one record. Every tool that processes JSONL follows this rule. Read line-by-line, parse each line as JSON, process the record, move to the next line.
The tool expects a JSON array as input like [{...}, {...}]. If your JSON is a single object like {"name": "Alice"}, wrap it manually: [{"name": "Alice"}]. Then convert.
If your JSON is nested like {"users": [{...}, {...}], "meta": {...}}, extract the array you want first. Copy just the users array part and paste that into the converter. Or use the JSON Flattener tool to restructure it.
The converter splits the top-level array into lines. If your data is not an array at the top level, you need to extract or wrap it first. This is a one-time manual step before conversion.
Hand-picked guides to go deeper

Compare JSON vs XML vs YAML for APIs, config files, and data exchange. Includes syntax examples, performance benchmarks, readability tests, and use case recommendations for developers.

Parse JSON in Python using json.loads() for strings and json.load() for files. Complete guide with code examples, error handling, nested data, and real-world use cases.

Read JSON files in JavaScript using Fetch API, Node.js fs module, and ES6 imports. Complete guide with code examples for browser and server environments, error handling, and large file processing.