6 min read read
By Imad Uddin
split json file into multiple fileshow to split json in pythonsplit large jsonsplit json using jqjson file splittersplit nested json fileonline json splittersplit json by sizejson tools for developerschunk json file pythonhow-to guidetutorialjson operations

How to Split JSON Files: Free Tool + Python + Command Line (2026)

How to Split JSON Files: Free Tool + Python + Command Line (2026)

Splitting a JSON file means breaking one large file into multiple smaller files. You need this when files are too large for editors, exceed upload limits, or need parallel processing. Two main methods: Python for automation, online tool for one-off tasks.

If you need to do this once: use the JSON Splitter at merge-json-files.com/json-file-splitter. If you need to automate it: use Python below.

Method 1: Split JSON File in Python

Best for automation and custom logic.

Basic Array Splitting

Use this when you have a large array of objects and want to split by record count.

import json

with open('large_file.json') as f:
    data = json.load(f)

Once the data is loaded, split it into chunks:

chunk_size = 100  # records per file

for i in range(0, len(data), chunk_size):
    chunk = data[i:i+chunk_size]
    with open(f'chunk_{i//chunk_size + 1}.json', 'w') as f:
        json.dump(chunk, f, indent=4)

Output: creates chunk_1.json, chunk_2.json, etc., each containing up to 100 records.

Making It Reusable with argparse

Frequent JSON splitting benefits from CLI tool conversion:

import argparse
import json

parser = argparse.ArgumentParser(description='Split a JSON array into multiple files')
parser.add_argument('input_file', help='Path to the input JSON file')
parser.add_argument('--size', type=int, default=100, help='Number of records per output file')
parser.add_argument('--output-dir', default='.', help='Directory for output files')
args = parser.parse_args()

with open(args.input_file) as f:
    data = json.load(f)

for i in range(0, len(data), args.size):
    chunk = data[i:i+args.size]
    output_path = f'{args.output_dir}/chunk_{i//args.size + 1}.json'
    with open(output_path, 'w') as f:
        json.dump(chunk, f, indent=2)
    print(f'Wrote {len(chunk)} records to {output_path}')

print(f'\nDone. Split {len(data)} records into {-(-len(data)//args.size)} files.')

Now you can run it like:

python split_json.py data.json --size 500 --output-dir ./chunks

Splitting by File Size Instead of Record Count

Use this when upload limits specify maximum file size (e.g., 5 MB API limit).

import json
import os

def split_by_size(input_file, max_size_mb=5):
    """Split a JSON array into files that don't exceed max_size_mb."""
    with open(input_file) as f:
        data = json.load(f)

    max_bytes = max_size_mb * 1024 * 1024
    current_chunk = []
    chunk_num = 1

    for record in data:
        current_chunk.append(record)
        # Check size of current chunk
        chunk_json = json.dumps(current_chunk, indent=2)
        if len(chunk_json.encode('utf-8')) >= max_bytes:
            # Remove last record, write chunk, start new one
            current_chunk.pop()
            with open(f'chunk_{chunk_num}.json', 'w') as f:
                json.dump(current_chunk, f, indent=2)
            print(f'chunk_{chunk_num}.json: {len(current_chunk)} records')
            current_chunk = [record]
            chunk_num += 1

    # Write remaining records
    if current_chunk:
        with open(f'chunk_{chunk_num}.json', 'w') as f:
            json.dump(current_chunk, f, indent=2)
        print(f'chunk_{chunk_num}.json: {len(current_chunk)} records')

split_by_size('large_data.json', max_size_mb=5)

Output: files stay under specified size limit. Record count varies per chunk.

Handling Very Large Files with Streaming

Use this for gigabyte-scale JSON files that exceed available RAM.

pip install ijson
import ijson
import json

def split_large_json(input_file, chunk_size=1000):
    """Stream-split a large JSON array without loading it all into memory."""
    chunk = []
    chunk_num = 1

    with open(input_file, 'rb') as f:
        for item in ijson.items(f, 'item'):
            chunk.append(item)
            if len(chunk) >= chunk_size:
                with open(f'chunk_{chunk_num}.json', 'w') as out:
                    json.dump(chunk, out, indent=2)
                print(f'Wrote chunk_{chunk_num}.json ({len(chunk)} records)')
                chunk = []
                chunk_num += 1

    if chunk:
        with open(f'chunk_{chunk_num}.json', 'w') as out:
            json.dump(chunk, out, indent=2)
        print(f'Wrote chunk_{chunk_num}.json ({len(chunk)} records)')

split_large_json('massive_file.json', chunk_size=5000)

Output: minimal memory usage regardless of input file size. Reads one record at a time.

Method 2: Use the Online JSON Splitter

No-code option for quick one-off tasks. Runs entirely in your browser.

Try it here: JSON File Splitter

How to use it:

  1. Upload your JSON file.
  2. Choose the number of records per file.
  3. Click Split.
  4. Download the resulting files.

It is especially useful for one off splits where writing a script would be overkill.

Privacy

The tool processes everything locally in the browser. The file content does not get uploaded to a server.

It also validates JSON before splitting and handles output file naming automatically.

Method 3: Command Line with jq (Linux/macOS)

Fast JSON processing for terminal users.

Install jq

sudo apt install jq   # Debian/Ubuntu
brew install jq       # macOS

Split a JSON array into chunks

jq -c '.[]' large_file.json | split -l 100 - chunk_

This converts each array element into a single line, then uses split to create files of 100 lines each. Output files are named chunk_aa, chunk_ab, continuing alphabetically.

Convert each split file back into a proper JSON array

for file in chunk_*; do
  jq -s '.' "$file" > "$file.json"
  rm "$file"
  mv "$file.json" "$file"
done

Split into a specific number of files

If you want exactly 10 output files regardless of how many records there are:

total=$(jq 'length' large_file.json)
chunk_size=$(( (total + 9) / 10 ))
jq -c '.[]' large_file.json | split -l $chunk_size - part_

Extract specific ranges

# Get records 0-99
jq '.[0:100]' large_file.json > first_100.json

# Get records 100-199
jq '.[100:200]' large_file.json > second_100.json

jq executes extremely fast. Works natively on Linux and macOS. Excellent for shell script integration and automation pipelines.

Syntax requires learning period. After familiarity develops, surprisingly complex transformations execute in single command.

Splitting Nested JSON Structures

Not all JSON files consist of flat arrays. Files with multiple top-level keys containing different data types require splitting by key rather than array index.

Say your file looks like this:

{
  "users": [...],
  "admins": [...],
  "settings": {...}
}

You can split it by key in Python:

import json

with open('nested.json') as f:
    data = json.load(f)

for key, value in data.items():
    with open(f'{key}.json', 'w') as f:
        json.dump(value, f, indent=2)
    print(f'Wrote {key}.json')

Creates users.json, admins.json, and settings.json. Each file contains exclusively that key's data.

More complex nested structures require splitting inner array while preserving parent structure:

import json

with open('nested.json') as f:
    data = json.load(f)

users = data['users']
chunk_size = 100

for i in range(0, len(users), chunk_size):
    output = {
        'users': users[i:i+chunk_size],
        'metadata': data.get('metadata', {}),
        'chunk_info': {
            'chunk_number': i // chunk_size + 1,
            'total_records': len(users),
            'records_in_chunk': len(users[i:i+chunk_size])
        }
    }
    with open(f'users_chunk_{i//chunk_size + 1}.json', 'w') as f:
        json.dump(output, f, indent=2)

Preserves overall structure. Splits large inner array into manageable pieces. Metadata remains intact across all chunks.

Real World Use Case: JSON API Pagination

Many APIs return paginated results. Practical workflow collects paginated data then splits for downstream processing:

import requests
import json

all_data = []
for page in range(1, 6):
    response = requests.get(f'https://api.example.com/data?page={page}')
    all_data.extend(response.json())

# Save the combined result
with open('combined.json', 'w') as f:
    json.dump(all_data, f)

# Then split it into chunks for upload to another system
chunk_size = 200
for i in range(0, len(all_data), chunk_size):
    chunk = all_data[i:i+chunk_size]
    with open(f'upload_batch_{i//chunk_size + 1}.json', 'w') as f:
        json.dump(chunk, f)

Later recombination of split files follows the same idea in reverse: merging chunks back into a single file.

Splitting JSONL Files

JSONL (JSON Lines) files split more simply. Each line already represents independent JSON object. Standard text-splitting tools suffice:

Using split on Linux and macOS

split -l 1000 data.jsonl chunk_ --additional-suffix=.jsonl

Using Python

chunk_size = 1000
chunk_num = 1
current_chunk = []

with open('data.jsonl') as f:
    for line in f:
        current_chunk.append(line)
        if len(current_chunk) >= chunk_size:
            with open(f'chunk_{chunk_num}.jsonl', 'w') as out:
                out.writelines(current_chunk)
            current_chunk = []
            chunk_num += 1

if current_chunk:
    with open(f'chunk_{chunk_num}.jsonl', 'w') as out:
        out.writelines(current_chunk)

Memory-efficient approach. Reads line by line. Handles files of any size.

Tools Comparison Table

MethodBest ForTechnical SkillScalabilityFlexibility
Python ScriptAutomation, full controlIntermediate/AdvancedHandles any sizeFully customizable
Online ToolQuick one-off tasksBeginnerLimited by browser memoryBasic splitting
jq (CLI)Linux users, large filesIntermediateVery fastGood for arrays

Common Errors and Fixes

JSONDecodeError Cause: input file contains malformed JSON (trailing commas, missing brackets, truncated exports). Fix: validate file before splitting, correct first syntax error, re-run split.

TypeError: list indices must be integers Cause: dictionary-style access used on list (or opposite). Fix: confirm whether top level is array or object, adjust splitting logic to match.

PermissionError Cause: output directory not writable. Fix: choose output directory you have access to, or adjust folder permissions.

MemoryError Cause: file too large to load into RAM with json.load(). Fix: use streaming with ijson, or split at text level first (for JSONL).

UnicodeDecodeError Cause: file not encoded as UTF-8. Fix: open file with correct encoding (utf-8-sig for BOM, latin-1 as last resort), re-save as UTF-8.

Best Practices When Splitting JSON Files

  • Validate before and after. Validate the input first, then validate a few output chunks to confirm the split did not introduce syntax errors.
  • Keep backup copies. Preserve the original export so you can rerun the split if something goes wrong.
  • Use descriptive filenames. users_part_1.json communicates intent better than chunk_aa.
  • Add logging to scripts. Print records per chunk and the total so it is easy to confirm nothing was dropped.
  • Check the target file size limit. Record count does not directly correlate with output size, especially with large nested objects.
  • Consider compression for transfer. Zip or tar.gz can significantly reduce transfer time and storage.
  • Preserve encoding. Output files should use the same encoding as the input, which is UTF 8 in most cases.

Recap

  • Python: best when you need automation, custom logic, and control over output naming and formatting.
  • Online splitter: best for quick one off work when a script would be overkill.
  • jq command line: best for Linux and macOS workflows, especially when you are already in a terminal pipeline.

Pick the method that matches your file structure, size constraints, and workflow.

Final Thoughts

Splitting JSON files is mostly about two things: understanding the structure first, and choosing a method that fits your constraints (size limits, memory, and repeatability). If you validate the input before splitting and spot check a few outputs after, the process stays safe and predictable.

For a fast no code option, use the online splitter above.

If you later need to recombine chunks into a single output, use this guide: How to Merge JSON Files.

Related Tools

Complex nested structures often benefit from a cleanup pass before splitting. JSON Flattener can simplify nested data so chunks are easier to work with.

Frequently Asked Questions

How do I split a JSON file by number of records?

Use Python with array slicing: for i in range(0, len(data), chunk_size): chunk = data[i:i+chunk_size]. Set chunk_size to your desired records per file (e.g., 100, 1000). Each output file gets exactly that many records except the last chunk which contains remaining records.

Can I split a JSON file that isn't an array?

Yes. For objects with multiple keys, split by extracting each key into separate files: for key, value in data.items(): json.dump(value, open(f'{key}.json', 'w')). For nested objects with arrays inside, extract and split the inner array while preserving parent structure.

What's the fastest way to split a large JSON file on the command line?

Use jq: jq -c '.[]' file.json | split -l 100 - chunk_ splits into 100-line chunks. For proper JSON arrays, wrap with jq -s '.' afterward. Works on Linux and macOS. Install with brew install jq or apt install jq.

How do I split a JSON file in Python without loading it all into memory?

Use the ijson library for streaming: for item in ijson.items(f, 'item'). This reads one record at a time instead of loading entire file. Essential for gigabyte-scale files. Install with pip install ijson. See streaming example in Method 1 above.

How do I split a JSON file by a specific field value?

Group records by field value first: groups = {}; for record in data: key = record['category']; groups.setdefault(key, []).append(record). Then write each group to separate file: for category, records in groups.items(): json.dump(records, open(f'{category}.json', 'w')). Works for splitting by date, region, type, etc.

Read More

All Articles