Streaming JSON
For large JSON files that do not fit in memory, use ijson for incremental parsing.
For large JSON files that do not fit in memory, use ijson for incremental parsing.
import ijson
# Parse a 1GB JSON file without loading it all into memory
with open("large.json", "rb") as f:
for item in ijson.items(f, "users.item"):
process_user(item) # each user dict, one at a time
# Alternative: JSONL (one JSON object per line)
with open("data.jsonl") as f:
for line in f:
record = json.loads(line)
process(record)
JSONL (newline-delimited JSON) is often better than one huge JSON array for streaming and appending.