How to Generate 1 Million JSON Records for Load Testing
Load testing is one of those tasks that sounds simple but is actually painful to set up. The hardest part isn't running the test — it's getting a realistic, large-scale dataset to feed into it.
This guide covers how to generate 1 million JSON records (or any large dataset) for load testing, database stress testing, and performance benchmarking — without writing a script or spending money.
Why Load Testing Needs Large, Realistic Datasets
A common mistake is load-testing your API with 100 identical records repeated 10,000 times. The problem: your database query planner behaves very differently with 10,000 unique records than with 10,000 identical ones.
Real load testing requires:
- Volume — Enough records to stress the parts of your system you actually care about: query performance, index efficiency, memory usage, connection pooling.
- Variety — Diverse data that triggers different code paths. A search endpoint tested with 1,000 identical names won't reveal how it performs with mixed-length strings, Unicode, or null values.
- Realistic distribution — Some fields should cluster (e.g., most users in a few cities) while others should be uniformly distributed (e.g., timestamps spread over 2 years). Cookie-cutter generators don't think about this.
Method 1: Dummy JSON Generator (Easiest — No Code)
Dummy JSON Generator is a free browser-based tool that supports up to 1,000,000 records in a single generation run. It's entirely client-side — generation happens in your browser, so there are no server rate limits.
Step 1: Open the tool
Visit dummyjsongenerator.com/tool. No account needed.
Step 2: Build your schema
Add the fields that match your actual data model. For a typical user-events table, you might use:
| Field Name | Type |
|---|---|
| id | UUID |
| userId | UUID |
| eventType | One of: pageview, click, purchase, signup |
| timestamp | DateTime (past 2 years) |
| sessionId | UUID |
| ipAddress | IP Address |
| country | Country |
| deviceType | One of: mobile, desktop, tablet |
Step 3: Set record count to 1,000,000
Drag the slider to 1M or type the value directly. The tool will generate all records in-browser using JavaScript.
Performance note: Generating 1M records takes 10–30 seconds depending on your machine. Keep your field count reasonable (under 15 fields) for the fastest generation.
Step 4: Download as JSON or CSV
Click Download. For 1M records, the output JSON file will be roughly 200–600MB depending on field types and values. Use CSV if you need a more compact format (~50–150MB for 1M rows).
Method 2: Faker.js Script (Most Flexible)
For datasets that need to be regenerated frequently, or where you need custom logic (conditional fields, referential integrity across tables), write a Node.js script with Faker.js:
import { faker } from '@faker-js/faker';
import { createWriteStream } from 'fs';
const RECORD_COUNT = 1_000_000;
const OUTPUT_FILE = 'load-test-data.json';
const stream = createWriteStream(OUTPUT_FILE);
stream.write('[\n');
const EVENT_TYPES = ['pageview', 'click', 'purchase', 'signup', 'logout'];
const DEVICE_TYPES = ['mobile', 'desktop', 'tablet'];
for (let i = 0; i < RECORD_COUNT; i++) {
const record = {
id: faker.string.uuid(),
userId: faker.string.uuid(),
eventType: faker.helpers.arrayElement(EVENT_TYPES),
timestamp: faker.date.between({
from: '2024-01-01',
to: '2026-01-01'
}).toISOString(),
sessionId: faker.string.uuid(),
ipAddress: faker.internet.ipv4(),
country: faker.location.country(),
deviceType: faker.helpers.arrayElement(DEVICE_TYPES),
};
const isLast = i === RECORD_COUNT - 1;
stream.write(JSON.stringify(record) + (isLast ? '\n' : ',\n'));
// Log progress every 100k records
if (i % 100_000 === 0) {
console.log(`Generated ${i.toLocaleString()} records...`);
}
}
stream.write(']');
stream.end();
console.log(`Done. Output: ${OUTPUT_FILE}`);Run with: node --experimental-vm-modules generate.js
This streams directly to disk to avoid memory issues with 1M records in RAM.
Performance: Generates ~100k records/second, so 1M records takes about 10 seconds.
Using Your Dataset in Load Testing Tools
k6 (JavaScript-based load testing)
// k6-load-test.js
import http from 'k6/http';
import { check } from 'k6';
import data from './load-test-data.json';
export const options = {
vus: 100, // 100 virtual users
duration: '30s', // run for 30 seconds
};
export default function () {
// Pick a random record from the dataset
const record = data[Math.floor(Math.random() * data.length)];
const res = http.post('https://your-api.com/events', JSON.stringify(record), {
headers: { 'Content-Type': 'application/json' },
});
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
}Run with: k6 run k6-load-test.js
Apache JMeter
- Open JMeter → Add Thread Group → Add Sampler → HTTP Request
- Add "CSV Data Set Config" → point to your downloaded CSV file
- Map CSV columns to JMeter variables:
${id},${userId}, etc. - Use variables in your HTTP request body
- Run with 100–500 threads to simulate concurrent users
Artillery
# artillery-config.yml
config:
target: "https://your-api.com"
phases:
- duration: 60
arrivalRate: 50 # 50 new users per second
payload:
path: "./load-test-data.csv"
fields:
- "id"
- "userId"
- "eventType"
scenarios:
- flow:
- post:
url: "/events"
json:
id: "{{ id }}"
userId: "{{ userId }}"
eventType: "{{ eventType }}"Run with: artillery run artillery-config.yml
Database Seeding with 1M Records
If your goal is seeding a database rather than HTTP load testing, here's how to use the SQL export:
PostgreSQL:
# Download the SQL file from Dummy JSON Generator
# Then pipe it directly to psql
psql -U your_user -d your_database -f generated-data.sql
# For very large files, use COPY instead of INSERT for 10x speed:
# In Dummy JSON Generator, export as CSV, then:
psql -U your_user -d your_database -c "\COPY events FROM 'generated-data.csv' CSV HEADER"MySQL:
mysql -u your_user -p your_database < generated-data.sql
# Or for CSV via LOAD DATA:
mysql -u your_user -p your_database -e "
LOAD DATA INFILE '/path/to/generated-data.csv'
INTO TABLE events
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
"MongoDB:
# Export as JSON array from Dummy JSON Generator, then:
mongoimport --db your_db --collection events --file generated-data.json --jsonArrayFor 1M records, mongoimport typically takes 30–60 seconds.
Tips for Realistic Load Test Data
- Model your actual traffic distribution. If 70% of your users are on mobile, set your deviceType field to generate mobile 70% of the time. Most generators don't do this automatically, but you can post-process or script it.
- Include "hot" records. Real databases have hotspots — a handful of records that get queried far more than others. Manually add 10–20 "hot" IDs to your dataset and reference them frequently in your load test script.
- Use timestamps that span your actual expected range. If your app has been live for 18 months, generate timestamps from 18 months ago to now — not just the last 30 days. This matters for time-range queries and index efficiency.
- Test with indexes both present and absent. Generate your 1M records, run your load test, then add indexes and run again. The difference tells you exactly how much your indexes help.
Summary
Generating 1 million JSON records for load testing doesn't require a paid tool or a complex script. Dummy JSON Generator is currently the only free browser-based tool that supports 1M records with no account or row-limit restrictions — making it the fastest path from zero to load-test-ready data.
For datasets that need regeneration as part of a CI/CD pipeline, Faker.js with a streaming Node.js script is the most maintainable long-term solution.
Either way, you can have a 1M-record dataset ready in under 5 minutes.