How to Generate 1 Million JSON Records for Load Testing

June 15, 2026

Load testing is one of those tasks that sounds simple but is actually painful to set up. The hardest part isn't running the test — it's getting a realistic, large-scale dataset to feed into it.

This guide covers how to generate 1 million JSON records (or any large dataset) for load testing, database stress testing, and performance benchmarking — without writing a script or spending money.

Why Load Testing Needs Large, Realistic Datasets

A common mistake is load-testing your API with 100 identical records repeated 10,000 times. The problem: your database query planner behaves very differently with 10,000 unique records than with 10,000 identical ones.

Real load testing requires:

Volume — Enough records to stress the parts of your system you actually care about: query performance, index efficiency, memory usage, connection pooling.
Variety — Diverse data that triggers different code paths. A search endpoint tested with 1,000 identical names won't reveal how it performs with mixed-length strings, Unicode, or null values.
Realistic distribution — Some fields should cluster (e.g., most users in a few cities) while others should be uniformly distributed (e.g., timestamps spread over 2 years). Cookie-cutter generators don't think about this.

Method 1: Dummy JSON Generator (Easiest — No Code)

Dummy JSON Generator is a free browser-based tool that supports up to 1,000,000 records in a single generation run. It's entirely client-side — generation happens in your browser, so there are no server rate limits.

Step 1: Open the tool

Visit dummyjsongenerator.com/tool. No account needed.

Step 2: Build your schema

Add the fields that match your actual data model. For a typical user-events table, you might use:

Field Name	Type
id	UUID
userId	UUID
eventType	One of: pageview, click, purchase, signup
timestamp	DateTime (past 2 years)
sessionId	UUID
ipAddress	IP Address
country	Country
deviceType	One of: mobile, desktop, tablet

Step 3: Set record count to 1,000,000

Drag the slider to 1M or type the value directly. The tool will generate all records in-browser using JavaScript.

Performance note: Generating 1M records takes 10–30 seconds depending on your machine. Keep your field count reasonable (under 15 fields) for the fastest generation.

Step 4: Download as JSON or CSV

Click Download. For 1M records, the output JSON file will be roughly 200–600MB depending on field types and values. Use CSV if you need a more compact format (~50–150MB for 1M rows).

Method 2: Faker.js Script (Most Flexible)

For datasets that need to be regenerated frequently, or where you need custom logic (conditional fields, referential integrity across tables), write a Node.js script with Faker.js:

import { faker } from '@faker-js/faker';
import { createWriteStream } from 'fs';

const RECORD_COUNT = 1_000_000;
const OUTPUT_FILE = 'load-test-data.json';

const stream = createWriteStream(OUTPUT_FILE);
stream.write('[\n');

const EVENT_TYPES = ['pageview', 'click', 'purchase', 'signup', 'logout'];
const DEVICE_TYPES = ['mobile', 'desktop', 'tablet'];

for (let i = 0; i < RECORD_COUNT; i++) {
  const record = {
    id: faker.string.uuid(),
    userId: faker.string.uuid(),
    eventType: faker.helpers.arrayElement(EVENT_TYPES),
    timestamp: faker.date.between({
      from: '2024-01-01',
      to: '2026-01-01'
    }).toISOString(),
    sessionId: faker.string.uuid(),
    ipAddress: faker.internet.ipv4(),
    country: faker.location.country(),
    deviceType: faker.helpers.arrayElement(DEVICE_TYPES),
  };
  
  const isLast = i === RECORD_COUNT - 1;
  stream.write(JSON.stringify(record) + (isLast ? '\n' : ',\n'));
  
  // Log progress every 100k records
  if (i % 100_000 === 0) {
    console.log(`Generated ${i.toLocaleString()} records...`);
  }
}

stream.write(']');
stream.end();
console.log(`Done. Output: ${OUTPUT_FILE}`);

Run with: node --experimental-vm-modules generate.js

This streams directly to disk to avoid memory issues with 1M records in RAM.

Performance: Generates ~100k records/second, so 1M records takes about 10 seconds.

Using Your Dataset in Load Testing Tools

k6 (JavaScript-based load testing)

// k6-load-test.js
import http from 'k6/http';
import { check } from 'k6';
import data from './load-test-data.json';

export const options = {
  vus: 100,        // 100 virtual users
  duration: '30s', // run for 30 seconds
};

export default function () {
  // Pick a random record from the dataset
  const record = data[Math.floor(Math.random() * data.length)];
  
  const res = http.post('https://your-api.com/events', JSON.stringify(record), {
    headers: { 'Content-Type': 'application/json' },
  });
  
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
  });
}

Run with: k6 run k6-load-test.js

Apache JMeter

Open JMeter → Add Thread Group → Add Sampler → HTTP Request
Add "CSV Data Set Config" → point to your downloaded CSV file
Map CSV columns to JMeter variables: ${id}, ${userId}, etc.
Use variables in your HTTP request body
Run with 100–500 threads to simulate concurrent users

Artillery

# artillery-config.yml
config:
  target: "https://your-api.com"
  phases:
    - duration: 60
      arrivalRate: 50  # 50 new users per second
  payload:
    path: "./load-test-data.csv"
    fields:
      - "id"
      - "userId"
      - "eventType"

scenarios:
  - flow:
      - post:
          url: "/events"
          json:
            id: "{{ id }}"
            userId: "{{ userId }}"
            eventType: "{{ eventType }}"

Run with: artillery run artillery-config.yml

Database Seeding with 1M Records

If your goal is seeding a database rather than HTTP load testing, here's how to use the SQL export:

PostgreSQL:

# Download the SQL file from Dummy JSON Generator
# Then pipe it directly to psql

psql -U your_user -d your_database -f generated-data.sql

# For very large files, use COPY instead of INSERT for 10x speed:
# In Dummy JSON Generator, export as CSV, then:
psql -U your_user -d your_database -c "\COPY events FROM 'generated-data.csv' CSV HEADER"

MySQL:

mysql -u your_user -p your_database < generated-data.sql

# Or for CSV via LOAD DATA:
mysql -u your_user -p your_database -e "
  LOAD DATA INFILE '/path/to/generated-data.csv'
  INTO TABLE events
  FIELDS TERMINATED BY ','
  ENCLOSED BY '\"'
  LINES TERMINATED BY '\n'
  IGNORE 1 ROWS;
"

MongoDB:

# Export as JSON array from Dummy JSON Generator, then:
mongoimport --db your_db --collection events --file generated-data.json --jsonArray

For 1M records, mongoimport typically takes 30–60 seconds.

Tips for Realistic Load Test Data

Model your actual traffic distribution. If 70% of your users are on mobile, set your deviceType field to generate mobile 70% of the time. Most generators don't do this automatically, but you can post-process or script it.
Include "hot" records. Real databases have hotspots — a handful of records that get queried far more than others. Manually add 10–20 "hot" IDs to your dataset and reference them frequently in your load test script.
Use timestamps that span your actual expected range. If your app has been live for 18 months, generate timestamps from 18 months ago to now — not just the last 30 days. This matters for time-range queries and index efficiency.
Test with indexes both present and absent. Generate your 1M records, run your load test, then add indexes and run again. The difference tells you exactly how much your indexes help.

Summary

Generating 1 million JSON records for load testing doesn't require a paid tool or a complex script. Dummy JSON Generator is currently the only free browser-based tool that supports 1M records with no account or row-limit restrictions — making it the fastest path from zero to load-test-ready data.

For datasets that need regeneration as part of a CI/CD pipeline, Faker.js with a streaming Node.js script is the most maintainable long-term solution.

Either way, you can have a 1M-record dataset ready in under 5 minutes.