WaChat to PDF

How to Export WhatsApp Chat as JSON (AI-Ready Format)

WaChat to PDF's pro plan includes AI-ready JSON export - a structured version of your WhatsApp chat that can be used for AI analysis, data pipelines, or CRM imports.

WhatsApp's own export format is a plain text file designed for human reading, not machine processing. Feeding that raw text into an AI model or data pipeline requires significant parsing work, and the results are often unreliable. WaChat to PDF's AI-ready JSON export solves this by converting the chat into a clean, structured format where every message is a properly typed object with consistent fields - ready to load directly into ChatGPT, a Python script, or a CRM import tool.

What Is AI-Ready JSON Export?

The AI-ready JSON export is a machine-readable representation of the same WhatsApp conversation that WaChat to PDF uses to generate the PDF output. Instead of rendering messages as formatted text on a page, the export serialises every message as a structured JSON object with explicit fields for sender, timestamp, message type, and content. This makes the data suitable for ingestion by any system that can parse JSON - which is effectively every modern programming language, AI platform, and business tool.

The format differs fundamentally from the raw .txt file that WhatsApp produces. The raw export uses a loose, locale-dependent text format where dates and sender names are mixed into a single line of text with varying delimiters. The JSON export separates every attribute into its own field and normalises timestamps to ISO 8601 format, removing the ambiguity that makes raw WhatsApp text unreliable as a data source.

What the JSON Output Contains

Message fields

Each message in the JSON output is represented as an object containing: sender (the display name or phone number of the message author), timestamp (ISO 8601 datetime string), messageType (one of text, image, video, audio, document, sticker, location, contact, or system), content (the text body of the message, or a description for non-text types), and mediaRef (the filename of any attached media file, where present). System messages - such as 'Alice added Bob' or 'Messages and calls are end-to-end encrypted' - are included with messageType set to system.

Metadata

A top-level metadata object accompanies the messages array and contains: participants (array of all unique sender names in the chat), dateRange (start and end ISO date strings), totalMessages (integer count), and exportedAt (ISO datetime when the conversion was run). This metadata is particularly useful when building dashboards or passing context to an AI model - you can include the metadata block in your AI prompt to give the model a summary of the conversation before it reads the individual messages.

Media references

For chats exported with media, each message that contained an attachment includes a mediaRef field with the original filename as it appeared in the WhatsApp ZIP. If you exported without media, the mediaRef field is present but set to null. The JSON does not embed media files as base64 - it references them by filename, so you can match JSON records to the media files in your original ZIP if you need to process them separately.

Step-by-Step: Exporting as JSON

  1. Export your WhatsApp chat from your phone as a .zip or .txt file.
  2. Log in to WaChat to PDF with a pro plan account and upload your export file.
  3. On the Export step, select 'AI-ready JSON' as the output format.
  4. Click Generate and download the .json file when ready.

What You Can Do with the JSON Export

AI analysis with ChatGPT or Claude

The most immediate use case for the JSON export is feeding the conversation to a large language model for analysis. Because the data is already structured, you can paste a subset of messages directly into a ChatGPT or Claude prompt and ask questions like 'summarise the key decisions made in this conversation' or 'identify all dates and deadlines mentioned'. The consistent field names make it straightforward to filter by sender or date range before sending to the model, which reduces token usage on long conversations.

For longer chats that exceed a model's context window, the JSON structure makes it easy to implement a sliding window approach: process the messages array in batches, passing each batch with the metadata block as context. The clean field boundaries mean you can do this programmatically without any custom text parsing.

Data science and statistics

The JSON export loads directly into a pandas DataFrame in Python or a data frame in R, making it suitable for quantitative analysis. Common analyses include message frequency over time (activity heatmaps by hour and day of week), response time distributions between participants, word count and emoji frequency per sender, and topic modelling using the text content of messages. Because timestamps are normalised to ISO 8601, time-series analysis requires no date parsing beyond a single pd.to_datetime() call.

CRM import

Sales and customer-success teams increasingly need to log WhatsApp conversations against contact records in their CRM. The JSON export provides the sender, timestamp, and content fields needed to create activity records in most CRM import formats. A simple mapping script can convert the messages array to a CSV that matches the import template of Salesforce, HubSpot, Pipedrive, or similar platforms, with each message becoming a logged activity note against the relevant contact.

Legal cross-referencing

When a chat is converted to both PDF and JSON in the same WaChat to PDF job, the message ordering in the JSON matches the page ordering in the PDF. This means you can use the JSON to programmatically locate specific messages - for example, all messages from a particular sender between two dates - and then cite the corresponding PDF page numbers in legal submissions. The combination of a Bates-stamped PDF and a structured JSON export gives legal teams both the human-readable exhibit and the machine-queryable record in a single operation.

JSON vs PDF - When to Use Which

A PDF is the right output when the audience is human: a solicitor reviewing an exhibit, a judge reading a bundle, an HR manager assessing a complaint. PDFs are self-contained, portable, and display consistently regardless of software. Bates numbers and SHA-256 hashes make them suitable for formal legal use. The JSON export, by contrast, is for machine consumption - whenever the chat needs to be queried, imported, transformed, or passed to an AI model.

For most professional use cases, generating both outputs from the same conversion is the practical choice. The PDF provides the official, human-readable record; the JSON enables downstream processing without any further parsing or manual data entry. Both are produced from the same parsed source data, so they are guaranteed to be consistent with each other.

The JSON export is only available on the Pro plan. Free plan users get PDF output only.

Upgrade to pro and export your WhatsApp chat as AI-ready JSON.

upload_fileConvert Your Chat Free

Related Articles