WaChat to PDF
WhatsApp Export6 min read

Understanding the WhatsApp _chat.txt Format

The WhatsApp _chat.txt file has a specific timestamp and sender format that varies between iOS and Android. Here's a technical breakdown of the format and its edge cases.

_chat.txt is the core file inside every WhatsApp export. It is a plain-text log of the entire conversation, structured in a consistent but locale-sensitive format. Understanding this format is useful if you want to parse the file programmatically, troubleshoot a conversion issue, or verify that an export has been correctly interpreted. This guide covers every aspect of the format, including the edge cases that trip up most parsers.

The Basic Line Format

Each regular message in _chat.txt occupies a single line and follows the pattern: timestamp, sender name, colon and space, message body. The timestamp is enclosed in square brackets on iOS and presented without brackets on many Android builds. A typical iOS line looks like this:

[05/03/2025, 10:14:33] Alice: Can you send me the contract? - where 05/03/2025 is the date in DD/MM/YYYY format, 10:14:33 is the 24-hour time, Alice is the sender's name as saved in your contacts, and the remainder is the message text.

A message you sent yourself appears with your own name as it is stored in WhatsApp, or in some exports with no sender name at all, depending on your WhatsApp version and region. The format is otherwise identical to received messages.

iOS vs Android Timestamp Differences

The most significant variation between iOS and Android exports is in the timestamp format. iOS consistently uses square brackets and a comma between the date and time: [DD/MM/YYYY, HH:MM:SS]. Android exports from many regions omit the brackets and use a format without the comma: DD/MM/YYYY HH:MM:SS. Some Android manufacturers use a period as the date separator instead of a slash: DD.MM.YYYY.

  • iOS (UK/EU locale): [05/03/2025, 10:14:33] Alice: message
  • iOS (US locale): [3/5/2025, 10:14 AM] Alice: message - note MM/DD/YYYY and 12-hour time
  • Android (common): 05/03/2025, 10:14 - Alice: message - no brackets, dash before sender
  • Android (German locale): 05.03.2025, 10:14 - Alice: message - period date separator
  • Android (some versions): 05/03/25, 10:14 am - Alice: message - 2-digit year and 12-hour time

WaChat to PDF's parser handles all documented variants by testing a set of timestamp patterns against the first 20 lines of the file and selecting the best match. If your file uses an unusual format, the parser reports which lines it could not match so you can investigate.

System Messages

System messages are lines in _chat.txt that do not follow the sender: message pattern. They are generated by WhatsApp itself to record events in the conversation's lifecycle. System messages appear with a timestamp but no sender name. Examples include:

  • Encryption notice: 'Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them.' - appears at the top of every new conversation.
  • Group events: 'Alice added Bob', 'Bob left', 'Alice changed the group icon', 'Alice changed the subject to Team Updates'.
  • Security code changes: 'Your security code with Alice changed. Tap to learn more.' - appears when a contact reinstalls WhatsApp or changes device.
  • Call logs: 'Missed voice call', 'Video call, 12 min' - voice and video calls are logged as single-line system entries.
  • Ephemeral message notices: notifications that disappearing messages have been turned on or off.

In the WaChat to PDF output, system messages are rendered as small centred cards distinct from the chat bubbles. They are clearly labelled as system events, which helps readers understand the structure of a conversation without confusing them with actual messages from participants.

Media References

When a media file is attached to a message, the _chat.txt line references the file by name rather than embedding any binary data. On iOS the reference appears in angle brackets: [05/03/2025, 10:14:33] Alice: <attached: IMG-20250305-WA0001.jpg>. On Android the pattern is slightly different and often omits the word 'attached': 05/03/2025, 10:14 - Alice: IMG-20250305-WA0001.jpg (file attached).

WhatsApp names media files using a consistent convention. Images are named IMG-YYYYMMDD-WAxxx.jpg or IMG-YYYYMMDD-WAxxx.webp. Voice notes use PTT-YYYYMMDD-WAxxx.opus (PTT stands for push-to-talk). Videos use VID-YYYYMMDD-WAxxx.mp4. Documents retain their original filename but are sometimes renamed with a leading sequence number. The YYYYMMDD portion reflects the date the file was received, not the date of the message, which can occasionally differ by a day in messages sent around midnight.

If you exported without media, the reference lines show '<Media omitted>' instead of a filename. This is a WhatsApp-generated placeholder - it means a file was sent but was not included in this export. Re-exporting with Include Media will replace these placeholders with the actual filenames.

Multi-line Messages

WhatsApp allows users to send messages containing line breaks (using Shift+Enter on the desktop app, or a special input method on mobile). In _chat.txt, a multi-line message is represented as a single entry starting with a timestamp, followed by continuation lines that do not begin with a timestamp. Parsers must treat any line that does not start with a recognisable timestamp pattern as a continuation of the previous message rather than a new message.

This is one of the most common sources of parsing errors in naive implementations. A parser that assumes one-message-per-line will incorrectly split a multi-line message into multiple separate messages, corrupting both the message text and the overall message count. WaChat to PDF's parser uses lookahead to detect continuation lines before committing to a message boundary.

Character Encoding

WhatsApp exports _chat.txt as UTF-8 in all modern versions of the app. UTF-8 encoding means the file correctly preserves emoji, accented characters, and right-to-left scripts such as Arabic and Hebrew. Emoji are stored as Unicode code points and render correctly in any UTF-8-aware text editor or PDF renderer.

Very old Android exports from WhatsApp versions prior to approximately 2020 occasionally used a different encoding for non-ASCII characters, which can cause characters from Arabic, Thai, or Chinese scripts to appear as question marks or boxes when the file is opened in a default-encoding editor. If you encounter this, open the file in a code editor such as VS Code or Notepad++ and manually change the encoding to UTF-8. WaChat to PDF also attempts to detect encoding mismatches at upload time.

Edge Cases and Quirks

Several less common situations can complicate parsing _chat.txt. The most common edge case is a sender name that contains a colon - for example, a contact saved as 'Dr. A. Smith: Legal' or a group member with a colon in their display name. Since the sender–message boundary is delimited by a colon and space, a colon in the sender name causes naive parsers to cut the sender name short and treat the remainder as the message start.

  • Sender names with colons: the parser must use the timestamp boundary to identify where the message body begins, not the first colon after the sender name.
  • Very long messages: WhatsApp does not impose a hard character limit on messages in the export. Single messages exceeding 65,000 characters have been observed in practice. The converter handles these without truncation.
  • Messages from unsaved numbers: when the sender is not in your contacts, their line shows the international phone number instead of a name (e.g. +44 7700 900123). The converter treats this as a valid sender identifier.
  • Forwarded messages: WhatsApp does not specially mark forwarded content in _chat.txt. A forwarded message appears as a regular message from the person who forwarded it.
  • Link previews: when a URL is shared, WhatsApp sometimes adds a preview title and description on the following lines. These continuation lines are parsed as part of the same message.

Ready to convert your _chat.txt into a readable, professionally formatted PDF? Upload your file to WaChat to PDF - the parser handles all formats and edge cases automatically.

upload_fileConvert Your Chat Free

Related Articles