Extract E-mail Messages and Attachments

WinHex & X-Ways

Extract E-mail Messages and Attachments

 

Part of volume snapshot refinement.

 

A forensic license allows to separately list and examine e-mail messages and e-mail attachments stored in the following e-mail archive file formats: Outlook Personal Storage (.pst), Offline Storage (.ost), Exchange (.edb, Exchange 2010 and earlier supported, 2010 still in a testing stage), Outlook Message (.msg), Outlook Template (.oft), Outlook Express, Outlook for Mac, Kerio Connect (store.fdb files that can be processed like ordinary PST/OST files), AOL PFC files, Mozilla mailbox (including Netscape and Thunderbird), generic mailbox (mbox, Unix mail format), MHT Web Archive (.mht). By default, X-Ways Forensics tries to extract from these file types: pst,ost,edb,dbx,pfc,mbox,eml,emlx,mht,msg,olk14msgsource,olk14message,oft,mbs

 

E-mail messages are usually output as .eml files. To conveniently focus on all extracted e-mail messages from all e-mail archives (and even processed original .eml files) it is recommended to explore recursively and use the Attribute filter (not the Type or Category filter).

 

The timestamp in the "Date:" line in an e-mail message's header (if accompanied by a time zone indicator like -0700 or +0200) is listed as the creation date & time. The timestamp in the "Delivery-Date:" line (or alternatively, if not available, the first "Received:" line) is listed as the last modification date & time. For extracted e-mails and their attachments, sender and recipient will be displayed in the corresponding columns in the directory browser. You may filter by dates as well as sender and recipient.

 

If e-mail messages have a Sender: line in addition to a From: line, then the sender according to the Sender: line is now shown in the Sender column of the directory browser additionally, after the From: sender, if actually different. They are delimited by spaces and a pipe (|). For example, an English language MS Outlook shows such e-mails as having been sent "on behalf of" someone else (by the Sender: sender on behalf of the From: sender). You can filter for such e-mails by entering a pipe as a substring for the Sender column. Analogously, different kinds of recipients ( To:, Cc:, and Bcc: ) are delimited by pipes in the Recipient column.

 

Attachments and embedded files are extracted, too, if found in the e-mail archive (exception e.g. AOL PFC) and usually become child objects of their respective containing e-mail messages in the volume snapshot. All extracted e-mails and attachments actually reside in the evidence object's metadata subdirectory and may utilize a lot of drive space.

 

E-mail extraction from PST can process password-protected PST archives without the password! It supports the following code pages for encoded PST files: ISO8859-1, ISO8859-2, ISO8859-3, ISO8859-4, ISO8859-5, ISO8859-6, ISO8859-7, ISO8859-8, ISO8859-9, ISO8859-10, ISO8859-11, ISO8859-13, ISO8859-14, ISO8859-15, ISO8859-16, koi8-r, koi8-u, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 874, UTF16, UTF32, UTF8

 

In certain old AOL PFC files, pictures may be embedded in e-mail messages in a special way. In that case, such an e-mail message will be marked with a paperclip icon, but the picture will not be separately extracted. The picture, if JPEG or PNG, can be found, however, when extracting JPEG and PNG files from *.pfc.

 

Some advantages of the .eml format for output: E-mail messages output as .eml files are represented as simple and as authentic and universal as it gets. They are easy to understand, clearly structured into header and body, and extremely easy to completely view in a variety of simple programs (e.g. text editor, word processing, Internet browser, free e-mail clients like Thunderbird and Windows Mail). No commercial software like MS Outlook needed is needed to view .eml files. .eml is the "natural" format of e-mail, just like a raw image is the natural format of a disk image, if you even want to call it a "format" (actually it has no additional format specifications, it's just a plain representation of the data that it should represent). An .eml file contains the complete original metadata of the e-mail message, fully intact, exactly as it was sent and delivered. You have complete control over the file if you copy it out for someone else, can see all data, can verify that no unintended data made it into the file. You can easily redact any text in the body manually with a simple text editor, redact any metadata in the header, easily retroactively remove any attachment using a simple text editor if needed, all of which is impossible to do with a complex proprietary binary file format such as MSG. The general format of .eml files can be understood by anyone, and it is simply a text file. The format of MSG files can be understood only with a computer science or programming background, and learning it takes a lot of time. Redacting e-mail data hidden in MSG files is difficult.

 

A side task of e-mail processing is to extracted files from e-mail related MIM archives and make them accessible as child objects in the volume snapshot in plain binary form.