Uncover Embedded Data

WinHex & X-Ways

Uncover Embedded Data

 

Part of volume snapshot refinement. Forensic license only.

 

Allows to carve files of various types that are embedded in files of other various types, through a byte-level file header signature search within certain files. This is successful if the outer file (host file) is intact and the embedded file is  not stored in the host file in a fragmented manner. Otherwise the embedded files may appear as corrupt. Notably this function searches for JPEG and PNG pictures, even JPEG pictures in other JPEG files (those that contain thumbnails of themselves). The files found this way will be generically named as "Embedded 1....jpg", "Embedded 2....png", etc.

 

This function also extracts .emf files embedded in multi-page printouts (.spl spooler files). .spl files that contain a single .emf file only can be viewed directly with the viewer component. Also extracted this way are .lnk shortcut files from .customdestinations-ms jumplists.

 

Special internal algorithms exist that properly extract, by following the data structures in the respective file format, even if fragmented, .lnk shortcut files from .automaticdestinations-ms jump lists, files of various types from OLE2 compound files (e.g. MS Word .doc, MS PowerPoint .ppt), Firefox browser caches (based on "_CACHE_MAP_" files), Safari browser caches, Norton Backup files (N360 backup, .nb20) and Windows Vista/7 Windows.edb databases (from the latter even e-mail messages), and pictures that are embedded as Base64 in VCF files (electronic business cards).

 

Chrome browser caches are processed based on "index" files, with support for multiple streams of the same cache entry: The HTTP response (named .chrome1) is output as well as, if present, as are compiled JavaScript entries (.js1). If a no-cache directive was sent by the web server, at least the HTTP response is still cached. In Preview mode you can see a special representation of HTTP responses. Chrome caches can now also be processed if their index is not available, for example if cache fragments have been carved or if the cache was partially deleted or corrupted. It may be possible in some cases that a better extraction result can be achieved without the index, even if it is present. To try that, if the index has not been processed before, you can have the uncover function process "data_4" files and omit the index. data_4 is part of the optional "special interest" group.

 

Also extracted are thumbnails from thumb*.db files, from Google's Picasa 3 image organizer and viewer software (thumbindex.db and related files), Photoshop thumbnail caches (Adobe Bridge Cache.bc), Canon ZoomBrowser thumbnail collections (.info), and Paint Shop Pro caches (.jbf). Thumbnails in certain very old "thumbs.db" files cannot be displayed correctly. Such thumbs.db files will be assigned to the report table "Unsupported thumbs.db" and can be viewed e.g. with the freely available program "DM Thumbs" by GreenSpot Technologies Ltd. Thumbcache*.db files of Windows Vista and later are targeted indirectly if thumbcache_idx.db is in the mask and if that file is available in the same directory. That speeds up the extraction and avoids the output of numerous duplicate thumbnails (only the highest available resolution is output). If thumbcache_idx.db is in the mask, that also means that thumbcache*.db files that are specifically selected or tagged for processing are not processed unless the thumbcache_idx.db file is also selected/tagged.

 

Also, from PDF documents it extracts any kinds of files that are marked as embedded plus JPEG and JPEG 2000 plus Acrobat form files in XML format plus JavaScript objects (the latter may make it easier to determine whether a PDF file should be considered malware). Extracts individual cookie files from Firefox and Chrome SQLite databases, also data blocks embedded as Base64 in XML-formatted PLists (.plist) and raw data blocks embedded in binary PLists (.bplist). It is recommended to verify file types at the same time so X-Ways Forensics can distinguish between traditional (XML-formatted) PLists and binary PLists (BPLists). Many PLists do not have a .plist extension and need to be identified as PLists first. Since the type of the embedded data is not identified by the PList as such, the output also benefits from a simultaneous file type verification. Nested PLists (PLists embedded in PLists) will also be identified and processed recursively. Another child object created for PLists represents parsed text in a human-readable way and serves as a preview of the PList itself.

 

Also reconstructs e-mail messages and extracts contact and account information from the Livecomm.edb database, which is used by the Windows Mail client (Windows 7 and newer), and contacts from Windows Live Mail contacts.edb database, also contacts from Windows Live Messenger's contacts.edb database.

 

You can also uncover various potentially relevant resources in 32-bit and 64-bit Windows PE executables (programms and libraries) as child objects, in particular RCDATA, named objects, bitmaps, icons and manifests. Useful for example for malware analysis. This does not happen automatically, only if you specifically target executable files via a suitable series of file masks.

 

Fully Base64-encoded files in the volume snapshot, provided that they have "b64" in the Type column can be automatically decoded, and the result is output in binary as (surprise) a child object.

 

Last not least this function can decompress hiberfil.sys files from Windows XP, Vista and 7 (32 and 64 bit) and automatically add the result to the case as raw memory dumps. hiberfil.sys slack (compressed data from previous usage of a hiberfil.sys file, as found near the end, if the last usage achieved stronger compression than previous usages) is provided as a child object in its decompressed form.

 

Generally all files produced by this function are added to the volume snapshot as child objects of their respective host files in which they were found. Files smaller than 65 bytes are not touched, for performance reasons.

 

Two separate file masks are maintained for uncovering embedded data in various file types. The second mask is optional and labelled as "special interest". For example malware investigators may choose to also process executable files that way when needed. You may prepend any element of a mask with a colon to temporarily exclude it, but keep it in the list for future reference. E.g. :*.jpg means not files with jpg as the extension or type.

 

In files of a type for which no internal extraction algorithm is built in, X-Ways Forensics tries to carve embedded data using those file header signatures that are marked in “File Header Signatures Search.txt” with the “e” flag. That means you can have X-Ways Forensics uncover embedded data in many more file types than it does by default if you like!

 

File header signature search in all files not processed above

 

A separate sub-operation optional allows you to freely carve any kind of file within any file that is not processed by the first sub-operation. By default, file types with the "e" flag are selected for that. Use great caution to avoid delays and copious amounts of garbage files (false positives) and duplicates. Please apply this new function very carefully and only with a good reason to specifically targeted files only, such as swap files or storage files in which backup application concatenate other files without compression, not blindly to all files or random files. Remember with great power comes great responsibility.

 

Signatures marked with the "E" flag (upper case) are never carved within other files, to prevent the worst effects, for example MPEG frames carved within MPEG videos, zip records carved within zip archives, .eml, .html and .mbox files carved within e-mail archives, .hbin registry fragments carved within registry hives. If you know what you are doing, of course you could remove the E flag.

 

There is an option to apply the carving procedure recursively, that is to also carve in files that were already carved within other files themselves. This can lead to many duplicates if the outer file at level 1 is carved too big so that files can be carved in it that were also carved at level 0 (the original file).

 

For situations were you want to carve embedded files that are not aligned at 512-byte boundaries in the original file, you may make use of the extensive byte-level option. Files are never carved in $MFT.

 

The default settings will make X-Ways Forensics conduct a file header signature searches at the byte level within pagefile.sys files, to find e-mail fragments, .lnk shortcut files, pictures, etc.