Volume Snapshots

WinHex & X-Ways

Volume Snapshots and their Refinement

 

A volume snapshot is a database of the contents of a volume or physical medium (files, directories, ...) at a given point of time. The directory tree and the directory browser present views into this database. Based on the underlying file system's data structures, it consists of one record per file or directory, and remembers practically all metadata (name, path, size, timestamps, attributes, ...), but not the contents of files or data of directories. A volume snapshot usually references both existing and previously existing (e.g. deleted) files, also virtual (artifically defined) files if they are useful for a computer forensic examination (e.g. so that even unused parts of a disk or volume are covered). Operations such as logical searches, indexing, and all commands in the directory browser context menu are applied to the files and directories as they are referenced in the volume snapshot. Because of compressed files and because deleted files and the virtual "Free space" file may be associated with the same clusters of a volume multiple times, the sum of all files and directories in a volume snapshot can easily exceed the total physical size of a volume.

 

A volume snapshot is stored on the disk either as a set of files named Volume*.dir in the folder for temporary files or (if associated with a case) as files named “Main 1”, “Main 2”, “Main 3”, “Names”, …, in the evidence object's metadata directory.

 

Volume Snapshot Options

 

The Specialist menu allows to expand/refine the standard volume snapshot in various ways. Requires a specialist or forensic license. Full functionality only with a forensic license.

 

Run X-Tensions: X-Tensions are DLLs, which you can program yourself, to extend the functionality of X-Ways Forensics or use it automatically for your own purposes. More information.

 

Particularly thorough file system data structure search

 

File header signature search

 

Block-wise hashing and matching

 


 

The below operations are applied after the aforementioned operations, to files that are already contained in the volume snapshot, and they are all applied together and file-wise (i.e. first all operations to one file, then all operations to the next file, and so on), to process files in the order of ascending internal IDs. Some of these operations may produce additional files, which will get the next higher available internal ID. Previously existing files whose first cluster is known to have been overwritten or whose first cluster is unknown are not processed except if you specifically target them via tagging.

 

Files that are considered irrelevant based on hash matching can be automatically omitted from all further operations to save time and avoid potentially even more irrelevant files that might otherwise be extracting from them. It is also possible to omit not only known irrelevant files, but also known relevant files from further processing. Useful for example if in large cases you have or expect really many such files and having proof of their presence is sufficient for you and you don't need to extract their internal metadata, don't need to compute their skin tone percentages or PhotoDNA hashes, and don't need to check them for embedded data etc. There is also an option to omit files that are filtered out. All of these options are particular powerful in that they can target even files in advance that are not yet part of the volume snapshot when the refinement starts. For example when additional files are added to the snapshot by the file header signature search, depending on the file type these files can be further processed (e.g. hashed) or not, if the Type filter is active during the later stages of the volume snapshot refinement.

 

There is an option to omit additional hard links for the same file in NTFS/HFS+ from volume snapshot refinement just as from logical searches, to save time and reduce the number of redundant identical child objects etc. This can make a big difference on partitions with Windows installations that have a lot of hard links and HFS+ partitions with Mac OS X Time Machine. Which hard links are considered the "additional" hard links internally can be seen in the "Link count" column (gray number means to be omitted) and also in the Description column, which identifies all hard links (i.e. files with a hard link count larger than 2) and the additional ones in particular textually. The hard link that is not marked as "optionally omitted" in the Description column is considered the "main" hard link internally.

 

Compute hash

 

Verify file types with signatures algorithms

 

Extract internal metadata, browser history, and events

 

Include contents of Zip and RAR archives etc.

 

Extract e-mail messages and attachments

 

Uncover embedded data in various file types

 

Export JPEG pictures from videos

 

Picture analysis and processing

 

File format specific and statistical encryption tests

 

Indexing

 


 

Should processing freeze on a certain file, note that the internal ID and the name of the currently processed file are displayed in the small progress indicator window. If the volume snapshot refinement is applied to an evidence object and the refinement crashes when processing a single file at a time, X-Ways Forensics will tell you which file when you restart the program and associate it with a report table named "Reason for crash?" (depends on the Security Options). All that happens so that you can exclude and omit the file when trying again. It does no harm (does not create duplications and does not cost much time) if you restart snapshot refinement for that volume from scratch, as already processed files will quickly be skipped, up to the point where the refinement progress was last saved, which depends on the auto-save interval of the case. The volume snapshot remembers for each file separately which operations of the volume snapshot refinement have been applied to it already, so the same operations will usually not be applied again to the same file.

 

If the hash value for a problematic (crashing) file was computed, that file and identical files are skipped automatically if you (continue to) refine the volume snapshot and compute hash values (at least if the protection against identical crasher files is active in the properties of the case). To make the case forget previous crasher files, click the Delete button in the case properties. Skipped files are also automatically added to the aforementioned report table.

 

The file processing part of volume snapshot refinements supports multiple threads (only if not applied to a selection). Depending on the selected suboperations and the types of the files in the volume, and depending on I/O speed, this can double, triplicate or even quadruplicate the performance. The faster your mass storage solution (HDD, SSD, RAID) in terms of seek times and data transfer speed, the more time you save percentage-wise. This parallelization feature is still considered experimental and not complete yet, but the potential time saving in one of the most important and most time-consuming functions of the program is enormous. Selecting multiple extra threads has an effect only when searching in evidence objects that are images or directories, not disks. If you select 0 extra threads, it will work as in X-Ways Forensics versions before 19.0. If you select 1 or more extra threads, processing is done in additional worker threads (as many as you select), and the main thread of the process will be idle, which means the GUI will remain highly responsive. In X-Ways Investigator up to 2 worker threads may be used, in X-Ways Forensics up to 8, if your CPU supports that. If multi-threaded processing crashes, next time when you restart the program it probably cannot tell you which file exactly presumably caused the crash. File-wise processing conducted by X-Tensions (through calls of XT_ProcessItem or XT_ProcessItemEx) are also parallelized if the X-Tensions identifies itself as thread-safe. Processing of files in file archives is currently excluded from parallelisation internally. Parallelization is currently not offered as an option if indexing is selected.

 

You may schedule a simultaneous search in advance for the time after the volume snapshot refinement.

 

Interdependencies

 

There are various interdependencies between all these operations. For example, if the contents of archives are included in the volume snapshot, among these files there could be pictures that are to be checked for skin colors, or documents that are to be checked for encryption. You can work under the premise that if an additional file is added to the volume snapshot or if the true type of a file is detected as part of Refine Volume Snapshot, all the appropriate other operations are applied to that file, if they are all selected. The output of one operation automatically becomes the input of all other operations (or even the same operation again), where suitable.

 

Imagine someone tries to conceal an incriminating JPEG picture by embedding it in a MS Word document, misnaming that .doc file to .dll, compressing that file in a Zip archive, misnaming the .zip file to .dll, compressing that .dll in another Zip archive, misnaming that .zip file again to .dll, and then sends this .dll file by e-mail as an attachment using MS Outlook. If all the respective options are selected, Refine Volume Snapshot does the following: It extracts the e-mail attachment from the PST e-mail archive. It detects that the .dll attachment is actually a Zip archive. Then it includes the contents of it in the volume snapshot, namely a file with the .dll extension. That file is found to be actually another Zip archive. Consequently that archive will be explored, and the .dll file inside will be detected as a .doc file. Searching for embedded pictures, X-Ways Forensics finds the JPEG file in the .doc file and can immediately check it for skin colors if desired. All of this happens in a single step. Wow.

 

Notes

 

X-Ways Forensics conveniently remembers for each and every file in the volume snapshot which refinement operations have already been applied to it, so that the file will not unnecessarily be processed again, which would lead to undesirable duplication of child objects, waste of time etc. X-Ways Forensics does not remember the individual suboptions of each operation (e.g. whether "Create previews of browser databases" was selected for the metadata extraction) and cannot catch up on these suboptions individually. The only operations that will be applied repeatedly are indexing and matching of hash values against the hash database. If for any reason you wish to apply certain other operations again to the same file (e.g. then with different suboptions or after having updated the signature database for file type verification), you may reset a file to the state of "still to be processed" by volume snapshot refinement, by selecting it and pressing Ctrl+Del. This will also clear any computed skin color percentages, extracted metadata, hash values, hash matches, etc. However, this function does not remove any child objects from the volume snapshot. That would have to be done by the user separately, if desired, by hiding and removing them. Neither does this function delete any events that were created during prior refinement operations. Another keyboard shortcut, Ctrl+Shift+Del, allows to remove matches with ordinary hash sets, FuzZyDoc hash sets, and PhotoDNA categories from selected files in the volume snapshot, which even if the hash sets are deleted from the hash database are not discarded otherwise.

 

Whether a file should be processed by volume snapshot refinement or not is decided only at the time when it is that file's turn, not when you start the operation. That means if you continue to work in the program while a volume snapshot refinement is ongoing, and alter or activate or deactivate filters or tag or untag files or exclude or include files, that may still affect the scope of the operation, depending on the chosen options and depending on whether the files that you tag/untag/exclude/include/... still have to be processed or not. So if for example you find out that the operation takes too much time, you can still make the filter more strict or untag certain very large files etc., without interrupting the process.

 

When volume snapshot refinement is in the stage of processing individual files, then the progress percentage is simply the internal ID of the currently processed file divided by the total number of items in the volume snapshot. X-Ways Forensics doesn't know beforehand which files need a lot of time to process, only when actually reading from the file it will be decided what should be done with the file and discovered how much data is embedded etc. File type verification and potentially hash database matching may change the decision about what to do with the file, if anything at all. If an entire evidence object consists of just 1 file, e.g. if you added a single files to the case, then the progress percentage will not advance. The progress is 0% initially and 100% for a fraction of a second when done. The displayed percentage does not reflect the sub-progress within a given large file.

 

An unlabelled (but tooltipped) check box in the volume snapshot refinement dialog window can now make X-Ways Forensics reveal which suboperation is currently applied to the currently processed file. A 3-digit abbreviation will be displayed with the following meaning:

Sig: file type verification

Hsh: hashing

Vid: capture sporadic still images from videos

Idx: preprocessing original file contents for indexing

Dec: text decoding for indexing

IdX: preprocessing decoded text for indexing

Emb: search for embedded data

PDN: PhotoDNA database matching

Pic: other picture analysis steps

Eml: e-mail extraction

Fuz: FuzZyDoc database matching

Met: metadata extraction

Enc: file format specific encryption test

Ent: entropy check

Arc: inclusion of files in archives into the volume snapshot

This may be helpful for educational reasons, to give users a better idea of how computationally expensive certain suboperations are and how much time could be saved by not selecting them if not absolutely necessary. It may also prove useful for debugging purposes. Whether this option may slow down processing on certain computers has not been tested.

 

Certain previously valid timestamps of files are output as events during various suboperations of the particularly thorough file system data structure search on NTFS, depending on the refinement option "Provide by-catch timestamps from various sources as events", which may also effect other operations whose primary purpose is not the retrieval of timestamps/events.