Hash Database

WinHex & X-Ways

previous page next page

Hash Database

Functionality only available with a forensic license. An internal hash database, once created, consists of 257 binary files with the extension .xhd (X-Ways Hash Database). The storage folder is selected in the General Options dialog. Such an internal hash database is organized in a very efficient way, which maximizes performance when matching hash values. It is up to the user to decide on what hash type the database will be based (MD5, SHA-1, SHA-256, ...), and it is up to the user to fill the hash database with hash sets and hash values, either by creating hash sets in X-Ways Forensics yourself or by importing hash sets from other sources. The same hash database can be shared and used simultaneously by multiple users or instances if the same storage folder is selected. However, it cannot be updated while other users/instances are using it.

It is possible to maintain two separate hash databases at the same time, databases based on the same hash type or different hash types. Useful for example if you receive hash sets from different sources with different hash types (e.g. some with MD5 and some with SHA-1 values) and wish to use them simultaneously. The second hash database may be stored on a different drive. Useful if for example the primary hash database for general use is shared with colleagues on a network drive and the user wishes to create or import new hash sets, either for temporary use only or while the primary hash database is locked by other users, into a locally stored second database.

Each hash value in the hash database belongs to one or more hash sets. Each hash set belongs to either the category irrelevant/known good/harmless or "notable"/known bad/malicious/relevant or can remain uncategorized (meaning "not decided yet" or "uncertain").

Hash values of files can be computed and matched against the hash database when refining the volume snapshot. The directory browser's optional columns Hash Set and "Category will then reveal for each file to which hash sets and category it belongs, if any (which allows you to sort/filter by these aspects and ignore irrelevant files easily or focus on files you are looking for). If the hash value of a file is contained in multiple selected hash sets, the program will report all matching hash sets and indicate the category of one of the hash sets. It also checks whether the matching hash sets all belong to the same category, and if not, will show a warning.

An optional second, separate hash database of block hash values (instead of normal file hash values), stored in a separate directory, allows you to search for incomplete remnants of known highly relevant files block-wise on other media.

Via the Tools menu you get invoke the dialog window to manage the active hash database(s), which allows you to

- start a fresh, blank hash database (and discard the existing current database, using the "Initialize" command, where you have the opportunity to select a new hash type),

- view a list of the hash sets that are contained in the database,

- rename hash sets,

- merge hash sets (note that duplicate hash values in the resulting hash set are not removed immediately, but next time when you add a hash set, and note that you are not warned if you are merging hash sets of different categories),

- toggle the category of hash sets,

- verify the integrity of the hash database,

- import selected hash set text files,

- import all the hash set text files in a certain folder and all its subfolders (ditto), optionally into a single internal hash set whose name you have to specify,

- export selected hash sets (for example if you wish to exchange individual hash sets with other examiners, not the whole database),

- and switch between the normal file hash database and the block hash database.

*NSRL RDS 2.x, HashKeeper, and ILook text files are supported, plus hash sets in the JSON/ODATA format layout as used by Project Vic (versions 1.0, 1.1 und 1.2) as found in the Hubstream Inbox. Another import and the only export format is a very simple and universal hash set text file, where the first line is simply the hash type (e.g. "MD5") and all the following lines are simply the hash values as ASCII hex or (for SHA-1) in Base32 notation, one per line. Line break is 0x0D 0x0A.

When importing hash values from NSRL RDS, if you categorize the hash set as irrelevant, hash values marked as special or malicious will be ignored (not imported). If you categorize the hash set as notable, only hash values that are marked as malicious will be imported. If you set the hash set to the uncategorized state, only hash values that are marked as special or have an unknown flag will be imported. If you wish to import all hash values, you can import the same NSRL hash set file three times, with different categorizations, and all hash values will end up in suitably categorized internal hash sets.

The Include in Hash Database command in the directory browser's context menu allows you to create your own hash sets in any of the internal hash databases. Whenever importing/creating hash sets, duplicate hash values within the same hash set will be eliminated. When importing the NSRL RDS hash database, X-Ways Forensics checks for records with the flags "s" (special) and "m" (malicious) so that these hash values are not erroneously included in the same internal hash set that should be categorized as irrelevant. The hash database supports up to 65,535 hash sets.

Duplicate hash values that are already contained in the hash database can optionally be either removed from a newly created or newly imported hash set or from all existing hash sets, to keep the hash database more compact/less redundant if so desired.

There is a way to efficiently delete individual hash values from an existing hash set, by importing a hash set file (simple 1-column format, 1 hash value per line), where the hash values to delete must be listed first and must be prepended with a minus sign ("-"). The file must have the same name as the existing hash set in the database that you wish to update (additional filename extension allowed).

There is an option to unload the hash database if loaded at the moment when all data windows are closed (the moment when the last open data window is closed), to save main memory or to specifically allow other concurrent users or instances to change the hash database.

PhotoDNA

FuzZyDoc

previous page start next page