Search Term Lists

WinHex & X-Ways

Search Term List

 

Displayed in the Case Data window when in search hit viewing mode (after clicking the button with the binoculars and the four horizontal lines). The search term list contains all the search terms ever search for in the case unless deleted by the user. The search terms can optionally be sorted alphabetically in ascending order or by the listed search hit count in descending order, via the context menu of the search term list, to make it easier to locate a certain search term in lengthy lists.

 

Selecting search terms in the search term list and then clicking the Enter button allows you to list all the search hits for these search terms in the currently selected path, subject to filters, in the search hit list. You can select multiple search terms by holding the Shift or Ctrl key while clicking them. You may press the Del key to delete selected search terms and all their search hits permanently.

 

To reduce a search hit list to a list of unique files that contain at least one search hit, check "List 1 hit per item only" and then click Enter. This can be very useful if you are going to review all such files manually, ensuring that each such file is listed only once. No assumption must be made that somehow "the most useful" search hit in each file is the one that makes it to the list, or if multiple search terms are selected the one listed search hit is for a search term that you consider more important. The reduction is non-destructive. Bringing back the original, complete search hit list merely requires that you uncheck this special box and click the Enter button again.

 

The option to list 1 search hit per item only does not filter out search hits in slack space or in un-initialized parts of files (in the part exceeding the so-called valid data length). This is useful be-cause the slack of a file is typically not related to the contents of that file, so any search hits in these special areas would likely have a totally different context than search hits in the logical por-tion of the file (and especially search hits in the uninitialized part of a file may reside in data from various different sources) and thus they need to be reviewed additionally. Please note that it is still necessary to unselect the "1 hit per item" option to separately check out search hits in con-glomerates such as pagefile.sys and the virtual "Free space" file, which contain data from totally different sources. The "1 hit per item" option is most useful for documents, for which you can often tell after one quick look in Preview mode whether that particular file is relevant or not.

 

It is possible to see (and via the Export list command in the context menu copy) the hit counts for selected search terms in the search term list. These hit counts are based on the current settings for the search hit list that is on the screen, take all filters into account, the explored path, any active AND combination etc. It is the numbers of hits that are actually listed, not the numbers of hits that have been recorded/saved. To see the total numbers of hits, deactivate any filter and select all search terms. Note that the "List 1 hit per item only" option also functions like a filter for search hits.

 

You can rename search terms with a command in the context menu of the search term list, for example so that lengthy GREP expressions are replaced with a more concise and easier-to-understand name such as "IP addresses", "Credit card numbers", "E-mail addresses" etc.

 

Hit count in search term lists

 

There are two ways how to logically combine multiple search terms with Boolean operators:

 

1) By default, multiple selected search terms are combined with a logical OR. To force a search term, select it and press the "+" key. To exclude a search term, select it and press the "-" key. To return a search term to normal OR combination, press the Esc key. You may also use the context menu of the search term list for all that. The below examples describe the effect of selecting the search terms A and B depending on their "+" or "-" status.

 

A

B

= search hits for A and search hits for B that occur in any files (normal OR combination)

 

+A

B

= search hits for A and search hits for B that occur in files that contain A

 

+A

+B

= search hits for A and search hits for B that occur in files that contain both A and B (AND combination)

 

A

-B

= search hits for A that occur in files that do not contain B

 

2) For a logical AND combination, if the search terms are not marked with "+" or "-", you may also use the small scrollbar that appears when you select multiple search terms. Allows you to see only search hits in files that contain all the selected search terms at the same time. You can combine up to 7 search terms that way. If you select more than 2 search terms, you also have the option to be less strict and only specify a minimum number of different search terms in the same file, e.g. require that of search terms A, B, C and D any combination of two of them in the same file is sufficient, e.g. A and B, or A and C, or B and D, etc. (fuzzy/flexible AND combination).

 

In addition to the "Min. x" option, the search term list also offers offers a "Max. 1" option when multiple search terms are selected that are not forced with a + or excluded with a -. "Max. 1" will list search hits only if they are contained in files that do not contain any of the other selected search terms. For example for 3 search terms, to get the same results otherwise, you would have had to list search hits for search term A while excluding B and C, then list search hits for B while excluding A and C, and then list search hits for C while excluding A and B, which of course is not as elegant and does not show you all such singular search hits at the same time.

 

When 2 search terms are selected in the search term list and combined with a logical AND (using either of the two available methods), additionally you can now require that search hits must be "NEAR" to each other to be listed, to find more likely relevant combinations of both search terms in the same file, exactly like with a proximity search. The maximum distance between the search hits that constitutes "NEAR" can be defined by the user in bytes. A NEAR combination may also be applied for more than 2 selected search terms. The effect is that a search hit is listed only if *any* of the other selected search terms occurs nearby.

 

This paragraph quoted from wikipedia.org: The basic, linguistic, assumption is that the proximity of the words in a document implies a relationship between the words. Given that authors of documents try to formulate sentences which contain a single idea, or cluster related ideas within neighboring sentences or organized into paragraphs, there is an inherent, relatively high, probability within the document structure that words used together are related. Where as, when two words are on the opposite ends of a book, the probability there is a relationship between the words is relatively weak. By limiting search results to only include matches where the words are within the specified maximum proximity, or distance, the search results are assumed to be of higher relevance than the matches where the words are scattered.

 

What's more, the search term list offers a "NOT NEAR" option (abbreviated NTNR) in addition to "NEAR". With 2 selected search terms, NTNR will ensure that only search hits are listed that are not located in vicinity of any search hits of the respective other search term. With more than 2 selected search terms, the results are currently undefined.