Duplikaterkennung

WinHex & X-Ways

previous page next page

Duplikaterkennung

Wenn Sie Dateien mit identischem Inhalt nur einmal begutachten möchten und wenn Dateinamen, Zeitstempel, Löschzustand und andere Metadaten des Dateisystems zunächst von sekundärer Bedeutung sind, können Sie den Befehl "Duplikate in Liste finden" im Kontextmenü des Verzeichnis-Browsers verwenden, um Duplikate unter den aktuell aufgelisteten (aufgelisteten, nicht ausgewählten!) Dateien zu erkennen, basierend auf Hash-Werten (sofern berechnet) oder anderen Kriterien. Auf Wunsch können können Duplikate im Datei-Überblick sogleich ausgeblendet werden. Dabei wird nur jeweils eine Datei in jeder Gruppe von identischen Dateien wird nicht ausgeblendet. Jede Gruppe von identischen Dateien kann optional einer eindeutigen Berichtstabelle zugeordnet werden, um diese Gruppe per Filter leicht auf einmal betrachten zu können, selbst dann, wenn sie sich über mehrere Asservate verteilen.

Im Zweifelsfall behält diese Funktion beim Ausblenden existierende (nicht gelöschte) Dateien bei, und gibt unter gelöschten Dateien denjenigen den Vorzug, die über Dateisystem-Datenstrukturen gefunden wurden und nicht per Signatursuche. Und im Zweifelsfall wird diejenige Kopie einer Datei beibehalten, deren Besitzer bekannt ist. Optionale Sonderregeln: Identische E-Mails mit unterschiedlichen Dateianhängen (Unterobjekten) werden als Duplikate gekennzeichnet, aber nicht ausgeblendet. Identische Anhänge (Unterobjekte) werden als Duplikate gekennzeichnet, aber nur dann indirekt ausgeblendet, wenn sie Teil von identischen E-Mails sind und diese auch ausgeblendet werden. Dies erleichtert die Untersuchung und vermeidet die Situation, daß das übergeordnete Objekt (die E-Mail) einer E-Mail+Anhang-Familie und das Kind (der Dateianhang) einer anderen Familie ausgeblendet wird.

Wenn Sie später relevante Dateien finden, für die es Duplikate gab, und Sie sich nun auch für diese Duplikate interessieren (z. B. deren Dateinamen, Pfade oder Zeitstempel), können Sie ein Hash-Set der gefundenen relevanten Datei erzeugen, um alle Duplikate bequem und automatisch zu identifizieren, indem Sie die Hash-Werte aller Dateien gegen dieses spezielle Hash-Set abgleichen und dann den Hash-Set-Filter verwenden. Oder Sie verwenden den Filter der Hash-Spalte direkt.

Pairs of duplicates in the same volume snapshot can be optionally linked as so-called related items, so that it's easy to navigate from one such file to at least one duplicate. However, that does not work across evidence object boundaries. Marking the files as duplicates in the Description column is optional.

Alternatively, you may exclude files simply based on identical names instead of identical hash values. This is a case-insensitive comparison and of course should be used only if you know what you are doing, as it does not compare the file contents at all. Could be useful for example if you wish to get rid of multiple copies of the same files found in backups if you do not need to keep different versions of these files. If prior to the comparison for example you sort by last modification date in descending order, this will ensure that the newest version of the file will be kept and all older versions will be excluded. Files with identical names are not marked as duplicates in the Attr. column.

If you have access to PhotoDNA in X-Ways Forensics, you may also identify and exclude duplicate pictures using PhotoDNA. All duplicates will be marked as "duplicates found" in the Attr. column, and all except one will be excluded. When in doubt, deleted files or pictures with a poor resolution will be excluded and existing files and pictures with a higher resolution will be kept. Please note that the hash value comparison is a potentially time-consuming operation if many pictures are listed in the directory browser, much more so than for conventional hash values. However, you can abort the comparison at any time. This operation requires that PhotoDNA hash values have been computed beforehand, using Specialist | Refine Volume Snapshot | Picture processing | Compute PhotoDNA hash values. It is useful for example for law enforcement agencies that wish create PhotoDNA hash sets of unique pictures only and for that purpose maintain a lawful collection of incriminating pictures without duplicates. The strictness of the picture comparison is the same as set in the Specialist | Refine Volume Snapshot | Picture processing dialog window for matching against the PhotoDNA hash database.

previous page start next page