SOPIA Suite

SOPIA

SOPIA Suite Help Files


DuFF.py

Description

DuFF is a duplicate file finder script. The script works by scanning a directory and creating a list of md5 and sha-1 hashes for each individual file. Then, the script looks for identical hashes within the list and outputs the duplicate files and their respective locations.

Dependancies

  • python 2.7 installation
  • Windows OS

How To Use




Step 1:

Start SopiaSuite and select DuFF from the menu.




Step 2:

Browse to the directory you want to scan for duplicate files.



Step 3:

Once DuFF finishes scanning, you will see a pop up.



Step 4:

DuFF creates a file called 'outPut.txt' in the same folder that duff.py is in. You can view this file in the terminal if you wish. If you click no, the txt file will still be made.



Step 5:

If you browse to the directory where duff.py is located, you will see the newly created outPut.txt.



Step 6:

If you open outPut.txt, you'll see that DuFF has printed a list of MD5 hashes and the relevant files that go with the MD5 hash, highlighted in yellow. Duplicate files can be seen below, highlighted in red.



Step 7:

The screenshot below shows the 3 files that were in the directory we scanned earlier. As you can see, DuFF correctly recognised that File1.txt & File3.txt were the duplicates.

About

DuFF was developed as a way of identifying duplicate files, after this functionality was missing in EnCase & FTK's standard software. The script came to be called DuFF as this is a take on the DUplicate File Finder and the logo is a homage to The Simpsons, in which Homer drinks DUFF beer.



© 2014 Simon McCabe, SopiaSuite Help Files.