7z Format

7-Zip 18

previous page next page

7z Format

7z is a new archive format, providing a high compression ratio.

The main features of the 7z format:

Open architecture
High compression ratio
Strong AES-256 encryption
Ability to use any compression, conversion or encryption method
Supports files with sizes up to 16000000000 GB
Unicode file names
Solid compression
Archive headers compression

7z has an open architecture, so it can support any new compression methods.

The following methods currently are integrated into 7z:

Method	Description
LZMA	Improved and optimized version of LZ77 algorithm
LZMA2	LZMA-based compression method. It provides better multithreading support than LZMA
PPMD	Dmitry Shkarin's PPMdH with small changes
BCJ	Converter for 32-bit x86 executables
BCJ2	Converter for 32-bit x86 executables
BZip2	Standard BWT algorithm
Deflate	Standard LZ77-based algorithm

LZMA is the default and general compression method of 7z format. The main features of the LZMA method:

High compression ratio
Variable dictionary size (up to 4 GB)
Compression speed: about 1 MB/s on 2 GHz CPU
Decompression speed: about 10-20 MB/s on 2 GHz CPU
Small memory requirement for decompression (depends from dictionary size)
Small code size for decompression: about 5 KB
Supports multi-threading and P4's hyper-threading

The LZMA compression algorithm is very suitable for embedded applications. If you want to use LZMA code, you can ask for consultation, custom code programming, and required developer licenses at

www.7-zip.org/support.html

AES encryption

7-Zip supports encryption with the AES-256 algorithm. This algorithm uses a cipher key with length of 256 bits. To create the key, 7-Zip uses a derivation function based on an SHA-256 hash algorithm. A key derivation function produces a derived key from a text password defined by the user. To increase the cost of an exhaustive search for passwords, 7-Zip uses a big number of iterations to produce the cipher key from the text password.

Tips for selecting password length

Here is an estimate of the time required for an exhaustive password search attack, when the password is a random sequence of lowercase Latin letters.

The most complex task for password search attack is SHA-256 calculation. Special SHA-256 hardware or GPU can be used to accelerate password search attack. Now modern GPU can provide about 10 times more performance for SHA-256 calculation than modern CPU. And special SHA-256 hardware can provide about 20 times more performance than GPU.

We suppose that one user with a budget of about $2000 (for GPUs) can check 10000 passwords per second and an organization with a budget of about 10^9 USD (one thousand million US dollars) can check 3 * 10^12 passwords per second. We also suppose that the processor in use doubles its performance every two years; so, each additional Latin letter of a long password adds about 9 years to an exhaustive key search attack.

The result is this estimate of the time to succeed in an attack:

Password Length	Single User Attack	Organization Attack
1	1 s	1 s
2	1 s	1 s
3	2 s	1 s
4	1 min	1 s
5	30 min	1 s
6	12 hours	1 s
7	14 days	1 s
8	1 year	1 s
9	10 years	2 s
10	19 years	1 min
11	28 years	30 min
12	37 years	12 hours
13	46 years	14 days
14	55 years	1 year
15	64 years	10 years
16	73 years	19 years
17	82 years	28 years
18	91 years	37 years
19	100 years	46 years

previous page start next page