UseUnicodeAsNecessary Property

DotNetZip

Ionic Zip Library v1.9.1.6 UseUnicodeAsNecessary Property
ReferenceIonic.ZipZipOutputStreamUseUnicodeAsNecessary
Indicates whether to encode entry filenames and entry comments using Unicode (UTF-8).
Declaration Syntax
C# Visual Basic Visual C++
[ObsoleteAttribute("Beginning with v1.9.1.6 of DotNetZip, this property is obsolete. It will be removed in a future version of the library. Use AlternateEncoding and AlternateEncodingUsage instead.")]
public bool UseUnicodeAsNecessary { get; set; }
<ObsoleteAttribute("Beginning with v1.9.1.6 of DotNetZip, this property is obsolete. It will be removed in a future version of the library. Use AlternateEncoding and AlternateEncodingUsage instead.")> _
Public Property UseUnicodeAsNecessary As Boolean
	Get
	Set
[ObsoleteAttribute(L"Beginning with v1.9.1.6 of DotNetZip, this property is obsolete. It will be removed in a future version of the library. Use AlternateEncoding and AlternateEncodingUsage instead.")]
public:
property bool UseUnicodeAsNecessary {
	bool get ();
	void set (bool value);
}
Remarks

The PKWare zip specification provides for encoding file names and file comments in either the IBM437 code page, or in UTF-8. This flag selects the encoding according to that specification. By default, this flag is false, and filenames and comments are encoded into the zip file in the IBM437 codepage. Setting this flag to true will specify that filenames and comments that cannot be encoded with IBM437 will be encoded with UTF-8.

Zip files created with strict adherence to the PKWare specification with respect to UTF-8 encoding can contain entries with filenames containing any combination of Unicode characters, including the full range of characters from Chinese, Latin, Hebrew, Greek, Cyrillic, and many other alphabets. However, because at this time, the UTF-8 portion of the PKWare specification is not broadly supported by other zip libraries and utilities, such zip files may not be readable by your favorite zip tool or archiver. In other words, interoperability will decrease if you set this flag to true.

In particular, Zip files created with strict adherence to the PKWare specification with respect to UTF-8 encoding will not work well with Explorer in Windows XP or Windows Vista, because Windows compressed folders, as far as I know, do not support UTF-8 in zip files. Vista can read the zip files, but shows the filenames incorrectly. Unpacking from Windows Vista Explorer will result in filenames that have rubbish characters in place of the high-order UTF-8 bytes.

Also, zip files that use UTF-8 encoding will not work well with Java applications that use the java.util.zip classes, as of v5.0 of the Java runtime. The Java runtime does not correctly implement the PKWare specification in this regard.

As a result, we have the unfortunate situation that "correct" behavior by the DotNetZip library with regard to Unicode encoding of filenames during zip creation will result in zip files that are readable by strictly compliant and current tools (for example the most recent release of the commercial WinZip tool); but these zip files will not be readable by various other tools or libraries, including Windows Explorer.

The DotNetZip library can read and write zip files with UTF8-encoded entries, according to the PKware spec. If you use DotNetZip for both creating and reading the zip file, and you use UTF-8, there will be no loss of information in the filenames. For example, using a self-extractor created by this library will allow you to unpack files correctly with no loss of information in the filenames.

If you do not set this flag, it will remain false. If this flag is false, the ZipOutputStream will encode all filenames and comments using the IBM437 codepage. This can cause "loss of information" on some filenames, but the resulting zipfile will be more interoperable with other utilities. As an example of the loss of information, diacritics can be lost. The o-tilde character will be down-coded to plain o. The c with a cedilla (Unicode 0xE7) used in Portugese will be downcoded to a c. Likewise, the O-stroke character (Unicode 248), used in Danish and Norwegian, will be down-coded to plain o. Chinese characters cannot be represented in codepage IBM437; when using the default encoding, Chinese characters in filenames will be represented as ?. These are all examples of "information loss".

The loss of information associated to the use of the IBM437 encoding is inconvenient, and can also lead to runtime errors. For example, using IBM437, any sequence of 4 Chinese characters will be encoded as ????. If your application creates a ZipOutputStream, does not set the encoding, then adds two files, each with names of four Chinese characters each, this will result in a duplicate filename exception. In the case where you add a single file with a name containing four Chinese characters, the zipfile will save properly, but extracting that file later, with any zip tool, will result in an error, because the question mark is not legal for use within filenames on Windows. These are just a few examples of the problems associated to loss of information.

This flag is independent of the encoding of the content within the entries in the zip file. Think of the zip file as a container - it supports an encoding. Within the container are other "containers" - the file entries themselves. The encoding within those entries is independent of the encoding of the zip archive container for those entries.

Rather than specify the encoding in a binary fashion using this flag, an application can specify an arbitrary encoding via the ProvisionalAlternateEncoding property. Setting the encoding explicitly when creating zip archives will result in non-compliant zip files that, curiously, are fairly interoperable. The challenge is, the PKWare specification does not provide for a way to specify that an entry in a zip archive uses a code page that is neither IBM437 nor UTF-8. Therefore if you set the encoding explicitly when creating a zip archive, you must take care upon reading the zip archive to use the same code page. If you get it wrong, the behavior is undefined and may result in incorrect filenames, exceptions, stomach upset, hair loss, and acne.

Assembly: Ionic.Zip (Module: Ionic.Zip) Version: 1.9.1.8 (1.9.1.8)