Microsoft AntiXSS Library |
What's New in AntiXSS / Change History |
What's new in AntiXSS 4.2
Minimum Requirements
You can now, once again, use the encoder libraries in .NET 2.0. .NET 2.0, 3.5 and 4.0 have their own libraries optimised for each version of the framework.
.NET 4.0 Support
The .NET 4.0 version of AntiXSS comes with a class that can be used to set AntiXSS as the default encoder used by MVC, WebPages and WebForms applications.
Invalid Unicode is handled differently.
Invalid Unicode characters are now replaced with the Unicode replacement character, U+FFFD (�). Previously when encoding strings through HtmlEncode, HtmlAttributeEncode, XmlEncode, XmlAttributeEncode or CssEncode invalid Unicode characters would be detected and an exception thrown.
UrlPathEncode added.
The encoding library now has Encoder..::..UrlPathEncode(String) which will encode a string for use as the path part of a URL.
The HTML Sanitizer handles CSS differently.
The HTML Sanitizer now removes all CSS from the <head> section of an HTML page. If a <style> tag is discovered in the body of an HTML page, or in an input fragment the tag will be removed, but the contents kept, as happens with other invalid tags. If the style attribute is discovered on an element it is removed.
What's new in AntiXSS 4.0
Minimum Requirements
The AntiXSS Library now requires .NET Framework 3.5.
Return Values
If you pass a null as the value to be encoded the encoder will now return null. The previous behavior was to return String.Empty.
Medium Trust Support
The HTML Sanitization methods, GetSafeHtml()()()() and GetSafeHtmlFragment()()()() have been moved to a separate assembly. This enables the AntiXssLibrary assembly to run in medium trust environments, a common user request. If you wish to use the Html Sanitization library you must now include the HtmlSanitizationLibrary assembly. This assembly requires full trust and the ability to run unsafe code.
Adjustable safe-listing for HTML/XML Encoding
The safe list for HTML and XML encoding is now adjustable. The MarkAsSafe(LowerCodeCharts, LowerMidCodeCharts, MidCodeCharts, UpperMidCodeCharts, UpperCodeCharts) method allows to you choose from the Unicode Code Charts which languages your web application normally accepts. Safe-listing a language code chart leaves the defined characters in their native form during encoding, which increases readability in the HTML/XML document and speeds up encoding. Certain dangerous characters will also be encoded. The language code charts are defined in the Microsoft.Security.Application..::..LowerCodeCharts, Microsoft.Security.Application..::..LowerMidCodeCharts, Microsoft.Security.Application..::..MidCodeCharts, Microsoft.Security.Application..::..UpperMidCodeCharts and Microsoft.Security.Application..::..UpperCodeCharts enumerations.
It is suggested you safe list your acceptable languages during your application initialization.
Invalid Unicode character detection
If any of the HTML, XML or CSS encoding methods encounters a character with a character code of 0xFFFE or 0xFFFF, the characters used to detect byte order at the beginning of files an InvalidUnicodeValueException will be thrown.
Surrogate Character Support in HTML and XML encoding
Support for surrogate character pairs for Unicode characters outside the basic multilingual plane has been improved. Such character pairs are now combined and encoded as their &xxxxx; value. If a high surrogate pair character is encountered which is not followed by a low surrogate pair character, or a low surrogate pair character is encountered which is not preceded by a high surrogate pair character an InvalidSurrogatePairException is thrown.
HTML 4.01 Named Entity Support
A new overload of the HtmlEncode method, Encoder..::..HtmlEncode(String, Boolean) allows you to specify if the named entities from the HTML 4.01 specification should be used in preference to &#xxxx; encoding when a named entity exists. For example if useNamedEntities is set to true the copyright entity would be encoded as ©.
HtmlFormUrlEncode
A new encoding type suitable for using in encoding Html POST form submissions is now available via Encoder..::..HtmlFormUrlEncode. This encodes according to the W3C specifications for application/x-www-form-urlencoded MIME type.
LDAP Encoding changes
The LdapEncode function has been deprecated in favor of two new functions, Encoder..::..LdapFilterEncode(String) and Encoder..::..LdapDistinguishedNameEncode(String)
Encoder..::..LdapFilterEncode(String) encodes input according to RFC4515 where unsafe values are converted to \XX where XX is the representation of the unsafe character. For example
Input | Output |
---|---|
Parens R Us (for all your parenthetical needs) | Parens R Us \28for all your parenthetical needs\29 |
* | \2A |
C:\MyFile | C:\5CMyFile |
Lučić | Lu\C4\8Di\C4\87 |
Encoder..::..LdapDistinguishedNameEncode(String) encodes input according to RFC 2253 where unsafe characters are converted to #XX where XX is the representation of the unsafe character and the comma, plus, quote, slash, less than and great than signs are escaped using slash notation (\X). In addition to this a space or octothorpe (#) at the beginning of the input string is \ escaped as is a space at the end of a string.
Input | Output |
---|---|
, + \ " \ < > | \, \+ \" \\ \< \> |
Hello | \ Hello |
Hello | Hello\ |
#Hello | \#Hello |
Lučić | Lu#C4#8Di#C4#87 |
Encoder..::..LdapDistinguishedNameEncode(String, Boolean, Boolean) is also provided so you may turn off the initial or final character escaping rules, for example if you are concatenating the escaped distinguished name fragment into the midst of a complete distinguished name. In addition to the RFC mandated escaping the safe list excludes the characters listed at http://projects.webappsec.org/LDAP-Injection.
MarkOutput
The ability to mark output using an HtmlEncode overload and query string parameter has been removed.