14 lines
2.2 KiB
Plaintext
14 lines
2.2 KiB
Plaintext
charset [HTML5]
|
|
|
|
This attribute declares the character encoding used of the page. It can be locally overridden using the lang attribute on any element. This attribute is a literal string and must be one of the preferred MIME name for a character encoding as defined by the IANA. Though the standard doesn't request a specific character encoding, it gives some recommendations:
|
|
Authors are encouraged to use UTF-8.
|
|
Authors should not use ASCII-incompatible encodings (i.e. those that don't map the 8-bit code points 0x20 to 0x7E to the Unicode 0x0020 to 0x007E code points) as these represent a security risk: browsers not supporting them may interpret benign content as HTML Elements. This is the case of at least the following charsets: JIS_C6226-1983, JIS_X0212-1990, HZ-GB-2312, JOHAB, the ISO-2022 family, and the EBCDIC family.
|
|
Authors must not use CESU-8, UTF-7, BOCU-1 and SCSU, also falling in that category and not intended to be used on the web. Cross-scripting attacks with some of these encodings have been documented.
|
|
Authors should not use UTF-32 because not all HTML5 encoding algorithms can distinguish it from UTF-16.
|
|
|
|
Notes:
|
|
The declared character set must match the one of the page. There is no valid reason to declare an inaccurate character set.
|
|
This <meta> element must be inside the <head> element and within the 512 first bytes of the page, as some browsers only look at these first bytes before choosing a character set for the page.
|
|
This <meta> element is only a part of the algorithm to determine the character set of a page that browsers apply. The HTTP Content-Type header and any BOM elements have precedence over this element.
|
|
It is good practice, and strongly recommended, to define the character set using this attribute. If no character set is defined for a page, several cross-scripting techniques may be able to harm the page user, such as the UTF-7 fallback cross-scripting technique. Always setting this meta will protect against these risks.
|
|
This <meta> element is a synonym for the pre-HTML5 <meta http-equiv="Content-Type" content="text/html; charset=IANAcharset"> where IANAcharset corresponds of the value of the equivalent charset attribute. This syntax is still allowed, although obsolete and no more recommended. |