Entity escaping
Your Sitemap file must be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below.
Character Escape Code
Ampersand : & &
Single Quote : ' '
Double Quote : " "
Greater Than : > >
Less Than : < <
In addition, all URLs (including the URL of your Sitemap) must be URL-escaped and encoded for readability by the web server on which they are located. However, if you are using any sort of script, tool, or log file to generate your URLs (anything except typing them in by hand), this is usually already done for you. Please check to make sure that your URLs follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.
Below is an example of a URL that uses a non-ASCII character (ü), as well as a character that requires entity escaping (&):
http://www.example.com/ümlat.php&q=name
Below is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding) and URL escaped:
http://www.example.com/%FCmlat.php&q=name
Below is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and URL escaped:
http://www.example.com/%C3%BCmlat.php&q=name
Below is that same URL, but also entity escaped:
http://www.example.com/%C3%BCmlat.php&q=name
[php]
<pre><code>function xml_entities($string) {
return strtr(
$string,
array(
"<" => "<",
">" => ">",
'"' => """,
"'" => "'",
"&" => "&",
)
);
}
[/php]