You are here:  » sitemap


sitemap

Submitted by searley on Tue, 2006-03-28 21:41 in

since the upgrade i am getting the following sitemap error from google

Parsing error (Line 1) in Sitemap http://www.shoppingchanneluk.com/sitemap.php?merchant=ActivityGifts.co.uk

Submitted by support on Tue, 2006-03-28 21:42

Hiya,

There's an open issue with the sitemaps at the moment that I think is down to character encoding; a few people report various feeds that cause a problem, yet the feeds validate OK so I've no idea what's wrong. I've got some examples though so i'm on the case...

Submitted by support on Tue, 2006-03-28 21:54

Yes - the problem is that now that the script is being more generous with regards to the "safe" characters that it allows, it does mean that the sitemap may not be strictly UTF-8, which is the only format that Google accepts.

I'm looking into the best way to deal with this now; I think I know what to do; and will only require a minor modification so don't worry about making any changes.

Submitted by support on Tue, 2006-03-28 22:23

If you are just working with UK feeds; for the time being I would revert back to the old normalisation function. In the file includes/tapestry.php, look for the following line...

$text = preg_replace('/[^A-Za-z0-9'.chr(0x80).'-'.chr(0xFF).$allow.' ]/e','',$text);

If you change that line back the old version...

$text = preg_replace('/[^A-Za-z0-9'.$allow.' ]/e','',$text);

...and then re-import all feeds your sitemap will then validate. I'll look more into this to workout a more long-term solution.

In the mean time, i've updated the distribution to make the character set optional. If you do specify a character set (which will be the advice to anybody having problems with accents etc.) the normalise function will then switch in the less aggresive restriction and issue the charset header.