You are here:  » Character Set Configuration


Character Set Configuration

Submitted by support on Thu, 2006-03-02 15:45 in

Hi everyone,

Following on from this thread; I have now checked in the required changes for handling different character sets within Price Tapestry. This release sets "UTF-8" as the default character encoding; making it suitable for use with most popular product feeds in both English and all major international languages.

The actual character set is now configurable with a new line in config.php:

<?php
  $config_charset 
"utf-8";
?>

This configuration parameter is referred to in 2 places. Firstly; it is used to set a HTTP header within header.php overriding any character set configured by your webserver, and is also used within search.php as the characterset parameter to the html_entities() function (used to make the contents of the search query safe for display within the page). Extended characters now persist through the URL (even if the browser has entity encoded them) and appear correctly within the search box and "Search results for..." bar; for example

http://www.webpricecheck.co.uk/search.php?q=caractères étrangers

These changes are now in the current distribution.

Submitted by iman on Thu, 2006-03-02 16:29

great job, mañifico :)

Submitted by madstock on Thu, 2006-03-02 16:56

Thanks for this - although it doesn't solve the problems with my site (which I am sure are probably attributable to something else), I'm sure it will be of help to many.

Submitted by support on Thu, 2006-03-02 16:59

Hi madstock;

What are the specific problems that still exist on your site with this update?

If it's something that others using non-English character sets are likely to find it would be good to get it sorted...

Submitted by madstock on Thu, 2006-03-02 17:42

I don't think it is anything to do with the script, to be honest, more likely my shoddy site..

For example, without the changes in the new distribution,

http://www.france.madstock.com/search.php?q=caractères%20étrangers

note that as per the strict normalisation all special characters have been whipped out of the database output, but everything else looks hunky-dory - encoding is Western European (Windows), although Western European (ISO) or User Defined will also work...

Contrast with a new search page, using the new config.php and header.php definitions, plus the new tapestry page without the normalisation:

http://www.france.madstock.com/search2.php?q=caractères%20étrangers

'tis UTF-8, but everything appears "munged", and the navigation (which contains special characters) doesn't load at all.

If the javascript navigation is removed (still using a the new definitions)

http://www.france.madstock.com/search3.php?q=caractères%20étrangers

everything is still belly-up...

But it surely must be something wrong at my end, as other examples are working fine.

Submitted by support on Thu, 2006-03-02 17:52

Hiya,

What seems to be going on is that your other internationalised content is being generated in ISO-8859-1.

If you view your page:
http://www.france.madstock.com/search2.php?q=caractères%20étrangers
...and then configure your browser to use character set Western (ISO-8859-1); your navigation appears correctly.

Situation:

You want to use data from feeds in UTF-8 encoding with other content that is being generated in ISO-8859-1. Therefore, one of them has to be converted into the other. Can you provide a little more detail about how you are generating those drop down menus.

For example, in one of your source files; you are generating the strings that appear in the select box on your search form. It might just be a case of opening that file in a text editor that is able to change the encoding and making sure that your source file is saved in UTF-8. Alternatively, can you post a snippet of the code that is generating that menu?

Submitted by madstock on Thu, 2006-03-02 19:25

Thanks - problem (nearly) solved! - downloaded everything, and it was saving in ANSI as opposed to UTF-8 - twas simply a case of resaving for the most part.

A couple of minor points that have now arisen...

1. I could see no difference in the new "search.php" page, and as such the title still appears munged, what do I have to change, please??

2. When attempting to register merchants with non-standard characters in their names (and/or categories), there are still restrictions in place - I assume that this is controlled by the register file as opposed to the config.php ??

Thanks so much for all of your patience - this was driving me absolutely round the twist, and instead of looking for the simplest solution I was oing rather round the houses.

Thanks again!