You are here:  » Character encoding issue

Support Forum



Character encoding issue

Submitted by Tony on Thu, 2011-01-20 13:58 in

From the same affiliate network, I got 2 feeds coming from two different merchants. Despite special characters (french), the products names show very well. However the product descriptions are not rendering properly on the search result page. Very confusing is the fact that, although I have the same product description from the two merchants, the product description of one merchant render better (even though not perfect) than the description of the other.
Can you please help me out ?

Submitted by support on Thu, 2011-01-20 14:03

Hi Tony,

It sounds like the description field (and probably the product name also) from the merchant that is not displaying properly is in a different character set, probably ISO-8859-1.

To fix this, assuming that you haven't changed the default encoding (UTF-8) in config.php, add a UTF8 Encode filter to the Product Name and Description fields for the feed that is causing the problem. To add filters, click Filters alongside the feed from your /admim/ area. Don't forget to re-import the feed after adding the filters..

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by Tony on Thu, 2011-01-20 15:15

Hi David,
I added the filters as advised and then used the "slow import tool" to import the feeds. Now the result page and the description look much better than earlier. But when I click on "more information" or "compare price" the full description doesn't look good at all.

Submitted by support on Thu, 2011-01-20 15:27

Hi Tony,

There's no guarantee that the description in the search results and the description on the product page are taken from the same product record (as the search results are a "summary" SQL query and there's no way to determine which record the non-summary fields are pulled from).

So it sounds to me perhaps like both your feeds are actually in ISO-8859-1, but your configuration is still UTF-8. What I would suggest first is to remove any filters that you have added, and then edit your config.php and change line 4 to the following:

  $config_charset = "iso-8859-1";

Then re-import all feeds and that should do the trick...

In general, you want your configured character set to match the character set of the majority of your feeds, and then you can use the UTF8 Encode (ISO-8859-1 to UTF-8 conversion) or UTF8 Decode (UTF-8 to ISO-8859-1 conversion) filters on the name and description fields of the remaining feeds.

Cheers,
David.
--
PriceTapestry.com

Submitted by Tony on Thu, 2011-01-20 16:38

Hi David,
Still not working... But you're right, it seems that both descriptions are not taken from the same record.
Tony

Submitted by support on Thu, 2011-01-20 16:43

Hi Tony,

If you could email me a link to your installation, the filenames of the 2 feeds that are causing the problems I'll download them from your /feeds/ folder to my test server and check them out for you!

Cheers,
David.
--
PriceTapestry.com

Submitted by Tony on Fri, 2011-01-21 12:17

Hi David,
once again thanks for your help.I really appreciated. Is there a way to get a result page with all products no matter whether the user searches the product with or without the special characters ?
Eg : Actually if you search for "beurre de karité", you get no result but if you search with "beurre de karite" you'll get some products.
Thanks in advance

Submitted by support on Fri, 2011-01-21 12:26

Hi Tony,

What has worked to be the best approach in the past has been to actually perform the search against a special version of the product name that has had any accented vowels in the query replaced with their non-accented equivalents. I'll follow up your recent email regarding this as there are a few changes involved but it's quite straight forward to do...!

Cheers,
David.
--
PriceTapestry.com