You are here:  » Exclude certain characters from product names

Support Forum



Exclude certain characters from product names

Submitted by tbbd2007 on Tue, 2008-03-11 10:01 in

David,

Several feeds actually have the ™, ® and £ characters in them which obviously means the link url doesn't work. How do I add them to the list of excluded characters so that they are just removed from all feeds on import.

Thanks

Stephen
Online Shopping For
The Big Business Directory

Submitted by support on Tue, 2008-03-11 11:56

Hello Stephen,

These characters should be removed by default, however there are two ways that ™ and ® could be being allowed in. The first is if they are in the feed as HTML entities; and you have modified includes/admin.php to permit & and ";" thus allowing them to be imported.

The second way is if they are in the script as extended characters (i.e. a character in their own right as part of the utf-8 character set).

Assuming the first method (most likely), check includes/admin.php for the following code:

    /* apply standard filters */
    $record[$admin_importFeed["field_name"]] = tapestry_normalise($record[$admin_importFeed["field_name"]]);

If this appears as follows:

    /* apply standard filters */
    $record[$admin_importFeed["field_name"]] = tapestry_normalise($record[$admin_importFeed["field_name"]],"&;");

...then reverting back to the old original version should block the characters, however you are then going to end up with the remainder of the HTML entity; for example "trade" or "reg". The best thing to do then would be to remove them with a Search and Replace filter on the Product Name field for any feeds that have these characters in the product name...

Cheers,
David.

Submitted by tbbd2007 on Tue, 2008-03-11 13:42

David,

I understand the first part, that is exactly what I have at the moment. I was really meaning the second option whereby the feeds have the actual '™' etc. characters in. As seen at http://www.online-shopping-for.co.uk/merchant/3M-Select/ for instance.

Kind regards

Stephen

Submitted by support on Tue, 2008-03-11 15:21

Hi Stephen,

The first thing to do would be to set the character encoding, as it looks like your site is currently configured as "iso-8859-1", whereas these characters are "utf-8". However, you may have intentionally set iso-8859-1 because of incorrect characters being displayed elsewhere.

If not, if you edit config.php and set:

  $config_charset = "utf-8";

..that should at least cause the characters to be displayed correctly. It may also resolve the URL problem if the browser was previously generating invalid characters, but we may need to look into that part of the problem further...

Cheers,
David.

Submitted by tbbd2007 on Wed, 2008-03-12 01:19

David,

Yes my site is configured for iso-8859-1, but that is because the bulk of my feeds are in either iso-8859-1 or iso-8859-15. It's just that some of the utf-8 feeds cause problems. Changing to utf-8 as default probably wouldn't be ideal either as when The Big Business Directory was running on utf-8 the non utf-8 feeds came up with odd characters. I am assuming that it was because the iso-8859-1 and iso-8859-15 feeds would probably display invalid characters because of the code difference between the languages.

Kind regards,

Stephen

Submitted by support on Wed, 2008-03-12 09:33

Hi Stephen,

I thought that might be the case, in which case you should be able to use the "UTF8 Decode" filter against the product name and description fields in the feeds that have UTF-8 characters. This will convert them into iso-8859-1 so that they should display correctly under that character set...

Cheers,
David.

Submitted by tbbd2007 on Wed, 2008-03-12 10:13

David,

Yes, I will do that, but I still would like to know where to add the actual entity characters to exclude them from the product names.

Kind regards

Stephen

Submitted by support on Wed, 2008-03-12 11:09

Hi,

The easiest thing to do; if your site is English language (and therefore doesn't require the extended character set) is to look for the following code starting at line 22 in includes/tapestry.php:

    if ($config_charset)
    {
      $allow = chr(0x80).'-'.chr(0xFF).$allow;
    }

... and simple delete or comment this out...It will take effect on each feed from the next import.

Cheers,
David.

Submitted by bloach on Fri, 2008-07-25 07:40

Hi David,

I was about a similar thread but fortunately was able to locate this existing thread. Commenting out the line 22 worked for me perfectly.

Really appreciate your constant help and support.

Thanks

_____________________________________________________
My Australian Shopping Comparison Site

Australian Shopping Comparison