You are here:  » special characters problem

Support Forum



special characters problem

Submitted by gunneradt on Mon, 2008-12-01 13:27 in

I'm getting special characters problem and have erad the post

http://www.pricetapestry.com/node/30

however, the code in my includes/tapestry.php file is different to the code in that post

i have this code in there too:

global $config_charset;

what should I change so that £ signs, apostrophes, speech marks render properly?

cheers

Submitted by support on Mon, 2008-12-01 14:35

Hi There,

The current version should handle extended character sets correctly; however it is most likely that the problem is because the ampersand and semi-colon characters are being stripped during import which is breaking the HTML entities in your description text. This is easily fixed, if you look in includes/admin.php, the description field is "normalised" by the following code on line 165:

$record[$admin_importFeed["field_description"]] = tapestry_normalise($record[$admin_importFeed["field_description"]],",'\'\.%!");

If you replace this with:

$record[$admin_importFeed["field_description"]] = tapestry_normalise($record[$admin_importFeed["field_description"]],"&;,'\'\.%!");

That should do the trick. If the problem characters are appearing elsewhere, let me know and I'll look into that for you!

Cheers,
David.

Submitted by gunneradt on Mon, 2008-12-01 15:06

hi

sorry to say that doesn't appear to have done anything

it's in the product name ans decriptions that Im getting the problem

see here

http://www.gift-mania.co.uk/product/Baby39s-Pink-Calf-Nursery-Set.html

http://www.gift-mania.co.uk/product/The-Slanket.html

regards

Submitted by support on Mon, 2008-12-01 15:16

Hi,

My apologies; it is the # character that is also required to be permitted, so instead of the change described above, use:

$record[$admin_importFeed["field_description"]] = tapestry_normalise($record[$admin_importFeed["field_description"]],"&#;,'\'\.%!");

(in other words just add # to the list of characters)

Now, to permit similar characters in the product name (this is tricker because of search engine friendly URLs), also in includes/admin.php look for the following code on line 159:

    $record[$admin_importFeed["field_name"]] = tapestry_normalise($record[$admin_importFeed["field_name"]]);

...and replace this with:

    $record[$admin_importFeed["field_name"]] = tapestry_normalise($record[$admin_importFeed["field_name"]],"&#;");

A corresponding change must also then be made in the normalisation of the q parameter at the top of products.php. In that file, look for the following code on line 4:

$q = (isset($_GET["q"])?tapestry_normalise($_GET["q"],":\."):"");

...and replace that with:

$q = (isset($_GET["q"])?tapestry_normalise($_GET["q"],"&#;:\."):"");

Don't forget that any affected feeds will need to be imported again before any of these changes will take effect...

Hope this helps!

Cheers,
David.

Submitted by gunneradt on Mon, 2008-12-01 16:28

I havent tried the product name fix yet but this still hasn't worked in the descriptions as youll see by re-checking the two urls I posted earlier

Submitted by support on Mon, 2008-12-01 16:32

Hi,

Did you re-import the feed after making change?

If so, if you could email me your modified includes/admin.php i'll check it out for you!

Cheers,
David.

Submitted by gunneradt on Mon, 2008-12-01 16:50

do you mean I have to re-import all the feeds?

regards

Submitted by gunneradt on Mon, 2008-12-01 17:01

thats worked after a re-import - great stuff

ill make the other change and then re-import all the feeds!!

Ill report back if there any problems

regards

Submitted by gunneradt on Tue, 2008-12-02 00:55

That has worked for the most part

thanks very much

However, there are a couple of problems

If you click on the first couple of items in this merchants category search you will see that once you select the product, it doesn't then go to the item

http://www.gift-mania.co.uk/merchant/Drinkstuff/

regards

Submitted by gunneradt on Tue, 2008-12-02 00:58

it looks to me that the apostrophe in the url is causing the problem - is there a way to remove these?

Submitted by gunneradt on Tue, 2008-12-02 01:32

it still hasn't resolved all the special character problems as you'll see from this description

http://www.gift-mania.co.uk/product/Ben-10-Alien-Laboratory-Playset.html

Submitted by support on Tue, 2008-12-02 07:54

Hi,

I agree that it would be best to replace out the entities rather than permit them in the Product Name. It's still worth keeping the above mod in place as this will allow you to add a Search and Replace filter to the Product Name field (click "Filters" alongside each feed that has products with the apostrophe in the title) and then enter "'" (without the quotes) in the "Search" box and leave the "Replace" box empty. After importing the feed again these characters should not appear in the product name.

I couldn't see any specific problem with the Ben 10 description; but if that's still an issue let me know which characters are not being displayed correctly and i'll take a look...

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 08:59

Ive applied that filter as a test to one particular feed but the apostrophes appear to be still appearing in the feed after Ive reimported it

any reason why?

Submitted by support on Tue, 2008-12-02 09:08

Hi,

Sorry; the value to enter in the Search box didn't appear correctly in my post above... I've corrected it now, it should be "'" (without the quotes) in the Search box, and the Replace box blank...

(note that this applies to the drinkstuff.csv feed as per your example, it's possible that other feeds may require ' or " as the search text as these are all ways of indicating an apostrophe in HTML!!

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 09:31

thanks David - I think that's working

Ive noticed I still have still special charcaters on the merchant category page

http://www.gift-mania.co.uk/category/

I'm going to remove some of those categories as a result but is there an easy way to replace 'pound' and 'amp' with '£' and '&' plus any others etc on this page and probably the brand page too

regards

Submitted by gunneradt on Tue, 2008-12-02 09:40

only odd consequence of removing the apostrophe correctly in the product name is that it now appears to pick the item up twice in the search results and produce two separate urls for it - one with semi-colons in now, and one without as it should be

http://www.gift-mania.co.uk/search.php?q=smellovision

Submitted by support on Tue, 2008-12-02 10:28

Hi,

The filter against the Getting Personal feed looked fine, and I checked the feed itself and there is no surplus ";", so I just imported it again and it looks OK now... So that should have been all it is...

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 10:48

that's worked for me now too

many thanks again

Submitted by gunneradt on Tue, 2008-12-02 13:15

just a thought

is there a way to set global filters across all product names and descriptions or does it have to be per feed?

Submitted by support on Tue, 2008-12-02 13:35

Hi,

There's not i'm afraid; but it is very easy to add "global" alterations to all feeds within the import record handler function, which begins at line 124 in includes/admin.php.

For example, to perform a global search and replace as per the filter described above, scroll down to around line 180 where you will see this comment:

/* apply user filters */

...at which point you could add code like this:

$record[$admin_importFeed["field_name"]] = str_replace("'","",$record[$admin_importFeed["field_name"]]);

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 13:37

so that will perform al global search/replace for any items and field names I specify on all feeds as I import them?

Submitted by support on Tue, 2008-12-02 13:49

Correct...

To manipulate other fields, use these variables:

$record[$admin_importFeed["field_description"]]
$record[$admin_importFeed["field_image_url"]]
$record[$admin_importFeed["field_buy_url"]]
$record[$admin_importFeed["field_price"]]

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 13:49

there is this code at the moment at/after line 80

does it matter where i paste that line and add other lines? I'm assuming straight under the 'apply user filters'

/* apply user filters */
$filter_dropRecordFlag = false;

if ($admin_importFiltersExist)
{
foreach($admin_importFilters as $filter)
{
$execFunction = "filter_".$filter["name"]."Exec";

$record[$filter["field"]] = $execFunction($filter["data"],$record[$filter["field"]]);
}
}

regards

Submitted by support on Tue, 2008-12-02 13:50

Hi,

That's correct also.

Don't miss my last reply immediately above you last post - I think we cross-posted!

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 13:51

is it correct that I use "field_name" for the product name?

or should I use "field_product_name"

regards

Submitted by support on Tue, 2008-12-02 13:54

Hi,

"field_name" is correct, yes.

Remember that changes won't be applied until after the next import (per feed)...

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 13:55

thanks Ill try them all again

still once it's all setup once, much easier the enxt time

Submitted by gunneradt on Tue, 2008-12-02 14:16

that's worked brilliantly

much easier that way

Submitted by gunneradt on Tue, 2008-12-02 18:42

what term should I use for the 'category' to get rid of special characters?

cheers

Submitted by support on Tue, 2008-12-02 18:50

Hi,

What sort of special characters are you referring to?

Extended characters (i.e. accented letters etc. from UTF-8 and other character sets) are permitted by default - all dangerous characters should be removed by the default normalisation function...

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 19:18

What I meant was I have 'amp' 'quot' '39' appearing on my category page

I wanted to use the same global excludes as with product anme and description if possible

regards

Submitted by support on Tue, 2008-12-02 19:26

Hi,

Sorry - I left them out of the list - category and brand are in the variables:

$record[$admin_importFeed["field_category"]]
$record[$admin_importFeed["field_brand"]]

Cheers,
David.

Submitted by gunneradt on Tue, 2008-12-02 19:32

many thanks

Submitted by Mark on Thu, 2008-12-04 13:21

I think this answers my last question too
Mark

Comparison Shopping Sites http://cheapest-bargain.com/
http://cheapest-bargain.co.uk/
http://www-fancy-dress.com/

Submitted by babrees on Thu, 2009-02-05 07:54

I have implemented this, however, I have a problem within the description, as you can see at http://www.cashmereuk.co.uk/cashmereclothing/product/Cashmere-Fitted-Three-Quarter-Sleeve-Scoop-Neck.html

---------
Jill

Submitted by support on Thu, 2009-02-05 11:20

Hi Jill,

That particular instance looks like a character encoding error in the feed as I have checked and
the content is neither utf-8 (as it appears wrong as it stands) or iso-8859-1.

If you could drop me an email with the filename of the feed containing that description, i'll download it from your /feeds/ folder to my test server and check it out for you...

Cheers,
David.

Submitted by DanielWestman on Mon, 2009-03-09 16:39

Hi David,

I have some products with # in the product name but that symbol got stripped away at import.
I followed your instructions and made changes in admin.php and products.php to this:

<?php
 $record
[$admin_importFeed["field_name"]] = tapestry_normalise($record[$admin_importFeed["field_name"]],"#"); 
?>

<?php
 $q 
= (isset($_GET["q"])?tapestry_normalise($_GET["q"],"#:\."):""); 
?>

The # shows in the product name, but now I come to a 404 page when I click on the product link.
I guess it has something to do with the # being in the URL?

Is there any way to keep the # in the product name but remove it from the URL so it works?

Submitted by support on Tue, 2009-03-10 12:00

Hi Daniel,

This is happening because the search engine friendly versions of the product URLs is not urlencoded - because they don't ordinarily contain any characters that need encoding. Now that # is part of your product names, look for the following code on line 166 of search.php:

$searchresults["products"][$k]["productHREF"] = "product/".tapestry_hyphenate($product["name"]).".html";

...and REPLACE it with:

$searchresults["products"][$k]["productHREF"] = "product/".urlencode(tapestry_hyphenate($product["name"])).".html";

Cheers,
David.

Submitted by DanielWestman on Tue, 2009-03-10 12:51

Hi David,

I tried replacing the code but I still get a 404. With the new code # gets replaced with %23 in the URL.
Is there any way to keep the # just in the product name and strip it away in the URL?

Regards,
Daniel.

Submitted by support on Tue, 2009-03-10 13:29

Hi Daniel,

There is a more complex mod described in the following thread:

http://www.pricetapestry.com/node/2634

This involves having 2 separate name fields in the database - one for the URL, and another that is actually displayed that can contain any characters you like.

Using # in a URL is difficult as it is used by browsers to link to an anchor within the page, but I'm not sure why it not working after being encoded. If you'd like to email me a link to the site where I can see this i'll happily take a look for you...

Cheers,
David.

Submitted by DanielWestman on Tue, 2009-03-10 15:26

Hi David,

I tried the mod in node 2634 and now it almost works except for the page title.
The title used the name field instead of name_display, so I tried changing:

<?php
 $header
["title"] = htmlentities($q,ENT_QUOTES,$config_charset).
?>

To:
<?php
 $header
["title"] = htmlentities($product["products"][0]["name_display"],ENT_QUOTES,$config_charset).
?>

And that worked!

Will that cause trouble somewhere else in the script or is it okay?

Regards,
Daniel.

Submitted by support on Tue, 2009-03-10 16:19

Hi Daniel,

That's fine!

Cheers,
David.