You are here:  » Special Characters in Product Mapping and Filters


Special Characters in Product Mapping and Filters

Submitted by Mahony on Thu, 2012-01-12 14:44 in

Hi David,

i have a problem with special characters like ä,ö,ü...

These characters are in some of the Product Names. The first issue was the Product URL.
For example if there was an ö in the title, ptc put the ö also in the produkt link/url.

I solved this problem with this code in the tapestry.php:

<?php
$text 
str_replace("-"," ",$text);
    
$text str_replace("ü","ue",$text);
    
$text str_replace("ö","oe",$text);
    
$text str_replace("ä","ae",$text);
    
$text str_replace("ß","ss",$text);
    
/*if ($config_charset)
    {
      $allow = chr(0x80).'-'.chr(0xFF).$allow;
    }*/
?>

Then the next problems I'm struggeling with now are the Product Mapping Tool and the Feed Registration Filters.

First the Filters: I can't "Search and Replace" the word in the Title with the special character in it.
For example... I can "Search and Replace" the Word Buch in the title "Buch (3 Stück)" but i can't replace the word "(3 Stück)"

Second issue: If there is any special character in the title, the product mapping tool will not find the product. It finds neither "Buch" nor "Stück".

I hope you can help me with that. Thanks.

Tony

Submitted by support on Thu, 2012-01-12 14:56

Hi Tony,

Both issues imply that the database collation may not be a utf-8 variant. A couple of questions;

- In your configuration (config.php) are you still using the default value for $config_charset which is "utf-8"?

- In /admin/ when you create a Search and Replace filter on the product name, for example Search: "Stück" and replace "Stuck", is the filters saved correctly, i.e. if you edit the filter again the search and replace terms appear correctly; it's just that the filter isn't being applied as expected?

I do have a comprehensive special characters modification which many users of non-English language sites use that makes the "special character insensitive" by converting all accented vowels into non-accented equivalents at import time; and applying the same translation in search so that it wouldn't matter if your users searched "Stück" or "Stuck" - let me know if you would be interested in that patch...

Cheers,
David.
--
PriceTapestry.com

Submitted by Mahony on Thu, 2012-01-12 15:10

I'm using UTF-8 in my Wordpress config and in the ptc config.

I am UTF8 Decoding the Feed at import.

-Correct. The Search and Replace filter "Stück" is saved correctly. I can also edit it. But it doesn't get applied at import.

-Are there any downsides if I use the patch? Will it overwrite the special characters in the Titles?

Tony

Submitted by Mahony on Thu, 2012-01-12 15:18

I dont understand it but take a look at this..

{link saved}

{link saved}

In wordpress the titles are correct. In the root directory of pricetapestry not.

Submitted by support on Thu, 2012-01-12 15:20

Hi Tony,

Ah - that may be the problem as UTF8 Decode is UTF-8 to ISO-8859-1 conversion, so it sounds like you may need to be using UTF-8 Encode instead.

Bear in mind that filters are applied in the order that they are created; so in order to use a Search and Replace filter based on UTF-8 characters they would need to come after the conversion has been applied; so I'm afraid in they would need to be deleted and re-created but that may be all it is.

The special characters mod does convert the name as used in the URL to the non-accented version so may not be suitable if you wish to maintain special characters in the URL...

Cheers,
David.
--
PriceTapestry.com

Submitted by Mahony on Thu, 2012-01-12 15:23

ok now i changed it to $config_charset = "iso-8859-1";. now both versions get displayed correct. (ptc and wordpress plugin).
Filters and Product Mapping is doing their job now too.

Submitted by support on Thu, 2012-01-12 15:25

Hi Tony,

Yes - I saw when I went to the links that appears to be working fine now - let me know if you're still not sure of course...

Cheers,
David.
--
PriceTapestry.com

Submitted by Mahony on Thu, 2012-01-12 15:29

I dont want special characters in the url but i want to keep them in the Title. Is there a mod for the search because I can't search for stück :)

Submitted by support on Thu, 2012-01-12 15:41

Hi Tony,

Yes - URLs are generated using the normalised_name which by default permits all characters except those which are not URL safe. I'll email to you an alternative version which will convert accented vowels in the URL (rather than strip them) but leave them in place in the title...

Cheers,
David.
--
PriceTapestry.com

Submitted by lecolet on Wed, 2012-05-30 20:34

Hello, David,

Could you please, e-mail me this mod, too. Thank you very much!

Submitted by support on Thu, 2012-05-31 09:29

On way... (based on 12/10B distribution versions - if you have already made any mods to includes/admin.php, includes/tapestry.php or search.php let me know and I'll patch your versions...)

Cheers,
David.
--
PriceTapestry.com

Submitted by lecolet on Thu, 2012-05-31 11:56

Thank you for the files. I forgot to mention that I need the mod that you sent to Tony - the one that changes characters in the urls, but not in the name itself. I want to use it to transliterate Cyrillic characters into English ones in the url. I've tried those two files that you sent, and they change the search name (I don't need it, because users will be searching in cyrillic), but I need the name for the url to change (is it the normilised name?). And what would be the equivalent of the tapestry.php file for the wordpress installation? Thank you.

Submitted by support on Thu, 2012-05-31 14:27

Hi,

Ah, I understand - this should be straight forward (and doesn't need any changes in WordPress plugin files because normalised_name (which makes the URL) is generated at import time.

Would it be straight forward for you to post the translation that you require, e.g. Cyrillic Symbol = English (upper and lower case), then I will copy that into a modification in tapestry.php for you... I did search for translations but there seem to be various different versions so I want to make sure it the one you need...

Thanks!
David.
--
PriceTapestry.com

Submitted by lecolet on Thu, 2012-05-31 16:08

Thank you! Here is the transliteration. A few characters are left blank - that it correct... And, just to make sure... is normilised named used only to form the url? If I change the characters, would it affect anything else? I need it just for the url.

$tapestry_accentedVowels["A"] = array("А");
$tapestry_accentedVowels["B"] = array("Б");
$tapestry_accentedVowels["V"] = array("В");
$tapestry_accentedVowels["G"] = array("Г");
$tapestry_accentedVowels["D"] = array("Д");
$tapestry_accentedVowels["E"] = array("Е");
$tapestry_accentedVowels["E"] = array("Ё");
$tapestry_accentedVowels["ZH"] = array("Ж");
$tapestry_accentedVowels["Z"] = array("З");
$tapestry_accentedVowels["I"] = array("И");
$tapestry_accentedVowels["J"] = array("Й");
$tapestry_accentedVowels["K"] = array("К");
$tapestry_accentedVowels["L"] = array("Л");
$tapestry_accentedVowels["M"] = array("М");
$tapestry_accentedVowels["N"] = array("Н");
$tapestry_accentedVowels["O"] = array("О");
$tapestry_accentedVowels["P"] = array("П");
$tapestry_accentedVowels["R"] = array("Р");
$tapestry_accentedVowels["S"] = array("С");
$tapestry_accentedVowels["T"] = array("Т");
$tapestry_accentedVowels["U"] = array("У");
$tapestry_accentedVowels["F"] = array("Ф");
$tapestry_accentedVowels["KH"] = array("Х");
$tapestry_accentedVowels["TS"] = array("Ц");
$tapestry_accentedVowels["CH"] = array("Ч");
$tapestry_accentedVowels["SH"] = array("Ш");
$tapestry_accentedVowels["SHH"] = array("Щ");
$tapestry_accentedVowels[""] = array("Ь");
$tapestry_accentedVowels["Y"] = array("Ы");
$tapestry_accentedVowels[""] = array("Ъ");
$tapestry_accentedVowels["E"] = array("Э");
$tapestry_accentedVowels["YU"] = array("Ю");
$tapestry_accentedVowels["YA"] = array("Я");
$tapestry_accentedVowels["a"] = array("а");
$tapestry_accentedVowels["b"] = array("б");
$tapestry_accentedVowels["v"] = array("в");
$tapestry_accentedVowels["g"] = array("г");
$tapestry_accentedVowels["d"] = array("д");
$tapestry_accentedVowels["e"] = array("е");
$tapestry_accentedVowels["e"] = array("ё");
$tapestry_accentedVowels["zh"] = array("ж");
$tapestry_accentedVowels["z"] = array("з");
$tapestry_accentedVowels["i"] = array("и");
$tapestry_accentedVowels[""] = array("й");
$tapestry_accentedVowels["k"] = array("к");
$tapestry_accentedVowels["l"] = array("л");
$tapestry_accentedVowels["m"] = array("м");
$tapestry_accentedVowels["n"] = array("н");
$tapestry_accentedVowels["o"] = array("о");
$tapestry_accentedVowels["p"] = array("п");
$tapestry_accentedVowels["r"] = array("р");
$tapestry_accentedVowels["s"] = array("с");
$tapestry_accentedVowels["t"] = array("т");
$tapestry_accentedVowels["u"] = array("у");
$tapestry_accentedVowels["f"] = array("ф");
$tapestry_accentedVowels["kh"] = array("х");
$tapestry_accentedVowels["ts"] = array("ц");
$tapestry_accentedVowels["ch"] = array("ч");
$tapestry_accentedVowels["sh"] = array("ш");
$tapestry_accentedVowels["shh"] = array("щ");
$tapestry_accentedVowels[""] = array("ъ");
$tapestry_accentedVowels[""] = array("ь");
$tapestry_accentedVowels["y"] = array("ы");
$tapestry_accentedVowels["e"] = array("э");
$tapestry_accentedVowels["yu"] = array("ю");
$tapestry_accentedVowels["ya"] = array("я");

Thanks,
Lana

Submitted by support on Thu, 2012-05-31 16:49

Thanks Lana,

I've followed up by email with the above translation converted into a single arrays for the most efficient use with PHP's str_replace() function and combined with the tapestry_normalise() function...

Cheers,
David.
--
PriceTapestry.com

Submitted by lecolet on Fri, 2012-06-01 10:28

Thank you, David, it works great! I put

$q = (isset($_GET["q"])?$_GET["q"]:"");

back into the search.php (otherwise it was using transliterated name in the "product not found" result page), and here is the correct transliteration (I made a mistake in one of the characters, and also switched English/Cyrillic characters, because it was the other way around).

$tapestry_cyrillicSearch[]="А";$tapestry_cyrillicReplace[] = "A";
$tapestry_cyrillicSearch[]="Б";$tapestry_cyrillicReplace[] = "B";
$tapestry_cyrillicSearch[]="В";$tapestry_cyrillicReplace[] = "V";
$tapestry_cyrillicSearch[]="Г";$tapestry_cyrillicReplace[] = "G";
$tapestry_cyrillicSearch[]="Д";$tapestry_cyrillicReplace[] = "D";
$tapestry_cyrillicSearch[]="Е";$tapestry_cyrillicReplace[] = "E";
$tapestry_cyrillicSearch[]="Ё";$tapestry_cyrillicReplace[] = "E";
$tapestry_cyrillicSearch[]="Ж";$tapestry_cyrillicReplace[] = "ZH";
$tapestry_cyrillicSearch[]="З";$tapestry_cyrillicReplace[] = "Z";
$tapestry_cyrillicSearch[]="И";$tapestry_cyrillicReplace[] = "I";
$tapestry_cyrillicSearch[]="Й";$tapestry_cyrillicReplace[] = "J";
$tapestry_cyrillicSearch[]="К";$tapestry_cyrillicReplace[] = "K";
$tapestry_cyrillicSearch[]="Л";$tapestry_cyrillicReplace[] = "L";
$tapestry_cyrillicSearch[]="М";$tapestry_cyrillicReplace[] = "M";
$tapestry_cyrillicSearch[]="Н";$tapestry_cyrillicReplace[] = "N";
$tapestry_cyrillicSearch[]="О";$tapestry_cyrillicReplace[] = "O";
$tapestry_cyrillicSearch[]="П";$tapestry_cyrillicReplace[] = "P";
$tapestry_cyrillicSearch[]="Р";$tapestry_cyrillicReplace[] = "R";
$tapestry_cyrillicSearch[]="С";$tapestry_cyrillicReplace[] = "S";
$tapestry_cyrillicSearch[]="Т";$tapestry_cyrillicReplace[] = "T";
$tapestry_cyrillicSearch[]="У";$tapestry_cyrillicReplace[] = "U";
$tapestry_cyrillicSearch[]="Ф";$tapestry_cyrillicReplace[] = "F";
$tapestry_cyrillicSearch[]="Х";$tapestry_cyrillicReplace[] = "KH";
$tapestry_cyrillicSearch[]="Ц";$tapestry_cyrillicReplace[] = "TS";
$tapestry_cyrillicSearch[]="Ч";$tapestry_cyrillicReplace[] = "CH";
$tapestry_cyrillicSearch[]="Ш";$tapestry_cyrillicReplace[] = "SH";
$tapestry_cyrillicSearch[]="Щ";$tapestry_cyrillicReplace[] = "SHH";
$tapestry_cyrillicSearch[]="";$tapestry_cyrillicReplace[] = "Ь";
$tapestry_cyrillicSearch[]="Ы";$tapestry_cyrillicReplace[] = "Y";
$tapestry_cyrillicSearch[]="";$tapestry_cyrillicReplace[] = "Ъ";
$tapestry_cyrillicSearch[]="Э";$tapestry_cyrillicReplace[] = "E";
$tapestry_cyrillicSearch[]="Ю";$tapestry_cyrillicReplace[] = "YU";
$tapestry_cyrillicSearch[]="Я";$tapestry_cyrillicReplace[] = "YA";
$tapestry_cyrillicSearch[]="а";$tapestry_cyrillicReplace[] = "a";
$tapestry_cyrillicSearch[]="б";$tapestry_cyrillicReplace[] = "b";
$tapestry_cyrillicSearch[]="в";$tapestry_cyrillicReplace[] = "v";
$tapestry_cyrillicSearch[]="г";$tapestry_cyrillicReplace[] = "g";
$tapestry_cyrillicSearch[]="д";$tapestry_cyrillicReplace[] = "d";
$tapestry_cyrillicSearch[]="е";$tapestry_cyrillicReplace[] = "e";
$tapestry_cyrillicSearch[]="ё";$tapestry_cyrillicReplace[] = "e";
$tapestry_cyrillicSearch[]="ж";$tapestry_cyrillicReplace[] = "zh";
$tapestry_cyrillicSearch[]="з";$tapestry_cyrillicReplace[] = "z";
$tapestry_cyrillicSearch[]="и";$tapestry_cyrillicReplace[] = "i";
$tapestry_cyrillicSearch[]="й";$tapestry_cyrillicReplace[] = "j";
$tapestry_cyrillicSearch[]="к";$tapestry_cyrillicReplace[] = "k";
$tapestry_cyrillicSearch[]="л";$tapestry_cyrillicReplace[] = "l";
$tapestry_cyrillicSearch[]="м";$tapestry_cyrillicReplace[] = "m";
$tapestry_cyrillicSearch[]="н";$tapestry_cyrillicReplace[] = "n";
$tapestry_cyrillicSearch[]="о";$tapestry_cyrillicReplace[] = "o";
$tapestry_cyrillicSearch[]="п";$tapestry_cyrillicReplace[] = "p";
$tapestry_cyrillicSearch[]="р";$tapestry_cyrillicReplace[] = "r";
$tapestry_cyrillicSearch[]="с";$tapestry_cyrillicReplace[] = "s";
$tapestry_cyrillicSearch[]="т";$tapestry_cyrillicReplace[] = "t";
$tapestry_cyrillicSearch[]="у";$tapestry_cyrillicReplace[] = "u";
$tapestry_cyrillicSearch[]="ф";$tapestry_cyrillicReplace[] = "f";
$tapestry_cyrillicSearch[]="х";$tapestry_cyrillicReplace[] = "kh";
$tapestry_cyrillicSearch[]="ц";$tapestry_cyrillicReplace[] = "ts";
$tapestry_cyrillicSearch[]="ч";$tapestry_cyrillicReplace[] = "ch";
$tapestry_cyrillicSearch[]="ш";$tapestry_cyrillicReplace[] = "sh";
$tapestry_cyrillicSearch[]="щ";$tapestry_cyrillicReplace[] = "shh";
$tapestry_cyrillicSearch[]="ь";$tapestry_cyrillicReplace[] = "";
$tapestry_cyrillicSearch[]="ъ";$tapestry_cyrillicReplace[] = "";
$tapestry_cyrillicSearch[]="ы";$tapestry_cyrillicReplace[] = "y";
$tapestry_cyrillicSearch[]="э";$tapestry_cyrillicReplace[] = "e";
$tapestry_cyrillicSearch[]="ю";$tapestry_cyrillicReplace[] = "yu";
$tapestry_cyrillicSearch[]="я";$tapestry_cyrillicReplace[] = "ya";

Thank you,
Lana