You are here:  » How to convert ASCII characters to html

Support Forum



How to convert ASCII characters to html

Submitted by Convergence on Tue, 2012-05-01 20:47 in

Greetings,

Many merchants have ASCII characters instead of html, Example:

&#8482 ; (space added between the number and ; for display purposes.

instead of

Attempted to create a filter > Search and Replace however, not effective.

Is the ASCII cleansing version of the parser an option (/node/4203)? Any other suggestions?

Thanking you in advance for your solution!

Submitted by support on Wed, 2012-05-02 07:26

Hello Convergence and welcome to the forum!

The ASCII cleansing version of Magic Parser is more to do with handling character encoding errors rather than HTML entities which is what sequences such as &#8482 ; are.

The way these would be handled in a field depends on which field they are appearing in. If they are appearing in the product name field then in-fact & and ; would be stripped as part of the normalisation process so you can permit them as follows. In includes/tapestry.php look for the following code at line 21:

    $text = str_replace("-"," ",$text);

...and REPLACE with:

    $text = str_replace("-"," ",$text);
    $allow .= "&#;":

With that in place, if you re-import HTML entities should then be accepted and appear as expected...

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Wed, 2012-05-02 14:38

Hello David,

Thank you for the welcome message and your prompt reply.

Typically the ASCII characters are in the product description field as in "and UGG&#174 ; Australia comfort" The manufacturer in this instance requires their brand to be displayed as UGG® - Is there a solution to convert these?

Thanks, again!

Submitted by support on Wed, 2012-05-02 15:16

Hi,

There's no alternation of the description field as imported so I wonder if it might be the "double-encoding" issue where the ampersand ends up being encoded twice. If this is the case, try adding a Search and Replace filter against the Description field for this feed as follows:

Search:

&

Replace:

&

If this doesn't help, if you could let me know the URL of your installation and the filename of the feed containing this markup (I'll remove the details before publishing your reply) and I'll download the feed from my your /feeds/ folder to my test server and check it out for you...

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Wed, 2012-05-02 15:41

Hello David,

The install is on our production server and not yet live. We have also removed the admin password so you have access. Please note that we are in the process of playing with templates and so forth.

{link saved}

Also, we only have ONE feed for testing purposes, we chose this merchant because of the ASCII issue - Fun, huh? :)

Thanks!

Submitted by support on Wed, 2012-05-02 16:03

Hi,

Yes - it is the double-encoding scenario - the ampersand of the HTML entity is itself entity encoded. If you search for the product "Naya - Aloha (Black Leather)" (just search on "Naya Aloha") and then View > Source in your browse you will find this:

®

So the Search and Replace suggested above should fix it, but I don't see it currently added, so if you add a new Search and Replace filter to the Description field for your feed, using:

Search:

&

Replace:

&

...and then re-import; that will do the trick!

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Wed, 2012-05-02 16:56

Hello David,

Ah, I see the light!

Will give it a go...

Thanks!

Submitted by Convergence on Wed, 2012-05-02 17:14

Hello David,

This will be the first of many of the following comments - get used to it.

As we say on this side of the pond...

"You da man!"

and,

Thank you!