You are here:  » EAN Extraction


EAN Extraction

Submitted by stuball2 on Tue, 2014-04-01 12:55 in

Hello,

You've helped me out over email but thought it would be best to post (as I have certainly found the forum to be very useful as a result of other peoples questions).

Currently I run a Preg match against a description feed like so "EAN: ([0-9]*)" with a return Value of 1, and this successfully extracts the EAN. from the main description, I have since found that some EANS are stored in the description as "EAN/MPN/ISBN: 65161651651", so I created new field called ean2 and map that to description and run seperate Preg match.

I'm assuming there is a better way to do this with some kind of or statement?

Additionally, once I have pulled in the EANs to the ean and ean2 column I just use a sql update statement to copy the ean2 values where ean is empty.
I am however having some issues with product mapping though, 2 products with the Same EAN are not showing as a price comparison but as two different products, I have printed the ean in the product description and it is definately the same accross the two.

My process was:
Import feed1 (with own EAN field)
Import feed 2 (pulling EAN from description to ean and ean2 fields)
Update ean with ean2 when ean is empty.

Run /scripts/uidmap.php which results in one phase 0 entry and then lots of phase 1 lines and then done. I have previously run the two mod php files as per the ean article on here.

So basically could you help me with:
Is it messing up because of my SQL tinkering?
Is there a nicer way to pull EAN?

Thanks!
Stu.

Submitted by support on Tue, 2014-04-01 14:02

Hello Stu, and welcome to the forum!

The RegExp used in your preg Match filter can be extended to match either "EAN:" or "EAN/MPN/ISBN:". The expression to use would be:

(EAN\/MPN\/ISBN|EAN): ([0-9]*)

...and a Return Index of 2.

If you subsequently discover other combinations, they can be added to the list in the first bracketed part of the expression, where each possibility is spearated by the "|" (pipe) character. If the value needs to contain "/" then it must be escaped, to become "\/".

(there's a good RegExp "cheat sheet" here)

With that in place it can all be applied to your original `ean` field so there would be no need for the ean2 part.

Regarding the Automatic Product Mapping by UID, to check for identical values that would indicate that uidmap.php isn't quite working as expected, I often append the ean value to the product name displayed in the price comparison table, enclosed in [square brackets] which helps to check for any leading or trailing space. To do this, in html/prices.php search for:

print $product["original_name"];

...and REPLACE with:

print $product["original_name"]." [".$product["ean"]."]";

Check in particular for different version of zero padding as I've come across this as an issue with EAN mapping before, in which case that can be treated easily at import time, however if they look absolutely identical with the above test let me know and I'll check it out further for you...

Cheers,
David.
--
PriceTapestry.com

Submitted by stuball2 on Tue, 2014-04-01 14:23

Hi David,

Thanks so much, once again quick and thorough response.
The failure in mapping by EAN must have been related to my SQL tinkering as it now works perfectly :)

Thanks
Stuart.