You are here:  » Map by EAN but keep certain fields


Map by EAN but keep certain fields

Submitted by affinity-group on Fri, 2018-11-16 12:58 in

Hi David,

We're currently using the EAN mod, but want to keep certain fields from the duplicates (such as the buy_url, size and so on).

Do you have any advice on how to achieve this? We've tried a couple of things, but since we're running quite a lot of feeds (over 2M records) we keep running into performance issues.

Best regards,

Affinity

Submitted by support on Fri, 2018-11-16 15:18

Hi,

Automatic Product Mapping by UID doesn't remove anything that would not ordinarily be deleted as duplicate products per merchant so I'm not sure what context the fields you want to keep are in - can you give a couple of examples perhaps with merchant / product names / EANs and the corresponding fields that you would like to keep?

Thanks,
David.
--
PriceTapestry.com

Submitted by affinity-group on Sun, 2018-11-18 11:47

Hi David,

My bad, I thought products were actually disappearing but I found that they are not.

I actually want to retrieve a list of (say 10) products and group them by EAN (or product name since they will now be equal thanks to the UID map).

I also want to retrieve the lowest price when the products are grouped.

Do you have any advice on how to query this?

I currently use this query, but as said, this sometimes becomes very slow and doesn't return the lowest price
{code saved}

Thanks in advance!

Submitted by support on Mon, 2018-11-19 08:48

Hi,

Since you are using GROUP BY you can select the lowest price using

MIN(`pt_products`.`price`) AS minPrice

...and use that summary field as your ORDER BY clause;

ORDER BY minPrice ASC

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by affinity-group on Mon, 2018-11-19 10:54

Hi David,

Thanks for your reply, the query is now really fast!

However one little issue remaining, (all) the results now sort by price, but I don't get the lowest price of the single product.

So to clearify, I currently run this query:

SELECT
`pt_products`.`merchant`,
`pt_products`.`name`,
`pt_products`.`ean`,
`pt_products`.`image_url`,
`pt_products`.`dupe_hash`,
`pt_products`.`price_old`,
MIN(`pt_products`.`price`) AS minPrice,
(SELECT COUNT(*) count FROM pt_products WHERE search_name = `pt_products`.`search_name`) AS `pt_products__shop_count`
FROM `dbname`.`pt_products` AS `pt_products`
GROUP BY `pt_products`.`search_name`
ORDER BY minPrice ASC LIMIT 10

I get 10 results returned really fast. However there is one product in this result set with a shop_count of 2, and it shows the highest price instead of the lowest.

Thank you again

Submitted by support on Mon, 2018-11-19 14:50

Hi,

Have a go with COUNT(id) AS `pt_products__shop_count` instead of the sub-query - something like;

SELECT
`pt_products`.`merchant`,
`pt_products`.`name`,
`pt_products`.`ean`,
`pt_products`.`image_url`,
`pt_products`.`dupe_hash`,
`pt_products`.`price_old`,
MIN(`pt_products`.`price`) AS minPrice,
COUNT(id) AS `pt_products__shop_count`
FROM `dbname`.`pt_products` AS `pt_products`
GROUP BY `pt_products`.`search_name`
ORDER BY minPrice ASC LIMIT 10

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by affinity-group on Mon, 2018-11-19 18:32

Hi David,

I think it's working now, I thought the query wasn't working but apparently the other product with the lower price didn't have a same categoryId assigned.

Is there an easy fix to also set the categoryId according to the duplicate 'parent' product? I'm planning on changing the uid mod, but if there's an easier fix please let me know.

Thank you!

Submitted by support on Tue, 2018-11-20 08:30

Hi,

The easiest way would be to add categoryid to the uid mod, so first in uiddbmod.php where you have the following code at line 26:

    `name` VARCHAR(255) NOT NULL,

...REPLACE with:

    `name` VARCHAR(255) NOT NULL,
    `categoryid` INT(11) NOT NULL,

And then in uidmap.php where you have the following code at line 38:

      $sql2 = "SELECT SQL_CALC_FOUND_ROWS name FROM `".$config_databaseTablePrefix."products` WHERE ".$config_uidField."='".database_safe($product[$config_uidField])."' ORDER BY name LIMIT 1";

...REPLACE with:

      $sql2 = "SELECT SQL_CALC_FOUND_ROWS name,categoryid FROM `".$config_databaseTablePrefix."products` WHERE ".$config_uidField."='".database_safe($product[$config_uidField])."' ORDER BY name LIMIT 1";

And then the following code at line 56:

        $sql2 = "INSERT INTO `".$config_databaseTablePrefix."uidfix` SET uid='".database_safe($product[$config_uidField])."',name='".database_safe($product["name"])."'";

...and REPLACE with:

        $sql2 = "INSERT INTO `".$config_databaseTablePrefix."uidfix` SET uid='".database_safe($product[$config_uidField])."',name='".database_safe($product["name"])."',categoryid='".database_safe($product2["categoryid"])."'";

And finally the following code at line 77:

      $sql2 = "UPDATE `".$config_databaseTablePrefix."products` SET name='".database_safe($product["name"])."',normalised_name='".database_safe($normalisedName)."',search_name='".database_safe($searchName)."' WHERE ".$config_uidField."='".database_safe($product["uid"])."'";

...REPLACE with:

      $sql2 = "UPDATE `".$config_databaseTablePrefix."products` SET name='".database_safe($product["name"])."',normalised_name='".database_safe($normalisedName)."',search_name='".database_safe($searchName)."',categoryid='".database_safe($product["categoryid"])."' WHERE ".$config_uidField."='".database_safe($product["uid"])."'";

Cheers,
David.
--
PriceTapestry.com

Submitted by affinity-group on Tue, 2018-11-20 12:48

Hi David,

Two slight changes I had to to:
1) There was an extra comma in your query at line 77 which had to be removed (before the WHERE)
2) The $product["categoryid"] wasn't set on line 56, it had to be $product2["categoryid"]

Nevertheless the uid mod now works perfectly, many thanks for your great support :)!

Submitted by support on Tue, 2018-11-20 13:08

Ooops - corrected above!

Glad you're up and running.

Cheers,
David.
--
PriceTapestry.com