You are here:  » Inporting feeds without de-duplicating

Support Forum



Inporting feeds without de-duplicating

Submitted by 4theuk ltd on Tue, 2009-11-03 11:51 in

I am using the script to import an xml of classified adverts from a feed. Due to the way the script is designed it normalises the products and deduplicates them.

Is there a way of stopping it deduplicate during import

Thanks

Simon

Submitted by support on Tue, 2009-11-03 14:55

Hi Simon,

The duplication can be disabled; but there may other modifications required as various parts of the script GROUP BY name in order to provide the comparison; however removing duplication would be the first step.

In includes/admin.php, look for the following code beginning at around about line 248:

    /* create dupe_hash value */
    $dupe_key = $admin_importFeed["merchant"];
    // uncomment any additional fields that you wish to filter duplicates on (description not recommended)
    $dupe_key .= $record[$admin_importFeed["field_name"]];
    // $dupe_key .= $record[$admin_importFeed["field_description"]];
    // $dupe_key .= $record[$admin_importFeed["field_image_url"]];
    // $dupe_key .= $record[$admin_importFeed["field_buy_url"]];
    // $dupe_key .= $record[$admin_importFeed["field_price"]];
    $dupe_hash = md5($dupe_key);

...and REPLACE this with;

    /* create dupe_hash value */
    $dupe_key = $admin_importFeed["merchant"];
    // uncomment any additional fields that you wish to filter duplicates on (description not recommended)
    // $dupe_key .= $record[$admin_importFeed["field_name"]];
    // $dupe_key .= $record[$admin_importFeed["field_description"]];
    // $dupe_key .= $record[$admin_importFeed["field_image_url"]];
    // $dupe_key .= $record[$admin_importFeed["field_buy_url"]];
    // $dupe_key .= $record[$admin_importFeed["field_price"]];
    global $dupe_counter;
    $dupe_counter++;
    $dupe_key .= $dupe_counter;
    $dupe_hash = md5($dupe_key);

That will remove the duplication checking which is the first step. Then have a look at merchant searches for your classified feed; as it may be necessary to create a special search case...

Cheers,
David.