You are here:  » new "Drop Record If Not RegExp" filter


new "Drop Record If Not RegExp" filter

Submitted by RomanF on Mon, 2019-03-04 05:56 in

Hello,

I would like to drop all product that doesnt contain string "Antikvariat.html" or "CD.html" in product buy URL. For example product with this url {link saved} should be imported.

So I tried filter "Drop Record If Not RegExp" with text "/(Antikvariat.html|CD.html)/i" but the result of import is always 0 products, even I suppose the product with {link saved} should be imported.

Can you please help? Am I doing something wrong in reg expression?

Thank you

Roman

Submitted by support on Mon, 2019-03-04 09:28

Hello Roman and welcome to the forum!

Your RegExp is perfect - I just double checked to make sure there wasn't an issue with the period as this matches any character - to match exactly these can be escape using \ for example:

/(Antikvariat\.html|CD\.html)/i

So i'm not sure why it would be causing all records to drop - if the feed is CSV, could you perhaps post a couple of example lines (I'll remove before publishing your reply) of both one of the products that should be kept and one that should be dropped and I'll take a closer look for you...

Thanks,
David.
--
PriceTapestry.com

Submitted by RomanF on Mon, 2019-03-04 21:47

Hello David,

thank you,
here is a link to feed - {link saved}

You can see there is plenty of products with "Antikvariat.html" in it, for example:

{code saved}

but still I get 0 products imported. But I have tried lot of other filters and almost all doesnt work or work strangely, so I am tkinking whether something in my database setting could be wrong or maybe something is wrong in the feed? Like encoding?

Thank you
Roman

Submitted by support on Tue, 2019-03-05 08:55

Hello Roman,

The filter has worked fine on my test server - without the filter, 20672 products imported, and then after adding the filter 10218 imported.

Please could you double check that the filter is applied to the Buy URL field and not accidentally associated with another field e.g. Product Name instead as that would result in zero products imported. If that all looks good, please could you go to Tools > Backup and Restore and using the Backup form, uncheck everything except "feeds" and "filters" and then click Backup to download the backup XML. Open the file in your text and paste into your reply (I'll remove before publishing) and that will enable me to recreate exactly your configuration for that feed on my test server...

Cheers,
David.
--
PriceTapestry.com

Submitted by RomanF on Tue, 2019-03-05 17:00

Hello David,
I have checked that I am using Buy URL, so I am sending you the feed backup.
Thank you

{code saved}

Submitted by support on Tue, 2019-03-05 18:26

Hi,

I only got `feeds` backup in that XML, please can you try again making sure both `feeds` and `filters` selected - apologies for the inconvenience...

Thanks,
David.
--
PriceTapestry.com

Submitted by RomanF on Wed, 2019-03-06 10:56

Hi,

I am sorry, here it is:

{code saved}

Submitted by support on Wed, 2019-03-06 11:44

Thanks Roman,

That's strange - your configuration worked just fine on my test server! That would indicate a possible issue with the regular expression parser compiled in to your PHP installation (although I have never come across this before!).

To test this, I have made an alternative version using the normal string functions just like Drop Record If Not. If you could edit includes/filter.php and add the following new code to the end of the file, just before the closing PHP tag:

  /*************************************************/
  /* dropRecordIfNotMulti */
  /*************************************************/
  $filter_names["dropRecordIfNotMulti"] = "Drop Record If Not Multi";
  function filter_dropRecordIfNotMultiConfigure($filter_data)
  {
    widget_textBox("Drop record if field does not contain text(s) (comma separated)","text",FALSE,$filter_data["text"],"",6);
  }
  function filter_dropRecordIfNotMultiValidate($filter_data)
  {
  }
  function filter_dropRecordIfNotMultiExec($filter_data,$text)
  {
    global $filter_dropRecordFlag;
    if($filter_dropRecordFlag)
    {
      return $text;
    }
    $needles = explode(",",$filter_data["text"]);
    $found = 0;
    foreach($needles as $needle)
    {
      if (stristr($text,$needle) !== FALSE) $found++;
    }
    if (!$found) $filter_dropRecordFlag = TRUE;
    return $text;
  }

Then delete the Drop Record If Not Regexp filter and add the new Drop Record If Not Multi to the Buy URL field. This version supports a comma separated list of values to scan for, so in the text box on the configuration page for the filter use:

Antikvariat.html,CD.html

Save and re-import and fingers crossed that should give you the 10,218 required products...

Cheers,
David.
--
PriceTapestry.com

Submitted by RomanF on Wed, 2019-03-06 17:30

Hi David,

unfortunately it didnt help, still 0 products.

Here is my backup:

{code saved}

Here is also my php info:

{code saved}

Thank you

Roman

Submitted by support on Thu, 2019-03-07 08:42

Thanks Roman,

I'll follow up by email.

Cheers,
David.
--
PriceTapestry.com