You are here:  » Scan and Set Filter


Scan and Set Filter

Submitted by support on Tue, 2013-07-16 09:04 in

Hi everyone,

Sometimes users wish to add custom fields to their sites in order to show specific details on the product page e.g. colour, size etc. but where their feeds do not have that information separated out from the name or description of the product.

The following "Scan and Set" filter enables you to specify a comma separated list of values which are then checked for in the name and description of a product and if found, the field is set to that value. Add the code below to your includes/filter.php file:

  /*************************************************/
  /* scanSet */
  /*************************************************/
  $filter_names["scanSet"] = "Scan and Set";
  function filter_scanSetConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($nameDesc,trim($value))!==FALSE) return $value;
    }
  }

To use, first add the custom fields to your site as per the standard instructions in this thread.

Let's say for example that you have added a custom field `colour`. To populate `colour` with either Red, Green or Blue if any of those keywords exist in the product name or description, add a new Scan and Set filter to the Colour field, and in the text box on the configuration page for the filter, you would enter:

Red,Green,Blue

David
--
PriceTapestry.com

Submitted by stevewales20 on Tue, 2013-07-16 11:02

Excellent!

Something Like this was next on my list. Thank you very much.

Cheers
Steve

Submitted by erv on Wed, 2013-07-17 12:32

thanks, really useful!

Submitted by stevewales20 on Tue, 2013-08-13 11:24

Hi David,

This seems to be working pretty great, however a few minor issues. Mostly with the actual keywords, is it possible to only match exact keywords? at the moment the material 'wood' is also matching 'hollywood'.

Just looks a little silly having a wooden hat lol.

Thanks,
Steve
:)

Submitted by support on Tue, 2013-08-13 11:47

Hi Steve,

Can be made "whole word" only by using a regular expression match instead of stripos. Have a go with:

  /*************************************************/
  /* scanSet (whole word match) */
  /*************************************************/
  $filter_names["scanSet"] = "Scan and Set";
  function filter_scanSetConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (preg_match("/\s".$value."\s/i",$nameDesc)) return $value;
    }
  }

Cheers,
David.
--
PriceTapestry.com

Submitted by stevewales20 on Tue, 2013-08-13 13:28

Looks like it worked perfect!

Great stuff!

Thanks,
Steve.

Submitted by support on Tue, 2014-02-18 08:39

Hi everyone,

Here is a variation of this filter that scans the field itself (to which the filter has been applied) for any matches in the list and sets the field to that value:

  /*************************************************/
  /* scanSetField */
  /*************************************************/
  $filter_names["scanSetField"] = "Scan and Set Field";
  function filter_scanSetFieldConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetFieldValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetFieldExec($filter_data,$text)
  {
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($text,trim($value))!==FALSE) return $value;
    }
  }

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 00:33

Hi David

I am just working through merchant categories and creating a category map.

What I would like to do is scan the subcategory field (from http://www.pricetapestry.com/node/4665) and if a certain value exists then replace whole subcategory with a new keyword.

For example the Argos categories could be:
technologyipod and personal audioheadphones and earphones
technologyipod and personal audioipod accessories
technologyipod and personal audioipod skins cases and holders

If a category featured the substring "personal audio", as the above categories do, I would like to assign "Portable player" in the subcategory field.

This would save huge amounts of time on argos categories due to their long descriptive category names but not very friendly for mapping.

I would appreciate your help on this please.

Many thanks

Regards

Richard

Submitted by support on Thu, 2014-02-20 08:53

Hi Richard,

The "preg Replace All If" filter from this comment will do the trick. After adding the code to your includes/filter.php, add a new "preg Replace All if" filter to your new `subcategory` field and in the preg Expression box, to match a single string you can simply enter it as normal:

personal audio

...and in the Replace box:

Portable player

Using a regular expression search means you can have multiple search terms for a single replacement value in the same filter. To specify multiple search terms, enter them in a pipe-separated list, all enclosed in brackets, taking care not to include any superfluous spaces, e.g.

(personal audio|mp3 player)

...and to make case insensitive, add delimiters and use the "i" flag, e.g.

/(personal audio|mp3 player)/i

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 12:39

Hi David

Many thanks.

I have spent the morning trying quite a few different scenarios and using other fields such as brand to see if I can get ""preg Replace All If" to work.

So far I have had no success.

I have noticed this error when importing "Warning: preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash in"

Any thoughts?

Regards

Richard

Submitted by support on Thu, 2014-02-20 13:12

Hi Richard,

Could you have a go with the following version;

  /*************************************************/
  /* pregReplaceAllIf */
  /*************************************************/
  $filter_names["pregReplaceAllIf"] = "preg Replace All If";
  function filter_pregReplaceAllIfConfigure($filter_data)
  {
    print "preg Expression:<br />";
    print "<input type='text' name='search' value='".widget_safe($filter_data["search"])."' />";
    widget_errorGet("search");
    print "<br /><br />";
    print "Replace:<br />";
    print "<input type='text' name='replace' value='".widget_safe($filter_data["replace"])."' />";
    widget_errorGet("replace");
  }
  function filter_pregReplaceAllIfValidate($filter_data)
  {
    if (!$filter_data["search"])
    {
      widget_errorSet("search","required field");
    }
  }
  function filter_pregReplaceAllIfExec($filter_data,$text)
  {
    if (preg_match("/".preg_quote($filter_data["search"])."/i",$text))
    {
      return $filter_data["replace"];
    }
  }

...using a basic match e.g. simply

personal audio

...if still not joy, I'll create an alternative simple csv / explode() verison...

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 13:43

Hi David

Unfortunately amended version did not work either

Regards

Richard

Submitted by support on Thu, 2014-02-20 14:06

Hi Richard,

I've just set this up on my test server which offers either simple (single expression without having to worry about regexp characters), or a more complex expression - here's the code:

  /*************************************************/
  /* pregReplaceAllIf */
  /*************************************************/
  $filter_names["pregReplaceAllIf"] = "preg Replace All If";
  function filter_pregReplaceAllIfConfigure($filter_data)
  {
    print "preg Expression:<br />";
    print "<input type='text' name='search' value='".widget_safe($filter_data["search"])."' />";
    widget_errorGet("search");
    print "<br /><br />";
    print "Replace:<br />";
    print "<input type='text' name='replace' value='".widget_safe($filter_data["replace"])."' />";
    widget_errorGet("replace");
  }
  function filter_pregReplaceAllIfValidate($filter_data)
  {
    if (!$filter_data["search"])
    {
      widget_errorSet("search","required field");
    }
  }
  function filter_pregReplaceAllIfExec($filter_data,$text)
  {
    if (substr($filter_data["search"],0,1)=="/")
    {
      $match = $filter_data["search"];
    }
    else
    {
      $match = "/".preg_quote($filter_data["search"])."/i";
    }
    if (preg_match($match,$text))
    {
      return $filter_data["replace"];
    }
    else
    {
      return $text;
    }
  }

With this in place, try the basic example of

portable audio

...REPLACE

Portable Player

Where the first character of the match expression is _not_ "/", the filter will create a case insensitive regexp for you, with all special characters escaped. To match a range of values, use, as above:

/(protable audio|mp3 player)/i

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 14:32

Hi David

I have tried replacing both "personal audio" and "personal" with "Portable Player" without quotes, and in both cases no records were replaced for argos feed.

Still no luck

Regards

Richard

Submitted by support on Thu, 2014-02-20 14:38

Hi Richard,

Ah, I see what's happened - because of part of the index improvements the category field is actually limited to 32 characters, and this is truncating the lengthy Argos category names meaning the match isn't found because the field has been cropped.

It is in fact safe to go up to 100-150 characters, so the following dbmod.php script will increase the size of the category field to 100 characters which should be all it is:

<?php
  
require("includes/common.php");
  
$sql "ALTER TABLE `".$config_databaseTablePrefix."products`
            CHANGE `category` `category` VARCHAR(100) NOT NULL default ''"
;
  
database_queryModify($sql,$result);
  print 
"Done.";
?>

(upload and run once from top level Price Tapestry installation folder)

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 14:42

Hi David

I had had already increased category field to 100 characters and subcategory to 255 characters as yes the longer descriptions were truncated on import

So that is not the issue :(

Regards

Richard

Submitted by support on Thu, 2014-02-20 15:23

Hi Richard,

I have come across preg functions behaving peculiarly on some servers before, so here is a basic string comparison Search and Replace All filter with the option to specify multiple search values as a comma separated string:

  /*************************************************/
  /* searchReplaceAll */
  /*************************************************/
  $filter_names["searchReplaceAll"] = "Search and Replace All";
  function filter_searchReplaceAllConfigure($filter_data)
  {
    print "Search:<br />";
    print "<input type='text' name='search' value='".widget_safe($filter_data["search"])."' />";
    widget_errorGet("search");
    print "<br /><br />";
    print "Replace:<br />";
    print "<input type='text' name='replace' value='".widget_safe($filter_data["replace"])."' />";
    widget_errorGet("replace");
  }
  function filter_searchReplaceAllValidate($filter_data)
  {
    if (!$filter_data["search"])
    {
      widget_errorSet("search","required field");
    }
  }
  function filter_searchReplaceAllExec($filter_data,$text)
  {
    $words = explode(",",$filter_data["search"]);
    foreach($words as $word)
    {
      $word = trim($word);
      if (stripos($text,$word)!==FALSE)
      {
        return $filter_data["replace"];
      }
    }
    return $text;
  }

With this filter applied, have a go with a Search:

Portable Audio,MP3 Players

Replace:

Portable Players

Hope this helps! If still no joy, I'll be more than happy to check things out on your server with you - just drop me an email as I know you've been using the script for several years so that I've got your up to date email address and we can take it from there...

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2014-02-20 22:16

Hi David

I have gone and taken a sanity check and revisited issue tonight!

I have deleted a number of filters on data and have rechecked process. I have established that I can now alter fields using your code provided today at 2014-02-20 13:12. However, whenever I try to change data in the subcategory field that I created I cannot effect any changes. I have tried simple single strings to avoid silly errors (no guarantees though!)

I will take another look in the morning but if you have any inspiration as I am sure it is me having a brain block.

Please accept my apologies and thank you for your time

Regards

Richard

Submitted by support on Fri, 2014-02-21 08:45

Hi Richard,

I'll follow up by email with you and check it out on your server!

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Wed, 2015-01-07 23:45

Hi David

Happy New Year!

I'm looking forward to seeing your new release this month!

I am trying to modify this filter so that if the keyword is not found in either name or description then the content of the custom field is not overwritten. Currently the custom field is set to null if no match is made.

I would appreciate your assistance as my attempts have not been successful.

/*************************************************/
  /* scan name description and set field to value*/
  /*************************************************/
  $filter_names["scanSet"] = "Scan name and description and amend new field";
  function filter_scanSetConfigureNewField($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidateNewField($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExecNewField($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($nameDesc,trim($value))!==FALSE) return $value;
    }
  }

Regards

Richard

Submitted by support on Thu, 2015-01-08 09:02

Hello Richard,

A happy new year to you too!

It's just a case of adding a default return clause of the incoming $text value - have a go with:

/*************************************************/
  /* scan name description and set field to value*/
  /*************************************************/
  $filter_names["scanSet"] = "Scan name and description and amend new field";
  function filter_scanSetConfigureNewField($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidateNewField($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExecNewField($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($nameDesc,trim($value))!==FALSE) return $value;
    }
    return $text;
  }

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Thu, 2015-01-08 09:19

Hi David

Many thanks. I was including { } causing it to fail.

Regards

Richard

Submitted by richard on Fri, 2015-01-09 22:30

Hi David

Sorry for the delay in checking.

Unfortunately if the name or description fields do not match the keywords, the original content of the custom field is still being over written with a null value.

I have retried using the else command without success.

Do you have any ideas what I am doing wrong?

Regards

Richard

Submitted by support on Sun, 2015-01-11 12:17

Hi Richard,

Sorry about that - please could you confirm quickly that the problem persists with the above modification in place, and if so, if you could email me your modified includes/filter.php I'll check it all out for you...

Thanks,
David.
--
PriceTapestry.com

Submitted by richard on Sun, 2015-01-11 21:16

Hi David

Yes problem still persists. just sending email

Many thanks

Richard

Submitted by stevebi on Thu, 2016-02-18 09:00

Hello David,

May I ask what modification should I make in order to scan and set except product name, product description and merchant categories?

Cheers

Steve

Submitted by support on Thu, 2016-02-18 09:41

Hi Steve,

The text to scan is constructed in the original code by these lines:

    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];

...so to include merchant (feed) categories in the scan, REPLACE with:

    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_category"]];

Cheers,
David.
--
PriceTapestry.com

Submitted by stevebi on Thu, 2016-02-18 12:34

Thank you very much David!

Submitted by richard on Fri, 2016-07-01 10:40

Hi David,

Previously I would use the scan set filter to extract the colour with 98% accuracy.

I see that some merchants do provide colour as a separate field now. So I would like to use merchant feeds to populate colour where possible and then run a global filter that scans name/description & sets colour if colour field is empty, ie. not overwriting any content populated via a feed.

I would appreciate your help again!

Best regards,

Richard

Submitted by support on Fri, 2016-07-01 11:06

Hello Richard,

A version can be made that will only apply the scan / set process if the field is empty, for example, here is a version of the original filter from the first post in this thread modified in this way - Scan and Set (If Empty)

  /*************************************************/
  /* scanSetIfEmpty */
  /*************************************************/
  $filter_names["scanSetIfEmpty"] = "Scan and Set (If Empty)";
  function filter_scanSetIfEmptyConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetIfEmptyValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetIfEmptyExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    if ($text) return $text;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($nameDesc,trim($value))!==FALSE) return $value;
    }
  }

If you have a slightly modified version that you want to change to work in this way, it's just a case of inserting this line:

    if ($text) return $text;

...immediately after the global declarations in the ...Exec() function.

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by richard on Fri, 2016-07-01 11:47

Great, many thanks.

Submitted by smartprice24 on Fri, 2017-10-13 22:54

Hi David.

The filter:

"scan name description and set field to value"

do not import values that have symbols inside them as () - _. ! ? % etc etc.

Examples descriptions:

13-u001nl,i3-6100U,64-bit,Intel Core i3-6xxx 33.782 cm (13.3 "),(1920 x 1080),Intel Core i3-6100U, 4GB (DDR4), 500GB (HDD),Windows 10 Home (64 bit),802.11 a/b/g/n/ac,Bluetooth 4.2, 1 x USB 2.0, 2 x USB 3.0, 1.66 kg,

Is it possible to fix this?

Thank you
Giuseppe

Submitted by support on Sat, 2017-10-14 09:14

Hello Giuseppe,

Please could you post the filter code you are using as there are quite a few iterations above and also if you could confirm what you are using as the Values field in the filter configuration against the example description you posted I'll check it out further for you...

Thanks,
David.
--
PriceTapestry.com

Submitted by smartprice24 on Tue, 2017-10-17 12:40

Thanks David!

I solved using

  /*************************************************/
  /* scanSet (whole word match) */
  /*************************************************/
  $filter_names["scanSet"] = "Scan and Set";
  function filter_scanSetConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (preg_match("/\s".$value."\s/i",$nameDesc)) return $value;
    }
  }

I found that by adding filters at the end of filter.php, someone disappears, as if there is a limit of filters that can be added.

Many Thanks
Giuseppe

Submitted by smartprice24 on Fri, 2017-10-20 12:37

Hi David.

I use this filter filter.

Unfortunately, if you scan 800x600,1024x768,etc,etc., not import the values 800 x 600, 1024 x 768.

Is possible edit the filter, for read and import the values, even if they are separated by one or more spaces, possibly only internal spaces of the set value?

  /*************************************************/
  /* scanSet */
  /*************************************************/
  $filter_names["scanSet"] = "Scan and Set";
  function filter_scanSetConfigure($filter_data)
  {
    print "Values:<br />";
    print "<input type='text' name='values' value='".widget_safe($filter_data["values"])."' />";
    widget_errorGet("values");
  }
  function filter_scanSetValidate($filter_data)
  {
    if (!$filter_data["values"])
    {
      widget_errorSet("values","required field");
    }
  }
  function filter_scanSetExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    $nameDesc = $filter_record[$admin_importFeed["field_name"]];
    $nameDesc .= " ".$filter_record[$admin_importFeed["field_description"]];
    $values = explode(",",$filter_data["values"]);
    foreach($values as $value)
    {
      if (stripos($nameDesc,trim($value))!==FALSE) return $value;
    }
  }

Every tip is appreciated. Thank you.
Giuseppe

Submitted by support on Fri, 2017-10-20 14:03

Hello Giuseppe,

What I would suggest would be to put a Search and Replace filter on the fields that you are scanning (name and / or description) to search for " x " and replace with "x" - that will ensure the values are consistent before the Scan and Set...

Hope this helps!

Cheers,
David.
--
PriceTapestry.com