You are here:  » Extending Category Data using keywords


Extending Category Data using keywords

Submitted by ChrisNBC on Thu, 2016-11-10 15:31 in

Hi David,

The site I’m working on currently uses category hierarchy but unfortunately, some merchants data is lacking enough granularity in their category field to allow me to accurately map products to categories. I’m already extracting some keywords into newly created dB fields which I then append to the category field using a ‘text after filter’. This works ok but I would like to extend the category hierarchy further. I could handle this in the way I do now but I’m going to end up with loads of extra fields in my dB just to hold the keywords and also a whole set of ‘text after filters’ to append the data.

I wondered if you could suggest if there are any existing filters which would allow me to ‘scan and set’ using regular expression but then append the matches to another field (without deleting the existing content of that field)? I searched the forum but couldn’t spot anything.

Thanks in advance.

Best regards
Chris

Submitted by support on Fri, 2016-11-11 09:22

Hi Chris,

Here's a simple filter "Scan All and Append RegExp" that will scan (by RegExp) the entire import record (i.e. every field) and then if there is a match, append the match to the field to which the filter is attached;

  /*************************************************/
  /* scanAllAppRegExp */
  /*************************************************/
  $filter_names["scanAllAppRegExp"] = "Scan All and Append RegExp";
  function filter_scanAllAppRegExpConfigure($filter_data)
  {
    print "RegExp:<br />";
    print "<input type='text' name='regexp' value='".widget_safe($filter_data["regexp"])."' />";
    widget_errorGet("regexp");
  }
  function filter_scanAllAppRegExpValidate($filter_data)
  {
    if (!$filter_data["regexp"])
    {
      widget_errorSet("regexp","required field");
    }
  }
  function filter_scanAllAppRegExpExec($filter_data,$text)
  {
    global $importRecord;
    $scan = implode(" ",$importRecord);
    if (@preg_match($filter_data["regexp"],$scan,$matches))
    {
      $text .= " ".$matches[0];
    }
    return $text;
  }

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by ChrisNBC on Fri, 2016-11-11 15:44

Hi David,

Thanks for the quick response and for the filter above which is great. I wondered if it would be at all possible to add a 'set' field so I could specify a 'capture group' in the same way as the "Scan and Set RegExp" filter works?

Thanks in advance.

Best regards
Chris

Submitted by support on Sat, 2016-11-12 12:14

Hi Chris,

It would be no problem to make a "Scan and Append RegExp" based on your existing "Scan and Set RegExp", and including a new "All" scan option - have a go with the following;

  /*************************************************/
  /* scanAppendRegexp */
  /*************************************************/
  $filter_names["scanAppendRegExp"] = "Scan and Append RegExp";
  function filter_scanAppendRegExpConfigure($filter_data)
  {
    print "Scan:<br />";
    print "<select name='scan'>";
    print "<option value='all' ".($filter_data["scan"]=="all"?"selected='selected'":"").">All</option>";
    print "<option value='self' ".($filter_data["scan"]=="self"?"selected='selected'":"").">Self</option>";
    print "<option value='name' ".($filter_data["scan"]=="name"?"selected='selected'":"").">Name</option>";
    print "<option value='namedesc' ".($filter_data["scan"]=="namedesc"?"selected='selected'":"").">Name and Description</option>";
    print "</select>";
    widget_errorGet("scan");
    print "<br />";
    print "RegExp:<br />";
    print "<input type='text' name='regexp' value='".widget_safe($filter_data["regexp"])."' />";
    widget_errorGet("regexp");
    print "<br />";
    print "Set:<br />";
    print "<input type='text' name='set' value='".widget_safe($filter_data["set"])."' />";
    widget_errorGet("set");
  }
  function filter_scanAppendRegExpValidate($filter_data)
  {
    if (!$filter_data["regexp"])
    {
      widget_errorSet("regexp","required field");
    }
    if (!$filter_data["set"])
    {
      widget_errorSet("set","required field");
    }
  }
  function filter_scanAppendRegExpExec($filter_data,$text)
  {
    global $admin_importFeed;
    global $filter_record;
    global $importRecord;
    switch($filter_data["scan"])
    {
      case "all":
        $scan = implode(" ",$importRecord);
        break;
      case "self":
        $scan = $text;
        break;
      case "name":
        $scan = $filter_record[$admin_importFeed["field_name"]];
        break;
      case "namedesc":
        $scan = $filter_record[$admin_importFeed["field_name"]];
        $scan .= " ".$filter_record[$admin_importFeed["field_description"]];
        break;
    }
    if (@preg_match($filter_data["regexp"],$scan,$matches))
    {
      $retval1 = $filter_data["set"];
      $retval2 = $filter_data["set"];
      foreach($matches as $k => $match)
      {
        if (!$k) continue;
        $search = "\$".$k;
        $replace = $matches[$k];
        $retval2 = str_replace($search,$replace,$retval2);
      }
      if ($retval2 != $retval1)
      {
        $text .= " ".$retval2;
      }
    }
    return $text;
  }

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by ChrisNBC on Mon, 2016-11-14 22:44

Thanks David,

It works beautifully.

Best regards
Chris

Submitted by ChrisNBC on Mon, 2016-12-12 13:19

Hi David,

Hope all going well.

I wondered if you might be able to suggest if there is a way to use the above filter to append every (matched) word in the regex to the chosen field (with spaces between each word). I think it currently matches and appends the first word found?

Thanks in advance.

Best regards
Chris

Submitted by support on Mon, 2016-12-12 14:29

Hi Chris,

Please could you give an example configuration of an instance of this filter where only the first match is being appended - there is a replace functionality involved so it might be that it needs to be changed to ignore the "set" field...

Thanks,
David.
--
PriceTapestry.com

Submitted by support on Tue, 2016-12-13 09:26

Hi Chris,

Please could you forward latest includes/filter.php as I think it will be easier to modify an alternative filter, or I'll a new one - the process should be straight forward...

Thanks,
David.
--
PriceTapestry.com