You are here:  » Bulk Unmapped Category Export/Import


Bulk Unmapped Category Export/Import

Submitted by ChrisNBC on Thu, 2016-10-06 13:42 in

Hi David,

I have started mapping categories on my new site and have realised quite what a massive task it’s going to be even using the reverse mapping tool (which is great by the way).

A long time ago you supplied me a couple of scripts pmimport and prexport which I used to export product mapping so I could edit/create mapping using excel and then re-import them. I wondered if you might have an equivalent of these scripts for categories, which allow all unmapped merchant product categories to be exported so they can be worked on externally and then re-imported.

I searched the forum but couldn’t spot anything.

Thanks in advance.

Best regards
Chris

Submitted by support on Thu, 2016-10-06 13:55

Hi Chris,

Bear with me and I will look into that for you - import / export to / from a hierarchy is somewhat more complicated that a flat structure, but in the mean time, if you create the following script saved as, say chunmapped.php:

<?php
  require("includes/common.php");
  header("Content-Type: text/plain;charset=".$config_charset);
  $sql = "SELECT DISTINCT(category) FROM `".$config_databaseTablePrefix."products` WHERE categoryid='0' ORDER BY category";
  database_querySelect($sql,$rows);
  foreach($rows as $row)
  {
    print $row["category"]."\n";
  }
?>

...and then browse to chunmapped.php - this will dump all feed categories that are not currently mapped into the Category Hierarchy. Lots of users work with a version of this and I will consider including unmapped helpers in the next distribution that I am currently preparing for release this month...

Cheers,
David.
--
PriceTapestry.com

Submitted by ChrisNBC on Mon, 2016-10-10 09:41

Hi David,

Hope you had a good weekend.

Thanks for the above which has made things much quicker. Would it be feasible to add a regex mapping facility for categories ? (similar to the one used in product mapping). I realise this is probably quite a major addition but thought it might be worth mentioning in case there is any opportunity to include in future releases....

Additionally, I notice in the category hierarchy mapping function it's possible to filter categories using a "%" between words. I have some categories which look like:

gender sports football clothing shorts

I have tried adding “gender%shorts” to the ‘Alternatives’ mapping box (without leading “=”) but this does not seem to work. I wondered if you could tell me if there is a way to add an alternative using a wildcard. I’m sure this must have come up before but I searched the forum and didn’t spot anything.

Thanks in advance.

Best regards
Chris

Submitted by support on Mon, 2016-10-10 11:52

Hello Chris,

You could add a regexp match option, for example using a prefix of "?", so no prefix is a normal keyword match, "=" prefix for exact match, or "?" prefix for regexp match. To add this (for Category Hierarchy Mapping), edit includes/admin.php and look for the following code at line 435 (15/09A), (line 298 I think in your back-ported version)

        if (substr($k,0,1) !== "=")

...and REPLACE with:

        if (substr($k,0,1) == "?")
        {
          $regexp = substr($k,1);
          if (preg_match($regexp,$importRecord["category"]))
          {
            $importRecord["categoryid"] = $v;
            break;
          }
        }
        elseif (substr($k,0,1) !== "=")

And then in the Alternatives box, in place of "gender%shorts", have a go with:

?/gender.*shorts/

(the "/" characters are the regexp delimiters, and by including them in the alternatives gives maximum flexibility such as the use of flags if required)

Cheers,
David.
--
PriceTapestry.com

Submitted by ChrisNBC on Mon, 2016-10-10 13:47

Wow, thanks David, I really didn’t expect that but that’s fantastic it works a treat!

Best regards
Chris

Submitted by ChrisNBC on Tue, 2016-12-20 10:25

Hi David,

Hope all is going well.

I wondered if you might be able to tell me if there is a way to use negative lookaheads or lookbehinds in the regex hierarchy mapping mod above. For example I have the categories

Boys Football Shorts >>should map to Category1
Boys Shorts >>should map to Category2

I would like to exclude all products containing the word “Football” from the second category. I have used lookaheads and lookbehinds (both negative and positive) elsewhere in filters and they work really nicely but I’m struggling to get them to work in category mapping.

Thanks in advance.

Best regards
Chris

Submitted by support on Tue, 2016-12-20 13:34

Hi Chris,

Negative lookbehind seemed to work OK on my test server - I tried for Category 2:

/Boys.*(?<!Football )Shorts/

And then for Category 1,

/Boys.*Football.*Shorts/

If still no joy if you could let me know what RegExp you are using I'll check it out further with you...

Cheers,
David.
--
PriceTapestry.com

Submitted by ChrisNBC on Wed, 2017-01-11 12:23

Hi David,

Apologies for the delay in replying. I wanted to experiment with the regex before I responded. You are right the regex above does work but if the word sequence is different the negative word is missed. I wondered if you could suggest if there is a way I might be able to use a negative word match which would find a word anywhere within the field. I thought a conditional statement might do it but it doesn’t seem to. Alternatively, I noticed a mod (http://www.pricetapestry.com/node/6181) on the forum which adds a negative keyword box to product mapping and I wondered if maybe this could be applied to category mapping so the regex was ignored if certain keywords were present? Do you think this would work?

Thanks in advance.

Best regards
Chris

Submitted by support on Thu, 2017-01-12 12:30

Hi Chris,

The ? handler (for regexp matching) could be modified to support a single negative keyword, so where you have after the above replacement;

  if (substr($k,0,1) == "?")
  {
    $regexp = substr($k,1);
    if (preg_match($regexp,$importRecord["category"]))
    {
      $importRecord["categoryid"] = $v;
      break;
    }
  }
  elseif (substr($k,0,1) !== "=")

...have a go with:

  if (substr($k,0,1) == "?")
  {
    $parts = explode("&!",$k);
    if (isset($parts[1]))
    {
      if (strpos($importRecord["category"],$parts[1]) !== FALSE) continue;
    }
    $regexp = substr($parts[0],1);
    if (preg_match($regexp,$importRecord["category"]))
    {
      $importRecord["categoryid"] = $v;
      break;
    }
  }
  elseif (substr($k,0,1) !== "=")

And then as your alternative expression, start as before with the ? followed by the regexp, and then append &! (standing for "and not") followed by the negative keyword, for example;

?/Boys.*Shorts/&!Football

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by vias100 on Sat, 2020-01-25 18:48

Hi,
This doesn't seem to work for me. I am not using english language but greek in the regex also do I have to include double quotes? is it something with encoding?
For example "/^ΑΝΔΡΑΣ/" for starting with the work man (ΑΝΔΡΑΣ)?
I am using the code

  if (substr($k,0,1) == "?")
  {
    $parts = explode("&!",$k);
    if (isset($parts[1]))
    {
      if (strpos($importRecord["category"],$parts[1]) !== FALSE) continue;
    }
    $regexp = substr($parts[0],1);
    if (preg_match($regexp,$importRecord["category"]))
    {
      $importRecord["categoryid"] = $v;
      break;
    }
  }
  elseif (substr($k,0,1) !== "=")

on line 432 of the mentioned file.

Submitted by vias100 on Sat, 2020-01-25 19:23

Update:
?/FLEECE$/
This Alternatives rule works great in Hierarchy Categories.
So I guess is something with greek?

Submitted by support on Mon, 2020-01-27 09:03

Hi and welcome to the forum!

I think this will require the /u (UNICODE) modifier in the REGEXP - have a go with:

/^ΑΝΔΡΑΣ/u

Hope this helps!

Cheers,
David.
--
PriceTapestry.com