You are here:  » Removal of Categories & Associated Items


Removal of Categories & Associated Items

Submitted by EIF Media on Wed, 2014-01-01 12:39 in

Hi David,

After importing several feeds to my site, which is supposed to be specialist music, film and gaming, my categories have become convoluted and completely irrelevant items have made their way in due to their inclusion in the xml feeds imported.

I have looked at the filtering and this doesn't seem to be covered, but is there a way to completely ignore categories (i.e. not import items from this category and therefore prevent items within that category from being displayed in search results)?

Thanks,
Dan

Submitted by support on Thu, 2014-01-02 09:38

Hello Dan,

To exclude a single category, use a Drop Record filter against the Category field, either on a per feed basis if you know which feed contains the category you don't want on your site, or as a Global Filter, and in the text box on the configuration page for the filter enter the category name to exclude.

To exclude multiple categories, you would need to use the Drop Record RegExp filter instead, applied in the same way but on the configuration page you can specify multiple categories to exclude in the following format;

(Category 1|Category 2|Category 3)

e.g. a pipe-separated list of categories to exclude, all enclosed in brackets - that's the "RegExp" or regular expression.

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by EIF Media on Sun, 2014-01-05 16:40

Thanks again David, slightly ashamed I hadn't thought of that one!

Can you also help me with including search terms? For example if someone searches using the term '&' I could also include 'and' and '+' in the results, or if someone searches 'blueray' the results would also include 'bluray', 'blu-ray' and 'blu ray'.

Thanks,
Dan

Submitted by support on Mon, 2014-01-06 09:10

Hi Dan,

Quite a few users do something along these lines. Create a new file includes/searchextra.php containing an array of keywords and the additional keywords to expand out to e.g.

<?php
  $searchextra
["and"] = array("&","+"); // note that "&" itself is stripped by normalisation
  
$searchextra["blueray"] = array("bluray","blu ray");
?>

Then in search.php look for the following code at line 90:

  $parts = explode(":",$q);

...and REPLACE with:

  require("includes/searchextra.php");
  $allWords = array();
  $words = explode(" ",$q);
  foreach($words as $word)
  {
    $allWords[$word] = 1;
    if (isset($searchextra[$word]))
    {
      foreach($searchextra[$word] as $newWord)
      {
        $allWords[$newWord] = 1;
      }
    }
  }
  $newQ = implode(" ",array_keys($allWords));
  $parts = explode(":",$newQ);

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by EIF Media on Tue, 2014-01-07 23:18

As always, thanks for your help David.

I have implemented both suggestions and re-imported my feeds. My categories list is now considerably shorter!

Unfortunately though the search results returned are still as before. I have followed the steps above with the following contents of searchextra:

<?php
  $searchextra
["and"] = array("&","+"); // note that "&" itself is stripped by normalisation
  
$searchextra["bluray"] = array("blu ray","blu-ray");
  
$searchextra["LP"] = array("vinyl","record");
  
$searchextra["cassette"] = array("tape");
  
$searchextra["vhs"] = array("video");
?>

however a search for 'bluray' returns only 16 results when a search for blu-ray returns 6,244.

Can you also tell me if these work in reverse (will search results for 'blu-ray' include 'bluray' or will I need another $searchextra line)?

Regards,
Dan

Submitted by support on Wed, 2014-01-08 09:44

Hello Dan,

Ah - that will because the basic search method is being invoked which uses a logical "AND" by default, so in fact the modification should be implemented in 2 different ways;

If you firstly revert search.php by removing the above modification, and then look for the following code at line 4:

  require("includes/stopwords.php");

...and REPLACE with:

  require("includes/stopwords.php");
  require("includes/searchextra.php");

Then look for the following code at line 210:

  if (strlen($word) <= 3 || in_array(strtolower($word),$stopwords))

...and REPLACE with:

  if (isset($searchextra[$word]))
  {
    foreach($searchextra[$word] as $newWord)
    {
      $newWordSplits = explode(" ",$newWord);
      foreach($newWordSplits as $newWordSplit)
      {
        if (strlen($newWordSplit) <= 3 || in_array(strtolower($newWordSplit),$stopwords))
        {
          $config_useFullText = FALSE;
        }
      }
    }
  }
  if (strlen($word) <= 3 || in_array(strtolower($word),$stopwords))

Then for the FULLTEXT case, search for the following code at line 227:

  if (substr($word,-1)=="s")

...and REPLACE with:

  if (isset($searchextra[$word]))
  {
    foreach($searchextra[$word] as $newWord)
    {
      $newWords[] = $newWord;
    }
  }
  if (substr($word,-1)=="s")

And for the basic search method (any keyword