You are here:  » Search: plural vs singular


Search: plural vs singular

Submitted by webman on Mon, 2006-10-30 13:17 in

Hi Guys,

Well what can I say about PT that hasn't already been said! It brilliant, very easy to configure and customise once you get over the 'new script' syndrome! After only a few days and some super efficient assistance from David I now have a price comparison script that is full customised and ready to deploy! Wahoooo! :)

Just checking Davids test site with singular and plural search words and each gives a completely different set of results. EG:

Product search results for stove (showing 1 to 10 of 17)

Product search results for stoves (showing 1 to 10 of 469)

Generally and to my way of thinking (and it could be way of the planet) but if I was looking for a stove and didn't know of a brand or model number I would generall enter stoves, to compare prices and features. This gives a lot of results on the test site but omits the singular products... just a rambling thought, is their an easy way of matching up singular and plural search terms?

Would be interesting to hear about any experiences from web site owners on the singular vs plural visitor searches.

Submitted by support on Mon, 2006-10-30 13:35

Hi,

Thank you for your comments!

The issue of singular Vs plural serach has come up before, and some users have modified search.php to take account of it. The trick is to scan the search words for any word ending in "s", and then add the singular version of the word to the query. In search.php, look for:

        if (strlen($parts[0]) > 3)
        {

This is the default search case where the query length is greater than 3 characters and therefore using the full text index. To implement stemming, add the following code after the opening curly brace:

          $words = explode(" ",$parts[0]);
          $newWords = array();
          foreach($words as $word)
          {
            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
          }
          $allWords = array_merge($words,$newWords);
          $parts[0] = implode($allWords," ");

Hope this helps!
David.

Submitted by webman on Mon, 2006-10-30 15:36

Thanks David,

Works like a charm! :)

Submitted by pratikanay on Thu, 2007-07-12 17:18

Hi,

I used code

$words = explode(" ",$parts[0]);
$newWords = array();
foreach($words as $word)
{
if (substr($word,-1)=="s")
{
$newWords[] = substr($word,0,-1);
}
}

$allWords = array_merge($words,$newWords);
$parts[0] = implode($allWords," ");

and i have no result when i type "pantalons". before this change i have results.

after used above code my query like this.

SELECT *, MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH name AGAINST ('+pantalons +pantalon' IN BOOLEAN MODE) AS relevance FROM `products` WHERE MATCH name AGAINST ('+pantalons +pantalon' IN BOOLEAN MODE) GROUP BY name ORDER BY minPrice ASC LIMIT 0,20

Another major problem following this change = adidas = No found results – only works with “ADIDAS”.

Same with thing with “airness”, “Airness” = no results. “AIRNESS”= results

Has the search become case sensitive?

Can you give me suggestion?

Regards,
Pratik Patel

Submitted by support on Thu, 2007-07-12 17:31

Hello Pratik,

You cannot combine the plural search mod with the "AND" search mod. The reason you do not get any results is because you are now searching for "PANTALONS AND PANTALON". For this to work, you need to remove the BOOLEAN mode modifications, and you will then get results for "PANTALONS OR PANTALON".

Cheers,
David.

Submitted by italiano28 on Sun, 2008-02-10 17:08

Hello David,i want the same thing,the problem is that in italian language plural just change the last world,without adding other,example: (Pantalone = singular) (Pantaloni = plural) or (Casa=singular and Case=Plural) ,the 2 final words that change for plural are "E" and "I" .
Can you help me for this? thank you in advanced.
Cheers,
Stefano

Submitted by support on Sun, 2008-02-10 19:25

Hi Stefano,

This version of the code above will look for any words ending in "e" and add the same word but ending in "i"...

          $words = explode(" ",$parts[0]);
          $newWords = array();
          foreach($words as $word)
          {
            if ((substr($word,-1)=="e") || (substr($word,-1)=="E"))
            {
              $newWords[] = substr($word,0,-1)."i";
            }
          }
          $allWords = array_merge($words,$newWords);
          $parts[0] = implode($allWords," ");

Cheers!
David.

Submitted by italiano28 on Sun, 2008-02-10 21:44

Thank you,work perfectly.

Submitted by clickspace on Tue, 2008-02-12 12:45

Hi David,

2 questions if I may:

1) I was looking at the various forum threads on refining search results. I am just getting to grips with the script and testing results on my development site and I noticed that searching for example:

digital compact camera

returns every matching result with any of those words. However, the default sorting option of relevance means that the less relevant results are 6-7 pages deep which is good. It's not an issue so long as the user doesn't change the sorting from Low to High Price for example, then it's not so good. First few pages aren't really relevant.

Q: Would you recommend anything here or should I leave this as it is? Is it possible to keep relevancy even after sorting by price? I don't understand the complications from a coding point of view so maybe this simply isn't possible.

I've also read here that tinkering with the search results would introduce some other problems that would come with changing the free text query.

2) I was reading this http://www.pricetapestry.com/node/617 (Search: plural vs singular) and wanted to do the same thing because in my example, digital compact camera returns different results than digital compact cameras. I couldn't find that specific line of code you mentioned i.e.

In search.php, look for:

if (strlen($parts[0]) > 3)
{

I'm assuming the search.php code has changed since your original post. Is it still possible to add this code somewhere?

Thanks,
Steven

Submitted by support on Tue, 2008-02-12 12:50

Hi Steven,

Yes - it changed slightly recently, the line you are looking for is now:

if ($useFullText)

The modification was to make sure that the full text SQL was only used if all words are > 3 characters; whereas before it was based entirely upon the length of the query, which means that "LCD TV" would use the full text query but never return any results under most normal MySQL configurations.

I see what you mean about losing relevance if sorting by price; i'm not sure if anything can really be done about that but i'll take a look and see if there are any options...

Cheers,
David.

Submitted by clickspace on Tue, 2008-02-12 13:08

Thanks David.

Submitted by support on Tue, 2008-02-12 13:13

Hi Steven,

It just looks like you've inserted the code before the { - my mistake, I didn't make it very clear above...

Instead of:

if ($useFullText)
HERE

...it should be:

if ($useFullText)
{
  HERE

If you just move the { bracket back up to after the if statement it should work out fine. If you're still in doubt feel free to email me your modified file and i'll check it out for you.

Cheers,
David.

Submitted by clickspace on Tue, 2008-02-12 13:31

Perfect thanks! I was thinking that the { bracket was part of the SQL query i.e.

{
$sql = "SELECT * , MIN( price )

and that the code had to be out with this.

Full code for anyone else doing this is;

// modified for plural and singular

if ($useFullText)
{

$words = explode(" ",$parts[0]);
$newWords = array();
foreach($words as $word)
{
if (substr($word,-1)=="s")
{
$newWords[] = substr($word,0,-1);
}
}
$allWords = array_merge($words,$newWords);
$parts[0] = implode($allWords," ");

// end modified

$sql = "SELECT * , MIN( price ) AS minPrice,

etc.

Cheers again,
Steven

Submitted by AD_Mega on Wed, 2008-02-13 16:17

I have something different in my search.php. This is what I have:

default:
$useFullText = FALSE;
$words = explode(" ",$parts[0]);
foreach($words as $word)
{
if (strlen($word) >= 4)
{
$useFullText = TRUE;
}
}
// if not using full text index, spaces must be removed from the query
if (!$useFullText) $parts[0] = str_replace(" ","",$parts[0]);
// the following line replaces the deleted line from the original code
if ($useFullText)
{
$match = "+".str_replace(" "," +",$parts[0]);
$sql = "SELECT *, MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH name AGAINST ('".database_safe($match)."' IN BOOLEAN MODE) AS relevance FROM `".$config_databaseTablePrefix."products` WHERE MATCH name AGAINST ('".database_safe($match)."' IN BOOLEAN MODE) GROUP BY name";
$sqlResultCount = "SELECT COUNT(DISTINCT(name)) as resultcount FROM `".$config_databaseTablePrefix."products` WHERE MATCH name AGAINST ('".database_safe($match)."' IN BOOLEAN MODE)";
$orderBySelection = $orderByFullText;
}
else
{
$sql = "SELECT * , MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants FROM `".$config_databaseTablePrefix."products` WHERE search_name LIKE '%".database_safe($parts[0])."%' GROUP BY name";

$sqlResultCount = "SELECT COUNT(DISTINCT(name)) as resultcount FROM `".$config_databaseTablePrefix."products` WHERE search_name LIKE '%".database_safe($parts[0])."%'";

$orderBySelection = $orderByDefault;
}
break;

Where would I insert the code?

Submitted by support on Wed, 2008-02-13 16:22

Hiya,

Insert it after this:

if ($useFullText)
{

Cheers,
David.

Submitted by AD_Mega on Wed, 2008-02-13 17:24

I inserted this code:

$words = explode(" ",$parts[0]);
$newWords = array();
foreach($words as $word)
{
if (substr($word,-1)=="s")
{
$newWords[] = substr($word,0,-1);
}
}
$allWords = array_merge($words,$newWords);
$parts[0] = implode($allWords," ");

I get the same results as before I inserted the code when I search for camera but when I search for cameras I get less.

Submitted by support on Wed, 2008-02-13 17:36

Hi Adrian,

If you could email me a link to your site and a copy of your modified search.php and I'll take a look...

Cheers,
David.

Submitted by TWDesigns on Tue, 2008-07-29 17:28

I can't find (strlen($parts[0]) > 3) in search.php for some reason...

Submitted by support on Tue, 2008-07-29 18:27

Hi,

The instructions above refer to an older version of search.php. In the current distribution; instead look for (and insert the code after):

        if ($useFullText)

(line 77)

Cheers,
David.

Submitted by TWDesigns on Wed, 2008-07-30 01:50

Thanks!

Submitted by TWDesigns on Wed, 2008-07-30 20:14

My code looks like this but it's still now working, unless I'm understanding this post wrong. If I search for "Chicken" it finds both "Chicken and Chickens". But if I search for "Chickens" it only finds "Chickens".

Here is my coding: starting with $useFullText = FALSE; being on line 77

Before:

            $useFullText = FALSE;
            break;
          }
        }
        if ($useFullText)

After:

            $useFullText = FALSE;
$words = explode(" ",$parts[0]);
          $newWords = array();
          foreach($words as $word)
          {
            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
          }
          $allWords = array_merge($words,$newWords);
          $parts[0] = implode($allWords," ");
            break;
          }
        }
        if ($useFullText)

Submitted by support on Thu, 2008-07-31 08:14

HI,

It sounds like your site is currently using the basic search method for all queries. Could you email me your search.php and i'll make sure that you're using the latest version of the default search code with this plural mod being applied to both methods...

Cheers,
David.

Submitted by TWDesigns on Thu, 2008-07-31 17:56

Sent

Thanks again,
Tommy

Submitted by Hugo on Tue, 2008-10-28 13:10

Hi,

I made the modification but it seems it's only "half" working, unless there's something I missed.

When I search for "words" it's working, returning results for "word" and "words"; but when I search for "word", it only return results for "word".

Is it normal?
If it's normal, can you modify the code so it also search for the plural word when a singular word is searched?

Thanks,
Hugo

Submitted by support on Tue, 2008-10-28 17:39

Hello Hugo,

This could be done, but remember that is going to apply to almost every query - so check the performance. It's easy to do - where you have added this part of the code above:

            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }

...change this to include the opposite as follows:

            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
            else
            {
              $newWords[] = $word."s";
            }

In otherwords - if the word ends in "s", add a new word without the "s", otherwise add a new word with the "s". That should do the trick!

Cheers,
David.

Submitted by atman on Wed, 2009-04-01 13:48

   if ($useFullText)
        {
//i inserted the code above here.
    if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
            else
            {
              $newWords[] = $word."s";
            }
//// below is the rest of my original search.php codes
          $sql = "SELECT * , MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH (name,description) AGAINST ('".database_safe($parts[0])."') AS relevance FROM `".$config_databaseTablePrefix."products` WHERE MATCH (name,description) AGAINST ('".database_safe($parts[0])."') GROUP BY name";
          $sqlResultCount = "SELECT COUNT(DISTINCT(name)) as resultcount FROM `".$config_databaseTablePrefix."products` WHERE MATCH (name,description) AGAINST ('".database_safe($parts[0])."')";
   $orderBySelection = $orderByFullText;

i cant get the above mod to work on my search. Is the above mod compatible with the latest release? PT and magic downloaded march 2009

thanks david.

Submitted by support on Wed, 2009-04-01 14:00

Hi atman,

Your code is missing the part that breaks down and reconstructs $parts[0]. Try this:

       if ($useFullText)
       {
          //i inserted the code above here.
          $words = explode(" ",$parts[0]);
          $newWords = array();
          foreach($words as $word)
          {
            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
            else
            {
              $newWords[] = $word."s";
            }
          }
          $allWords = array_merge($words,$newWords);
          $parts[0] = implode($allWords," ");
          //// below is the rest of my original search.php codes
          $sql = "SELECT * , MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH (name,description) AGAINST ('".database_safe($parts[0])."') AS relevance FROM `".$config_databaseTablePrefix."products` WHERE MATCH (name,description) AGAINST ('".database_safe($parts[0])."') GROUP BY name";
          $sqlResultCount = "SELECT COUNT(DISTINCT(name)) as resultcount FROM `".$config_databaseTablePrefix."products` WHERE MATCH (name,description) AGAINST ('".database_safe($parts[0])."')";
   $orderBySelection = $orderByFullText;

Cheers,
David.

Submitted by Keeop on Fri, 2009-04-24 11:05

Hi all,

I've stumbled across a way, of sorts, to get around the 'plurals don't work with the Boolean method' problem. I say of sorts as this may help your search or it may make it worse! This is because its uses the wildcard operator.

So, some examples and these are based on if you are using the standard Full Text search and the Boolean method, i.e. MATCH name AGAINST ('".database_safe($parts[0]."' IN BOOLEAN MODE)

If a search for 'shoes' brings up 20 result but a search for shoe brings up none, then you obviously want to your search results to include 'shoes' and 'shoe'. To to this using the above example, add the following code to your search.php:

if ($useFullText)
        {
          //*** Plural stuff ***
          foreach($words as $k => $word)
          {
            if (substr($word,-1)=="s")
            {
              $words[$k] = substr($word,0,-1)."*";
            }
            else
            {
              $words[$k] = $word."*";
            }
          }
          $parts[0] = implode(" ",$words);
          $match = "+".str_replace(" "," +",$parts[0]);
          $sql = "SELECT *, MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH name AGAINST ('".database_safe($match)."' IN BOOLEAN MODE) AS relevance FROM `".$config_databaseTablePrefix."products` WHERE MATCH name AGAINST ('".database_safe($match)."' IN BOOLEAN MODE) ".$where." GROUP BY name";
          $sqlResultCount = "SELECT COUNT(DISTINCT(name)) as resultcount FROM `".$config_databaseTablePrefix."products` WHERE MATCH name AGAINST ('".database_safe($match)."' ".$where." IN BOOLEAN MODE)";
          $orderBySelection = $orderByFullText;
        }

The resultant query, based on a user entering 'shoes' in to the search box is:

SELECT *, MIN( price ) AS minPrice, MAX( price ) AS maxPrice, COUNT( id ) AS numMerchants, MATCH name AGAINST ('+shoe*' IN BOOLEAN MODE) AS relevance FROM `products` WHERE MATCH name AGAINST ('+shoe*' IN BOOLEAN MODE) GROUP BY name

As you can see, as the word was a plural, the 's' has been removed and the wildcard '*' has been added, meaning that the query will return every result with words beginning 'shoe'. This does bring us to caveat. If you were to have a product called 'ShoeASaraus' or something that's not an actual 'shoe' it will be returned in the results as the word 'shoe' is obviously present, but as the wilcard is at the end, the word would have to actually begin with 'shoe' rather than contain 'shoe' so this should limit any spurious results.

Anyway, have a play and see if it makes things better or worse for your particular sites and products.

Cheers.
Keeop

Submitted by goldengirl on Tue, 2009-07-28 20:33

Hi There I see this thread was sarted a long time ago and I can't seem to find any of the code snippets in search.php. so when I search for televisions as opposed to television nothing appears. In my search.php $useFullText = TRUE; not False like the above examples some help would be appreciated :)

Submitted by goldengirl on Tue, 2009-07-28 20:41

I have done a few tests regarding this i.e. searching lcd and lcds and the issue does not relate to single versus plural but to the fact that results only show if a product has the search query in the title, is there a way around this?

Submitted by support on Tue, 2009-07-28 20:48

Hi,

This modification is more complex with regards to both full text (all keywords > 4 characters) and normal search method, bear with me and I'll merge the plural search into both methods and document the changes relative to the latest version in this thread...

Cheers,
David.

Submitted by marco.saiu on Mon, 2018-01-15 18:12

Hello David,

have news about this topic?

Thanks,
Marco Saiu

Submitted by support on Tue, 2018-01-16 09:46

Hello Marco,

Plural to non-plural has been included in all later distributions, in other words a search for "trainers" would also search "trainer" however if you wanted to add support for the opposite that's no problem but keep an eye on search performance. To implement, edit search.php and look for the following code at line 263:

            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }

...and REPLACE with:

            if (substr($word,-1)=="s")
            {
              $newWords[] = substr($word,0,-1);
            }
            else
            {
              $newWords[] = $word."s";
            }

And then the following code at line 303:

            if (substr($word,-1)=="s")
            {
              $where .= " OR search_name LIKE '%".database_safe(substr($word,0,-1))."%'";
              if ($config_searchDescription)
              {
                $where .= " OR description LIKE '%".database_safe(substr($word,0,-1))."%'";
              }
            }

...and REPLACE with:

            if (substr($word,-1)=="s")
            {
              $where .= " OR search_name LIKE '%".database_safe(substr($word,0,-1))."%'";
              if ($config_searchDescription)
              {
                $where .= " OR description LIKE '%".database_safe(substr($word,0,-1))."%'";
              }
            }
            else
            {
              $where .= " OR search_name LIKE '%".database_safe($word."s")."%'";
              if ($config_searchDescription)
              {
                $where .= " OR description LIKE '%".database_safe($word."s")."%'";
              }
            }

Cheers,
David.
--
PriceTapestry.com

Submitted by marco.saiu on Tue, 2018-01-16 10:17

Hello David,

i like add Italian support with english singular/plurar in latest distribution.

Not replace actual function but expand with Italian rules. Because in Italian some English words are used (for examples smartphone, desktop, notebook, mixer, console, controller, laptop, smartwatch, videogame etc etc).

But i think is complex the Italian rules are:

- the words that in the singular end in "A" form the plural in "I" if they are masculine, in "E" if they are feminine.

- the male and female words that in the singular end in "O" form the plural in "I".

- the male and female names that in the singular end in "E" form the plural in "I".

Is possible elaborate new functions with this informations?

Thanks,
Marco Saiu

Submitted by support on Tue, 2018-01-16 10:30

Hello Marco,

Sure - could you perhaps give some actual examples using Italian words covering every case just so I'm absolutely sure what should be substituted...

Thanks,
David.
--
PriceTapestry.com

Submitted by marco.saiu on Tue, 2018-01-16 11:11

Hello David,

i work on it and update post in next days.

Thanks,
Marco Saiu