You are here:  » Find & Replace Words in Titles & Descriptions


Find & Replace Words in Titles & Descriptions

Submitted by noodles on Sat, 2006-09-23 15:23 in

I have a list of common words & synonyms in a 'pt_synonyms' table,

#----------------------------
# Table structure for pt_synonyms
#----------------------------
CREATE TABLE `pt_synonyms` (
`original` text,
`alternate` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
#----------------------------
# Records for table pt_synonyms
#----------------------------

insert into pt_synonyms values
('Television', 'TV');

What i'd like to do is process all the titles & descriptions in the products table to find and replace the 'original' with the 'alternate' text.

I've looked at the PT scripts to find where I could do this but i can't get anywhere. Any ideas?

Thanks
David

Submitted by support on Sat, 2006-09-23 19:11

Hi David,

How many synonyms do you have?

There are 2 ways to do this.

i) If there are less than 1-200 or so, I would load all synonyms into an array and process them during import.

ii) If there are more than that, I would run a separate process after import to update all titles (which are indexed) against the synonym table.

Let me know how many you're working with and i'll roll some code - it should be straight forward enough...

Cheers,
David.

Submitted by noodles on Sat, 2006-09-23 19:59

Hi David,

I'm up to about 300 now and more are being added all the time, i would estimate 400-500.

Thanks
Dave

Submitted by noodles on Sat, 2006-09-23 20:00

Oh and forget to add, i would also like to reverse the words in the table as well, so that will double it.

Submitted by support on Sun, 2006-09-24 10:31

Hi Dave,

Assuming that you are using PHP > 4.0.5, you can do this most efficiently using str_replace() with arrays as the parameters. This lets you change all words in just one call; although it may take some time.

Here's a new filter called "Synonyms" that uses the database structure and table name "synonyms" that you've provided (i'm assuming that "pt_" is your table prefix in config.php). Simply paste this code (excluding the PHP tags, of course) into includes/filter.php:

<?php
  
/*************************************************/
  /* synonyms                                      */
  /*************************************************/
  
$filter_names["synonyms"] = "Synonyms";
  function 
filter_synonymsConfigure($filter_data)
  {
    print 
"<p>There are no additional configuration parameters for this filter.</p>";
  }
  function 
filter_synonymsValidate($filter_data)
  {
  }
  function 
filter_synonymsExec($filter_data,$text)
  {
    
// make the search and replace arrays global so we only have to load them once
    
global $synonyms_search;
    global 
$synonyms_replace;
    
// if this is the first call to the filter load the arrays from the database
    
if (!is_array($synonyms_search))
    {
      
$sql "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
      if (
database_querySelect($sql,$synonyms))
      {
        foreach(
$synonyms as $synonym)
        {
          
$synonyms_search[] = $synonym["original"];
          
$synonyms_replace[] = $synonym["alternate"];
        }
      }
    }
    return 
str_replace($synonyms_search,$synonyms_replace,$text);
  }
?>

What this does is load all the items from the synonyms table into two arrays which then form the parameters to str_replace, which does the whole lot in one go.

You will have to register the filter against the product name and description fields, and then import the feeds again for the filter to take effect. Please note that this doesn't include the reversal because that would require a mechanism to remember the swap state of every single word (to avoid double swaps) and i'm not sure how easily that can be done on a swap space of several hundred words...

Hope this helps,
David.

Submitted by noodles on Sun, 2006-09-24 11:12

Thanks David,

You're right in assuming the table prefix, i'm running PHP/4.3.11 but get the following error on clicking import...

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/gifts/public_html/includes/database.php on line 21

Warning: mysql_num_rows(): supplied argument is not a valid MySQL result resource in /home/gifts/public_html/includes/database.php on line 26

Any ideas?

Submitted by support on Sun, 2006-09-24 11:50

Hi,

I'm assuming that your prefix is "pt_"?

The only query involved is the select all from the synonyms table, so there's not a lot to go wrong! Can you print the SQL out and then exit:

$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
print $sql;exit();

On your system, that should print:

SELECT * FROM pt_synonyms

Then, copy the SQL and paste it into phpMyAdmin (i'm assuming you used something like that to create the table) and see what error MySQL gives...

(this code is also assuming that you created the pt_synonyms table in the same database as the main Price Tapestry tables)

Cheers,
David.

Submitted by noodles on Sun, 2006-09-24 11:59

Hi david,

Yes the prefix is "_pt" in config php, but it's not picking it up, the SQl prints...

SELECT * FROM synonyms

The table is in the same database, here's my config line $config_databaseTablePrefix = "pt_";

odd???

PS. It works a treat when i change the select to... $sql = "SELECT * FROM pt_synonyms";

Submitted by support on Sun, 2006-09-24 12:05

Ooops....

$config_databaseTablePrefix is not declared as global in filter_synonymsExec(). The code should be like this:

<?php
  
/*************************************************/
  /* synonyms                                      */
  /*************************************************/
  
$filter_names["synonyms"] = "Synonyms";
  function 
filter_synonymsConfigure($filter_data)
  {
    print 
"<p>There are no additional configuration parameters for this filter.</p>";
  }
  function 
filter_synonymsValidate($filter_data)
  {
  }
  function 
filter_synonymsExec($filter_data,$text)
  {
    
// make the search and replace arrays global so we only have to load them once
    
global $config_databaseTablePrefix;
    global 
$synonyms_search;
    global 
$synonyms_replace;
    
// if this is the first call to the filter load the arrays from the database
    
if (!is_array($synonyms_search))
    {
      
$sql "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
      if (
database_querySelect($sql,$synonyms))
      {
        foreach(
$synonyms as $synonym)
        {
          
$synonyms_search[] = $synonym["original"];
          
$synonyms_replace[] = $synonym["alternate"];
        }
      }
    }
    return 
str_replace($synonyms_search,$synonyms_replace,$text);
  }
?>

Submitted by noodles on Sun, 2006-09-24 12:08

That's it, thank you very much for your help, even on a Sunday!

Best regards
David

Submitted by noodles on Sun, 2006-09-24 13:56

Back again, that works great until I realised it's doing some funny things, so for instance...

Original 'blended' Alternate 'mixed'

It actually ended up as 'mixtureed' because earlier in the array is...

Original 'blend' Alternate 'mixture'.

So i can see 2 issues here, firstly it's matching twice & secondly it can't distinguish whole words, but a string (obviously). Is there any way round this? I tried playing with preg_replace but can't get it to work. On top of all that i have entries that contain 2 words to be replaced with 1. It's never simple!!

Thanks
Dave

Submitted by support on Sun, 2006-09-24 15:15

Hiya,

Yes - that is going to complicate matters (I had thought there would be a number of "gotchas" involved here). I'll have a play with the filter code above to see what can be done....

Cheers,
David.

Submitted by support on Mon, 2006-09-25 08:14

Hi Dave,

Here's "Take 2"! Regular expressions can get quite heavy, but here's the code first:

<?php
  
/*************************************************/
  /* synonyms                                      */
  /*************************************************/
  
$filter_names["synonyms"] = "Synonyms";
  function 
filter_synonymsConfigure($filter_data)
  {
    print 
"<p>There are no additional configuration parameters for this filter.</p>";
  }
  function 
filter_synonymsValidate($filter_data)
  {
  }
  function 
filter_synonymsExec($filter_data,$text)
  {
    
// make the search and replace arrays global so we only have to load them once
    
global $synonyms_search;
    global 
$synonyms_replace;
    
// if this is the first call to the filter load the arrays from the database
    
if (!is_array($synonyms_search))
    {
      
$sql "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
      if (
database_querySelect($sql,$synonyms))
      {
        foreach(
$synonyms as $synonym)
        {
          
$synonyms_search[] = "/(^|\\s)".$synonym["original"]."($|\\s)/i";
          
$synonyms_replace[] = "\$1".$synonym["alternate"]."\$2";
        }
      }
    }
    return 
preg_replace($synonyms_search,$synonyms_replace,$text);
  }
?>

Luckily, preg_replace() behaves in the same way as str_replace() in that you can pass it arrays of search and replacement expressions. You will see in the code that the search expression is as follows:

"/(^|\\s)".$synonym["original"]."($|\\s)/i";

The bit in the middle is obvious - that's the word we're looking for. However, the (^|\\s) means either the start of the line "^", OR "|" a white space character "\\s". Similarly, ($|\\s) means either the end of the line "$" OR "|" a white space character "\\s".

This stops the matching of a part word. The fact that these parts of the expressions are in brackets is crucial, because you will notice that in the replacement expression:

"\$1".$synonym["alternate"]."\$2";

...we include the $1 and $2 "back references". They will be be replaced with whatever characters matched the before or after expressions within the search expression - otherwise the spaces would be wiped out.

Hope this helps!
Cheers,
David.

Submitted by noodles on Mon, 2006-09-25 16:48

That works, the global $config_databaseTablePrefix; was missing but this is corrected in the below code.

This may be a bit much to ask but is there any way of matching a reversal of the synonyms list, say doing only one match per term?

Thanks again
Dave

<?php
  
/*************************************************/
  /* synonyms                                      */
  /*************************************************/
  
$filter_names["synonyms"] = "Synonyms";
  function 
filter_synonymsConfigure($filter_data)
  {
    print 
"<p>There are no additional configuration parameters for this filter.</p>";
  }
  function 
filter_synonymsValidate($filter_data)
  {
  }
  function 
filter_synonymsExec($filter_data,$text)
  {
    
// make the search and replace arrays global so we only have to load them once
    
global $config_databaseTablePrefix;
    global 
$synonyms_search;
    global 
$synonyms_replace;
    
// if this is the first call to the filter load the arrays from the database
    
if (!is_array($synonyms_search))
    {
      
$sql "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
      if (
database_querySelect($sql,$synonyms))
      {
        foreach(
$synonyms as $synonym)
        {
          
$synonyms_search[] = "/(^|\\s)".$synonym["original"]."($|\\s)/i";
          
$synonyms_replace[] = "\$1".$synonym["alternate"]."\$2";
        }
      }
    }
    return 
preg_replace($synonyms_search,$synonyms_replace,$text);
  }
?>

Submitted by Oisin on Sat, 2006-10-07 14:16

how do i implement this?

ive added the synonyms filter, where do i insert the synonyms words?

Submitted by support on Sat, 2006-10-07 14:44

Hi,

To use this filter you need to have another table in your database to hold the synonums. The table is described in the first message in this thread....

Do you have phpMyAdmin or a similar MySQL administration script installed on your server?

Cheers,
David.

Submitted by Oisin on Sun, 2006-10-08 02:51

hi david,

thanks for the reply.

i do have phpmyadmin on cpanel.

i am also interested in including this filter by default. is that possible?

thank you.

oisin

Submitted by support on Sun, 2006-10-08 04:29

Hi oisin,

The first thing you need to do is create the table using phpMyAdmin. You can do this using the SQL tool and pasting in the following code having selected your Price Tapestry database:

CREATE TABLE `synonyms` (
`original` text,
`alternate` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

If you are using a table prefix configured in $config_databaseTablePrefix you will need to change the table name "synonyms" to include the prefix.

Next, populate the table as required using phpMyAdmin. Let me know once you've got the filter working as required, and i'll have a look at how you can make it permanent... I presume you want it applied to all description fields by default?

Cheers,
David.