I have a list of common words & synonyms in a 'pt_synonyms' table,
#----------------------------
# Table structure for pt_synonyms
#----------------------------
CREATE TABLE `pt_synonyms` (
`original` text,
`alternate` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
#----------------------------
# Records for table pt_synonyms
#----------------------------
insert into pt_synonyms values
('Television', 'TV');
What i'd like to do is process all the titles & descriptions in the products table to find and replace the 'original' with the 'alternate' text.
I've looked at the PT scripts to find where I could do this but i can't get anywhere. Any ideas?
Thanks
David
Hi David,
I'm up to about 300 now and more are being added all the time, i would estimate 400-500.
Thanks
Dave
Oh and forget to add, i would also like to reverse the words in the table as well, so that will double it.
Hi Dave,
Assuming that you are using PHP > 4.0.5, you can do this most efficiently using str_replace() with arrays as the parameters. This lets you change all words in just one call; although it may take some time.
Here's a new filter called "Synonyms" that uses the database structure and table name "synonyms" that you've provided (i'm assuming that "pt_" is your table prefix in config.php). Simply paste this code (excluding the PHP tags, of course) into includes/filter.php:
<?php
/*************************************************/
/* synonyms */
/*************************************************/
$filter_names["synonyms"] = "Synonyms";
function filter_synonymsConfigure($filter_data)
{
print "<p>There are no additional configuration parameters for this filter.</p>";
}
function filter_synonymsValidate($filter_data)
{
}
function filter_synonymsExec($filter_data,$text)
{
// make the search and replace arrays global so we only have to load them once
global $synonyms_search;
global $synonyms_replace;
// if this is the first call to the filter load the arrays from the database
if (!is_array($synonyms_search))
{
$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
if (database_querySelect($sql,$synonyms))
{
foreach($synonyms as $synonym)
{
$synonyms_search[] = $synonym["original"];
$synonyms_replace[] = $synonym["alternate"];
}
}
}
return str_replace($synonyms_search,$synonyms_replace,$text);
}
?>
What this does is load all the items from the synonyms table into two arrays which then form the parameters to str_replace, which does the whole lot in one go.
You will have to register the filter against the product name and description fields, and then import the feeds again for the filter to take effect. Please note that this doesn't include the reversal because that would require a mechanism to remember the swap state of every single word (to avoid double swaps) and i'm not sure how easily that can be done on a swap space of several hundred words...
Hope this helps,
David.
Thanks David,
You're right in assuming the table prefix, i'm running PHP/4.3.11 but get the following error on clicking import...
Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/gifts/public_html/includes/database.php on line 21
Warning: mysql_num_rows(): supplied argument is not a valid MySQL result resource in /home/gifts/public_html/includes/database.php on line 26
Any ideas?
Hi,
I'm assuming that your prefix is "pt_"?
The only query involved is the select all from the synonyms table, so there's not a lot to go wrong! Can you print the SQL out and then exit:
$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
print $sql;exit();
On your system, that should print:
SELECT * FROM pt_synonyms
Then, copy the SQL and paste it into phpMyAdmin (i'm assuming you used something like that to create the table) and see what error MySQL gives...
(this code is also assuming that you created the pt_synonyms table in the same database as the main Price Tapestry tables)
Cheers,
David.
Hi david,
Yes the prefix is "_pt" in config php, but it's not picking it up, the SQl prints...
SELECT * FROM synonyms
The table is in the same database, here's my config line $config_databaseTablePrefix = "pt_";
odd???
PS. It works a treat when i change the select to... $sql = "SELECT * FROM pt_synonyms";
Ooops....
$config_databaseTablePrefix is not declared as global in filter_synonymsExec(). The code should be like this:
<?php
/*************************************************/
/* synonyms */
/*************************************************/
$filter_names["synonyms"] = "Synonyms";
function filter_synonymsConfigure($filter_data)
{
print "<p>There are no additional configuration parameters for this filter.</p>";
}
function filter_synonymsValidate($filter_data)
{
}
function filter_synonymsExec($filter_data,$text)
{
// make the search and replace arrays global so we only have to load them once
global $config_databaseTablePrefix;
global $synonyms_search;
global $synonyms_replace;
// if this is the first call to the filter load the arrays from the database
if (!is_array($synonyms_search))
{
$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
if (database_querySelect($sql,$synonyms))
{
foreach($synonyms as $synonym)
{
$synonyms_search[] = $synonym["original"];
$synonyms_replace[] = $synonym["alternate"];
}
}
}
return str_replace($synonyms_search,$synonyms_replace,$text);
}
?>
That's it, thank you very much for your help, even on a Sunday!
Best regards
David
Back again, that works great until I realised it's doing some funny things, so for instance...
Original 'blended' Alternate 'mixed'
It actually ended up as 'mixtureed' because earlier in the array is...
Original 'blend' Alternate 'mixture'.
So i can see 2 issues here, firstly it's matching twice & secondly it can't distinguish whole words, but a string (obviously). Is there any way round this? I tried playing with preg_replace but can't get it to work. On top of all that i have entries that contain 2 words to be replaced with 1. It's never simple!!
Thanks
Dave
Hiya,
Yes - that is going to complicate matters (I had thought there would be a number of "gotchas" involved here). I'll have a play with the filter code above to see what can be done....
Cheers,
David.
Hi Dave,
Here's "Take 2"! Regular expressions can get quite heavy, but here's the code first:
<?php
/*************************************************/
/* synonyms */
/*************************************************/
$filter_names["synonyms"] = "Synonyms";
function filter_synonymsConfigure($filter_data)
{
print "<p>There are no additional configuration parameters for this filter.</p>";
}
function filter_synonymsValidate($filter_data)
{
}
function filter_synonymsExec($filter_data,$text)
{
// make the search and replace arrays global so we only have to load them once
global $synonyms_search;
global $synonyms_replace;
// if this is the first call to the filter load the arrays from the database
if (!is_array($synonyms_search))
{
$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
if (database_querySelect($sql,$synonyms))
{
foreach($synonyms as $synonym)
{
$synonyms_search[] = "/(^|\\s)".$synonym["original"]."($|\\s)/i";
$synonyms_replace[] = "\$1".$synonym["alternate"]."\$2";
}
}
}
return preg_replace($synonyms_search,$synonyms_replace,$text);
}
?>
Luckily, preg_replace() behaves in the same way as str_replace() in that you can pass it arrays of search and replacement expressions. You will see in the code that the search expression is as follows:
"/(^|\\s)".$synonym["original"]."($|\\s)/i";
The bit in the middle is obvious - that's the word we're looking for. However, the (^|\\s) means either the start of the line "^", OR "|" a white space character "\\s". Similarly, ($|\\s) means either the end of the line "$" OR "|" a white space character "\\s".
This stops the matching of a part word. The fact that these parts of the expressions are in brackets is crucial, because you will notice that in the replacement expression:
"\$1".$synonym["alternate"]."\$2";
...we include the $1 and $2 "back references". They will be be replaced with whatever characters matched the before or after expressions within the search expression - otherwise the spaces would be wiped out.
Hope this helps!
Cheers,
David.
That works, the global $config_databaseTablePrefix; was missing but this is corrected in the below code.
This may be a bit much to ask but is there any way of matching a reversal of the synonyms list, say doing only one match per term?
Thanks again
Dave
<?php
/*************************************************/
/* synonyms */
/*************************************************/
$filter_names["synonyms"] = "Synonyms";
function filter_synonymsConfigure($filter_data)
{
print "<p>There are no additional configuration parameters for this filter.</p>";
}
function filter_synonymsValidate($filter_data)
{
}
function filter_synonymsExec($filter_data,$text)
{
// make the search and replace arrays global so we only have to load them once
global $config_databaseTablePrefix;
global $synonyms_search;
global $synonyms_replace;
// if this is the first call to the filter load the arrays from the database
if (!is_array($synonyms_search))
{
$sql = "SELECT * FROM ".$config_databaseTablePrefix."synonyms";
if (database_querySelect($sql,$synonyms))
{
foreach($synonyms as $synonym)
{
$synonyms_search[] = "/(^|\\s)".$synonym["original"]."($|\\s)/i";
$synonyms_replace[] = "\$1".$synonym["alternate"]."\$2";
}
}
}
return preg_replace($synonyms_search,$synonyms_replace,$text);
}
?>
how do i implement this?
ive added the synonyms filter, where do i insert the synonyms words?
Hi,
To use this filter you need to have another table in your database to hold the synonums. The table is described in the first message in this thread....
Do you have phpMyAdmin or a similar MySQL administration script installed on your server?
Cheers,
David.
hi david,
thanks for the reply.
i do have phpmyadmin on cpanel.
i am also interested in including this filter by default. is that possible?
thank you.
oisin
Hi oisin,
The first thing you need to do is create the table using phpMyAdmin. You can do this using the SQL tool and pasting in the following code having selected your Price Tapestry database:
CREATE TABLE `synonyms` (
`original` text,
`alternate` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
If you are using a table prefix configured in $config_databaseTablePrefix you will need to change the table name "synonyms" to include the prefix.
Next, populate the table as required using phpMyAdmin. Let me know once you've got the filter working as required, and i'll have a look at how you can make it permanent... I presume you want it applied to all description fields by default?
Cheers,
David.
Hi David,
How many synonyms do you have?
There are 2 ways to do this.
i) If there are less than 1-200 or so, I would load all synonyms into an array and process them during import.
ii) If there are more than that, I would run a separate process after import to update all titles (which are indexed) against the synonym table.
Let me know how many you're working with and i'll roll some code - it should be straight forward enough...
Cheers,
David.