Hello David
Is it possible to add the categories to the sitemap?
Thanks
Mally
Hi David
I used this code as well (thank-you!)
It doesn't show date modified on the new sitemap. Is it possible to include this?
Regards
Peter
Hi Peter,
Unlike the products sitemap, where lastmod can be created from the feed import data, for the categories there is no such specific that can be used. However, thinking about it logically, the date of the most recently imported feed may be a useful value to use.
In sitemap.php, look for this code around line 74:
print "<lastmod>".date("Y-m-d",$row["imported"])."</lastmod>";
...and INSERT the following code to make a copy of the latest imported date for use in the sitemap categories index entry later:
if ($row["imported"] > $lastImported) $lastImported = $row["imported"];
Then, lower down where you added the new code for the categories index entry using this line:
print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc></sitemap>";
...REPLACE this with:
print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc><lastmod>".date("Y-m-d",$lastImported)."</lastmod></sitemap>";
Cheers,
David.
Hello David
Is it possible to add also brands to the sitemap ?
Thank you
Jordanus
hi Jordanus,
Sure - here's a brand version of the above:
sitemapBrands.php
<?php
function xmlentities($text)
{
$search = array('&','<','>','"','\'');
$replace = array('&','<','>','"',''');
$text = str_replace($search,$replace,$text);
$text = preg_replace('/[^-A-Za-z0-9:\/\. ]/','',$text);
return $text;
}
require("includes/common.php");
header("Content-Type: text/xml");
print "<?xml version='1.0' encoding='UTF-8'?>";
print "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd'>";
$sql = "SELECT DISTINCT(brand) AS brand FROM `".$config_databaseTablePrefix."products` WHERE brand <> '' LIMIT 50000";
if (database_querySelect($sql,$rows))
{
$sitemapBaseHREF = "http://".$_SERVER["HTTP_HOST"].$config_baseHREF;
foreach($rows as $row)
{
print "<url>";
if ($config_useRewrite)
{
$sitemapHREF = "brand/".tapestry_hyphenate($row["brand"])."/";
}
else
{
$sitemapHREF = "search.php?q=brand:".urlencode($row["brand"]);
}
print "<loc>".xmlentities($sitemapBaseHREF.$sitemapHREF)."</loc>";
print "</url>";
}
}
print "</urlset>";
?>
Cheers,
David.
--
PriceTapestry.com
Is it possible to add all brand/category pages to the sitemap?
Most of my brands have several pages. I want all of them in the sitemap.
For example:
www.example.com/brand/Example/
www.example.com/brand/Example/2.html
www.example.com/brand/Example/3.html
www.example.com/brand/Example/4.html
Hi,
Sure - since there is a limit of 50,000 URLs in a single sitemap, just to be on the safe side I've created alternative versions of the above scripts that work in a similar way to the main sitemap by returning a sitemap index with an entry for each category / brand. The individual category / brand sitemaps then contain all pages as required...
sitemapCategories.php
<?php
require("includes/common.php");
header("Content-Type: text/xml");
print "<?xml version='1.0' encoding='UTF-8'?>";
$sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
if (isset($_GET["category"]))
{
$category = tapestry_normalise($_GET["category"]);
print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE category='".database_safe($category)."'";
database_querySelect($sql,$rows);
$numResults = $rows[0]["numResults"];
$numPages = ceil($numResults / $config_resultsPerPage);
for($page=1;$page<=$numPages;$page++)
{
print "<url>";
$loc = $sitemapBaseHREF.tapestry_indexHREF("category",$category);
if ($page > 1) $loc .= $page.".html";
print "<loc><![CDATA[".$loc."]]></loc>";
print "</url>";
}
print "</urlset>";
}
else
{
print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT DISTINCT(category) FROM `".$config_databaseTablePrefix."products` WHERE category <> '' ORDER BY category";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
print "<sitemap>";
$sitemapHREF = $config_baseHREF."sitemapCategories.php?category=".urlencode($row["category"]);
print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
print "</sitemap>";
}
}
print "</sitemapindex>";
}
?>
sitemapBrands.php
<?php
require("includes/common.php");
header("Content-Type: text/xml");
print "<?xml version='1.0' encoding='UTF-8'?>";
$sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
if (isset($_GET["brand"]))
{
$brand = tapestry_normalise($_GET["brand"]);
print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE brand='".database_safe($brand)."'";
database_querySelect($sql,$rows);
$numResults = $rows[0]["numResults"];
$numPages = ceil($numResults / $config_resultsPerPage);
for($page=1;$page<=$numPages;$page++)
{
print "<url>";
$loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);
if ($page > 1) $loc .= $page.".html";
print "<loc><![CDATA[".$loc."]]></loc>";
print "</url>";
}
print "</urlset>";
}
else
{
print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT DISTINCT(brand) FROM `".$config_databaseTablePrefix."products` WHERE brand <> '' ORDER BY brand";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
print "<sitemap>";
$sitemapHREF = $config_baseHREF."sitemapBrands.php?brand=".urlencode($row["brand"]);
print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
print "</sitemap>";
}
}
print "</sitemapindex>";
}
?>
Cheers,
David.
--
PriceTapestry.com
Hello David,
The sitemapindex works just fine but the brand/category sitemaps do not work.
I am getting an error message when i try to open up a brand/cagetory sitemap.
This is the error message
xml parse error : no main element found
Location : http://www.mydomain.com/sitemapBrands.php?brand=Example
Rule number 1 , column 104
This line is written in red under the error message
-------------------------------------------------------------------------------------------------------^
Can you fix this?
Hi,
I just checked in-case copy / paste issue / forum code formatting but all looks OK, so if you could perhaps check with CDATA tags around the location - I've applied the change to the above, which is line 19 in each script:
print "<loc><![CDATA[".$loc."]]></loc>";
Hope this helps!
Cheers,
David.
--
PriceTapestry.com
I am getting the same error message but i have looked in my error log. This is the error message in the error log.
Got error 'PHP message: PHP Fatal error: Call to undefined function tapestry_indexHREF()
Hi,
Ah - sorry about that - tapestry_indexHREF() was added in 15/10A. For earlier distributions, for sitemapCategories.php look for the following code at line 17:
$loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);
...REPLACE with:
if ($config_useRewrite)
{
$loc = $sitemapBaseHREF.$config_baseHREF."category/".tapestry_hyphenate($category)."/";
if ($page > 1) $loc .= $page.".html";
}
else
{
$loc = $sitemapBaseHREF.$config_baseHREF."search.php?q=category:".urlencode($category);
if ($page > 1) $loc .= "&page=".$page;
}
And similarly in sitemapBrands.php look for the following code at line 17:
$loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);
...and REPLACE with:
if ($config_useRewrite)
{
$loc = $sitemapBaseHREF.$config_baseHREF."brand/".tapestry_hyphenate($brand)."/";
if ($page > 1) $loc .= $page.".html";
}
else
{
$loc = $sitemapBaseHREF.$config_baseHREF."search.php?q=brand:".urlencode($brand);
if ($page > 1) $loc .= "&page=".$page;
}
Cheers,
David.
--
PriceTapestry.com
Hi,
I tried to get the Brands URL in lower case in the Brands sitemap.xml, but if I change:
$sitemapHREF = "brand/".tapestry_hyphenate($row["brand"])."/";
to
$sitemapHREF = "brand/".tapestry_hyphenate(strlower($row["brand"]))."/";
I get an error, i thought this would work. What do I do wrong?
Thanks in advance!
Robert
Hi Robert,
You need to use strtolower ("to" missing) - have a go with;
$sitemapHREF = "brand/".tapestry_hyphenate(strtolower($row["brand"]))."/";
That should be all it is...
Cheers,
David.
--
PriceTapestry.com
Hi again David.
Is any script available to generate the sitemap when $config_useCategoryHierarchy is enabled? The script should only generate the leafs, of course, when there is some product mapped into in.
Hi,
Sure - here's a version of sitemapCategories.php for Category Hierarchy;
<?php
require("includes/common.php");
$categories_hierarchy = array();
function getCategoryPath($id)
{
global $categories_hierarchy;
$categories = array();
do {
array_unshift($categories,$categories_hierarchy[$id]["name"]);
} while($id = $categories_hierarchy[$id]["parent"]);
return implode("/",$categories);
}
$sql = "SELECT id,name,parent FROM `".$config_databaseTablePrefix."categories_hierarchy` ORDER BY id";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
$categories_hierarchy[$row["id"]] = $row;
}
}
header("Content-Type: text/xml");
print "<?xml version='1.0' encoding='UTF-8'?>";
$sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
if (isset($_GET["path"]))
{
$nodeInfo = tapestry_categoryHierarchyNodeInfo($_GET["path"]);
print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE categoryid='".database_safe($nodeInfo["id"])."'";
database_querySelect($sql,$rows);
$numResults = $rows[0]["numResults"];
$numPages = ceil($numResults / $config_resultsPerPage);
for($page=1;$page<=$numPages;$page++)
{
print "<url>";
$loc = $sitemapBaseHREF.tapestry_indexHREF("category",$nodeInfo["path"]);
if ($page > 1) $loc .= $page.".html";
print "<loc><![CDATA[".$loc."]]></loc>";
print "</url>";
}
print "</urlset>";
}
else
{
print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
print "<sitemap>";
$sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
print "</sitemap>";
}
}
print "</sitemapindex>";
}
?>
Cheers,
David.
--
PriceTapestry.com
Hi David.
How can I modify the above script to only display in sitemap the leaf categories.
I mean, for instance, I have, example.com/electronic/tv-video/monitors/
But I only want to include in sitemap monitors, not electronic and tv-video.
Thanks!
Hi,
If you only have products at leaf categories that should be the case as it stands as the SELECT query in the <sitemapindex> part of the above script uses the products table to get the categoryid list.
However, if you wanted to force the index to be leaf categories only, then where you have the following code beginning at line 44:
print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
print "<sitemap>";
$sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
print "</sitemap>";
}
}
print "</sitemapindex>";
...REPLACE with:
print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
$sql = "SELECT id FROM `".$config_databaseTablePrefix."categories_hierarchy` WHERE id NOT IN (SELECT DISTINCT(parent) FROM `".$config_databaseTablePrefix."categories_hierarchy`)";
database_querySelect($sql,$rows);
$leafIds = array();
foreach($rows as $row)
{
$leafIds[] = $row["id"];
}
$sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
if (database_querySelect($sql,$rows))
{
foreach($rows as $row)
{
if (!in_array($row["categoryid"],$leafIds)) continue;
print "<sitemap>";
$sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
print "</sitemap>";
}
}
print "</sitemapindex>";
Hope this helps!
Cheers,
David.
--
PriceTapestry.com
Hi Mally,
Should be straight forward. Start with a new script to generate a sitemap for the categories:
sitemapCategories.php:
<?php
function xmlentities($text)
{
$search = array('&','<','>','"','\'');
$replace = array('&','<','>','"',''');
$text = str_replace($search,$replace,$text);
$text = preg_replace('/[^-A-Za-z0-9:\/\. ]/','',$text);
return $text;
}
require("includes/common.php");
header("Content-Type: text/xml");
print "<?xml version='1.0' encoding='UTF-8'?>";
print "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd'>";
$sql = "SELECT DISTINCT(category) AS category FROM `".$config_databaseTablePrefix."products` LIMIT 50000";
if (database_querySelect($sql,$rows))
{
$sitemapBaseHREF = "http://".$_SERVER["HTTP_HOST"].$config_baseHREF;
foreach($rows as $row)
{
print "<url>";
if ($config_useRewrite)
{
$sitemapHREF = "category/".tapestry_hyphenate($row["category"])."/";
}
else
{
$sitemapHREF = "search.php?q=category:".urlencode($row["category"]);
}
print "<loc>".xmlentities($sitemapBaseHREF.$sitemapHREF)."</loc>";
print "</url>";
}
}
print "</urlset>";
?>
And then within sitemap.php, look for the following code on line 82:
print "</sitemapindex>";
...and REPLACE this with:
print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc></sitemap>";
print "</sitemapindex>";
Cheers,
David.