You are here:  » Is it possible to add the categories to the sitemap?


Is it possible to add the categories to the sitemap?

Submitted by mally on Mon, 2008-11-10 18:54 in

Hello David

Is it possible to add the categories to the sitemap?

Thanks

Mally

Submitted by support on Tue, 2008-11-11 09:55

Hi Mally,

Should be straight forward. Start with a new script to generate a sitemap for the categories:

sitemapCategories.php:

<?php
  function xmlentities($text)
  {
    $search = array('&','<','>','"','\'');
    $replace = array('&amp;','&lt;','&gt;','&quot;','&apos;');
    $text = str_replace($search,$replace,$text);
    $text = preg_replace('/[^-A-Za-z0-9:\/\. ]/','',$text);
    return $text;
  }
  require("includes/common.php");
  header("Content-Type: text/xml");
  print "<?xml version='1.0' encoding='UTF-8'?>";
  print "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd'>";
  $sql = "SELECT DISTINCT(category) AS category FROM `".$config_databaseTablePrefix."products` LIMIT 50000";
  if (database_querySelect($sql,$rows))
  {
    $sitemapBaseHREF = "http://".$_SERVER["HTTP_HOST"].$config_baseHREF;
    foreach($rows as $row)
    {
      print "<url>";
      if ($config_useRewrite)
      {
        $sitemapHREF = "category/".tapestry_hyphenate($row["category"])."/";
      }
      else
      {
        $sitemapHREF = "search.php?q=category:".urlencode($row["category"]);
      }
      print "<loc>".xmlentities($sitemapBaseHREF.$sitemapHREF)."</loc>";
      print "</url>";
    }
  }
  print "</urlset>";
?>

And then within sitemap.php, look for the following code on line 82:

print "</sitemapindex>";

...and REPLACE this with:

print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc></sitemap>";
print "</sitemapindex>";

Cheers,
David.

Submitted by mally on Tue, 2008-11-11 18:31

Hello David

Prefect!

Thanks

Mally

Submitted by Pep on Tue, 2008-11-18 01:01

Hi David

I used this code as well (thank-you!)

It doesn't show date modified on the new sitemap. Is it possible to include this?

Regards

Peter

Submitted by support on Tue, 2008-11-18 08:19

Hi Peter,

Unlike the products sitemap, where lastmod can be created from the feed import data, for the categories there is no such specific that can be used. However, thinking about it logically, the date of the most recently imported feed may be a useful value to use.

In sitemap.php, look for this code around line 74:

          print "<lastmod>".date("Y-m-d",$row["imported"])."</lastmod>";

...and INSERT the following code to make a copy of the latest imported date for use in the sitemap categories index entry later:

          if ($row["imported"] > $lastImported) $lastImported = $row["imported"];

Then, lower down where you added the new code for the categories index entry using this line:

print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc></sitemap>";

...REPLACE this with:

print "<sitemap><loc>".$sitemapBaseHREF."sitemapCategories.php</loc><lastmod>".date("Y-m-d",$lastImported)."</lastmod></sitemap>";

Cheers,
David.

Submitted by Jordanus on Tue, 2011-12-20 01:30

Hello David

Is it possible to add also brands to the sitemap ?

Thank you
Jordanus

Submitted by support on Tue, 2011-12-20 09:45

hi Jordanus,

Sure - here's a brand version of the above:

sitemapBrands.php

<?php
  function xmlentities($text)
  {
    $search = array('&','<','>','"','\'');
    $replace = array('&amp;','&lt;','&gt;','&quot;','&apos;');
    $text = str_replace($search,$replace,$text);
    $text = preg_replace('/[^-A-Za-z0-9:\/\. ]/','',$text);
    return $text;
  }
  require("includes/common.php");
  header("Content-Type: text/xml");
  print "<?xml version='1.0' encoding='UTF-8'?>";
  print "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd'>";
  $sql = "SELECT DISTINCT(brand) AS brand FROM `".$config_databaseTablePrefix."products` WHERE brand <> '' LIMIT 50000";
  if (database_querySelect($sql,$rows))
  {
    $sitemapBaseHREF = "http://".$_SERVER["HTTP_HOST"].$config_baseHREF;
    foreach($rows as $row)
    {
      print "<url>";
      if ($config_useRewrite)
      {
        $sitemapHREF = "brand/".tapestry_hyphenate($row["brand"])."/";
      }
      else
      {
        $sitemapHREF = "search.php?q=brand:".urlencode($row["brand"]);
      }
      print "<loc>".xmlentities($sitemapBaseHREF.$sitemapHREF)."</loc>";
      print "</url>";
    }
  }
  print "</urlset>";
?>

Cheers,
David.
--
PriceTapestry.com

Submitted by Jordanus on Tue, 2011-12-20 18:00

Thank you David, it works perfecty !

Submitted by bihmaniak on Sun, 2017-03-26 09:25

Is it possible to add all brand/category pages to the sitemap?

Most of my brands have several pages. I want all of them in the sitemap.

For example:
www.example.com/brand/Example/
www.example.com/brand/Example/2.html
www.example.com/brand/Example/3.html
www.example.com/brand/Example/4.html

Submitted by support on Mon, 2017-03-27 12:31

Hi,

Sure - since there is a limit of 50,000 URLs in a single sitemap, just to be on the safe side I've created alternative versions of the above scripts that work in a similar way to the main sitemap by returning a sitemap index with an entry for each category / brand. The individual category / brand sitemaps then contain all pages as required...

sitemapCategories.php

<?php
  require("includes/common.php");
  header("Content-Type: text/xml");
  print "<?xml version='1.0' encoding='UTF-8'?>";
  $sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
  if (isset($_GET["category"]))
  {
    $category = tapestry_normalise($_GET["category"]);
    print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE category='".database_safe($category)."'";
    database_querySelect($sql,$rows);
    $numResults = $rows[0]["numResults"];
    $numPages = ceil($numResults / $config_resultsPerPage);
    for($page=1;$page<=$numPages;$page++)
    {
      print "<url>";
      $loc = $sitemapBaseHREF.tapestry_indexHREF("category",$category);
      if ($page > 1) $loc .= $page.".html";
      print "<loc><![CDATA[".$loc."]]></loc>";
      print "</url>";
    }
    print "</urlset>";
  }
  else
  {
    print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT DISTINCT(category) FROM `".$config_databaseTablePrefix."products` WHERE category <> '' ORDER BY category";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $row)
      {
        print "<sitemap>";
        $sitemapHREF = $config_baseHREF."sitemapCategories.php?category=".urlencode($row["category"]);
        print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
        print "</sitemap>";
      }
    }
    print "</sitemapindex>";
  }
?>

sitemapBrands.php

<?php
  require("includes/common.php");
  header("Content-Type: text/xml");
  print "<?xml version='1.0' encoding='UTF-8'?>";
  $sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
  if (isset($_GET["brand"]))
  {
    $brand = tapestry_normalise($_GET["brand"]);
    print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE brand='".database_safe($brand)."'";
    database_querySelect($sql,$rows);
    $numResults = $rows[0]["numResults"];
    $numPages = ceil($numResults / $config_resultsPerPage);
    for($page=1;$page<=$numPages;$page++)
    {
      print "<url>";
      $loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);
      if ($page > 1) $loc .= $page.".html";
      print "<loc><![CDATA[".$loc."]]></loc>";
      print "</url>";
    }
    print "</urlset>";
  }
  else
  {
    print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT DISTINCT(brand) FROM `".$config_databaseTablePrefix."products` WHERE brand <> '' ORDER BY brand";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $row)
      {
        print "<sitemap>";
        $sitemapHREF = $config_baseHREF."sitemapBrands.php?brand=".urlencode($row["brand"]);
        print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
        print "</sitemap>";
      }
    }
    print "</sitemapindex>";
  }
?>

Cheers,
David.
--
PriceTapestry.com

Submitted by bihmaniak on Mon, 2017-03-27 21:06

Hello David,

The sitemapindex works just fine but the brand/category sitemaps do not work.
I am getting an error message when i try to open up a brand/cagetory sitemap.

This is the error message
xml parse error : no main element found
Location : http://www.mydomain.com/sitemapBrands.php?brand=Example
Rule number 1 , column 104

This line is written in red under the error message

-------------------------------------------------------------------------------------------------------^

Can you fix this?

Submitted by support on Tue, 2017-03-28 09:49

Hi,

I just checked in-case copy / paste issue / forum code formatting but all looks OK, so if you could perhaps check with CDATA tags around the location - I've applied the change to the above, which is line 19 in each script:

      print "<loc><![CDATA[".$loc."]]></loc>";

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by bihmaniak on Tue, 2017-03-28 11:58

I am getting the same error message but i have looked in my error log. This is the error message in the error log.

Got error 'PHP message: PHP Fatal error: Call to undefined function tapestry_indexHREF()

Submitted by support on Tue, 2017-03-28 12:38

Hi,

Ah - sorry about that - tapestry_indexHREF() was added in 15/10A. For earlier distributions, for sitemapCategories.php look for the following code at line 17:

      $loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);

...REPLACE with:

      if ($config_useRewrite)
      {
        $loc = $sitemapBaseHREF.$config_baseHREF."category/".tapestry_hyphenate($category)."/";
        if ($page > 1) $loc .= $page.".html";
      }
      else
      {
        $loc = $sitemapBaseHREF.$config_baseHREF."search.php?q=category:".urlencode($category);
        if ($page > 1) $loc .= "&page=".$page;
      }

And similarly in sitemapBrands.php look for the following code at line 17:

      $loc = $sitemapBaseHREF.tapestry_indexHREF("brand",$brand);

...and REPLACE with:

      if ($config_useRewrite)
      {
        $loc = $sitemapBaseHREF.$config_baseHREF."brand/".tapestry_hyphenate($brand)."/";
        if ($page > 1) $loc .= $page.".html";
      }
      else
      {
        $loc = $sitemapBaseHREF.$config_baseHREF."search.php?q=brand:".urlencode($brand);
        if ($page > 1) $loc .= "&page=".$page;
      }

Cheers,
David.
--
PriceTapestry.com

Submitted by bigshopper on Thu, 2018-10-18 12:46

Hi,

I tried to get the Brands URL in lower case in the Brands sitemap.xml, but if I change:
$sitemapHREF = "brand/".tapestry_hyphenate($row["brand"])."/";

to

$sitemapHREF = "brand/".tapestry_hyphenate(strlower($row["brand"]))."/";

I get an error, i thought this would work. What do I do wrong?

Thanks in advance!

Robert

Submitted by support on Thu, 2018-10-18 12:54

Hi Robert,

You need to use strtolower ("to" missing) - have a go with;

 $sitemapHREF = "brand/".tapestry_hyphenate(strtolower($row["brand"]))."/";

That should be all it is...

Cheers,
David.
--
PriceTapestry.com

Submitted by bigshopper on Thu, 2018-10-18 14:40

Thanks! That works! :)

Submitted by sirmanu on Sun, 2018-12-30 12:02

Hi again David.
Is any script available to generate the sitemap when $config_useCategoryHierarchy is enabled? The script should only generate the leafs, of course, when there is some product mapped into in.

Submitted by support on Mon, 2018-12-31 10:03

Hi,

Sure - here's a version of sitemapCategories.php for Category Hierarchy;

<?php
  require("includes/common.php");
  $categories_hierarchy = array();
  function getCategoryPath($id)
  {
    global $categories_hierarchy;
    $categories = array();
    do {
      array_unshift($categories,$categories_hierarchy[$id]["name"]);
    } while($id = $categories_hierarchy[$id]["parent"]);
    return implode("/",$categories);
  }
  $sql = "SELECT id,name,parent FROM `".$config_databaseTablePrefix."categories_hierarchy` ORDER BY id";
  if (database_querySelect($sql,$rows))
  {
    foreach($rows as $row)
    {
      $categories_hierarchy[$row["id"]] = $row;
    }
  }
  header("Content-Type: text/xml");
  print "<?xml version='1.0' encoding='UTF-8'?>";
  $sitemapBaseHREF = "http".(isset($_SERVER["HTTPS"])&&$_SERVER["HTTPS"]?"s":"")."://".$_SERVER["HTTP_HOST"];
  if (isset($_GET["path"]))
  {
    $nodeInfo = tapestry_categoryHierarchyNodeInfo($_GET["path"]);
    print "<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT COUNT(DISTINCT(name)) AS numResults FROM `".$config_databaseTablePrefix."products` WHERE categoryid='".database_safe($nodeInfo["id"])."'";
    database_querySelect($sql,$rows);
    $numResults = $rows[0]["numResults"];
    $numPages = ceil($numResults / $config_resultsPerPage);
    for($page=1;$page<=$numPages;$page++)
    {
      print "<url>";
      $loc = $sitemapBaseHREF.tapestry_indexHREF("category",$nodeInfo["path"]);
      if ($page > 1) $loc .= $page.".html";
      print "<loc><![CDATA[".$loc."]]></loc>";
      print "</url>";
    }
    print "</urlset>";
  }
  else
  {
    print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $row)
      {
        print "<sitemap>";
        $sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
        print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
        print "</sitemap>";
      }
    }
    print "</sitemapindex>";
  }
?>

Cheers,
David.
--
PriceTapestry.com

Submitted by sirmanu on Mon, 2019-04-22 08:26

Hi David.

How can I modify the above script to only display in sitemap the leaf categories.

I mean, for instance, I have, example.com/electronic/tv-video/monitors/

But I only want to include in sitemap monitors, not electronic and tv-video.

Thanks!

Submitted by support on Tue, 2019-04-23 09:57

Hi,

If you only have products at leaf categories that should be the case as it stands as the SELECT query in the <sitemapindex> part of the above script uses the products table to get the categoryid list.

However, if you wanted to force the index to be leaf categories only, then where you have the following code beginning at line 44:

    print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $row)
      {
        print "<sitemap>";
        $sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
        print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
        print "</sitemap>";
      }
    }
    print "</sitemapindex>";

...REPLACE with:

    print "<sitemapindex xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>";
    $sql = "SELECT id FROM `".$config_databaseTablePrefix."categories_hierarchy` WHERE id NOT IN (SELECT DISTINCT(parent) FROM `".$config_databaseTablePrefix."categories_hierarchy`)";
    database_querySelect($sql,$rows);
    $leafIds = array();
    foreach($rows as $row)
    {
      $leafIds[] = $row["id"];
    }
    $sql = "SELECT DISTINCT(categoryid) FROM `".$config_databaseTablePrefix."products` WHERE categoryid > 0 ORDER BY categoryid";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $row)
      {
        if (!in_array($row["categoryid"],$leafIds)) continue;
        print "<sitemap>";
        $sitemapHREF = $config_baseHREF."sitemapCategories.php?path=".urlencode(getCategoryPath($row["categoryid"]));
        print "<loc>".$sitemapBaseHREF.$sitemapHREF."</loc>";
        print "</sitemap>";
      }
    }
    print "</sitemapindex>";

Hope this helps!

Cheers,
David.
--
PriceTapestry.com