Tag Archives: Google

Sitemap & Sitemap Index xml files using php and mysql

For some time now I have been generating sitemap xml files for google submission using Microsoft Excel. Always knew that there would be a way to generate it on the fly/on request and never bother with that again. But how?

Documentation is not very clear because you need to use multiple programming languages, and also because the whole xml validation process seems very sensitive to errors. Since i am not an experienced programmer, stack exchange became my guide and i have finally managed to do it.

If you have multiple pages, you probably need to create intermediate xmlindex pages. These are basically specially marked index files that point to other index files. Warning: Nested xmlindex files are not allowed by Google (even though are at the end followed and indexed).

<?php Header('Content-type: text/xml');

require_once('../Connections/localhost.php'); 

if (!function_exists("GetSQLValueString")) {
function GetSQLValueString($theValue, $theType, $theDefinedValue = "", $theNotDefinedValue = "") 
{
 if (PHP_VERSION < 6) {
 $theValue = get_magic_quotes_gpc() ? stripslashes($theValue) : $theValue;
 }

 $theValue = function_exists("mysql_real_escape_string") ? mysql_real_escape_string($theValue) : mysql_escape_string($theValue);

 switch ($theType) {
 case "text":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break; 
 case "long":
 case "int":
 $theValue = ($theValue != "") ? intval($theValue) : "NULL";
 break;
 case "double":
 $theValue = ($theValue != "") ? doubleval($theValue) : "NULL";
 break;
 case "date":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break;
 case "defined":
 $theValue = ($theValue != "") ? $theDefinedValue : $theNotDefinedValue;
 break;
 }
 return $theValue;
}
}

mysql_select_db($database_localhost, $localhost);
$query_Recordset1 = "select * from sitemaps where `use`='sitemap'";
$Recordset1 = mysql_query($query_Recordset1, $localhost) or die(mysql_error());
$row_Recordset1 = mysql_fetch_assoc($Recordset1);
$totalRows_Recordset1 = mysql_num_rows($Recordset1);
$xml=new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'
 .'<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
 .'</sitemapindex>'); 
do { $url=$xml->addChild('sitemap');
 $website="http://www.tharsitis.gr/".strtolower(str_replace(' ','-',$row_Recordset1['url']));
 $url->addChild('loc',$website);
 "<br/>";
 //$url->addChild('lastmod','XXXXXXXX'); OPTIONAL
 } while ($row_Recordset1 = mysql_fetch_assoc($Recordset1));
mysql_free_result($Recordset1);
print($xml->asXML());
?>

This is the xml sitemap  index code that worked for me. I created the database recordset using Dreamweaver. It retrieves all the xml index urls from Mysql database.

At the second part of the project i needed to generate the sitemap xml files that the xml index would be point at.

This is the code that worked for me:

<?php Header('Content-type: text/xml');

require_once('../Connections/localhost.php'); 

if (!function_exists("GetSQLValueString")) {
function GetSQLValueString($theValue, $theType, $theDefinedValue = "", $theNotDefinedValue = "") 
{
 if (PHP_VERSION < 6) {
 $theValue = get_magic_quotes_gpc() ? stripslashes($theValue) : $theValue;
 }

 $theValue = function_exists("mysql_real_escape_string") ? mysql_real_escape_string($theValue) : mysql_escape_string($theValue);

 switch ($theType) {
 case "text":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break; 
 case "long":
 case "int":
 $theValue = ($theValue != "") ? intval($theValue) : "NULL";
 break;
 case "double":
 $theValue = ($theValue != "") ? doubleval($theValue) : "NULL";
 break;
 case "date":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break;
 case "defined":
 $theValue = ($theValue != "") ? $theDefinedValue : $theNotDefinedValue;
 break;
 }
 return $theValue;
}
}

$rs1_Recordset1 = "-1";
if (isset($_GET['id'])) {
 $rs_Recordset1 = strtolower(str_replace ( '-', ' ',$_GET['id']));
}

mysql_select_db($database_localhost, $localhost);
$query_Recordset1 = sprintf("SELECT packbrand,packsuuid,packcollection,packsname,folder FROM packs where collectionid=%s and visible='1' order by serial asc",GetSQLValueString($rs_Recordset1, "text"));
$Recordset1 = mysql_query($query_Recordset1, $localhost) or die(mysql_error());
$row_Recordset1 = mysql_fetch_assoc($Recordset1);
$totalRows_Recordset1 = mysql_num_rows($Recordset1);
$xml=new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'
 .'<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">'
 .'</urlset>'); 
do { 
 $brand=$row_Recordset1['packbrand'];
 $collection=$row_Recordset1['packcollection'];
 $url=$xml->addChild('url');
// $image="image:image";
 $website="http://www.tharsitis.gr/".strtolower(str_replace(' ','-',$brand."/".$collection."/".$row_Recordset1['packsuuid']));
 $img="http://www.tharsitis.gr/packagecache/".$row_Recordset1['folder']."/".$row_Recordset1['packsuuid'].".jpg";
 $url->addChild('loc',$website);
 $url->addChild('priority','0.7');
 $image = $url->addChild('image:image', null, 'http://www.google.com/schemas/sitemap-image/1.1');
 $image->addChild('image:loc',$img, 'http://www.google.com/schemas/sitemap-image/1.1');
 $image->addChild('image:caption',$row_Recordset1['packsname'], 'http://www.google.com/schemas/sitemap-image/1.1');
 } while ($row_Recordset1 = mysql_fetch_assoc($Recordset1));
mysql_free_result($Recordset1);
print($xml->asXML());
?>

In both files you can skip the initial part where i build the recordset. This was all done automatically by Dreamweaver. Save the files as php files.
Finally, i needed to modify my .htaccess file so that it translated every php  request matching the above files to xml. This is a requirement to get Google validate the files correctly.

Rewriterule ^xmlindex.xml$ /xml/xmlindex.php [L]
Rewriterule ^p(.*).xml$ /xml/xmlpacks.php?id=$1 [L]

As you can see in the second line of the below htaccess code, I use url variables to generate dynamic xml sitemaps for different sets of urls.  Vuala!

The importance of sitemap.xml

Google Webmaster Tools Menu
Google Webmaster Tools Menu

 

 

 

 

A colleague of mine recently announced to me that he and a couple of his friends  hired a software company to develop a web site for them. Curious as i am i visited the web site directly and started searching inside the code to find out which platform they used, if they have social integration, SEO tags etc. It turned out that none of these was present due the fact that it`s just a start up site, and the budget was not enough to cover for all of this. All ok but at least i think they should have added their newly created website to the webmaster tools of Google and Bing.

According to my opinion even if  it is a start up site, you need to use webmaster tools to let Google and Hotmail know that you are already running, and so that they can start indexing the individual web pages you have. For that reason you need to create a sitemap file and place it on the root of the www. I usually name these files sitemap.xml and either create them by hand or by application modules (like in wordpress for example). Keep in mind that google needs to have an xml file as a sitemap and not a php page that would generate xml formatted text, not being able to find a solution to this yet.

Google Webmaster Tools
Google Webmaster Tools
Bing Web Master Tools Menu
Bing Web Master Tools Menu

The benefits are numerous.  You can check your page in terms of structured data, linked pages, coding errors. You also can see different statistics of your web page such as inbound, outbound links as well as search queries. Google additionally also lets you see the indexing status of your submitted sitemaps as well as some keyword preference based on the submitted urls. Finally, on the relative submenu, you are able to monitor the google crawler as it crawls your web site for errors, and also be able to see how Google “reads” your web page, by using the “Fetch as Google” functionality.

Bing webmaster tools provides matching functionality, combining some “analytics” type data though. e.g SEO Analyzer, Link Explorer, Page Traffic etc.

Closing this, I strongly believe that creating a sitemap and adding it to Bing /Google webmaster tools is one of the first actions a webmaster should do, after going public. These are great tools that guarantee you that the big players in search engines can “see” you good.

Google Web Master Tools

Bing Web Master Tools

SEO Must haves that apply to any web page

Update: Yandex also has a Webmaster Tool for their search engine. Check it out