PHP- Mysql database encoding and question-marks in the text (and how to get rid of them)

For my developing project i use Mysql and Dreamweaver set up with Wamp. I would say that i very familiar with databases and how to handle data etc..

Prerequisites to this project was that i could not modify either my php.ini or mysql.ini, since i am on a shared hosting plan.

One issue I had and could not fix though was that i got question-marks  for special characters like quotes or Cyrillic or French characters using ascent. I did work on this many hours but with no luck. Today i finally have managed to solve it. Here is how:

Since all my databases already have data and are online, i created copies of them using the correct charset. So:

1. I Exported all the tables in my databases using mysqldump and recreated them using utf8  character set and utf8_unicode_ci table/column collation. This is essential to make sure all data are saved in the correct format

2. I made sure that all my pages had the following two meta tags in the head tag:

<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

3. Finally, and that is what was missing all this time, is to make sure that all communication with the database is done through UTF. Apparently this does not happen automatically when you set the php encoding and the database collation/encoding. To do so, every time you query your database,  you need to specify the communication  encoding format.

mysql_select_db($database_localhost, $localhost);

$query_collections = sprintf("Select  * from table WHERE packbrand=%s and serial='1' AND VISIBLE='1' order by collection asc", GetSQLValueString($colname_collections, "text"));

mysql_query("set names 'utf8'");

This  last part basically fixed it.

What did not work for me:

  • Trying to figure our whether your text editor (like notepad++) is wrongly converting your query to a different encoding.
  •  Pass the set names command in mysql workbench. While typing this i am just now realizing that this would not make any difference on way or an other, because it is the retrieval of the data in PHP that caused issues and not the displaying or the typing in..

One down many to go. Move to next project.

Sitemap & Sitemap Index xml files using php and mysql

For some time now I have been generating sitemap xml files for google submission using Microsoft Excel. Always knew that there would be a way to generate it on the fly/on request and never bother with that again. But how?

Documentation is not very clear because you need to use multiple programming languages, and also because the whole xml validation process seems very sensitive to errors. Since i am not an experienced programmer, stack exchange became my guide and i have finally managed to do it.

If you have multiple pages, you probably need to create intermediate xmlindex pages. These are basically specially marked index files that point to other index files. Warning: Nested xmlindex files are not allowed by Google (even though are at the end followed and indexed).

<?php Header('Content-type: text/xml');

require_once('../Connections/localhost.php'); 

if (!function_exists("GetSQLValueString")) {
function GetSQLValueString($theValue, $theType, $theDefinedValue = "", $theNotDefinedValue = "") 
{
 if (PHP_VERSION < 6) {
 $theValue = get_magic_quotes_gpc() ? stripslashes($theValue) : $theValue;
 }

 $theValue = function_exists("mysql_real_escape_string") ? mysql_real_escape_string($theValue) : mysql_escape_string($theValue);

 switch ($theType) {
 case "text":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break; 
 case "long":
 case "int":
 $theValue = ($theValue != "") ? intval($theValue) : "NULL";
 break;
 case "double":
 $theValue = ($theValue != "") ? doubleval($theValue) : "NULL";
 break;
 case "date":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break;
 case "defined":
 $theValue = ($theValue != "") ? $theDefinedValue : $theNotDefinedValue;
 break;
 }
 return $theValue;
}
}

mysql_select_db($database_localhost, $localhost);
$query_Recordset1 = "select * from sitemaps where `use`='sitemap'";
$Recordset1 = mysql_query($query_Recordset1, $localhost) or die(mysql_error());
$row_Recordset1 = mysql_fetch_assoc($Recordset1);
$totalRows_Recordset1 = mysql_num_rows($Recordset1);
$xml=new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'
 .'<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
 .'</sitemapindex>'); 
do { $url=$xml->addChild('sitemap');
 $website="http://www.tharsitis.gr/".strtolower(str_replace(' ','-',$row_Recordset1['url']));
 $url->addChild('loc',$website);
 "<br/>";
 //$url->addChild('lastmod','XXXXXXXX'); OPTIONAL
 } while ($row_Recordset1 = mysql_fetch_assoc($Recordset1));
mysql_free_result($Recordset1);
print($xml->asXML());
?>

This is the xml sitemap  index code that worked for me. I created the database recordset using Dreamweaver. It retrieves all the xml index urls from Mysql database.

At the second part of the project i needed to generate the sitemap xml files that the xml index would be point at.

This is the code that worked for me:

<?php Header('Content-type: text/xml');

require_once('../Connections/localhost.php'); 

if (!function_exists("GetSQLValueString")) {
function GetSQLValueString($theValue, $theType, $theDefinedValue = "", $theNotDefinedValue = "") 
{
 if (PHP_VERSION < 6) {
 $theValue = get_magic_quotes_gpc() ? stripslashes($theValue) : $theValue;
 }

 $theValue = function_exists("mysql_real_escape_string") ? mysql_real_escape_string($theValue) : mysql_escape_string($theValue);

 switch ($theType) {
 case "text":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break; 
 case "long":
 case "int":
 $theValue = ($theValue != "") ? intval($theValue) : "NULL";
 break;
 case "double":
 $theValue = ($theValue != "") ? doubleval($theValue) : "NULL";
 break;
 case "date":
 $theValue = ($theValue != "") ? "'" . $theValue . "'" : "NULL";
 break;
 case "defined":
 $theValue = ($theValue != "") ? $theDefinedValue : $theNotDefinedValue;
 break;
 }
 return $theValue;
}
}

$rs1_Recordset1 = "-1";
if (isset($_GET['id'])) {
 $rs_Recordset1 = strtolower(str_replace ( '-', ' ',$_GET['id']));
}

mysql_select_db($database_localhost, $localhost);
$query_Recordset1 = sprintf("SELECT packbrand,packsuuid,packcollection,packsname,folder FROM packs where collectionid=%s and visible='1' order by serial asc",GetSQLValueString($rs_Recordset1, "text"));
$Recordset1 = mysql_query($query_Recordset1, $localhost) or die(mysql_error());
$row_Recordset1 = mysql_fetch_assoc($Recordset1);
$totalRows_Recordset1 = mysql_num_rows($Recordset1);
$xml=new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'
 .'<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">'
 .'</urlset>'); 
do { 
 $brand=$row_Recordset1['packbrand'];
 $collection=$row_Recordset1['packcollection'];
 $url=$xml->addChild('url');
// $image="image:image";
 $website="http://www.tharsitis.gr/".strtolower(str_replace(' ','-',$brand."/".$collection."/".$row_Recordset1['packsuuid']));
 $img="http://www.tharsitis.gr/packagecache/".$row_Recordset1['folder']."/".$row_Recordset1['packsuuid'].".jpg";
 $url->addChild('loc',$website);
 $url->addChild('priority','0.7');
 $image = $url->addChild('image:image', null, 'http://www.google.com/schemas/sitemap-image/1.1');
 $image->addChild('image:loc',$img, 'http://www.google.com/schemas/sitemap-image/1.1');
 $image->addChild('image:caption',$row_Recordset1['packsname'], 'http://www.google.com/schemas/sitemap-image/1.1');
 } while ($row_Recordset1 = mysql_fetch_assoc($Recordset1));
mysql_free_result($Recordset1);
print($xml->asXML());
?>

In both files you can skip the initial part where i build the recordset. This was all done automatically by Dreamweaver. Save the files as php files.
Finally, i needed to modify my .htaccess file so that it translated every php  request matching the above files to xml. This is a requirement to get Google validate the files correctly.

Rewriterule ^xmlindex.xml$ /xml/xmlindex.php [L]
Rewriterule ^p(.*).xml$ /xml/xmlpacks.php?id=$1 [L]

As you can see in the second line of the below htaccess code, I use url variables to generate dynamic xml sitemaps for different sets of urls.  Vuala!