Gyong Ju - South Korea

Archive for the ‘PHP’ Category

There’s a ton of documentation available if you want to do template handling in PHP. This article is only about documenting the simple approach I use myself. I’m not going to enter the arena by stating PHP is a template language itself etc …, that’s just plain boring.

So what’s the intention here? The objective is to have a plain HTML file and replace content at certain places where you want PHP driven output to show. But by and itself the HTML file is just that, plain HTML with inclusion of CSS and JS where necessary.

My standard HTML file is shown below:

<!DOCTYPE html>
<html>
<head>
	<title>{title}</title>
	<meta http-equiv="content-type" content="text/html; charset=utf-8" />
	<meta http-equiv="content-language" content="{language}" />
	<meta name="author" content="M.E. Post" />
	<meta name="copyright" content="Copyright (c) M.E. Post 2008" />
	<link rel="stylesheet" href="{includepath}/css/include.css" type="text/css" media="screen" />
	<script type="text/javascript">
        var path = '{includepath}';
        </script>
        <script type="text/javascript" src="{includepath}/js/jquery-1.3.2.min.js"></script>
	<script type="text/javascript" src="{includepath}/js/include.js"></script>
</head>

<body>
	<div id="rap">
	  <div id="headwrap">
		  <div id="header">
			  <a href="{path}/">{title}</a>
		  </div>
		  <div id="desc">
			  <a href="{path}/">{subtitle}</a>
		  </div>
	  </div>
	  <div id="content">
		  <div class="storycontent">
		    {replace_content}
		  </div>
    </div>
  </div>
</body>
</html>

As you can see it’s a very minimal file and there are some elements in there like {includepath} and {replace_content} which are not regular html. These are the placeholders where content will be replaced.

Replacing the content is executed by the function below. It gets the content transferred through the variable $content, if the $content variable is empty it returns FALSE and aborts the function. After that it checks whether the template has already been loaded through checking the static $template, if it’s empty the template file is loaded, otherwise it will reuse the previously loaded template. All the template placeholders are replaced through a loop using mb_ereg_replace to make the text unicode compliant. The replaced template is returned as output of the function. Items like PATH et al are constants that are defined previously, you can take them out or add them to the function call if you want.

/**
* Merge the page template with the content
*
* @param string $content
* @return string
*/
function mergeContentWithTemplate($content='') {
	if (empty($content)) {
		return FALSE;
	}
	/* Static keyword is used to ensure the file is loaded only once */
	static $template = NULL;
	/* If no instance of $template has occured load the template file */
	if (is_null($template)) {
		$template_file = dirname(__FILE__) . '/../html/template.html';
		$template_file_content = file_get_contents($template_file);
	}
	mb_regex_encoding('utf-8');
	$pattern = array('{path}', '{includepath}', '{language}', '{title}', '{subtitle}', '{replace_content}');
	$replacement = array(PATH, INCLUDE_PATH, LANGUAGE, TITLE, SUBTITLE, $content);
	$pattern_size = sizeof($pattern);
	for ($i = 0; $i < $pattern_size; $i++) {
		$template_file_content = mb_ereg_replace($pattern[$i], $replacement[$i], $template_file_content);
	}
	return $template_file_content;
}

So that’s my simple little template thingy, hope it is of some use to you.

Now that’s a big claim but I can assure you its true for all three aspects. It doesn’t even require heavy customisation and the approach is based on standard plugins available on the WordPress plugin site. However like everything there’s a trade off with the approach and in this case its the loss of flexibility and dynamic behaviour. This isn’t an issue with static websites but if you’re running a blog then this solution isn’t for you (stuff like comments won’t work as this requires connectivity and feedback from WordPress). It’s up to you to decide whether my approach has merits for your use case. I offer no guarantees other than that I have applied the approach below to my own systems and for me it works. It’s very rough around the edges, I have been hacking some files and I haven’t rolled my changes into a nice shrink wrap form. Enough with the disclaimers let’s get going with an actual explanation of what I’m offering.

Security

WordPress suffers from the same problem that almost all Content Management Systems (CMS) suffer from, it has a unified code base for both content publication and content management. With WordPress (and similar systems) that share the same code base it is possible to hack the content management system through the content publication system. The content publication system is the aspect of the CMS that generates the pages if a visitor hits the site. The content publication system by its very nature is an open interface to the outside world and can therefor be hacked. By the fact that it shares code with the CMS system it is inevitable that also the CMS can be compromised in an attack on the content publication system. These hacks occur time and again and are endemic to the shared code approach so they will never go away. The only way of ensuring your CMS is not hacked through your content publication system is by separating the two. Now separation in a physical (code) sense is possible but requires a huge amount of effort and in effect means a different version of WordPress through a fork. This is not what I want to achieve, I have limited time and I can’t maintain my own version of WordPress and keep up with all the new functionality that the WordPress team cranks out all the time. Therefore I mean separation in a logical sense and this I achieve through the use of WP SuperCache. WP Super Cache turns your WordPress site/blog into a collection of static pages and it uses a .htaccess mod_rewrite approach to serve customers the static pages. It also has an option to serve page components like JS, CSS and images from a Content Delivery Network (CDN). My approach to separating the CMS from content publication is that I turn the WP Super Cache cache (pardon the pun) into its own virtual host in Apache and serve content in its static form from that Virtual Host. My visitors don’t need to access the WordPress installation to get to the content, the CMS and the content publication are logically separated. Now there’s a couple of tricks required for getting this up and running and I’ll explain these later in this post.

Speed

The approach of moving your page components into a CDN is well known and relatively straightforward to achieve with solutions like WP Super Cache or W3 Total Cache. Going one step further and moving your entire site, so including your html is a little less usual but that is what I have achieved. My test site (not this one) based on the standard twentyten theme now loads in 1.223 seconds of which 0.252 seconds is spent on the DNS lookups. The html and all other page components are served through Amazon Cloudfront using Origin Pull (but any other CDN can do the same, there is no Cloudfront specific trickery involved).

How it works

There’s a couple of code changes involved and some Apache and DNS configuration changes. What do you need:

  • LAMP platform and WordPress. I used the most recent version of WordPress (3.1.2) at the time of writing. Hosting is done on Amazon EC2 with a CentOS 5.6 based system
  • WP Super Cache plugin installed
  • A CDN, I used Amazon Cloudfront
  • Access to DNS for setting CNAME records

I’m assuming you have a functioning LAMP server. The following steps need to be executed:

  • Create a virtual host in Apache for the WordPress site
  • Install WordPress and WP Super Cache plugin
  • Configure the WP Super Cache plugin
  • Code hacks to the WP Super Cache plugin
  • Set up your CDN
  • Configure your DNS
  • Test

We’re going to put the WordPress site in a directory called “wordpress” located in /var/www/html (CentOS/Fedora default) and create a special virtual host called cms.example.com:

<VirtualHost *:80>
ServerName cms.example.com
ServerAdmin admin@example.com
DocumentRoot /var/www/html/wordpress
LogLevel info
ErrorLog logs/error_log
TransferLog logs/access_log
</VirtualHost>

Install WordPress in the /var/www/html/wordpress directory and configure it with the cms.example.com home/site url. Check that the installation completed sucessfully and you can access the admin interface at http://cms.example.com/wp-admin/. Install the WP Super Cache plugin as explained by the documentation.

Configure the WP Super Cache plugin as follows:

  • Advanced settings:
    • Cache hits to this website for quick access
    • Use PHP to serve cache files
    • 304 Not Modified browser caching. Indicate when a page has not been modified since last requested
    • Cache rebuild. Serve a supercache file to anonymous users while a new file is being generated
  • CDN settings:
    • Enable CDN Support
    • Off-site URL: http://cdn.example.com (where example.com is your own domain)
  • Preload settings:
    • Preload mode (garbage collection only on legacy cache files)

Create a new directory in your webroot, e.g. “cache”:

mkdir /var/www/html/cache

Set this up as a new virtual host in Apache, let’s call this new site cache.example.com:

<VirtualHost *:80>
ServerName cache.example.com
ServerAdmin admin@example.com
DocumentRoot /var/www/html/cache/supercache/cms.example.com
ErrorLog logs/error_log
TransferLog logs/access_log
</VirtualHost>

Restart Apache to get the new Virtual Hosts activated. Copy over the wp-content/themes/[theme-name] folder to your cache directory (/var/www/html/cache/supercache/cms.example.com) but only where it concerns css, js and images. You don’t need to copy over the php files as only the web page resources are required. The same applies for the wp-includes directory if your theme uses javascript files in the js subdirectory. Check if the pages come up ok if you access http://cache.example.com. If they do you’re fine, if not troubleshoot what the issue is, e.g. look at the Apache logs/error_log file.

After this we need to do some small code wrangling, it’s going to be ugly but small and we need the absolute path of the directory that we just created. Navigate to the plugin directory of your WordPress installation and enter the wp-super-cache directory. Open file “wp-cache-phase1.php” and at the top of the file just after the include( WPCACHEHOME . ‘wp-cache-base.php’); instruction add:

include( WPCACHEHOME . 'wp-cache-base.php');
$cache_path = "/var/www/html/cache/";

Save the file and open file “wp-cache-phase2.php”. At the top of the file, just after

$cache_path = "/var/www/html/cache/";

In the same file look for function function wp_cache_get_ob(&$buffer) and in this function look for this sequence (around line 504):

 } else {
                $buffer = apply_filters( 'wpsupercache_buffer', $buffer );
                // Append WP Super Cache or Live page comment tag
                wp_cache_append_tag($buffer);

After this sequence add:

$buffer = str_replace("http://cms.example.com", "http://www.example.com", $buffer);

Reason for this is that WP Super Cache will generate pages based on its own site/home url (cms.example.com) and we need to replace this url with the actual site url (www.example.com). Hence the clumsy find and replace whilst the pages are generated by the Preload section of the WP Super Cache plugin. I’m sure it can be done nicer but I’m just proving a concept, not winning prices for clean code.

Set up your CDN so that it has two Distribution Points / Pull Zones or whatever you CDN provider calls them. One should be listening to www.example.com and have cache.example.com as its origin server and the other should be listening to cdn.example.com and also have cache.example.com as its origin server. Note the CNAME records the CDN generates for you, let’s assume the following:

  • xyz.cloudfront.net –> www.example.com
  • abc.cloudfront.net –> cdn.example.com

Go to your DNS setup and set up the following changes:

  • Have the www subdomain (I’m assuming you already have this set up otherwise create a www CNAME record) refer to xyz.cloudfront.net
  • Create a CNAME record for cdn.example.com and have this point at abc.cloudfront.net

Apply the DNS changes and wait for the changes to propagate. If you can do a successful dig on www.example.com and cdn.example.com and you get to see something like this you should be ok:

www.example.com.         3044   IN CNAME  xyz.cloudfront.net.
xyz.cloudfront.net.      60     IN CNAME  xyz.ams1.cloudfront.net.
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.28
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.54
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.64
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.115
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.207
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.216
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.220
xyz.ams1.cloudfront.net. 60     IN A      216.137.59.254

Access your site at http://www.example.com/ and see if its working. If so start doing your performance tests and do some investigations with HTTP analysis tooling like HTTP Fox.

After you’ve established everything works fine you can make cms.example.com only accessible to yourself or your content editors, there is no real time dependency on WordPress anymore and the installation can be purely used for content management activities.

After a certain while if I’ve been working on code I get a bit blinded by the nice things I’ve accomplished and tend to focus on what I’m not happy with. Lets make this posting about something simple I’m happy with and which looks very nice: declaring and verifying constants. I’m a big fan of constants (not so much of magic constants but that’s a different story) and I use them frequently in my code. One thing that’s always important is to check whether you’ve actually already set the constant otherwise you get a warning/error dependant on the strictness setting of your error reporting. So here’s a nice way to set and verify whether you’ve actually set the constant already:

defined('LANGUAGE') or define('LANGUAGE', 'en-us');

If that ain’t a thing of beauty I don’t know what is :-)

There’s quite a number of ways to approach localization (L10N) in PHP. Typing in “simple localization php” in Google yields an impressive amount of results. I won’t go into a heavy theoretical approach in what the best way is or that you need to use gettext or any such approach. I just want to present the function(s) I’m using and leave it up to you to decide whether you like the approach.

First some constants that I use. You can rip them out if you don’t like them, they’re not essential, I just like using constants for this type of work.

/**
* Define the url path for the resources
*/
defined('INCLUDE_PATH') or define('INCLUDE_PATH', '/include');

/**
* Define the language using language code based on BCP 47 + RFC 4644,
* http://www.rfc-editor.org/rfc/bcp/bcp47.txt
*
* The language files can be found in directory 'lang'
*/
defined('LANGUAGE') or define('LANGUAGE', 'en-us');

The constants are used in the code below, replace them with your own values or approach where necessary. What happens is that a language file is loaded based on the configured language. A static array $translations is used to ensure that the language file is loaded once and every other subsequent call is handled through the $translations array in memory rather than reloading the language file. The language file is constructed in JSON format and when the language file is loaded into the $translations array the PHP json_decode function is used to convert from JSON to PHP associative array format. When a call is made to the function a language phrase is passed to the function and the matching value is found in the $translations array by using the language phrase as a key for the associative array.

Please ensure the .txt files are utf-8 encoded (without BOM), otherwise the json PHP functions will not operate correctly (see comment DanyBoy below).

/**
* Load the proper language file and return the translated phrase
*
* The language file is JSON encoded and returns an associative array
* Language filename is determined by BCP 47 + RFC 4646
* http://www.rfc-editor.org/rfc/bcp/bcp47.txt
*
* @param string $phrase The phrase that needs to be translated
* @return string
*/
function localize($phrase) {
    /* Static keyword is used to ensure the file is loaded only once */
    static $translations = NULL;
    /* If no instance of $translations has occured load the language file */
    if (is_null($translations)) {
        $lang_file = INCLUDE_PATH . '/lang/' . LANGUAGE . '.txt';
        if (!file_exists($lang_file)) {
            $lang_file = INCLUDE_PATH . '/lang/' . 'en-us.txt';
        }
        $lang_file_content = file_get_contents($lang_file);
        /* Load the language file as a JSON object and transform it into an associative array */
        $translations = json_decode($lang_file_content, true);
    }
    return $translations[$phrase];
}

An excerpt of the US English language file (in JSON format):

{
    "lang":"en-us",
    "No":"No",
    "Yes":"Yes",
    "or":"or",
    "Do you require help":"Do you require help"
}

An excerpt of the German language file:

{
    "lang":"de",
    "No":"Nein",
    "Yes":"Ja",
    "or":"oder",
    "Do you require help":"Brauchen Sie Hilfe"
}

Edit : Please make sure the language files are saved in UTF-8 format as this is the default encoding for the json_decode function used in the code above.

An example of it’s usage:

print localize('Do you require help') . localize('Yes') . localize('or') . localize('No');

An example of usage in some of my own code:

$create_page_array = array(
    'status_message' => $status_message,
    'table_caption' => localize('Delete Operation'),
    'table_explanation' => localize('This command deletes an operation'),
    'table_content' => $table_content,
    'checkbox' => 'operation_name',
    'table_sort' => '[[1,0]]'
);

The advantages to me are:

  1. Very easy integration into your code
  2. The language phrases remain recognizable in your own code
  3. The language files can be easily customised because the original phrase is part of the translation
  4. Language file needs to be loaded only once
  5. The language file is coded in an open standard (JSON) and can be processed in other ways that you currently don’t foresee

A Front End Controller is part of an MVC pattern.

The controller receives input and initiates a response by making calls on model objects. An MVC application may be a collection of model/view/controller triplets, each responsible for a different UI element. MVC is often seen in web applications where the view is the HTML or XHTML generated by the app. The controller receives GET or POST input and decides what to do with it, handing over to domain objects (i.e. the model) that contain the business rules and know how to carry out specific tasks such as processing a new subscription.

Continue Reading

A good friend of mine asked if it was possible to log out of a Basic Authentication session. My first knee-jerk response was that Basic Authentication has no log out function and you should close the browser to safely log out of the session. After some days silence he came back with a script he’d found on the php.net site. The script used sessions to break the Basic Authentication behavior of the browser. It wasn’t a very successful script because it only worked in a limited set of browsers but it got me thinking about a better solution.

Continue Reading

I finally succumbed to ease of use and switched from my bespoke PivotLog installation to WordPress. I thoroughly enjoyed Pivot but when switching from Textdrive to Amazon EC2 I had to change and migrate so many things that I settled for the easier solution; WordPress.

Continue Reading

Looking at the code in the previous entry wasn’t exactly a pleasant aesthetic experience (sorry for that, bit of a botched job) so for my new project, an implementation of the NIST RBAC model in PHP, I decided to code a nice generic PHP query engine. The Query Engine takes a number of arguments like the SQL query, the arguments for the query (to be passed into prepared statements), the types of the arguments and whether the query is part of an overall transaction. The nice thing is that the QueryEngine function returns the results as an associative array using the database column names as the key value.

Continue Reading

In the process of developing Lilliput CMS I had to think about how to do templating with PHP. There’s a lot of material available regarding PHP and templating and most of it is really weird. Having had a look at the Top 25 PHP template engines I can’t for the life of me understand why I would want to use something like Smarty, Savant or phptal. Obviously a lot of love and attention has been poured into these solutions but I can’t escape the feeling that these template engines are recreating PHP and its innate templating function. This feeling was confirmed when reading the “Templates and template engines” article on the php patterns website.

Continue Reading

My work for the Lilliput CMS has lead to some interesting new findings about PHP 5′s new DOM functions, particularly the XPath part of it. Whilst working on the templating structure for Lilliput I couldn’t get the DOM XPath queries to work on the file at hand, a normal XHTML 1.0 Strict document. With an external tool called XPath Explorer (XPE) every query evaluated correctly but as soon as I tried the same XPath expression in PHP it failed. After searching long and hard I came across a code snippet that contained the solution namely to explicitly declare the xhtml namespace for the DOM document you’re working on:

Continue Reading