Category: Programming

What is the best PHP accelerator to use?

Posted by – February 1, 2009

Well let me go ahead and tell you. Of course, this is all just my opinion and your milage may vary.

First, I will discuss the role of PHP accelerators (Opcode cache) in server tuning and scalability briefly. These tools will not enable your server to handle much more traffic in most scenarios. In some, the additional overhead of the PHP caching will even cause more load, others gain a marginal improvement by getting requests served a bit more quickly and having fewer concurrent connections. But they will not significantly raise your concurrent user limitations, as in a LAMP stack your bottleneck is usually at the database.

The way these PHP accelerators work is by caching the compiled bytecode of your human-readable PHP. Normally, your PHP code is compiled and then executed at runtime but these tools cache the compiled code, saving the expense of compiling it and thusly generally save you a bit of CPU at the cost of some increased memory usage.

So what PHP acceleration can do is make your PHP execute more quickly, and execute in roughly half the time. But it’s important to understand just what it’s accelerating, because your PHP execution is typically not what most influences the perception of speed to the user. To the user it’s a combination of page generation time, network latency, and page rendering time. These php caching tools may influence the page generation time but as I’ve already said, the database is usually the key there, and it’s both the bottleneck for concurrent users as well as the bulk of the page generation in typical setups. To make the biggest difference there you need good database design and data object caching with something like memcached, but here we’ll go over the options to improve your PHP execution times.

Alternative PHP Cache – http://pecl.php.net/package/APC

eAccelerator – http://eaccelerator.net/

XCache – http://xcache.lighttpd.net/

Zend Platform – http://www.zend.com/products/platform

ionCube PHP Accelerator – http://www.php-accelerator.co.uk/

Turck MMCache – http://turck-mmcache.sourceforge.net/index_old.html

On the able2know Q&A site we have used Zend and Turck MMCache in the past, with favorable results, but I am on a scalability and performance crusade here, and wanted to pick out the best of the current crop. Turck MMCache has not been actively developed for a while now, and is not a viable option for us to use in production, so that one’s out. I did the research into a variety of benchmarks (see chart), and with few exceptions the main competitors perform close enough to each other to make the performance differences less of a deal-killer in a selection between them. Simply put, the marginal performance gains you may acheive by selecting one over the other may be outweighted by differences in things like price, or how actively the code is being developed.

So I narrowed the selection down to XCache, APC and Zend. Zend does much more than PHP caching, and may be the right choice for others but they are a proprietary option that you must pay for, and that doesn’t justify the cost difference through performance gain as a PHP accelerator. XCache and APC are developed by well known programmers in the open source world, being the developers of lighttpd and PHP itself respectively. But between the two I am opting to use APC on able2know, as it is a pecl extension maintained by the maintainers of PHP itself (including the creator of PHP) and is reportedly going to become a core part of PHP 6.

So at least for me:

The best PHP accelerator to use is APC.

PHP Namespaces

Posted by – October 26, 2008

I have been seeing alot of complaining lately regarding PHP namespaces, and I thought I would chime in with my (often opposing) views. First off, let me explain the issue.

The new version of PHP will have a new feature, called namespaces. (I wrote about it in my post”PHP 5.3 Feature Preview“. This is a great featured, and one that the community as a whole is excited about. So what is the problem, then?

Well initially PHP was going to use the standard “::” syntax to invoke the namespace. For example:


namespace Foo;
   function bar() {
      echo "Namespace Foo";
   }
   Foo::bar();

However, this became a problem for the parsing engine, as it is the same way to call a static function.

class Foo{
   static function bar(){
      echo "Class Foo";
   }
}
Foo::bar();

So instead, PHP changed the syntax to a blackslash. So now, in the first example, you have ‘Foo\bar();’ while in the second, you still have ‘Foo::bar();’ Seems reasonable to me (and the PHP core members) but not to some.

For example, Ninh complains on his blog that if you put the invocation in double quotes, it will interpret things like “\t” as a tab, and that you have to use 2 backslashes. I really don’t see this being a problem. First off, why on earth would you use double quotes? There are no variables or single quotes being used in the string, so one should be using single quotes anyways. That is just good programming practice.

Even if you do use double quotes, there is a tried and true, standard solution to escape the backslash character. It is so obiquitous, that Ninh didn’t have to learn how to use it, he already knew about it, yet is still complaining. His problem with this method? “It looks like crap”. I’m sorry, I though we were using logic to code web applications, not painting a picture. And even at that, why does “::” look awesome, but “\\” look like crap? I don’t get it.

My favorite quote if from the very end of his post. He claims “Last time I checked, the world wasn’t filled with scrawny developers that would come crying to their mommies after getting their first facepalm of ‘AmbiguousInvocationError’.” And yet he is crying to his mommy (or rather, the blogosphere) because if he uses double quotes (which he doesn’t have to) then he has to escape the backslash character (a standard practice) and that’s “ugly”.

So what are your thoughts? Maybe I’m crazy, and this is a huge deal. I just don’t see it.

Google adds ‘Suggest’ feature to homepage

Posted by – August 30, 2008

Google suggest is a handy feature that displays a list of likely search terms as you are typing them. This feature is usefull for several reasons:

  1. It often stops you from having to type your entire search phrase in
  2. It’s a quick way to check the spelling on a word
  3. It gives you an idea of what a good search phrase is that is related to what your after

This feature has been around in google labs for awhile. A form of it is also used in the search bar of firefox (although the search bar only displays phrases, and not the number of results each will return). So the feature itself isn’t new, but making it the default on the homepage for all users (even guests) is a big deal.

I think it is a good move. Of course they want their homepage lite and responsive, but I believe the functionality is worth the extra pageload. Beside, an Ajax search like this really doesn’t take much javascript when done right. In fact, I recently implemented a similar feature for searching tags on able2know.

I implemented it without the help of javascript libraries, although I do have a javascript file I’ve written that contains a small set of usefull tools, such as targeting elements, sending ajax requests, or validating json data. Using these helper functions, my search code becomes this simple:


var tagSearch = {
	init : function(){
		tagSearch.e = $('tagSearch');
		tagSearch.resultBox = $('tagSearchResults');
		tagSearch.original = tagSearch.resultBox.innerHTML;
		tagSearch.e.onkeyup = function(e){
			if(tagSearch.e.value != tagSearch.lastValue){
				tagSearch.lastValue = tagSearch.e.value;
				if(tagSearch.e.value.length > 1)
					Ajax.get('/rpc/tag/search/?s=' + tagSearch.lastValue, null, tagSearch.update);
				else if(tagSearch.e.value.length == 0)
					tagSearch.resultBox.innerHTML = tagSearch.original;				
			}
		}
	}, 
	update : function(r){
		var data = JSON.parse(r.responseText);
		if(data.html){
			tagSearch.resultBox.innerHTML = data.html;
		}
	}	
}

On the php side of things, I am returning html to keep things simple. Pretty easy, eh?

Yahoo launches Fire Eagle

Posted by – August 13, 2008

Yahoo launched Fire Eagle yesterday, a web service that lets users input their location for use by other web applications. Web developers can use its API to create applications that then use this information to provide location-based services to the user.

The user can input their location through a variety of methods, with the most antiquated being the entry of their location on the Fire Eagle website or even, in the hyper Web 2.0 world, SMS. However it also allows phone-based applications to broadcast the user’s location to the web service which allows for real-time uses of local data that open a lot more application possibilites.

The user has control over what location details are broadcast, but privacy advocates are sure to cringe at the encroaching of the smart cloud that knows more about you, as the initial uses for this are largely related to commercial opportunities in your proximity.

Are you imagining an ad network that serves ads relevant to where your laptop or phone currently is? I am and it’s a frickin’ “Starbucks on the right” banner that I think could wring a few more bucks a day out of those caffeine junkies.

PHP 5.3 Feature Preview

Posted by – August 13, 2008

Earlier this month the PHP development team released an alpha version of the PHP platform (Read the Official Announcement). Of course there are many bug fixes, improvements, and new features. Here is a quick breakdown of the new features I am most excited for:

  • Namespaces – Finally! No longer do you have to worry about variable, function, or class names in the global scope interfering with other code libraries you may be using. A namespace essentially gives you your own personal global namespace, which you get to define. This means if you create a class called ‘user’, you can use someone else’s codebase even if they have a ‘user’ class as well.  Other languages such as C++ have had this feature for years, so its great to finally have it available to PHP. This may convince some that PHP is a viable solution in more huge, enterprise level codebases.
  • Late Static Binding – this is for those hardcore OOP coders. What this allows you to do, is reference an objects type (class name) from a function that is only available via inheritance. Its a confusing concept, but can prove useful in some situations. For a better example, see the Late Static Bindings Manual (procedural programmers need not apply).
  • Lambda Functions and Closures – Lambda functions are essentially anonymous, throw away functions. They can be useful if you want to use a simple function when you are already inside a function. Without Lambda functions, it would need to be defined elsewhere. However in some cases this can be hard to follow when reading the code, and seems wasteful when you only need to use the function once. A perfect use-case for Lambda functions is when they are defined for callbacks for other functions, such as array_walk(), or preg_replace_callback(). With Lambda functions, you can assign the function to a variable. Javascript programmers will recognize these as anonymous functions, as they are something many javascript programmers use heavily.

These are all very welcome additions, and I can’t wait till the stable version is out. The current roadmap estimates a stable version will be available by mid-October. Kudos, PHP Development team, keep up the good work!

Google releases Keyczar open source cryptographic toolkit

Posted by – August 11, 2008

Google has announced the release of their open source cryptographic toolkit Keyczar. It is a toolkit that Google claims (I have no experience with it yet) will make it easier to do cryptography right.

Keyczar’s key versioning system makes it easy to rotate and revoke keys, without worrying about backward compatibility or making any changes to source code.

Visualizations of Sorting Algorithms

Posted by – August 6, 2008

I’m traveling today, so here’s a quick post on an old site I found interesting. It’s a visualization of various sorting algorithms that help illustrate the efficiencies and overall speed differences between them. To start the sorting simulations click on each of them to activate the animation.

There’s not a lot of surprises here and you’ll see the predictable results: the parallel sorting algorithms generally outperform the sequential sorting algorithms, but it’s a useful demonstration that can better hammer home the difference in your mind.

Check them out here and try even more here.

Yahoo launches the Yahoo Music API

Posted by – August 4, 2008

Yahoo announced their Yahoo Music API today, releasing another tool in their open strategy of providing services to web developers. Their music API allows other websites to tap into some of the Yahoo Music content. Their API can be used for catalog data, such as searching Yahoo Music by artist, or getting charts like the new releases or popular music or for user data, like recommendations for the user. The user data requires Yahoo’s browser-based authentication while the catalog data does not.

Right now, the rate limit is 5,000 queries a day, which is too low for me to consider it worth building on yet but hopefully they’ll announce a paid service for commercial sites that want to build something serious on it.

Check out the API and apply for an API key here or have a look at their example application (Facebook app, requiring Facebook login).

You can’t sort and scale

Posted by – July 31, 2008

If you really want to scale you are going to have to come to terms with a basic fact. You can’t sort and scale. Ok, now that I have your attention through overstatement let me apply the requisite nuance. Sure, you can sort. But you can’t sort deep into a stack efficiently.

Now if you are working on a smaller scale where you aren’t pushing the limits of a relational database you’ll never know this. Your expensive sorts will still be fast enough with your small datasets. And when they grow, the first pages will still be fast enough as long as you have decent database indexing. But there’s a specific pattern that will show you the wall: “The deep page in a long sort”.

I made that jewel of jargon up just now, and since it makes precious little sense give me a chance to elaborate. When you sort a large dataset, your database will perform well early in the result set. However further into the set the performance degrades because it needs to calculate all previous positions to give you the next results.

So say you are selecting from a table with 5 million records and sorting by date. To get the first 10 results is easy, and your server doesn’t break a sweat. Ok, now get the last 10 results in that sort. Your server has to sort 4,999,990 results before it gets to the offset and the result is a slow query or the inability to complete the query at all.

And that’s just the way it is. That’s one reason why Google doesn’t show you all the pages in your search. They limit you to about 1,000 results. Go ahead and try to find the last page ranked on Google for a term like, “computers” and you’ll see what I mean.

Now there are ways around it, and those of you in the Google-can-do-anything crowd can simmer down now. The real reason they don’t do so is because their results are of decreasing relevance and therefore of little utility for their average user. But if a mere genius ;-P like myself can figure it out I’m sure they know all the workarounds as well.

To work around it, you need to select a smaller subset of the data using WHERE and sort this smaller dataset with your ORDER BY and offset/limit. So for example, when developing custom community software for able2know (coming soon, I hope) where threads can get big (e.g. 75,000 posts) we amortized this sort expense over the writes. We did this by calculating the post position within a thread and storing it on the post table with the post information instead of relying on a sort off the date. The last pages of threads would have to sort all previous posts to know what to display but this way we know that if we display 10 posts per page then we should query for positions 50-60 on page 5 for example, instead of querying the whole dataset and sorting it all to find out what should be displayed.

When a user posts, we can do an inexpensive check for the last position and calculate the position of the new post, and by storing it then we prevent ourselves from doing expensive sorts. So if you want to sort a huge dataset, save the positions or identifiers and filter first.

In a nutshell for the SQL crowd: use WHERE not ORDER BY to sort and scale.

Convert UTC dates to local timezone offset automatically

Posted by – September 11, 2007

So I was working on an application that printed out lots of dates to the user. However, I get traffic from all over the world, and as this is kind-of time sensitive information, I wanted to display the time in their own timezone. I have don’t this before on things like forums, where you allow the user to select their timezone when they register. But what if your users don’t register?

I decided to build a javascript function that would automatically convert from UTC time to the users’ local timezone. This is a pretty simple task, and should work very effectively. I already store the dates in my database as UTC format, which is a good idea to do so that there is never any confusion later as to what timezone a date is in (What if you and/or your server moves?). Also because this is a javascript function, we have the ability to determine the users local time – something that we normally wouldn’t have.

Without further ado, here is the code:


var dateFunction = {
  convertDate : function (gmtDate){
    var originalDate = new Date(gmtDate);
    var retStr = (originalDate.getMonth()+1) + "-" + originalDate.getDate() + "-" + originalDate.getFullYear();
    return retStr;
  },
  init : function(){
    var elems = YAHOO.util.Dom.getElementsByClassName('utcDate');
    for(var i=0;i

As you can see, I am using the YUI library to do some DOM parsing, and unobtrusively add some event listeners. The code should be pretty straight forward, when the page loads I look for all elements with a class of "utcDate" and pass the innerHTML into a separate function. The other function create a new date object, using the original UTC date as the time. By doing this, javascript automatically will display this new date in the users' local time. I can then return the date in whatever format I want. Finally, I replace the innerHTML of the element with the new date string, and we are all set

This provides a simple way to convert dates to local timezone offset, and all the is required is for you to wrap any date you want converted in an element with a class equal to "utcDate". It also degrades gracefully, because without javascript users will simply see the date in UTC format, which isn't really a bad thing, and probably what you would be showing them anyways, if you didn't have this nifty script!