Yahoo launches Fire Eagle

Posted by – August 13, 2008

Yahoo launched Fire Eagle yesterday, a web service that lets users input their location for use by other web applications. Web developers can use its API to create applications that then use this information to provide location-based services to the user.

The user can input their location through a variety of methods, with the most antiquated being the entry of their location on the Fire Eagle website or even, in the hyper Web 2.0 world, SMS. However it also allows phone-based applications to broadcast the user’s location to the web service which allows for real-time uses of local data that open a lot more application possibilites.

The user has control over what location details are broadcast, but privacy advocates are sure to cringe at the encroaching of the smart cloud that knows more about you, as the initial uses for this are largely related to commercial opportunities in your proximity.

Are you imagining an ad network that serves ads relevant to where your laptop or phone currently is? I am and it’s a frickin’ “Starbucks on the right” banner that I think could wring a few more bucks a day out of those caffeine junkies.

PHP 5.3 Feature Preview

Posted by – August 13, 2008

Earlier this month the PHP development team released an alpha version of the PHP platform (Read the Official Announcement). Of course there are many bug fixes, improvements, and new features. Here is a quick breakdown of the new features I am most excited for:

  • Namespaces – Finally! No longer do you have to worry about variable, function, or class names in the global scope interfering with other code libraries you may be using. A namespace essentially gives you your own personal global namespace, which you get to define. This means if you create a class called ‘user’, you can use someone else’s codebase even if they have a ‘user’ class as well.  Other languages such as C++ have had this feature for years, so its great to finally have it available to PHP. This may convince some that PHP is a viable solution in more huge, enterprise level codebases.
  • Late Static Binding – this is for those hardcore OOP coders. What this allows you to do, is reference an objects type (class name) from a function that is only available via inheritance. Its a confusing concept, but can prove useful in some situations. For a better example, see the Late Static Bindings Manual (procedural programmers need not apply).
  • Lambda Functions and Closures – Lambda functions are essentially anonymous, throw away functions. They can be useful if you want to use a simple function when you are already inside a function. Without Lambda functions, it would need to be defined elsewhere. However in some cases this can be hard to follow when reading the code, and seems wasteful when you only need to use the function once. A perfect use-case for Lambda functions is when they are defined for callbacks for other functions, such as array_walk(), or preg_replace_callback(). With Lambda functions, you can assign the function to a variable. Javascript programmers will recognize these as anonymous functions, as they are something many javascript programmers use heavily.

These are all very welcome additions, and I can’t wait till the stable version is out. The current roadmap estimates a stable version will be available by mid-October. Kudos, PHP Development team, keep up the good work!

Comscore announces explosive growth in social networking websites

Posted by – August 12, 2008

Not to be outdone by the other web metrics companies releasing their traffic analysis articles in the last few days, Comscore announced today that “Social Networking Explodes Worldwide“, citing a 25% growth worldwide in the last year.

Naturally, the growth is largely coming outside of North America whose 9% growth rate for the year is the lowest globally and emerging internet markets were responsible for most of the growth.

Additionally, they released information about specific sites showing MySpace nearly stalled at 3% growth with Facebook and Hi5 the fastest growing social networks at the moment. Also of interest to the social media marketer is that they are showing Facebook as having overtaken MySpace in monthly uniques. But then again, only a blind social media marketer wouldn’t have noticed Facebook’s meteoric rise.

Google crosses 70% search market share in US

Posted by – August 12, 2008

Research firms Neilsen Online and Hitwise have released traffic information that shows strong Google growth in the US market. Google’s audience through its variety of web properties grew by a million from June to July to 129 million (Google has the largest audience in the US through its properties like Google.com and YouTube.com). But even more damning for its rivals (namely Microsoft and Yahoo), and arguably for the webmaster community, is the information Hitwise released today about the most coveted traffic of all: search traffic. Hitwise announced that Google has crossed the 70% mark (they cite 70.77%) which is up 10% from July of last year and 2% from June of this year. Here’s the rest of the search landscape that Google dominates:

  • Google – 70.77%
  • Yahoo – 18.65%
  • MSN/Live – 5.36%
  • Ask – 3.53%

And here’s a graph from Hitwise:

Google releases Keyczar open source cryptographic toolkit

Posted by – August 11, 2008

Google has announced the release of their open source cryptographic toolkit Keyczar. It is a toolkit that Google claims (I have no experience with it yet) will make it easier to do cryptography right.

Keyczar’s key versioning system makes it easy to rotate and revoke keys, without worrying about backward compatibility or making any changes to source code.

Setting up an Agile Webmaster workstation

Posted by – August 10, 2008

My last Windows XP workstation died on me and I am replacing it with a new Vista box. I decided to take note of the programs I found myself needing so that in the future I could set it up more quickly and spend less time installing software when I need it by installing it before I do.

A few days into my use of this workstation here are the things I’ve already installed.

Browsers

  • Firefox 3. Nick found where my keyword search feature was hidden so I’m happy with this as my default browser. I’ll post about the various extensions I needed in another post.
  • Opera – Latest version only
  • Safari – Latest version only

Together with IE 7 and IE 6 and Firefox 2 these are the desktop browsers that an Agile Webmaster should support.

Version Control System

  • Code Co-op – “Abstract through a standard protocol” is a mantra of mine, and one I used frequently in the Relisoft forums when trying to convince them to use pop/smtp as the tansport for their p2p Version Control System Code Co-Op. There are a lot of options out there for this but I like this one for its relatively low network topography needs. With the p2p system comes conflict from latency and some complication but I like how decentralized this is.

Whatever you choose to use is fine if it works. But don’t code without a Version Control System.

Text Editor

  • Edit Pad Pro – Text editors are very personal things for the agile webmaster. The hardcore geeks tend to be almost religious about their choice, and it’s no wonder, we spend much of our day in a text editor. Edit Pad Pro is my choice, with its integration with Jan Goyvaerts’ other great tools such as RegexBuddy and PowerGrep that I also use.

Note: one of the Agile Webmasters, Steve Levithan, is writing a Regex book with Jan that I’d be all over if I didn’t know Steve and couldn’t just ask him my occasional regex question myself.

Graphics

  • Photoshop – Ok, being on a more textual end of the development spectrum, I got away with Paint for a few days but installed an old copy of Photoshop CS. No link here because I honestly think most webmasters can use a host of other tools that are perfectly fine for web design without the Photoshop complexity and pricetag. Back when I did more of this stuff I liked Macromedia Fireworks (now Adobe Fireworks) but you shouldn’t take too many graphics recommendations from me.

File Utilities, Networking, Sync and Backup

  • Syncback SE – A very useful tool for a variety of things ranging from deployment to backup. The on-change syncronization feature is the glue to our PHP IDE (I’ll tell you about it one day) around here.
  • 7-Zip – Because agile webmaster cannot live by Windows’ built in Zip support alone but needeth other compression algorithms and stuff.
  • SSH Secure Shell 3.2 – Yes I know of Putty (what I use whenever on a guest computer) and just about any other SSH client. But this one’s my choice for the file transfer GUI (like FTP clients). It used to be available as a free download here but now you have to search the web for it (it’s free for non-commercial use). Here, I’ll save you some time.

Last but not least, I installed Microsoft Office 2003. I use browser-based applications (Google mostly) for most of my needs but because many of the corporations I work with don’t I often need this suite around to open their files. In addition, the PIM world is still far behind the Exchange/Outlook powerhouse. Apple’s Mobile Me is promising but I still use Outlook as my contacts backup and syncing center.

I haven’t even begun to scratch the surface, and will continute to write up my agile development enviroment in future articles as I run into more cases for tools I need to install.

Live Search’s Webmaster Center comes out of BETA

Posted by – August 9, 2008

Microsoft’s Live Search Webmaster Center came out of BETA today, with new features showing backlinks and crawl errors. Most agile webmasters would not have had much use for the crawl error tool, as they often have all the data they need in their own server logs but backlink metrics are very useful to SEO efforts and the more data you can get about your backlinks the better.

Till now, Yahoo’s site explorer has been the most useful tool, with the most accurate backlink data and it’s nice to see more transparancy from the search engines.

Google’s Adsense Algorithm

Posted by – August 8, 2008

Google’s search algorithms get a lot of play and not enough people are paying attention to the fact that Google’s contextual ad network still enjoys a technical superiority to its peers that it long lost in the search relevancy algorithm. Simply put, some search competitors are doing a decent job with search relevancy but still seem to be nowhere when it comes to serving relevant ads.

I’d like to share some of my thoughts on the Adsense algorithm, which I will revisit in detail in the future. Given the secrecy of the sauce I will not try to prove what is and what is not the Google Adsense algorithm and will take the approach that any SEO worth his salt should and speculate as to what the algorithm should be, and what current technology is capable of.

At its simplest level I believe the algorithm should, and likely does, work like this:

  1. Attempt to determine page context and serve a contextually relevant ad
  2. Use clickstream data to determine what the user might be interested in and serve an ad that may not be contextually relevant.
  3. Use basic demographic data (e.g. geolocation) to attempt to target ad relevance to the user.

The premise is simple, the context of the page is a strong indication about what the user will click on and is the first priority of the algorithm. You may know that the user was interested in other, potentially more profitable, subjects but that the user is on that page now is a fairly good indication of what the user is interested in at that particular moment.

But then again it isn’t always the case, and clickstream data can help identify what the user is really interested in. For example, the user’s previous searches can indicate what is really meant for the query “apple”, but even more immediately relevant is that Google often knows where you were right before you got to the page. And with increasing frequency, it was Google itself.

This is the single biggest reason that clickstream data must be a part of the Google algorithm. It’s much easier to determine context from a user-input query. That’s why other search engines are starting to compete with Google in relevance on most queries. If Google knows what the user searched for before clicking on this page they have a variable that rivals page context in relevance to the user. If they know the user searched for “buy a green widget in San Diego” and landed on a general page about green widgets they would be foolish not to use the additional context they know about the user (the location specific subset that they are looking for) in their attempt to serve an ad the user is most likely to click.

The “session context” as I, as of a moment ago, like to call it in the clickstream would be weighed heavily with page context in my algo, and historic clickstream data would follow at a distance. If you know they area always looking for certain widgets and you don’t have a great ad for the page or session context then an ad about what the user has expressed past interest in is the next best thing. Google has a lot of clickstream data from their own web properties as well as others through sites running Adsense itself as well as their free log analytics service they provide to webmasters in exchange for the data. For example, they could know what you searched for on Yahoo when you land on a page with their own ads or log tracking and it’s precisely such examples that they can use to their benefit. Search history presents the easiest contextualization opportunities because the user has given the context in a string. Other clickstream data requires a lot more guesswork and for these reasons I think that Google should, and does, focus on mainly search related clickstream data. Given my read on their corporate culture, I’m not sure if they are doing this outside of their own web properties, as in my Yahoo example, but they should and I can’t imagine that they don’t for their own search engine.

Lastly you can throw anything else you know about the user. You have the IP and can map to geodata in a simple example, like showing the user an ad for a local restaurant. And you can even get fancy and use aggregate trends (e.g. people searching in a certain area might be in town for a certain reason, come up with your own specifics) and other logical deductions (i.e. “wild guesses” like searching in English from Mexico might mean you are interested in a hotel). I think focusing your efforts is a big part of the end result of this kind of work and believe that if Google uses any of this fall back data they do it simply. Why spend time on a complicated algorithm to generate poor guesses when you can spend more time nailing the real priorities like page context?

In another post, I’ll break down the on-page context algorithm possibilities but I’m out of time for today.

Google begins to integrate DoubleClick into their content network

Posted by – August 7, 2008

Google announced a number of new options for advertisers in the content network that will have a big impact on AdWords advertisers and AdSense publishers as they begin to integrate their acquisition DoubleClick into their existing ad networks.

Google completed the acquisition of the display advertising giant on March 11th, 2008 with the aim of bolstering its display advertising presence on the web. With the overwhelming majority of their revenue coming from text advertising DoubleClick’s multimedia strengths were deemed a good fit to the tune of a 3.1 billion cash acquisition offer on April 14th, 2007. Earlier this year the regulatory hurdles were cleared and Google’s advertisers are beginning to see the end result of the merging of these ad platforms. The additional options for the advertisers may compel companies who were wary of the Google content network with its legion of AdSense publishers to give the content networks a new try. Here are the options Google announced:

  • Frequency Capping: Enables advertisers to control the number of times a user sees an ad. Users will have a better experience on Google content network sites because they will no longer see the same ad over and over again.
  • Frequency Reporting: Provides insight into the number of people who have seen an ad campaign, and how many times, on average, people are seeing these ads.
  • Improved Ads Quality: Brings performance improvements within the Google content network.
  • View-Through Conversions: Enables advertisers to gain insights on how many users visited their sites after seeing an ad. This helps advertisers determine the best places to advertise so users will see more relevant ads.

Visualizations of Sorting Algorithms

Posted by – August 6, 2008

I’m traveling today, so here’s a quick post on an old site I found interesting. It’s a visualization of various sorting algorithms that help illustrate the efficiencies and overall speed differences between them. To start the sorting simulations click on each of them to activate the animation.

There’s not a lot of surprises here and you’ll see the predictable results: the parallel sorting algorithms generally outperform the sequential sorting algorithms, but it’s a useful demonstration that can better hammer home the difference in your mind.

Check them out here and try even more here.