Setting up an Agile Webmaster workstation

Posted by – August 10, 2008

My last Windows XP workstation died on me and I am replacing it with a new Vista box. I decided to take note of the programs I found myself needing so that in the future I could set it up more quickly and spend less time installing software when I need it by installing it before I do.

A few days into my use of this workstation here are the things I’ve already installed.


  • Firefox 3. Nick found where my keyword search feature was hidden so I’m happy with this as my default browser. I’ll post about the various extensions I needed in another post.
  • Opera – Latest version only
  • Safari – Latest version only

Together with IE 7 and IE 6 and Firefox 2 these are the desktop browsers that an Agile Webmaster should support.

Version Control System

  • Code Co-op – “Abstract through a standard protocol” is a mantra of mine, and one I used frequently in the Relisoft forums when trying to convince them to use pop/smtp as the tansport for their p2p Version Control System Code Co-Op. There are a lot of options out there for this but I like this one for its relatively low network topography needs. With the p2p system comes conflict from latency and some complication but I like how decentralized this is.

Whatever you choose to use is fine if it works. But don’t code without a Version Control System.

Text Editor

  • Edit Pad Pro – Text editors are very personal things for the agile webmaster. The hardcore geeks tend to be almost religious about their choice, and it’s no wonder, we spend much of our day in a text editor. Edit Pad Pro is my choice, with its integration with Jan Goyvaerts’ other great tools such as RegexBuddy and PowerGrep that I also use.

Note: one of the Agile Webmasters, Steve Levithan, is writing a Regex book with Jan that I’d be all over if I didn’t know Steve and couldn’t just ask him my occasional regex question myself.


  • Photoshop – Ok, being on a more textual end of the development spectrum, I got away with Paint for a few days but installed an old copy of Photoshop CS. No link here because I honestly think most webmasters can use a host of other tools that are perfectly fine for web design without the Photoshop complexity and pricetag. Back when I did more of this stuff I liked Macromedia Fireworks (now Adobe Fireworks) but you shouldn’t take too many graphics recommendations from me.

File Utilities, Networking, Sync and Backup

  • Syncback SE – A very useful tool for a variety of things ranging from deployment to backup. The on-change syncronization feature is the glue to our PHP IDE (I’ll tell you about it one day) around here.
  • 7-Zip – Because agile webmaster cannot live by Windows’ built in Zip support alone but needeth other compression algorithms and stuff.
  • SSH Secure Shell 3.2 – Yes I know of Putty (what I use whenever on a guest computer) and just about any other SSH client. But this one’s my choice for the file transfer GUI (like FTP clients). It used to be available as a free download here but now you have to search the web for it (it’s free for non-commercial use). Here, I’ll save you some time.

Last but not least, I installed Microsoft Office 2003. I use browser-based applications (Google mostly) for most of my needs but because many of the corporations I work with don’t I often need this suite around to open their files. In addition, the PIM world is still far behind the Exchange/Outlook powerhouse. Apple’s Mobile Me is promising but I still use Outlook as my contacts backup and syncing center.

I haven’t even begun to scratch the surface, and will continute to write up my agile development enviroment in future articles as I run into more cases for tools I need to install.

Live Search’s Webmaster Center comes out of BETA

Posted by – August 9, 2008

Microsoft’s Live Search Webmaster Center came out of BETA today, with new features showing backlinks and crawl errors. Most agile webmasters would not have had much use for the crawl error tool, as they often have all the data they need in their own server logs but backlink metrics are very useful to SEO efforts and the more data you can get about your backlinks the better.

Till now, Yahoo’s site explorer has been the most useful tool, with the most accurate backlink data and it’s nice to see more transparancy from the search engines.

Google’s Adsense Algorithm

Posted by – August 8, 2008

Google’s search algorithms get a lot of play and not enough people are paying attention to the fact that Google’s contextual ad network still enjoys a technical superiority to its peers that it long lost in the search relevancy algorithm. Simply put, some search competitors are doing a decent job with search relevancy but still seem to be nowhere when it comes to serving relevant ads.

I’d like to share some of my thoughts on the Adsense algorithm, which I will revisit in detail in the future. Given the secrecy of the sauce I will not try to prove what is and what is not the Google Adsense algorithm and will take the approach that any SEO worth his salt should and speculate as to what the algorithm should be, and what current technology is capable of.

At its simplest level I believe the algorithm should, and likely does, work like this:

  1. Attempt to determine page context and serve a contextually relevant ad
  2. Use clickstream data to determine what the user might be interested in and serve an ad that may not be contextually relevant.
  3. Use basic demographic data (e.g. geolocation) to attempt to target ad relevance to the user.

The premise is simple, the context of the page is a strong indication about what the user will click on and is the first priority of the algorithm. You may know that the user was interested in other, potentially more profitable, subjects but that the user is on that page now is a fairly good indication of what the user is interested in at that particular moment.

But then again it isn’t always the case, and clickstream data can help identify what the user is really interested in. For example, the user’s previous searches can indicate what is really meant for the query “apple”, but even more immediately relevant is that Google often knows where you were right before you got to the page. And with increasing frequency, it was Google itself.

This is the single biggest reason that clickstream data must be a part of the Google algorithm. It’s much easier to determine context from a user-input query. That’s why other search engines are starting to compete with Google in relevance on most queries. If Google knows what the user searched for before clicking on this page they have a variable that rivals page context in relevance to the user. If they know the user searched for “buy a green widget in San Diego” and landed on a general page about green widgets they would be foolish not to use the additional context they know about the user (the location specific subset that they are looking for) in their attempt to serve an ad the user is most likely to click.

The “session context” as I, as of a moment ago, like to call it in the clickstream would be weighed heavily with page context in my algo, and historic clickstream data would follow at a distance. If you know they area always looking for certain widgets and you don’t have a great ad for the page or session context then an ad about what the user has expressed past interest in is the next best thing. Google has a lot of clickstream data from their own web properties as well as others through sites running Adsense itself as well as their free log analytics service they provide to webmasters in exchange for the data. For example, they could know what you searched for on Yahoo when you land on a page with their own ads or log tracking and it’s precisely such examples that they can use to their benefit. Search history presents the easiest contextualization opportunities because the user has given the context in a string. Other clickstream data requires a lot more guesswork and for these reasons I think that Google should, and does, focus on mainly search related clickstream data. Given my read on their corporate culture, I’m not sure if they are doing this outside of their own web properties, as in my Yahoo example, but they should and I can’t imagine that they don’t for their own search engine.

Lastly you can throw anything else you know about the user. You have the IP and can map to geodata in a simple example, like showing the user an ad for a local restaurant. And you can even get fancy and use aggregate trends (e.g. people searching in a certain area might be in town for a certain reason, come up with your own specifics) and other logical deductions (i.e. “wild guesses” like searching in English from Mexico might mean you are interested in a hotel). I think focusing your efforts is a big part of the end result of this kind of work and believe that if Google uses any of this fall back data they do it simply. Why spend time on a complicated algorithm to generate poor guesses when you can spend more time nailing the real priorities like page context?

In another post, I’ll break down the on-page context algorithm possibilities but I’m out of time for today.

Google begins to integrate DoubleClick into their content network

Posted by – August 7, 2008

Google announced a number of new options for advertisers in the content network that will have a big impact on AdWords advertisers and AdSense publishers as they begin to integrate their acquisition DoubleClick into their existing ad networks.

Google completed the acquisition of the display advertising giant on March 11th, 2008 with the aim of bolstering its display advertising presence on the web. With the overwhelming majority of their revenue coming from text advertising DoubleClick’s multimedia strengths were deemed a good fit to the tune of a 3.1 billion cash acquisition offer on April 14th, 2007. Earlier this year the regulatory hurdles were cleared and Google’s advertisers are beginning to see the end result of the merging of these ad platforms. The additional options for the advertisers may compel companies who were wary of the Google content network with its legion of AdSense publishers to give the content networks a new try. Here are the options Google announced:

  • Frequency Capping: Enables advertisers to control the number of times a user sees an ad. Users will have a better experience on Google content network sites because they will no longer see the same ad over and over again.
  • Frequency Reporting: Provides insight into the number of people who have seen an ad campaign, and how many times, on average, people are seeing these ads.
  • Improved Ads Quality: Brings performance improvements within the Google content network.
  • View-Through Conversions: Enables advertisers to gain insights on how many users visited their sites after seeing an ad. This helps advertisers determine the best places to advertise so users will see more relevant ads.

Visualizations of Sorting Algorithms

Posted by – August 6, 2008

I’m traveling today, so here’s a quick post on an old site I found interesting. It’s a visualization of various sorting algorithms that help illustrate the efficiencies and overall speed differences between them. To start the sorting simulations click on each of them to activate the animation.

There’s not a lot of surprises here and you’ll see the predictable results: the parallel sorting algorithms generally outperform the sequential sorting algorithms, but it’s a useful demonstration that can better hammer home the difference in your mind.

Check them out here and try even more here.

Google launches Google Insights for search marketers

Posted by – August 5, 2008

Today Google Insights launched, a tool developed for AdWords advertisers to better understand trends in search terms. You can use this tool to compare the traffic for a keyword or phrase and filter by vertical (Category) and region. This is useful for search marketing professionals both for their efforts in PPC and natural results marketing. In both cases knowing the search volume is one of the most important strategic variables, after all why spend time and money on terms with less traffic than others that you can work or spend on?

In the past, Overture was the most reliable way to get free query volume information from one of the major search engines. But they have discontinued their tool and Google has been releasing more search volume data around their AdWords PPC product and now has several of the most important keyword research tools for your webmaster arsenal.

Read more about it on the AdWords blog.

SEO Friendly Titles – The first WordPress plugin you should install

Posted by – August 5, 2008

One of the very first things I do when installing a WordPress blog is to hack the titles. By default the page titles that WordPress generates are not SEO friendly, and the individual post page titles put the title of the post after the blog and archive wording.

The all in one seo pack plugin allows you to modify the blog’s meta tags through the WordPress admin panel as well as on the individual posts through the post editor. Its defaults are sensible and it represents a cleaner solution than hacking at the code to do it yourself because of the abstraction gained through the WordPress plugin architecture.

It’s now part of my standard WordPress install and should really be a part of the core software.

Yahoo search update for August 2008

Posted by – August 4, 2008

Quick heads up for the SEO crowd: Yahoo search is being updated with new indexing and ranking algorithms today. You should see significant changes in their results as they are being rolled out.

Yahoo launches the Yahoo Music API

Posted by – August 4, 2008

Yahoo announced their Yahoo Music API today, releasing another tool in their open strategy of providing services to web developers. Their music API allows other websites to tap into some of the Yahoo Music content. Their API can be used for catalog data, such as searching Yahoo Music by artist, or getting charts like the new releases or popular music or for user data, like recommendations for the user. The user data requires Yahoo’s browser-based authentication while the catalog data does not.

Right now, the rate limit is 5,000 queries a day, which is too low for me to consider it worth building on yet but hopefully they’ll announce a paid service for commercial sites that want to build something serious on it.

Check out the API and apply for an API key here or have a look at their example application (Facebook app, requiring Facebook login).

Yahoo claims that Yahoo Buzz is social bookmarking leader over Digg

Posted by – August 3, 2008

On Friday Yahoo held its annual shareholder meeting, and while the Yahoo board’s deal with Carl Icahn’s opposing slate of directors has removed a lot of the drama from the meeting it was still interesting. And one of the things I’d like to highlight is a claim made by Yahoo president Susan Decker about Yahoo Buzz, Yahoo’s social media site that’s currently in BETA.

Steven Shankland blogs on CNET that at around 11:13 am Decker boasted that Yahoo Buzz has “displaced Digg as the top way to find content across the entire Internet”.

Without supporting information it’s hard to assess the validity of this claim, but given Yahoo’s size and audience it would be foolish for Social Media Marketers to ignore this service in their online marketing efforts.

Right now, it is a bit of a walled garden, and it seems that they will differ from Digg in that some editorial control comes from relying on trusted publishers. Whether relying on feeds from screened publishers is a BETA restriction or a long-term approach to Yahoo Buzz is unclear. But you can get a head start and apply to be a publisher with Yahoo Buzz here.