Last time I wrote about cleaning house and freeing up Google’s index for your best content. That’s a great antidote to Panda’s negative ranking factors that punish cluttered websites. But sometimes it’s not that easy to retire pages. In large companies like IBM, there are several brands and business units that offer products or services related to top words such as analytics or cloud. We simply can’t ask these business units to retire fresh content related to business-critical offerings. In these cases, our only option is to help Google crawl and index our content so that the most important content takes its rightful place in Google’s index, and hopefully Google’s search engine results pages (SERPs). I’m talking about creating clear information architecture (IA) that shows Google (and your users) the relative importance of pages related to the same keywords.
How to build an IA using keyword research
One of my jobs in IBM entails getting agile web development teams started with what I call “outside-in marketing.” The key principal of this approach is to start with keyword research to learn the language of clients and prospects, then build digital experiences with that language. If users within your target audience type key phrases into Google, they will look for those words in your navigation when they land on your pages from Google. So we build the whole experience–from the URL through the navigation to the content, links and images–around those keywords.
When I first start to work with new teams, I often have the biggest difficulty selling outside-in marketing to the IAs. Some like to organize content by content type because that’s the way they’ve always done it. When I get wireframes from them organized in this way, I remind them that users do not look for content by content type. No one goes looking for white papers. They look for content about topics. They will choose a white paper within that topic area if it suits the information task they are trying to accomplish.
Another common IA mistake is to organize content the way product teams organize product portfolios, complete with all the internal language companies use to organize these portfolios. This is akin to surfacing your org chart, which most IAs hate because it only makes sense to people within the company, and not the target audience. On the other hand, if we use the language of the audience to organize our sites, we have a much better chance of connecting with the ways they find content. The chief benefit is it helps the crawler find and index content related to those same keywords.
Case Study: IBM SmartCloud
A while back, I got called in to help build a new site around IBM’s cloud computing initiative, called SmartCloud. The site at the time was organized by the business units that offered cloud-related products and services–infrastructure, software, services, etc. I did the keyword research and found that most of the audience looked for content related to Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS). So we built tab-based navigation around these topics, and adjusted our whole content strategy around this user language. Each topic had its own page, which was built specifically for the audience interested in it.
This was kind of a breakthrough because it helped the brand and business unit marketing managers understand the way clients and prospects think about cloud computing, which in turn helped IBM reorganize it’s entire cloud portfolio around client needs. A year after we built the tab-based navigation, those tabs are some of the most important digital experiences in the whole company. The tabs have links to those other 30 experiences, where the audience can register for events or ask to be contacted about specific offerings.
The new site was also built on a human-readable URL structure that enabled it to jump on the first page in Google as soon as it was launched. The main keyword was placed in the URL closest to the root. This effectively told Google that the page was IBM’s top page on cloud computing–and not the 30 other brand or business unit pages on the same topic. We also made sure it was the only cloud-related site with a link from the ibm.com home page. Again, this was a major cue to Google about the relative importance of the page for IBM.
We recently refreshed our keyword research and determined that some longer phrases (in the form of questions) actually had very high query volumes in Google. So we are building digital experiences to answer these questions right off the main SmartCloud page.
Tip: When URLs have long keyword phrases in them, we typically separate the individual words with dashes or hyphens. For example, …/software-as-a-service/index.html. Hyphens are the equivalent of the Boolean or connector, which allows Google to determine that the page is relevant to all of the keywords in any combination. There are times when you might want to use the underscore between words–chiefly with multiword brand names. The underscore is the URL equivalent of the Boolean and connector. But when in doubt, use hyphens.
Sitemaps, de-indexing and redirecting
Normally, Google will pay close attention to site IA when it attempts to rank apparently duplicate content within the same domain. But sometimes you have to help it out. One of the best ways to do this is to build a site map file off of the root directory of your site or microsite. You can do this with Webmaster tools. See Google’s own advice on how to do this.
Google claims it will take sitemaps as advisory documents, but it favors sites that are crawler-friendly to begin with. If you have a legacy page that is ranking well for a keyword and competing with a new experience, you can de-index the legacy page. It’s actually pretty easy to do. You just go into the metadata of the old page and modify the robots tag with follow no index. This is always better than retiring the page altogether because it avoids broken links and other undesirable outcomes. And it helps Google know that you no longer want the page in its index.
Sometimes the best thing to do is to take the old page down. If you do, chances are, it has some vital external links pointing to it. Rather than simply spilling that link juice on the floor, you can capture it by redirecting the old page (with a 301 redirect) to the new experience. This was another key to our SmartCloud success. We had three top-level cloud experiences, which were competing for links and Google ranking. When we redirected all three into the new URL , we captured all the link juice for those experiences and put it into one bucket.
Panda might not treat content the same way as previous Google algorithms, but it does treat legitimate internal and external links in a very similar way. You can greatly improve your search effectiveness by using clear internal IA and consolidating link juice from apparently duplicate sites within your environment.
James Mathewson is the Global Search Strategy Lead for IBM and the Search Practice Lead for the IBM Design Lab. He is also co-author of Audience, Relevance and Search: Targeting Web Audiences with Relevant Content. Follow him on twitter @James_Mathewson.