I like Wikipedia quite a lot. It is usually the first place I turn to find factual information about a topic. There are some people who don’t trust the site…notably my children’s elementary school teachers. Generally though, it is going to be about as accurate as any reasonably available alternative. The prestigious scientific journal Nature performed a study in 2005 comparing Wikipedia to the Encyclopedia Britannica. The article itself is behind a pay-wall, but a summary is available on CNet. My summary of the summary is:
- Wikipedia gets the big picture across just as well as Britannica.
- There are lots of mistakes in both.
- Britannica just edges out Wikipedia on the accuracy of the finer points in an article.
But I can’t do a quick web search and read the Encyclopedia Britannica. Wikipedia, on the other hand has enormous quantities of information available in seconds, and for free!
Aside from the awesome content, the thing I find fascinating about Wikipedia is how high it ranks with the big search engines. It feels like every time I do a simple search on Google or Bing, Wikipedia is the first or second result. It’s gotten to the point that if I want to find a Wikipedia article, I don’t even go to the site to do my search.…I just use the search bar in my browser and pick the Wikipedial link from the top of the search results.
Now, of course Wikipedia doesn’t hit the top of the search results every time. Thinking about it did make me curious to know just how do Wikipedia articles rank. I devised a (not scientifically rigorous) experiment: pick a number of random English nouns and see where Wikipedia ranks in the search results. Originally I was going to try to use the Google search API, but Google has a 100 search limit on their API before you have to pay a nominal fee. Being a cheap guy doing this for fun, I decided that Microsoft’s 100% free Bing search API would worked perfectly for me.
For single word searches on nouns, Wikipedia does about as well as I thought it would: #1 result more than half the time. Only 5% of the time is it not in the top 10. This makes sense as most physical things are going to have a detailed Wikipedia entry.
Then I decided to try adjectives and adverbs. Wikipedia didn’t do as well here as it did with nouns, but still quite respectable. Adjectives were first page results 75% of the time. I was surprised by Wikipedia’s rank for adverbs. It only makes the 10 ten about 30% of the time, and very, very rarely is it in the top 3.
Most sites have to consciously perform Search Engine Optimization (SEO) to hit the top of the Googe charts. They do it because better search rankings means more money. Wikipedia isn’t exactly driven by a money, but they do want to make their treasure trove of information available to everyone. Here are some of the things they do, perhaps without conscious effort, to rank so highly.
Many, many sites link to Wikipedia. Not only to the site’s home page, but also to individual articles. Search engines assume that any page that is being linked to by lots of other sites should show up high in the rankings. The number and quality of incoming links is a proxy for the overall quality of a page. If lots of people link to it, lots of people must think the page is worth reading.
High Quality Content
Most Wikipedia pages are bursting with content and distinctly lacking in fluff. Comparing a Wikipedia page to a typical web site you will see that each Wiki page has significantly more text. Wikipedia also has essentially no advertisements and very few images. Browse a cnn.com page with images turned off and advertisements hidden and you will see just how little text there is on what you thought was a decent sized web page.
Most web sites don’t go back and edit their popular pages but search engines want to server up current information. If a page has been recently updated, that is probably a good thing and the search engines will give you a bump up for that. Wikipedia’s crowd sourcing nature encourages frequent updates to any and every page. Even pages that have been around for a long, long time will be updated quite often. Wikipedia’s Sheep page was updated 8 times in September…and sheep have been largely unchanged for a long, long time.
Great Semantic Markup
By their natures, encyclopedias will usually highlight the subject of a page. That same page will also have lots of words related to the subject present and highlighted as well. In practical terms, this means that the Wikipedia page on sheep will have the word “sheep” in a H1 heading tag. That page also has the following sheep related words and phrases at various heading levels: “sheep compared to goats”, breeds, diet, reproduction, predation, economic importance, etc. Search engines know statistically what words are supposed to go together. If a page has a lot of words related to a search term, the search engine will assume that the term is the main topic of the page. Therefor, the page will rank higher in the search results. If the related words are also emphasized in the HTML coding of a page or the URL, all the better.
I can’t say that all search engines do this, but Google has stated that they include load times in their ranking algorithm. The reasoning is that if a page loads quickly, users will be happier than when viewing a slow loading site. All things being equal, Google would rather show pages the user will be happy with.
If you have web content you want to rank high in web searches, see if you can do some of these things too. Write good content that people want to share and link on their own blogs and web sites. Keep your pages on topic. Use HTML markup properly and be sure to put your key words in header tags. Make sure your site is performing well. It sounds crazy when you write it out, but an Akamai and JupiterResearch study found that the average online shopper will give up on a web site if it takes more than 4 seconds to load.
I’m not saying that you are likely to rank higher than Wikipedia, but if you can rank just below them you are going to be doing really well.