Search Engines- Winning the War on Content Farms


This time last year, search industry eyes focussed on the general quality of search results eventually causing a storm of protest from all parts of the web regarding over optimized websites that offered little of value to the user - and too many ads. The trend which takes us into 2012 is that low quality websites have been falling off the top of the SERPs, largely as a result of Google’s Panda algorithm update.

University of Glasgow computer scientist Richard McCreadie, at the request of NewScientist magazine, examined 50 queries known as content farm targets in March and again in August. The results, according to NewScientest, are “striking.”


For the purpose of this study, McCreadie defined low quality results as “uninformative sites whose primary function appears to be displaying adverts.” He hired people to review the results on Google and Bing during the two given time frames.

Keep in mind that the first Panda update rolled out on February 24th and affected 11.8 percent of results, so some of the test queries were most likely already affected by the beginning of the initial test period. That further progress was seen throughout the year reinforces that subsequent Panda updates did as they were meant to do, according to Google: “reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful.”

Amit Singhal and Matt Cutts explained further how Panda sniffs out low quality sites, in a March interview with Wired:

Singhal:We wanted to keep it strictly scientific, so we used our standard evaluation system that we’ve developed, where we basically sent out documents to outside testers. Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”
Cutts: There was an engineer who came up with a rigorous set of questions, everything from. “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines.
Singhal: And based on that, we basically formed some definition of what could be considered low quality. In addition, we launched the Chrome Site Blocker [allowing users to specify sites they wanted blocked from their search results] earlier , and we didn’t use that data in this change. However, we compared and it was 84 percent overlap [between sites downloaded by the Chrome blocker and downgraded by the update]. So that said that we were in the right direction.
One of the queries McCreadie identified as attractive to content farmers was “how to train for a marathon.” In that example, sites with generic lists of tips were present in the March test, but had disappeared from the top 10 by the August test, replaced with higher quality results from reputable publications such as Runner’s World magazine. McCreadie reported to NewScientist that they had found similar trends across the 50 test queries.

Between the March and August test periods, Panda was updated five times:
April 11th, Panda 2.0 introduced signals such as user-blocked websites
  • May 9th, Panda 2.1, minor changes
  • June 16th to 20th, Panda 2.2, more minor changes
  • July 26th, Panda 2.3 acknowledged by Google
  • August 12th, Panda 2.4 rolled out the algorithmic changes globally
Late in April, Forbes took a look at early results to determine how top content farms had been affected by the first two incarnations of Panda. At that time, Demand Media’s Answerbag’s Google referrals were down 80 percent and eHow, another Demand Media property, saw its Google search visibility drop 42 percent. Overall, Demand Media traffic fell 40 percent, according to Experian Hitwise. Here are some other content farm traffic results in the wake of Panda, from around the web:

Mahalo hit by Google Panda
Hubpages Traffic According to Quantcast

Since McCreadie’s study, Google has updated the algorithm a number of times, most notably with the September 28th Panda 2.5 update and the November 3rd Google Freshness update, which affected 35 percent of searches.

Over the course of the last year, we’ve heard loud cries of protest after each of the updates from smaller site owners who felt they’d been unfairly penalized by Panda. In retrospect though, as we’re heading into a new year, it does seem that Panda is accomplishing what it was meant to do.

Towards the end of 2011, on Webmaster Radio’s Webcology show, host Jim Hedger asked each of the Year in Review panelists what they felt the biggest search story of the year had been. Surprisingly, perhaps, Panda wasn’t really on the radar of some of the more recognized names in search as one of the bigger concerns of 2011. In the Webcology chatroom, it was generally agreed among industry vets including Jill Whalen that sites hit by Panda, whether they realized it or not, time and again were found to have areas in need of improvement that very well could have contributed to their being snagged in the updates: duplicate content, thin or shallow content, overwhelming ad placement.

0 comments:

Post a Comment

Share