Is Google’s RankBrain Susceptible to Human Bias?
Jonathan Hunt has been with PMG since 2013 and is a senior leader on the SEO team, guiding automation and technology strategy for organic search. His 17 years of experience in SEO has included leading programs for ecommerce, technology, entertainment, and b2b brands. Jonathan was recently named a finalist for AAF Austin’s 2023 Big Wigs Awards for Best Data Analyst.
Over the last year, we’ve entered into a new rise of global populism. Online audiences have seemingly become disenchanted with the established voices of authority, and how people determine the value of information appears to be determined more by who is saying it than by the evidence backing it. Is it possible that this online shift to GroupThink impacts how we receive and interpret information from supposed objective sources, like Google?
Back in March of 2016, Google made an almost unprecedented revelation and actually confirmed the three most significant factors in regard to natural search. These three elements were, by far, the most important factors in determining organic rankings on queries. Google’s Andrey Lipattsev confirmed that three biggest factors were:
Based on a statement made the year before, we were all already aware that RankBrain has attained the number 3 spot. Andrey remained vague on which came first, content or links, just stating that they were number 1 and number 2. This revelation confirmed an aspect that many in the SEO community had believed for a long time: The top ranking factors in organic search relied heavily on the concept of authority — something not so easy to be given a quantifiable, unbiased value.
In the era of #FakeNews, where it’s more important than ever to discern from whom and from where we get our information, we must ask the question: Is Google’s Algorithm susceptible to human bias and populist crowd think?
Setting aside links and content, for the time being, we’re put face to face with Google’s latest major ranking factory, RankBrain. RankBrain is the machine-learning section of the algorithm which evaluates site content against complex new queries entered into Google. In short, when Google is faced with a question it’s never seen before (which is about 15% of all searches, or ~450 million/day), it turns to RankBrain to understand the question and determine the best sites to serve as results.
Now, there a couple of items to keep in mind here. First, RankBrain isn’t working alone. It’s just one ranking factor. So, even if it’s a query that’s never been seen before, other factors in the algorithm are still hard at work, but RankBrain’s conclusions probably carry a lot more weight in the final search results. Second, RankBrain isn’t used exclusively for new searches. Helping to parse and contextualize new searches were a key reason RankBrain was developed, but that’s hardly the extent of its scope. As the #3 ranking factor across all of Google’s algorithm, it’s safe to say that RankBrain is utilized in nearly every search query on the site (approximately 3 billion/day). Finally, RankBrain’s influence will not be the same for all searches. Some queries are probably very straightforward and do not require a large lift from something as powerful as RankBrain. Others will not be as easy for the lighter parts of the algorithm, and RankBrain’s influence will be felt much more.
So, how does RankBrain make its decision? How does it evaluate never-before-seen (or rarely seen) queries and contextualize them into something Google can better understand? RankBrain is a machine learning protocol, where the algorithm utilizes information from external observations to make judgments on the intent of complex queries. These external observations are where we consider the possibility that human bias can be introduced into Google’s machine learning system, allowing it to become an inherent part of the Google algorithm.
Over the last year, we have seen increasingly negative online sentiment toward institutions previously treated as pillars of authoritative information. Press outlets such as the Associated Press, the New York Times, the Wall Street Journal, and CNN have all been called “Fake News” by people on both sides of the aisle. The rise of populism has brought crackpot conspiracies and radical ideologies into mainstream media, along with distrust for anything remotely resembling establishment. How does that bias reveal itself online?
Jessica Stillman, in a November 2016 piece for Inc. Magazine, recounts an interview in Harvard Gazette, laying out a compelling case for how an algorithm can be “racist.” In the article, she focuses on the work of Cathy O’Neil — Harvard-trained mathematician, Ph.D., and author of Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. O’Neil posits that algorithms are still the work of human beings in their design. Even when built with the best of intentions, human bias can still find its way in. This may be even more dangerous, as the result is systemic bias or prejudice – largely given a free pass due to its perceived apolitical, neutral appearance. As Stillman writes, “We accept them because they’re dressed up in a veneer of math.”
O’Neil, in her Gazette interview, goes on to explain: “The real misunderstanding that people have about algorithms is that they assume that they’re fair and objective and helpful. There’s no reason to think of them as objective because data itself is not objective and people who build these algorithms are not objective.”
In a way, machine learning — the kind utilized in Google’s RankBrain — is supposed to fix that problem. By making the algorithm self-learning and updating, we can remove the bias (inadvertent or not) of the individual programmer. It works on dilution theory. Get enough data points, and the outliers will reveal themselves and be made irrelevant. But machine-learning algorithms still use crowd-sourced user data. By analyzing behavior patterns across millions of users, we may have eliminated the bias of the individual, only to replace it with the systemic bias of the society.
Producing a truly objective, apolitical algorithm may be impossible. While that may be the ideal, it is not really an attainable goal. At least, not yet. And this is not to say that machine learning algorithms, such as RankBrain aren’t valuable. In fact, it’d be hard to make any sort of case that Google (on the whole) has done anything but become more personalized, more accurate, and more perceptive in terms of search.
Stay in touch
Subscribe to our newsletter
The important piece of the whole thing is that we know it’s not perfect. We have to remember that it’s not entirely objective. The real danger comes when we forget.
9 MINUTES READ | May 6, 2020