The realization struck me while holding the hand of my seven-year-old son, standing at the precipice of the most giant cliff I had ever looked over. At this moment, his boundless freedom to explore his surroundings took a back seat to his safety. In that precarious and volatile moment, my natural intelligence as a human outweighed philosophical notions of parenting. Anything less would have been artificially stupid.
Machine Learning and Real-World Consequences
Assuming my parental judgment, described above, is sound, we could safely say that most parents, placed in a similar situation, would make a similar judgment call. Suppose it is true that we can make intelligent, rational decisions in the interest of posterity. Why are we so sluggish about transferring this embedded natural intelligence to the machine learning algorithms we develop and implement into, arguably, equally precarious business situations?
When AI is your lover — you extrapolate all over the place
Our infatuation with artificial intelligence leads to a mindless disregard for natural intelligence. Unsurprisingly, in the words of Vincent Warmerdam this makes our machine learning algorithms artificially stupid.
Algorithms merely automate, approximate, and interpolate. Extrapolation is the dangerous part. Vincent Warmerdam, 2019.
The danger of getting emotionally involved
This post pays open homage to Vincent’s enlightening talk from 2019 entitled “How to Constrain Artificial Stupidity”– a topic increasingly deserving of a more watchful eye. What follows is part 1 of a series in which we will take a closer look at several of Vincent’s fixes for Artificial Stupidity in the field of machine learning.
Artificial Stupidity: the lack of real-world consensus (or natural intelligence) reflected in machine learning algorithms.
This complacency around natural intelligence and how to implement it in our machine learning models results in the dumbing down of the output of our, otherwise, ingenious AI creations, resulting in disastrous real-world consequences.
Example of Artificial Stupidity in the Wild
The Boston Housing Data Set used broadly to run probability tests on the housing market. One of the data columns delineates the “number of black people in your town.” If unquestioned, running probabilities against this data set will ironically reinforce a preexisting bias within the same data thought to provide a “fair” estimation of housing trends.
This example makes strikingly clear how important remaining curious about your database’s sources and content is before reporting any algorithmic successes.
Artificial: Made or produced by human beings rather than occurring naturally, especially as a copy of something natural.1
Stupidity: Behavior that shows a lack of good sense or judgment.
How wrong can an AI Model be?
There are usually two things that can go HorriblyWrong™ with models.
- Models don’t perform well on a metric people are attached to.
- Models do something that you don’t want them to.
My thesis is that the industry is too focused on the performance; we need to worry more about what happens when models fail. Vincent Warmerdam, 2019
Avoiding the AS (Artificial Stupidity) — “Love is Blind” Trap
If the above thesis is confirmed, a stronger focus on understanding why models fail and taking necessary steps to fix them is in order. It would better serve us if we began approaching machine learning like people in physics: study a system until it becomes clear which model will explain everything.
The following is the first in a set of four suggested fixes. The remaining three will follow in future posts.
Fix #1: Predict Less, and more carefully
We must be honest about what AI does. AI does not, in fact, deliver a probability. Honestly put, AI gives us an approximation of a proxy, given certain known factors.
AI cannot determine how unknown factors will influence what we do know. As a result, any missing data or data we are unaware of will dramatically affect our model’s output. Without all the data, we are unable to illustrate at which point the AI model will fail.
This wouldn’t be a problem if machine learning models weren’t always designed to return a result. We need to build safeguards to constrain when a model returns a result. And determine at which threshold the constraints will prevent an artificially stupid prediction.
In short: If we don’t know, don’t predict!
Missing data or wrong data means unwittingly solving for the wrong problem. In the real world, our model will fail. It’s okay to approach failure with humility, take a step back, and use natural human-intelligence to evaluate if we can come to a more valuable human solution. This humility will help us better articulate what we are solving for. Maybe this will lead to us realizing that we missed something in or asked the wrong questions of the data.
Algorithms merely automate, approximate, and interpolate. Extrapolation is the dangerous part.
Try not to report an AI miracle until we understand when the model will fail.
Fairness at the cost of privacy?
What are the practical implications? If I am looking to build a model to grant the highest possible fairness across my data set, I will need to calculate at what point the model is unfair. Having information like gender, race, and income within the data set will provide more transparency into how fairness is defined within a specific dataset. Baffling as it may be, without being honest about how this type of data influences our models, hiding instead behind good-intentioned data-privacy conventions, businesses can legitimately refuse transparency into their algorithmic predictions on the grounds of anti-discrimination.
In this way, an algorithm whose original purpose was, for example, to generate greater fairness among demographics in the housing market could become the basis for intensified segregation and systemic racism.
This is ethically debased and begs a solution. Something this post is far from providing. Suffice it to say: honest digital business looks different.
At the very least, we need to identify sensitive variables and do our best to correct for them. This means we must do everything we can to understand better the data going into our models.
If the predictions coming out of your model are your responsibility, so too should be the data going into the model.
Rediscover a Whole new World — Design-Thinking
Having this knowledge raises the stakes of machine learning! Simultaneously, approaching machine learning and AI in this way redeems our whole world around design–thinking (Read Andreas Wagner’s interpretation of a findability score to get an idea of what I mean!). Suddenly, we are once again the creators of our own design. No longer blindly plugging data into models whose outcomes we are powerless to influence. Understanding and giving merit to the human intelligence behind the models we use positions us to ask critical questions of the data we plug into the model.
As a result, we can move away from a
model().fit, and toward more meaningful bespoke models
In this way, we increase how articulate a system is while at the same time answering questions about assumptions without resorting to basic metrics.
From this perspective, making a model is:
learning from data x whatever constraints I have.
Maybe we should start learning to accept that
model.fit()is incredibly naive. Perhaps we would be better served if we began approaching machine learning like people in physics: study a system until it becomes clear which model will explain everything. — Vincent Warmerdam, 2019
Take a step back and consider for which use case your model should be a proxy. Does it mimic its real-word naturally intelligent counterpart? Or is your model out-to-lunch concerning real-world application? Beware: you don’t want to be the person designing an algorithm responsible for quoting less than fair housing rates due to the number of black people in a neighborhood! Which natural thinking person would do that?
Natural Intelligence isn’t such a bad thing
Grant yourself the creative freedom to understand the problem. Your solution design will be better as a result.
Check out Vincent’s open-source project called scikit-lego (an environment to play around with these different types of strategies in real-world scenarios) and his YouTube video which inspired this blogpost.
Artificial Intelligence isn’t such a bad thing if we are willing to bestow credit on the beautiful, natural intelligence which is human. This approach is lacking in our Machine Learning models today. If intelligently implemented into our models, the potential for this natural intelligence approach to deliver more meaningful results is excellent.
We’ll be talking more about the remaining three fixes for artificial stupidity in future posts. Stay with us!!