Can AI Be Fair and Unbiased?

December 14, 2023 Paulo Carvão

Forty-one states and the District of Columbia are suing Meta, alleging Instagram is addictive and harmful to children. Argentina has recently held its first “artificial intelligence” presidential election. Searches for images of CEOs yield fewer images of women and job search ads for high-paying positions are less likely to appear for them. Natural language processing algorithms often encode language with gender biases. In the U.S. healthcare system, commercial algorithms exhibit racial biases that distort health-related decisions.

Can we trust AI to be fair? There are no easy answers and there is no single definition of fairness. What is considered fair can vary depending on the context. However, some general principles can be used to guide the development and use of algorithms and artificial intelligence systems that are based on them.

Fair algorithms should, at a minimum, provide results that are legally and morally acceptable. Responsibility for the achievement of fairness lies primarily with those who have developed the algorithm but can also lie with those who are applying the algorithm to a specific problem set. The data that is used to train the algorithm also plays a critical role since it can carry within it historical structures of bias, prejudice, and inequity.

Fairness is not always a mathematical concept. There is no simple formula that can be used to produce fair algorithms. There are notions of fairness that are purely mathematical, such as in engineering, but this text focuses on classes of algorithms where there is a subjective component to the assessment of fairness as a human value, judged by human beings with a degree of subjectivity involved. This highlights the importance of having a combination of technical and liberal arts skills to develop responsible and ethical use of algorithms for artificial intelligence.

Fair algorithms should not discriminate based on characteristics that a subject cannot control. Case in point: an algorithm should not make employment decisions based on a person's race, gender, sexual orientation, or disability status. These are characteristics that people cannot choose, and it is unfair to discriminate against them on this basis. Moreover, these are typically characterized as protected groups under United States law.

The responsibility for the achievement of fairness lies with both the developers and the users of algorithms. Developers have a responsibility to design fair algorithms and to avoid introducing bias into the algorithm. However, algorithm fairness is not a problem that can be solved by developers alone. Users of algorithms also play a role in ensuring that algorithms are used ethically. For instance, a company that develops a facial recognition algorithm has a responsibility to ensure that the algorithm is not biased against certain groups of people. However, the company that uses the algorithm to secure its premises also has a responsibility to ensure that the algorithm is used fairly. This means that the company should review the algorithm's decisions and make sure that they are not biased.

Technically speaking there is an important distinction between statistical bias and fairness. An algorithm can be harmful and unfair, even when it is statistically unbiased. In popular culture, these terms can be used interchangeably leading to confusion. Therefore, it is important to focus the conversation about the ethics of algorithms on the fact that there can be problems even when they are unbiased in a strictly technical sense.

The literature on algorithmic fairness proposes many different definitions, and one common approach is to classify fairness criteria into three categories: independence, separation, and sufficiency.

Independence means that the relevant attribute (e.g., race or gender) is statistically independent of the predicted value or predictions. Let’s say, a loan algorithm should not be more likely to predict that a person will default on a loan simply because they are a woman or a person of color. This is also known as the statistical parity criterion.
Separation means that the relevant attribute is statistically independent of the predicted value conditional on the actual outcome's value. This means that the loan algorithm should be just as accurate at predicting whether a man or a woman will default on a loan. This is known as the equal opportunity criterion.
Sufficiency means that the relevant attribute is statistically independent of the actual outcome value conditional on the predicted value. Relevant and non-relevant groups who received the same predicted value should have the same actual outcome value. This means that, for example, if the loan algorithm predicts that two people are equally likely to default on a loan, then they should have the same actual probability of default. This is the calibration criterion.

These categories are not mutually exclusive, and it is often difficult to achieve all three categories of fairness simultaneously. By understanding these concepts, one can better develop and evaluate algorithms that are fair to everyone.

Another common dilemma is the tradeoff between fairness and accuracy. Typically, when a notion of fairness is enforced the accuracy of the algorithm suffers. This depends on the fairness metric and the quality and size of the training dataset. Diversifying datasets to reduce bias may conflict with privacy, as gathering more detailed data can intrude on individual confidentiality or include sensitive information that could potentially be misused.

Algorithms are increasingly used to make important decisions about our lives, which must be fair. Industry and academia continue to develop new ways to evaluate algorithmic fairness and to ensure that algorithms are used responsibly. By applying these techniques, it is possible to develop algorithms that are fair in their contexts, even if no definition of fairness applies to all.

We may not have all the answers, but we are starting to ask the right questions.

About the Author:

Paulo Carvão is an accomplished Global Technology Executive with a record of leading large businesses at IBM, where he was a senior leadership team member until 2022. Since then, he has acted as a strategic advisor for technology and go-to-market issues and is a Venture Capital Limited Partner and investment committee member. As a Harvard Advanced Leadership Initiative Fellow, Paulo focuses on the critical intersection of technology and democracy, contributing impactful articles on this pivotal subject. He also previously wrote for the Social Impact Review, “The Supreme Court Has Spoken on Gonzalez v. Google – Now It’s Congress's Turn To Address Section 230.”