Algorithmic Bias

What Is Algorithmic Bias?

Algorithmic bias occurs when an algorithm systematically produces unfair outcomes, favoring certain groups or individuals over others. These outcomes often disproportionately impact marginalized populations and can arise from the data, design, or deployment of the algorithm.

Despite the perception of algorithms as objective or neutral, they are often trained on human-generated data, shaped by social systems, and embedded with implicit assumptions.

In short, algorithmic bias is when machines inherit human flaws — and automate them.

Why Algorithmic Bias Matters

Area	Impact
Hiring Systems	May favor one gender or ethnicity
Credit Scoring	Can penalize low-income applicants
Facial Recognition	Performs poorly on darker-skinned individuals
Predictive Policing	Over-targets communities already over-surveilled
Healthcare Algorithms	Can under-treat certain patient groups

In each case, biased algorithms amplify social inequalities — but now, at scale and with the veneer of scientific objectivity.

How Algorithmic Bias Arises

1. Biased Training Data

If your data reflects historical discrimination, the algorithm will learn and replicate it.

Example:
If past hiring records mostly show men in leadership roles, a hiring algorithm may favor male candidates by default.

2. Label Bias

Training labels (the “correct” outcomes) may be themselves biased.

Example:
If loan defaults were disproportionately attributed to a specific zip code, the model may assume residents from there are untrustworthy — regardless of individual behavior.

3. Feature Selection Bias

Choosing which inputs (features) to feed into the model can bake in bias.

Example:
Using zip code in a mortgage model might act as a proxy for race or class, even if race is not explicitly included.

4. Sample Imbalance

If one group is underrepresented in the training set, the model won’t generalize well to them.

Example:
Facial recognition datasets trained mostly on white male faces perform poorly on women and people of color.

5. Feedback Loops

Biased outputs reinforce biased inputs.

Example:
Predictive policing tools send officers to the same neighborhoods repeatedly → more arrests recorded there → algorithm assumes higher crime rate → continues targeting same area.

Case Studies

1. Amazon’s Resume Screening AI (2018)

Trained on 10 years of hiring data.
Learned to downgrade resumes that included the word “women’s” (as in “women’s chess club captain”).
Favored resumes with male-dominant language.
Result: Gender bias at scale.

2. COMPAS Algorithm for Criminal Risk

Used in U.S. courts to predict recidivism.
Found to be twice as likely to label Black defendants as high risk — even when they didn’t reoffend.
White defendants were more likely to be labeled low risk when they did reoffend.

3. Apple Card Credit Limits

Reports showed women receiving significantly lower credit limits than men, even with equal or better financials.
Apple and Goldman Sachs denied gender-based decisions — but the training data and model transparency were lacking.

Algorithmic Bias vs Human Bias

Aspect	Human Bias	Algorithmic Bias
Source	Conscious or unconscious behavior	Data, design, or deployment
Visibility	Sometimes overt	Often hidden or “black box”
Speed	Individual and slow	Instant and scalable
Accountability	Traceable to individuals	Often blamed on “the model”

Key danger: Algorithmic bias can appear objective, making it harder to challenge.

Identifying Algorithmic Bias

Data Auditing
- Check for imbalances, stereotypes, or missing groups.
Fairness Metrics
- Statistical parity
- Equalized odds
- Calibration
- False positive/negative rates by subgroup
Model Explainability
- Use tools like SHAP, LIME, or Feature Importance to interpret decisions.
Counterfactual Testing
- Would changing race/gender/zip code change the outcome, holding all else constant?

Mitigating Algorithmic Bias

Strategy	Description
Bias-aware data collection	Include diverse, balanced data
Fair feature selection	Avoid proxies for protected attributes
Re-weighting or resampling	Balance datasets for underrepresented groups
Fairness constraints in training	Penalize unfair outcomes in model loss function
Post-processing corrections	Adjust outputs to equalize fairness metrics
Human-in-the-loop	Use human judgment alongside model outputs
Model transparency & documentation	Provide model cards, data sheets, and audit trails

Fairness Trade-Offs

There’s no one-size-fits-all definition of fairness. Improving one type of fairness can reduce another.

Trade-off	Conflict
Accuracy vs Fairness	More fairness may reduce precision
Group Fairness vs Individual Fairness	Treating groups equally may treat individuals unequally
Short-term vs Long-term Fairness	Correcting historical bias can appear unequal in short-term

This is known as the “impossibility of fairness” theorem — meaning you must choose which definition of fairness matters most in context.

Tools and Libraries for Bias Detection

IBM AI Fairness 360
Fairlearn (Microsoft)
Google What-If Tool
Aequitas
H2O.ai Explainability modules
SHAP / LIME for model interpretation

Regulation and Ethics

Many governments and organizations are now addressing algorithmic bias through:

AI Ethics Guidelines (OECD, UNESCO, EU)
Algorithmic Impact Assessments
Transparency Requirements in public sector tools
Right to Explanation (GDPR)
Audit requirements for sensitive algorithms (e.g., lending, employment)

Summary

Algorithmic bias refers to systematic unfairness in machine decision-making.
It originates from biased data, poor design choices, and feedback loops.
It can have serious consequences in hiring, finance, justice, healthcare, and more.
Bias must be audited, measured, and mitigated — not ignored.
There is no single metric of fairness — context and human judgment are essential.