Introduction:

AI systems nowadays enable us to see, understand and interact with the world in ways that were unimaginable since a decade ago, and these systems are developing in an extraordinary pace yet despite these advancements AI is not infallible(incapable of making mistakes), developing responsible AI requires an understanding of the possible issues, limitations or unintended consequences.

Technology is a reflection of what exits in society, so without good practices, AI would replicate existing issues or bias and amplify them. But there isn't a definition for responsible AI, nor there is a simple checklist or formulae that defines how responsible AI practices should be implemented, that said every organization has their own AI principles that reflect their organizational values and all these are unique to every organization, if we look for common themes, we find a consistent set of ideas like:

Transparency
Fairness
accountability
privacy

From design to deployment, the decisions you make have an impact, therefore we need to have a defined and repeatable process for using AI efficiently.

Why we need Responsible AI?

There is a common misconception with AI is the machines make the central decision-making problem, but in reality, there are people who designed these machines and decide how to use.

People are involved in each aspect of AI development, they collect data, control the deployment of these AI, and then customers or other people use these machines to incorporate their decision-making capabilities.

Every time a person makes a decision, based on their own values. Whether it is decision to us generative AI to solve the problem as opposed to other methods, or anywhere throughout the machine learning lifecycle, that person introduces their own set of values, that means that every decision point requires careful consideration and evaluation to ensure choices have been made responsibly for concept, deployment and maintenance, because there is potential to impact various areas of society, not to mention people's daily lives, it is important to develop these technologies with ethics in mind.

Responsible Ai doesn't mean to focus only on obviously controversial use cases, but there are responsible AI practices for even seemingly innocuous (not very harmful, potentially harmless) use cases.

All those with good intent may still cause ethical issues and unintended outcomes or not be as beneficial as they could be. Ethics are important as they not only represent the right thing to do, but also because they can guide AI designs be more beneficial for people's lives.

We should know that a responsible Ai means a successful Ai. For example, if we take the 7 principles put forward by google which not only help in developing responsible AI but also help make better decisions even though the team doesn't align with the decision, they may align and trust the process, that actively govern the research and product development and effect our business decisions:

AI should be socially beneficial.(any project should take into account the broad range of social and economic values and will proceed only if the overall benefits substantially exceed the forceable risks)
AI should avoid creating or reinforcing unfair bias.(avoid unjust effects on people, especially those related sensitive characteristics such as race, ethnicity, gender, nationality, income, sexual orientation, ability and political and religious beliefs)
AI should be built and tested for safety. (apply strong and stricter safely practices to avoid unintended results)
AI should be accountable to people. (AI systems should provide appropriate opportunities for feedback)
Ai should incorporate privacy design principles. (provide sufficient transparency and control over the use of data)
AI should uphold high standards of scientific excellence. (work with range of statements that provide thoughtful leadership in this area and drawing on scientifically rigorous, multi-disciplinary approaches)
AI should be made available for uses that accord(present) with these principles. (many technologies have multiple use cases, so you should avoid potentially harmful or lucid(easy to understand) applications)

BIAS:

for example, if we take this image:
what is this image?
many will say it is a watermelon, slices of watermelon, juicy watermelon, etc. or just red watermelon
now we would say this as a yellow watermelon
why did we not say red watermelon initially? when we see an image, we tend to think it in the most general sense to us what we are habituated to see, like just watermelon rather than red watermelon. This is due to our own bias like that of the geographical sense where we have never seen a yellow watermelon, the red watermelon is the only watermelon. For us Red is a prototypical colour of watermelon that most of us expect to see.
We label and categorize the world to reduce complex sensory inputs into simplified groups that are easier to work with.
prototypes are "typical" representations of a concept or object.
We tend to notice and talk about things that are atypical.
Bias and stereotypes arise when particular labels and features confound (surprise or confusion) decisions - whether human or artificial.
for example: one of the most famous biases have been arising in facial recognition
this form of algorithmic bias can arise in many different ways:
for example, the image of classifying a bride which was passed into a CNN trained on open-source large scale image dataset "coco":
but when an image of a bride in south Asia was passed into the CNN, then the results were completely different:
then we can see that the classes are totally unrelated and not even human being, this is very concerning and is not an expected behaviour in deep learning that we may think of has solved, in this case "image classification".
Now you may ask, what could be the actual drivers, the causes of such a misclassification
the analysis of bias correlation of the income and geography was done.
it was seen that the data taken was taken from the upper margin of income, what can be the other sources?
it was seen that the data was collected from the north American and European countries, making the data set biased to those regions only while majority population of the world lies in the east and the south east Asia
this example perfectly shows how bias can exist in multiple levels in an AI or Deep learning pipeline

Bias in stages of the AI pipeline:

Bias can exist in multiple stages of an AI pipeline and poisons all stages of the model
- Data: imbalances with respect to class labels, features, input structures
- Model: lack of unified uncertainty, interpretability and performance metrics
- Training and deployment: feedback loops and perpetuate biases
- Evaluation: done in bulk, lack of systematic analysis with respect to data subgroups
- Interpretation: human errors and biases distort meaning of results
Based on the image, types of biases that occur in each stage of the AI pipeline are:
- Data Biases
  1. Historical Bias
    - Explanation: Historical bias occurs when the data reflects historical inequalities and prejudices. This bias is embedded in the dataset due to past practices or societal norms.
    - Example: If historical hiring data shows a preference for male candidates, a model trained on this data will likely perpetuate this bias.
  2. Representation Bias
    - Explanation: Representation bias happens when certain groups are underrepresented or overrepresented in the data. This can lead to models that perform poorly for underrepresented groups.
    - Example: If a facial recognition dataset has more images of light-skinned individuals than dark-skinned individuals, the model will likely perform better on light-skinned individuals.
  3. Measurement Bias
    - Explanation: Measurement bias occurs when the data collected does not accurately reflect the real-world values. This can happen due to poor measurement tools or inconsistent data collection methods.
    - Example: If survey data on income is self-reported, it may not accurately reflect true income levels due to misreporting or misunderstanding of the survey questions.
  4. Temporal Bias
    - Explanation: Temporal bias occurs when data becomes outdated over time. Models trained on outdated data may fail to capture current trends or behaviors.
    - Example: A model predicting product demand based on data from before the COVID-19 pandemic may not perform well during or after the pandemic due to shifts in consumer behavior.
  5. Omitted Variable Bias
    - Explanation: Omitted variable bias happens when a relevant variable is left out of the data or model. This can lead to incorrect inferences about relationships between variables.
    - Example: If a model predicting student performance omits socioeconomic status, it might incorrectly attribute performance differences solely to other factors like school quality.

AI/ML Biases

Algorithmic Bias
- Explanation: Algorithmic bias arises from the design of the algorithm itself, including how it processes data and makes decisions. This can include biases in the training process, hyperparameter selection, and optimization criteria.
- Example: An algorithm designed to maximize accuracy might disproportionately misclassify minority class members if the classes are imbalanced.
Evaluation Bias
- Explanation: Evaluation bias occurs when the metrics used to assess model performance do not capture the true performance across different groups. This can lead to misleading conclusions about model effectiveness.
- Mathematical Example: Precision and recall are common evaluation metrics. If precision is high but recall is low for a minority group, the model may be considered effective overall but still be biased against that group.
Aggregation Bias
- Explanation: Aggregation bias arises when data from different groups are combined without considering differences between those groups. This can mask performance disparities.
- Mathematical Example: If a dataset combines data from two regions with different income distributions, a model trained on this aggregated data may not accurately predict income for either region.
Popularity Bias
- Explanation: Popularity bias occurs when the model disproportionately favors popular items or behaviors present in the data. This can lead to a reinforcement of the popularity of these items or behaviors.
- Example: A recommendation system might suggest popular products more frequently, ignoring niche items that might be of interest to specific users.
Ranking Bias
- Explanation: Ranking bias happens when the model's ranking mechanism disproportionately favors certain items or individuals based on biased criteria.
- Example: Search engines might rank websites higher based on biased click-through data, perpetuating the visibility of certain types of content.
Emergent Bias
- Explanation: Emergent bias arises when the system's interactions with users lead to unexpected and biased outcomes. This can happen when user behavior evolves in response to the system.
- Example: A social media algorithm that prioritizes engaging content might inadvertently promote sensational or misleading information if users interact with it more frequently.
Linking Bias
- Explanation: Linking bias occurs when the connections or relationships in the data lead to biased outcomes. This can happen in network-based models or when integrating multiple data sources.
- Example: In a network analysis, certain nodes might be overrepresented due to biased sampling methods, leading to skewed insights about network structure.

Human Review Biases

Behavioral Bias
- Explanation: Behavioral bias arises from the actions and decisions of humans reviewing the AI/ML output. This can include cognitive biases such as confirmation bias or anchoring.
- Example: A human reviewer might be more likely to accept model outputs that align with their pre-existing beliefs, disregarding outputs that contradict them.
Presentation Bias
- Explanation: Presentation bias occurs when the way information is presented influences the decisions made based on that information. This can include visualizations, report formats, or user interfaces.
- Example: A dashboard that highlights certain metrics over others can lead to biased decision-making if critical information is downplayed or omitted.
Content Production Bias
- Explanation: Content production bias arises when the content created based on model outputs is biased. This can be due to biased writing, selection of examples, or framing.
- Example: News articles generated from data might reflect the biases of the data and the journalists, perpetuating certain stereotypes or viewpoints.
Social Bias
- Explanation: Social bias occurs when societal norms and values influence the interpretation and use of model outputs. This can include cultural biases, prejudices, or institutional biases.
- Example: A hiring algorithm might be used in a way that perpetuates existing gender biases in certain industries, even if the algorithm itself is unbiased.

Broad taxonomy(classification) of biases:
- Data driven:
  - Selection Biases: data selection does not reflect randomization. ex: class imbalance
  - Sampling Biases: Particular data instances are more frequently sampled. ex: hair, skin tone
  - Reporting Biases: What is shared does not reflect real likelihood. ex: news coverage
- Interpretation driven:
  - Correlation Fallacy: Correlation != Causation
  - Overgeneralization: "General" conclusions drawn from limited test data
  - Automation Biases: AI generated decisions are favored over human generated decisions

Detailing on the taxonomy of biases:

Correlation fallacy:

Here we can see that we have two plots, we want to predict the number of CS doctorates in the next year, so we will use the trend of the other plot as they are nearly the same, but as you can see the other graph shows the revenue generated by arcades.
Even though the plots follow a certain trend, the other plot clearly unable to get the basic cause of the problem here, that is computer science doctorates in US, thus a model with input of revenue and output of CS doctorates could easily breakdown as it is unable to capture the fundamental driving force that is leading to this trend in the variable that we are trying to predict.
So, the correlation fallacy is not just about Correlation != Causation, it can also generate and perpetuate biases when wrongfully or incorrectly used.

Overgeneralization:

For example, if we want to train a model to predict mugs, we use a wide range of data of images with mugs, but when we took real life detection, we found out that our model showed poor performance, as we never assumed such angles in detecting the mugs
So, our model shows poor performance on mugs that were represented less significantly, although we expected it to generalize.
This phenomenon can be often referred to as a "Distribution Shift" that can bias networks to have worst performance on examples that it has not encountered before.
One of the strategies that was proposed to remove these distribution shifts was, to start with that dataset and then construct an improved dataset that already account for potential distribution shifts, this is done by specifying example sets of images for training, and then shifting with respect to particular variable to construct the test dataset:
in this the distribution shift occurred due to Time/Geographic region
Or in case of medical images, that mean sourcing images from different hospitals, for each of train, validation and test
Now datasets like this help mitigate generalization biases because they inherently impose the necessity of already testing your model on a distribution shifted series of examples

Data driven Biases - class imbalance:

One of the most pervasive(unwelcoming) biases that are present in the data.
image we have a distribution of three classes, where the first graph shows the distribution on the real population, now we see that in our dataset the class distribution was not the same, leading to a class imbalance bias, by which the model that is trained on that dataset will be biased to the distribution of the dataset and not to the real population, leading to partial and poor decisions made on real life examples.
This leads to more accuracies on frequently occurring classes
this is not what we desire, but instead we want an unbiased performance, and have accuracies constant and same over all the classes.
the root cause of this type of problem is, we build a classifier which will classify data points into blue and orange classes.
here our dataset is imbalanced such that for every 1 orange point, 20 blue points
now in the process of learning, from gradient descent, we see that by incremental update will be made to the classifier, on the basis of the data that it has observed.
now there as we see the data points appear but our dataset is class imbalanced as more and more blue points come in as a result the decision boundary keeps moving accordingly
Now say we see an orange point, then the decision boundary tries to adjust itself to account for new observation
In the end the classification boundary will end up occupying more blue space than orange thereby making it more biased towards blue, showing how class imbalance can lead to biased classification towards a majority class which is a very common problem
One of the prominent examples of such a bias is that of classification of cancer patients:
our task is to train a CNN to detect glioblastoma from MRI scans of brain and suppose the class incidence in our dataset reflected the real-world incidence of this disease meaning that for a dataset of 100000 images, only 3 of them actually had brain tumor
Remember that a classification model is ultimately being trained to optimize its classification accuracy, so what this model can basically fall back towards is just predicting healthy all the time, making the model reach 99.99 percent accuracy
As that was the rate at which healthy occurred in the dataset
So how do we mitigate this?
there are two approaches, one being batch selection and Example Weighting
- Batch Selection:
  - Select and feed the model in batches that are class balanced(1:1)
  - Now our boundary can see the samples and adjust itself and change itself as the increasing number of batches are fed to the dataset.
  - Balanced batches give more information
- Example Weighting:
  - if the dataset is biased, then the weights assigned to them are inverse of the frequency of occurrence, so that we can reduce the influence of majority class and improve the influence of minority class, as a result be product a class balanced dataset contributing to the model's learning process equally
  - Above is another way to visualize this approach.

Bias within features:

What if our class is balanced and there still be biases?
Can there be biases in the class?
The hidden biases in the class which can be harder to identify and even more dangerous and this could reflect a lack of diversity in the within class feature space in the data.
For example: Hidden bias in Facial Detection
by training a CNN with the biased data, our model ends up being biased as well
These kinds of biases manifest quite strongly in commercial facial detection and classification systems

Mitigate biases and improve fairness:

We have approached to a new idea of using machine Learning Algorithms to improve the fairness of the data. This machine learning model is trained that is trained to remove aspects of the signal that are contributing to unwanted bias. )

Instead of removing signals, we can add back signals for greater inclusion for underrepresented regions of the data or particular demographics (statistical characteristics of human populations).

Bias and fairness in Supervised Classification:

A classifier's output decision should be same across sensitive characteristics, given what the correct decision should be.
A classifier, f theta(x) is biased if its decision changes after being exposed to additional sensitive feature inputs, it is fair with respect to variable Z if:
For example, for a single binary variable Z, fairness means:

Evaluating Bias and Fairness:

Evaluating performance with respect to different subgroups or demographs is called Disaggregated evaluation.
For example, if we are working with coloured shapes, with can be with respect to the colour feature keeping shape constant or shape feature keeping the colour constant
Evaluating performance with respect to the subgroup interactions is called the Intersectional evaluation.

Recent developments that use deep learning to mitigate bias(Supervised Classification):

Adversarial Multi-task Learning:

First, we as human users need to specify an attribute 'z' for which we seek to mitigate bias, jointly predict output y and z
So, given a particular input x, which is going to passed into the network via embedding and hidden layers.
At the output the network will have two heads, each corresponding to each of the prediction task, being the prediction of target label y and prediction of the value of sensitive attribute that we are trying to debias against z.
Our goal is to try to remove any confounding effect of this sensitive attribute on the outcome of the task prediction decision.
This effect removal is done by posing an adversarial(opposing) objective into training specifically by negating the gradient from the attribute prediction head, during backpropagation, the effect of this is to remove the effect of the attribute on the prediction task.
This approach was first applied to a language modeling problem, where the sensitive attribute that was specified was gender, and the task of interest was analogy completion where the goal is to predict the word that is likely to fill in an analogy(comparison).
One limiting factor for this approach is that the sensitive attribute is human determined, this can be limiting in 2 ways:
- There could be hidden and unknown biases that are not necessarily apparent from the onset which we also ultimately want to debias these.
- By defining the bias, we humans will inadvertently(accidentally) propagating our own biases by telling the model what we think it is biased against
So, what we want is an automated system that can automatically define and uncover potential biases in the data without any annotation or specification

Adaptive Resampling for Automated Debiasing:

This being the perfect use case for generative models specifically those that can learn and uncover the underlying latent dataset.
For example, in facial detection, if we are given a dataset with many different faces, we may not know what the exact distribution of a particular latent variables in this dataset is going to be, there could be imbalances with respect to these latent variables, for example, "face pose", "skin tone", that could result in an unwanted bias in our DataStream model.
Using generative models, we can uncover underrepresented and overrepresented regions of the latent landscape and use this information to mitigate some of these biases.

Learn latent structure:

We can achieve this by using a variation of auto encoder structure, we can use it to learn the underlying latent structure of the dataset in a completely unbiased and unsupervised manner,
For example, in case of face images, variables like orientation which were never specified to the model, the model picked up and learnt this as a particular latent variable; by looking at a lot of different examples of faces and recognising and this was an important factor.

Estimate Distribution:

After we know the structure of these latent variable, now we can make the distribution of them, meaning which values that these latent variables can take.
For example, if our dataset has more images of a certain skin tone, then those will be over represented, thus the likelihood of selecting a particular image that has this particular skin tone during training will be unfairly high, resulting in unwanted biases in favour of these types of faces.
Conversely, faces with rare features like shadows, glasses, darker skin, hats, maybe underrepresented in the data, thus the likelihood of selecting instances with these features to train the model will be low.

Adaptively Resample data:

The model will then adjust the sampling probabilities of individual instances to reweight them during the training process itself.
This resampling generates more fair and more representative dataset for training

Math behind the resampling:

Approximate the distribution of the latent space with a joint histogram over the latent variables:
Based on estimated joint distribution, we can then define the adjusted probability for sampling a particular datapoint x during training
Using this approach and applying it to facial detection, we could increase the probability of resampling for faces that had under represented features
The power of this approach is that it conducts this resampling operation based on learnt features that are automatically learnt, thereby needing no human annotation of what the attributes or biases should be, making it more generalizable, also allows for debiasing against multiple factors simultaneously.
To evaluate how this algorithm mitigated bias on a benchmark dataset for evaluation of facial detection systems that is balanced with respect to male and female as well as skin tone.
To determine the degree of bias present, the performance is evaluated across sub groups in the dataset grouped on the basis of male female annotation and the skin tone annotation.
Without any debiasing, the performance was observed with the debiased approach with different values of debiasing parameter: