# Learning fair rule lists with FairCORELS

Using Python to learn fair rule lists with FairCORELS

Ulrich Aïvodji https://aivodji.github.io/
03-29-2021

FairCORELS is an open-source library for learning fair rule lists. In this short tutorial, we will learn how to use it to train rule lists under group fairness constraints.

# Prerequisites

## Group fairness

Several notions of fairness have been proposed in the machine learning literature . This tutorial will focus on statistical notions of fairness , also known as group fairness definitions. Simply put, statistical notions of fairness aim at studying how the performances of a machine learning model, as evaluated by a statistical measure $$m$$ (e.g., false positive or false negative rates), differ depending on the group membership of individuals within a dataset. Group memberships are often determined by the values of particular attributes (e.g., gender, age, ethnicity) hereafter referred to as sensitive attributes. In this context, the word unfairness is usually used to characterize the gap between the value of $$m$$ for a majority group (e.g., historically advantaged individuals) and its value for a minority group (e.g., historically disadvantaged individuals). Table 1 summarizes some of the most frequently used statistical notions of fairness.

Table 1: Summary of statistical notions of fairness with the related statistical measure involved. The columns identifier and method represent the id and the method used for the fairness metric within the FairCORELS library.
name measure identifier method
statistical parity (SP) probability of receiving a positive outcome 1 statistical_parity()
predictive parity (PP) positive predictive value 2 predictive_parity()
predictive equality (PE) false positive rate 3 predictive_equality()
equal opportunity (EOpp) false negative rate 4 equal_opportunity()
equalized odds (EOdds) false negative rate and false positive rate 5 equalized_odds()
conditional use accuracy equality (CUAE) positive predictive value and negative predictive value 6 conditional_use_accuracy_equality()

## Rule lists

A rule list $$r= (d_p, \delta_p, q_0, K)$$ of length $$K \geq 0$$ is a $$(K+1)-$$tuple consisting of $$K$$ distinct association rules $$p_k \to q_k$$, in which $$p_k \in d_p$$ is the antecedent of the association rule and $$q_k \in \delta_p$$ its corresponding consequent, followed by a default prediction $$q_0$$. The rule list below predicts whether a person is likely to make at least 50k per year.

 if [capitalGain:>=5119] then [high]
else if [maritalStatus:single] then [low]
else if [capitalLoss:1882-1978] then [high]
else if [workclass:private && education:bachelors] then [high]
else if [education:masters_doctorate] then [high]
else [low]

To make a prediction using a rule list, the rules are applied sequentially until one rule applies, in which case the associated outcome is reported. If none of the rules applies, then the default prediction is reported.

## CORELS

CORELS is a supervised machine learning algorithm which takes as input a feature vector $$X$$ and its associated label vector $$Y$$, all assumed to be binary, and finds the rule list $$r$$ that minimize the regularized empirical risk: $$\mathsf{misc(r, X, y)} + \lambda K$$, where $$\mathsf{misc(r, X, y)}$$ is the misclassification error of the rule list and $$\lambda \geq 0$$ is a regularization parameter used to penalize longer rule lists. It represents the search space of the rule lists as a trie and uses an efficient branch-and-bound algorithm to prune it.

## FairCORELS

FairCORELS is a multi-objective formulation of CORELS that aims to learn fair rule lists. More precisely, FairCORELS solves the following optimization problem:

$\begin{eqnarray} \label{eq:faircorels} \underset{r \in \mathcal{R}}{\operatorname{argmin}} & \mathsf{misc(r, X, y)} + \lambda K \\ \text{s.t. } & \texttt{unf}(r,X,y, A) \leq \epsilon, \\ \end{eqnarray}$

where the objective function is the regularized empirical risk of CORELS and the constraint requires that the unfairness $$unf(r,X,y, A)$$ of the solution, for a particular sensitive attribute $$A$$, to be less than a value $$\epsilon$$.

# Learning a fair rule list

In this section, we will see how to use FairCORELS to learn a fair rule list.

## Installation

FairCORELS can be installed easily using pip install faircorels. Read the README for more details on the installation process.

## Dataset and Preprocessing

For this tutorial, we will use the Adult Income dataset. It contains demographic information of about $$48,842$$ individuals from the $$1994$$ U.S. census. The associated classification task consists of predicting whether a particular individual earns more than $$50,000\$$ per year. Table 2 shows the first six records of the dataset.

Table 2: First six rows of the Adult Income dataset.
age workclass education marital_status relationship occupation race gender capital_gain capital_loss hours_per_week income
28 Self-Employed HS-grad Single Not-in-family Blue-Collar White Male 0 0 55 0
65 Private School Widowed Unmarried Service Other Male 0 0 40 0
58 Private HS-grad Married Husband Blue-Collar White Male 0 0 40 0
44 Private Some-college Married Husband Sales White Male 0 0 40 0
50 Private Masters Married Husband White-Collar White Male 15024 0 65 1
18 Self-Employed School Single Own-child Blue-Collar White Male 0 0 10 0

Similar to CORELS, FairCORELS also need input data to be categorical. The dataset will be transformed as follows:

• First, all the numerical attributes are converted into categorical ones by using the Minimum Description Length Principle .

• Afterwards, one-hot encoding is applied to the resulting dataset.

• Finally, the set of rules (composed of single-clause, negative single-clause, and two-clauses rules) is constructed by applying FPGrowth to the one-hot encoded dataset.

Table 3 gives an overview of the preprocessed dataset.

Table 3: First six rows of the binarized dataset.
0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

## Training

To train the constrained rule list, we will need the following libraries:

• pandas: to load the dataset.

• sklearn: to split the dataset into train and test sets.

• faircorels: to train the rule list with fairness constraint and to evaluate the unfairness and the accuracy

import pandas as pd
from faircorels import FairCorelsClassifier, ConfusionMatrix, Metric
from sklearn.model_selection import train_test_split

Now, we will perform the following tasks:

• Separate the feature vector from the label vector.
• Remove the feature vector that characterizes the majority and the minority groups. In fact, the sensitive attribute is only used to enforce the fairness contraint at training time. The sensitive attribute is not used at inference time to avoid disparate treatment.
• Split the dataset into train and test sets.
# loading the preprocessed dataset

# getting the label vector
y = df["income"]
y = y.to_numpy()

# getting the majority and minority groups
maj_group = df["gender_Male"]
min_group = df["gender_Female"]

#getting the rest ot the features
X = df.drop(["income", "gender_Male", "gender_Female"], axis=1)
features = X.columns.tolist()
X = X.to_numpy()

# split the dataset into train/test sets
X_train, X_test, y_train, y_test, \
maj_group_train, maj_group_test, \
min_group_train, min_group_test =  train_test_split(X, y,
maj_group,
min_group,
test_size=0.33,
random_state=42)

Finally, we can train the rule list.

# create the model
clf = FairCorelsClassifier(
c = 1e-3,
mode = 3,
fairness = 4,
epsilon = 0.95,
max_card = 1,
maj_vect = maj_group_train,
min_vect = min_group_train,
n_iter = 300000,
policy = "bfs",
bfs_mode = 2,
verbosity = [])
# train the model
clf.fit(X_train, y_train, features=features, prediction_name="[income:>50K]")

The rule list model is instantiated using the FairCorelsClassifier class with the following parameters:

• c: is the weight of the rule list’s length regularizer.

• mode: determines the optimization methods. A value of 3 corresponds to the epsilon constraint method. That is, it will fix a value for the unfairness and then maximize the accuracy.

• fairness: identifier of the fairness metric used. See Table 1 to get the identifier of each fairness metric and their corresponding methods. For this example, we used the equal opportunity metric.

• epsilon: is the value of the fairness constraint. So, a value of 0.95 means that we want an unfairness gap of 0.05.

• max_card: maximum cardinality allowed when mining rules. In this example, we use max_card = 1 since we have already performed the mining with FPGrowth.

• maj_vect and min_vect: are used to specify the majority and minority groups

• n_iter: is used to specify the maximum number of nodes (rule lists) to search before exiting. Even if branch-and-bound is an exact method, we will need to stop the search algorithm after a certain number of iterations to avoid high computational overhead. However, whenever the algorithm stops, we have the guarantee that the best rule list given the computational constraint is returned.

• policy and bfs_mode: are used to specify, the exploration strategy. That is, how nodes are ordered within the priority queue of the branch-and-bound framework. Here, we use a breadth-first search and nodes are selected according to the value of the objective function of the rule lists that they represent.

The rule list is trained with the fit method.

## Evaluation of the rule list

To obtain the trained model’s description, we can use the rl_ attribute of the rule list object.

print(clf.rl_)
RULELIST:
if [not_capitalGain_-inf_to_5095.5]: [income:>50K] = True
else if [capitalLoss_1881.5_to_1978.5]: [income:>50K] = True
else if [education_hs_grad__AND__capitalLoss_-inf_to_1534.0]: [income:>50K] = False
else if [occupation_whiteCollar__AND__hoursPerWeek_40.5_to_inf]: [income:>50K] = True
else [income:>50K] = False

We can compute the accuracy of the model by using the score method.

accuracy_train = clf.score(X_train, y_train)
accuracy_test = clf.score(X_test, y_test)
print("Accuracy train: {}".format(accuracy_train))
Accuracy train: 0.8197500440063369
print("Accuracy test: {}".format(accuracy_test))
Accuracy test: 0.8166711337443044

To compute the unfairness, we will use the ConfusionMatrix and Metric classes.

def compute_unfairness(min_group, maj_group, X, y):
cm = ConfusionMatrix(min_group, maj_group, clf.predict(X), y)
cm_minority, cm_majority = cm.get_matrix()
fairness_metric = Metric(cm_minority, cm_majority)
return fairness_metric

fairness_metric_train = compute_unfairness(min_group_train, maj_group_train,
X_train, y_train)
fairness_metric_test  = compute_unfairness(min_group_test, maj_group_test,
X_test, y_test)
print("Unfairness train: {}".format(fairness_metric_train.equal_opportunity()))
Unfairness train: 0.02858023705705337
print("Unfairness test: {}".format(fairness_metric_test.equal_opportunity()))                                                         
Unfairness test: 0.0018119197364480089

To evaluate the unfairness, we first create a confusion matrix for both the minority and majority groups using the ConfusionMatrix class. Then, we create a metric object using Metric. Finally, we get the value of the unfairness using the corresponding method equal_opportunity().

In this particular example, we can see that the learned rule list effectively has an unfairness that is lower than 0.05 and an accuracy of 0.819 on the training set. It also generalizes well: it has an unfairness close to 0.00 and an accuracy of 0.816 on the test set.

# Conclusion

In this short tutorial, we learned how to train rule lists under group fairness constraints using FairCORELS. More details about FairCORELS can be found in our paper Learning Fair Rule Lists .

Aïvodji, Ulrich, Julien Ferry, Sébastien Gambs, Marie-José Huguet, and Mohamed Siala. 2019. “Learning Fair Rule Lists.” arXiv Preprint arXiv:1909.03977.
Angelino, Elaine, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin. 2018. “Learning Certifiably Optimal Rule Lists for Categorical Data.” Journal of Machine Learning Research 18 (234): 1–78.
Chouldechova, Alexandra, and Aaron Roth. 2018. “The Frontiers of Fairness in Machine Learning.” arXiv Preprint arXiv:1810.08810.
Fayyad, Usama M., and Keki B. Irani. 1993. “Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning.” In IJCAI.
Han, Jiawei, Jian Pei, and Yiwen Yin. 2000. “Mining Frequent Patterns Without Candidate Generation.” ACM Sigmod Record 29 (2): 1–12.
Narayanan, Arvind. 2018. “Translation Tutorial: 21 Fairness Definitions and Their Politics.” In Proc. Conf. Fairness Accountability Transp., New York, USA.
Rivest, Ronald L. 1987. “Learning Decision Lists.” Machine Learning 2 (3): 229–46.
Verma, Sahil, and Julia Rubin. 2018. “Fairness Definitions Explained.” In 2018 Ieee/Acm International Workshop on Software Fairness (Fairware), 1–7. IEEE.

### Citation

Aïvodji (2021, March 29). Ulrich Aïvodji: Learning fair rule lists with FairCORELS. Retrieved from https://aivodji.github.io/posts/2021-03-29-learning-fair-rulelists-with-faircorels/
@misc{aïvodji2021learning,
}