2. But why, Mr. Robot?

Jul 12

Alright, let me be honest. When I said at the end of the previous piece that this is where the interesting stuff would start, I did not quite have the idea of a piece like this in mind. Not that I find this uninteresting, but it is also rather dry and matter of fact. See it as a fossil maybe, pretty cool but non-moving. Something you can look at and say “huh look at that” and walk to the next part of an exhibit.

However, this piece of the exhibit is also necessary to understand the rest of the cool and interactive works. Simply put, we need to build a bit more on a common foundation before we can enter the stage of wondering and asking questions and of general “being creative”. There is the need for a bit of a domain, a box so to say, before starting to think outside of it.

After the previous introduction, I assume we agree on the basics of machine learning. Now we move on to the way in which these models of machine learning can be interpreted. Luckily, there is already quite an extensive body of research available to us and it would be a shame to skip over all the fruits of hard-working AI experts (and AI’s). This is why it seems necessary to include one more of these foundational, review-like sections:

Let’s dive into the field of Interpretable Machine Learning.

This section will be based on Christoph Molnar’s book on interpretable machine learning. If you are looking to learn more about certain topics that I discuss, or the field as a whole, I greatly recommend checking out the whole book or the specific sections that interest you. My contribution will mostly be to summarise and make the content even more accessible, as well as to be able to share my own point of view and interest through second-hand writing.

One of the main attractive features of this book is Molnar’s pragmatic approach in my opinion. Rather than merely stating the difficulties of defining a term like interpretability, he continues to work with the best notions of explanation that we do have. In my opinion, this mindset is key to the field of interpretable machine learning. The world of artificial intelligence is in constant motion and to keep it from going off track, there is a need for people like Molnar that keep working on such difficult tasks rather than just sweeping them aside.

One of the definitions that Molnar has chosen to include is the definition as given in an interesting paper from 2017 by T. Miller. This is a paper about explanations in artificial intelligence with inspiration from the social sciences, and the definition is:

“Interpretability is the degree to which a human can understand the cause of a decision.”

To continue walking the fine border between narrowing down and moving forward, let us leave the ambiguity of this definition aside for now. Instead, we can consider those models that are considered fairly interpretable from the start.

This category contains, but is not limited to, linear models. A linear explanation — as laid out in the last piece’s short introduction to machine learning — is pretty clearly interpretable. Even in the case that there are very many factors contributing to the outcome, there will only be as many factors included in this calculation as there are input variables.

For example, consider a (very poor) linear model which will output the amount of people developing diabetes type II in the Netherlands for the coming year based on three input variables:
- the amount of people that developed diabetes type II in the past year,
- the change in average weight over the past year,
- the change in average age over the past year.

If this model is based on the framework of linear regression (which I find a nice example of a linear method), the output would be the sum of these inputs multiplied by a certain weight for each of the inputs. Concretely, if the weight of the population goes up this will have a positive effect on the amount of people developing diabetes type II. This input is connected to a certain positive weight.

Any human can see exactly what happened in the calculation and understand the conclusion that the model gave. This is a very small and oversimplified example, but even with a lot more variables it would be feasible to understand the outcome given through the specific weights that are connected to them.

An example of a non-linear method that is still very interpretable, is the decision tree. This is very much what it sounds like, and is quite recognisable even for our own way of coming to conclusions. When I decide whether or not to bring an umbrella with me for the day, my thought process normally proceeds along the following lines:

First, I check whether it is raining or if the forecast is that it will later;
If this is the case, I think about if I mind having an umbrella with me;
Lastly, I check whether it is crazy windy out as this would ruin my umbrella. (3. is where most of my decisions tend to “no” here in the Netherlands.)

This is basically what a decision tree model will do. Granted, there is a bit more to it when actually creating and implementing such a model, but the base consists of simple IF — THEN statements of a similar nature. IF it is raining AND the umbrella is not bothersome AND it is not crazy windy out, THEN I bring an umbrella. Sometimes, we are not that different from our computers.

Other model frameworks are just too complicated to understand, that is the simplest way to put it. There is more to this of course, but the actual scope of these models needed for their predictive performance does not combine with interpretability. These frameworks often are called black box models, due to their lack of transparency.

However challenging it sounds to still understand such models, in the spirit of Molnar’s approach we can be inspired to still try our best. For this, there exists a body of research which aims at model-agnostic methods. There are two main reasons to consider methods that do not depend on the model altogether rather than still trying to make sense of a specific opaque type of model:

The frameworks get so large and complicated that there is little difference in using or not using what we know about the setup;
As there are different models that are being used (and switching between them can be very efficient), model-agnostic methods are friendlier to use.

Stop and think for a moment on what this means. We are now accepting the complexity of machine learning models and putting it aside altogether. We could just as well be using a model-agnostic method to consider the nature of the decision procedure of an octopus predicting world cup matches, as long as we can look at this as an input-output model of the right form. There does not have to be a specific intricate framework present, this is just what we want to research.

Two of these methods that have gained a lot of traction over the emergence of the research field are LIME and SHAP. LIME was proposed only six years ago and has already been cited in more than 9.500 papers. The approach is that of a local surrogate model. This consists of the creation of a new local model that approaches the behaviour of a specific explanation instance of the original model. With an example this will be clearer:

Consider another model, a black box model, to predict the chance of a single person developing diabetes type II. This one is a lot more accurate than our previous linear model as it uses many more variables and the more powerful and opaque convolutional neural network approach. When inputting all my relevant data, the model predicts a certain chance of me getting diabetes type II. Now I would like to understand this prediction though before I trust it, and so I aim to build a LIME model.

To do this, bring to mind a lot more versions of me. There’s one a kilo heavier or lighter, there’s one with a diabetic brother, there’s one that is a few months older, and so on and on. All these variations in input data will produce a variation in output chance as well. This way, with enough alterations of the base case (which is the “normal” me) we slowly create a complete model based on this one explanation instance and its variations. We are approaching the impact that different variables have on the output for this instance specifically: We are creating a local surrogate model.

SHAP, as proposed in a paper from 2017, uses a (slightly) different approach. This is based on the contribution of different factors to the output, and is best understood by the analogy of players working together in a coalition. We still focus on only one conclusion at a time though. But instead of considering changes in single variables like for a LIME explanation, we consider the presence and absence of different inputs altogether.

Going back to our example of me and my chance to develop diabetes type II, we alter the input in a different way. First we see what the model thinks of me without age, then we see what the result would be if we were to take away the family’s health history input. Note that this does not mean that we pretend like I have no family for this instance, we simply do not let the model consider the impact of my family’s history altogether. Slowly we create an explanation which is based on the impact that the different inputs (players) have on the result (coalition), this is a SHAP approach.

Most machines are simple in such a way that we only need a manual to understand them. Some machines with the capacity to learn are still so understandable that we just need a bit of insight to feel comfortable in our capability to explain their behaviour. This leaves the black box models, which are very difficult to understand. That is the feature correlated with their incredible performance, as well as being the feature that makes them most interesting to research speaking from my own perspective.

To me, black box models are not just a promising but impossible-to-understand prospect. They are a blank canvas with the first few lines of interpretation drawn on them. These models beat humans at tasks for which we have had thousands of years of training, and this alone makes them such interesting entities to research. Here is what we could consider alien intelligence, and we are in a prime position to start understanding.

The existing body of research lays down a great foundation for the further exploration of this question:

What does it mean to understand artificial (or just alien) intelligence and how can we go about approaching this understanding?

Now that is the interesting part to me, and in the next piece we start dipping our toes in the water.

Note: I have edited this piece after supervisor Peter pointed out that my example of linear regression was faulty. The case on which it works is now one of amounts rather than chances, and it should be a correct implementation.

eXplainable AIMachine Learning

Maarten Mintjes

2. But why, Mr. Robot?

3. Putting robots in boxes

1. On Machine Learning