SAILSS: The Scouting AI Library for Social Sciences (2)

This is part 2 of a series of blog posts. Please read part 1 first.

The python code for this library is available on my GitHub account.


We have seen in the previous blog post that the sentence

„She came home and painted her nails“

is (according to most current LLMs) much more likely than the sentence

„He came home and painted his nails“

These two likelihoods of course depend on the data the LLM was trained on. Imagine the language model was solely (!) trained on the diaries of feminist women and passionate male crossdressers: in this case the second sentence would be even more likely than the first one!
This is the first problem with the proposed method: in which sense does the world model within a LLM really represent the structures of our society? There are many factors to consider:

  • LLMs are trained on a very large text corpus containing texts from various cultures. It is somehow a global average (with most probably a strong dominance of English speaking - and in particular US - culture).
  • The texts in the corpus describe how humans see the world, and not necessarily how it is. For instance the LLM might show human biases (like racism).
  • The training corpus also contains a lot of fiction. These texts show how humans imagine the world and - again - not necessarily as it is. An LLM might, for instance, eagerly answer questions about the technical details of a warp drive for a faster-than-light spaceship, even if such a propulsion device does not exist.

It is important to note that for our purpose (!) a LLM should actually not be completely „bias free“:
A LLM for text generation should hide certain aspects of the truth to avoid reinforcing certain biases. It should, for instance, output texts with a race and gender neutral probability for professions (i.e. not only white male medical doctors and black cleaning ladies). To achieve this, LLMs must undergo some kind of alignment training. But for our purpose this kind of bias should be still present in the LLM: it is unfortunately still true that doctors are more likely to be male and white than female and black. And it is exactly this kind of things we would like to track down with the method!
We will see later how we could deal with this problem.

Let’s have a look how we can calculate such likelihoods. We start with a very simple example: the sentence „I am a woman“. The likelihood for this sentence can be calculated as follows:

L(„I am a woman“) =
P(„I“) * P(„am“ | “I“) * P(„a“ | “I am“) * P(„woman“ | “I am a“)

where, for instance, P(„a“ | “I am“) is the conditional probability that the word „a“ follows after the sentence „I am“. Now all these probabilities are usually rather small and - of course - smaller than 1. This means, that the longer the sentence is, the smaller the likelihood gets. It therefore makes sense (to get a value which is independent of the length of the sentence) to normalize the likelihood with the number of words in the sentence (i.e. calculate an „average likelihood per word“). Because we have a multiplication in the above formula, we have to use the geometric mean (i.e. the Nth root of the product, where N is the number of words/tokens \(x_i\)):

$$ \sqrt[N]{p(x_1) p(x_2|x_{<2}) \cdots p(x_N|x_{<N})} $$

And because this is usually still a inconveniently small number it is another good idea to take the inverse of it to get numbers which are easier to compare. The resulting value is called perplexity of the sentence:

$$ PPL = \frac{1}{\sqrt[N]{p(x_1) p(x_2|x_{<2}) \cdots p(x_N|x_{<N})}} = \left(\prod_{i=1}^N p(x_i|x_{<i})\right)^\frac{-1}{N} $$

It can be understood as a measure how confused the LLM was when processing the sentence (higher value = more confused, lower value = less confused) [1].

Now let’s look again at the more complex statements from the beginning: If we want to compare the sentences „She came home and painted her nails“ and „He came home and painted his nails“ we expect the perplexity of the first sentence to be lower (i.e. less confusing) compared to the second one (men who paint their nails are confusing). Let’s assume that we are interested in this „nail painting gender difference“.

Now we have a problem (and this is also the reason why we can’t use a standard LLM library to do the work): We are actually only interested in the likelihood of the last part („she/he painted her/his nails“), but under the condition of the first part. But our perplexity for the full sentence is something different: it measures the likelihood of the full sentence. This could be very relevant, as the likelihoods of „she came home“ and „he came home“ could be quite different (and we are not interested in this difference). It would not even help to reduce the two sentences to „She painted her nails“ and „He painted his nails“, because there is already a difference in the frequency for „she“ and „he“ appearing at the beginning of sentences in the corpus. And maybe we really want to keep the leading „He/she came home and“ part to restrict the measurement on „She painted her nails“ to a „home context“.

Therefore we must calculate the probabilities using the full sentence, but the perplexity should be only calculated with the part of the sentence we want to measure.

SAILSS offers for this purpose a special symbol („||“) which can be introduced into the prompt (= text under investigation):

„||She came home and ||painted her nails“

The symbol || toggles the inclusion of sentence parts into the calculation of the perplexity. In the example, the calculation is immediately switched off at the beginning of the sentence and then switched on again before „painted“. In this example the libary will calculate:

L(„painted her nails“ | “She came home and“) =
P(„painted“ | “She came home and“) * P(„her“ | “She came home and painted“) * P(„nails“ | “She came home and painted her“)

and from this:

$$ PPL = \frac{1}{\sqrt[3]{L}} $$

In the next blog post we will look at some examples how SAILSS can be used and what kind of outputs it can produce.

Stay tuned!


[1] On a computer, the perplexity is usually computed using a different (but equivalent) formula to avoid numerical instabilities.

Image: DALL-E


Follow me on X to get informed about new content on this blog.

I don’t like paywalled content. Therefore I have made the content of my blog freely available for everyone. But I would still love to invest much more time into this blog, which means that I need some income from writing. Therefore, if you would like to read articles from me more often and if you can afford 2$ once a month, please consider supporting me via Patreon. Every contribution motivates me!