The global rise of excessive fears, nationalism and racism is of widespread concern. I was interested what role the media (both classical and social) with respect to this phenomenon could play. I chose do create a (rather informal) analysis based on ideas from data science and machine learning. All the arguments are explained and visualized using simple examples. The first result will be not very surprising but serves as a way to demonstrate the methods used also in the (more interesting) second example.
Let’s create an example for some ground truth (i.e. factual) data: the following chart shows all the events which happened in a large city block. The x- and y-coordinates are normal geographical coordinates. All events are, for the sake of simplicity, classified into either strictly positive (green) or strictly negative (red):
(the reason why I have chosen bigger dots for the negative events will be explained soon)
Normally there are many more positive events than negative ones. The general impression ist therefore rather positive: most areas seem to be safe (visualized with the green background color)!
Please note that in some areas we have both negative and positive events. We have only very little data, but we still want to be careful and mark the whole area as „possibly dangerous“. This could be justified (but is maybe not!).
Our knowledge about such poorly explored areas can be improved. Let’s call going to such a place (with status „unknown“ or „possibly dangerous“) an adventure. Now we all know that adventures are a dangerous thing. Therefore we don’t get ourselves into adventures all the time. It’s more a collective job (for which we usually use the more expendable males): once in a while some member of the community explores such a place and reports his experience. The important thing to note is that as soon as a region is marked „possibly dangerous“, it gets explored with much lower frequency (as going there is now an adventure) which also means that this region could remain marked „dangerous“ even if it actually is not. Or in other words: the error correction process is slow for areas falsely classified as negative.
Now let’s see how the media would report about this data. We make the reasonable assumption that the media channel (be it a classic newspaper or a social media platform) does not sample the events randomly. Random sampling makes little sense. The reason for this is that the range of possible emotional impact is substantially different for positive and negative events. This sounds complicated but is easy to understand: the maximum positive events we can experience in our life are probably an orgasm and the birth of a child. Even if the latter will change the life of the parents considerably, both these events don’t have the extreme impact negative events can have: if you die, the world as you see it ends. Or if you have an accident with a heavy injury, you might suffer badly for the rest of your life. Also, even in a small village maybe hundreds of orgasms will happen every night and many children are born every year. But the same village might see only one case of murder in 20 years. And, as media channels have to do some kind of sampling (there are simply too many events) they usually sort all the events by emotional impact (=expected interest from the audience) and then publish the top N (N = 5-20) entries. This is the reason why we never read headlines like „This night Mr. X from Y brought his wife Z to a fantastic orgasm!“.
Let’s do this for our sample data:
I have now chosen the 10 events with the largest emotional impact.
Now this looks much worse than the ground truth! Almost all the areas of the city now seem to be dangerous, the general impression is now suddenly clearly negative (visualized using red background color)! This corresponds to the world model the audience constructs in their mind based on this data.
We can conclude:
- Media channels with a sampling process favoring content with high emotional impact have a tendency to create irrational fears in their audience.
This result is important because such unreasonable fears lead to the implementation of excessive countermeasures (security infrastructure, legal etc.). They make us build cages for ourselves!
This is, admittedly, a rather trivial result. Now it gets more interesting. Most newspapers have a regional, a national and an international section. These three sections are of about the same size. But of course the amount of ground truth data for the international section (in the following chart located in the south) is much much larger than the one for the regional section (located in the north):
If now the „sorting by emotional impact“ sampling method is applied separately to the data of all three sections, something very interesting happens: typically for the regional section there are only very few true „high impact“ events. Therefore this part is mostly filled with things like „Big market tomorrow in city center“ or „society X is collecting donations for the disabled“ etc.. This is very different for the international section: here we have easily enough „high impact“ events to fill the whole section with it. And as we know, the „high impact“ events have s strong tendency to be negative. But this means that the model the audience creates from the countries abroad must be much worse than the model constructed from the data about local events:
We can conclude:
- Media channels with a sampling process favoring „high emotional impact“ applied to several locality sections with different numbers of events have a tendency to foster nationalism, racism and cultural chauvinism
- Surprisingly, this effect is even present when the media channel is making sure that the content describing the events does not foster these directions of thought (i.e. „liberal“/quality/responsible newspapers)!
These effects are extremely significant, because the amount of information we get about events from media channels is now much larger than the amount of events we experience ourselves. This has interesting implications too. Let me give you an example. I recently travelled to India. Beforehand quite a few of my friends asked me „Why do you want to go to this miserably country. So much poverty there and they mistreat their women“. When I came back from India and told them that I actually liked it very much there, they would not really believe me. The reason is, that the amount of negative information about India outnumbers the positive information so massively that people rather think „This guy is probably a bit weird“. Therefore:
- Individual „ground truth“ experiences of individuals which contradict the predominant negative image the media distribute are necessarily rare compared to the amount of information received from the media. The reasons are that 1. such experiences require an adventure as soon as the public consensus is negative 2. we receive a huge amount of information from the media. Consequently they are often considered as outliers and therefore ignored.
Together with the deficiencies of classical media I discussed in my older blog post „Social media and democracy: Can we learn from machine learning?“ we need to ask the question if the media we have today are not doing more damage than good?
I believe that the advent of fake content will harm our current media system quickly and heavily. Media will have to transform completely and those channels which are not capable of transformation will probably perish. To understand if such a transformation will be beneficial we must try to predict its nature.
Fake content means that you cannot trust any content which does not come from a person which has created the content itself (i.e. experienced the corresponding event itself). This can probably extended using a network of trust with a low number of hops in the social graph. This means that you would for instance trust your friends, their friends and the friends of your friends (but not further). I did not analyze this kind of media architecture in detail yet and only time will tell how they really behave and how they will impact society. But it is well possible that some harmful characteristics of the media we have today will be eliminated in these models.
Therefore, the imminent fundamental transformation of the media by AI generated fake content could be a huge chance for humankind.
You can download the code I have used to generate the plots from my GitHub account.
Image: DALL-E 3
Follow me on X to get informed about new content on this blog.