Skip to the content.

Grouping Ethnicities into Ethnic Groups

As we mentioned earlier, the world is diverse. In fact, our dataset presents a total of 431 different ethnicities. Can you imagine? This is more than twice the number of countries in the world, therefore we can’t work with such a large number of classes. Indeed some ethnicities might be composed of only a small number of people whereas others might be composed of thousands of people. This is not balanced at all! As such, grouping them into larger ethnic groups is a good idea. Not only does it guarantee a certain level of anonymity between ethnicities (see “A note on Ethical Risk” in previous page), but it also significantly balances the disparities between them.

The ethnicities were manually grouped according to the UK’s list of ethnic groups. All 431 ethnicities were mapped to the following 4 ethnic groups:

By looking at the proportion of ethnic groups below, we can see that the ethnic groups “White” and “Asian, Middle East and Tribes” account for roughly two thirds of the population:

Looking at the data

Before diving into any kind of analysis, let’s have a look at our data so that we have a feeling of its content.

Actor diversity over time

Over the past century many things have changed, trends and fashion are not anymore the same, trade and technology have massively accelerated leading today to a more diverse world than ever. Let’s see if this diversity increase is present in the cinema industry.

As we can see the diversity in the cinema industry can be decomposed into three timestamps:

We can also separate men and women for a broader view of the diversity over time:

Here we can see three major trends:

Actor diversity between movie genres

Now let’s focus on the actor diversity among movie genres.

The first thing we can notice is the fact that the cinema industry, all time combined, is dominated by four genres: Drama, Comedy, Romance Film and Action. Also we can observe that among those four genres, the distribution of ethnic groups is roughly the same: the two dominant groups are White and Asian, Middle East and Tribes, with the group White being slightly bigger than the group Asian, Middle East and Tribes (except for the Romance movies where the two are of same importance). Moreover those two represent more than two thirds of the distribution in all four genres. But when looking at all the genres, World cinema and the Musical have distinct distributions. Indeed, the two genres are dominated by the group Asian Middle east and tribes. This makes sense for World cinema as it is defined as produced outside of USA.

Actor diversity in the movies’ main characters

How can we differentiate between main characters and side characters in a movie? In this project, we simply define an important/main character if its name is mentioned in the movie’s plot summary. If not, then the character will be considered as a side character.

Now, and these are the last exploration plots, we promise, let’s see the actors diversity among main and side characters.

Let’s first analyze the barplots representing the distribution of ethnic groups playing main and side roles. As we can see, there are much more people playing side roles than main roles. This first observation is to be expected because in a movie, for a handful of main characters there are many more side characters. Then we can observe that, in both barplots, the ranking of the ethnic groups based on the number of actors are the same. However we can see that those proportions vary between main roles and side roles. Indeed for main roles the difference in number of actors from an ethnic group to another seems to be constant (~3000), whereas for side roles the difference is not constant anymore. In fact for the side roles, the ethnic group Asia, Middle East and Tribes is almost as important as the ethnic group White and those two categories dominate the distribution. Similarly, Mixed or mulpiple ethnic group and Black, Caribbean or African have a similar number of side actors, even if they represent a smaller part of the barplot.

The same observations can be done with the two piecharts.

Previous page Home Next page