Along this analysis journey, we observed data, trying to make sense of it to answer our question: “Does the cast diversity impact a movie’s success”? By first looking at the diversity representation in movies, we notice that the distribution isn’t equal among ethnic groups or genders. Overall, White men tend to be more represented in movies and play more often main roles. We especially notice a difference with the Black, Caribbean actors who are the least represented group. However, let’s also notice the difference in representation between White and Asian, Middle East character isn’t too important.
We also tried to determine what were the movie features that seemed to impact the movie’s success, quantified by the Box office revenue. We found that White men tend to play in more successful movies than other genders or ethnicities. Considering the ethnic diversity representation, it tends to increase with time as well as the budget, leading to a general trending of most more diverse casts with more lucrative movies. However, a rare specific setting of a mid-diverse cast ([0.3,0.4]) leads to the most lucrative movies. Gender representation also tends to increase with time however the most lucrative movie still seems to have a poor gender diversity. To assess more precisely the correlation between the movie characteristics and the movie revenue, we used linear regression to obtain correlation coefficient values. Among the most impactful factors, we found the Budget, the Movie runtime and the Movie release year, so technical aspects were important. The genre of the movie was also impactful, positively (Adventure, Comedy,…) but also negatively (Indie, World cinema). On the other hand, the diversity features, such as the Ethnic Diversity Score and Gender Diversity Score had very small coefficient values, indicating a really small impact.
Finally, we set up machine learning models to predict the Box office revenue based on movie features and observed what were the most important features of the model. One more time, we found that diversity wasn’t the most important one. It was rather characteristic features (Budget, genres,…).
In conclusion, after analyzing our dataset, we found that for the given range of time, the diversity of the cast doesn’t seem to be the most impactful factor determining a movie’s success. Its role seems negligible compared to other characteristics, such as the language, the budget or the movie genre. Nevertheless, let’s be precise, it isn’t because diversity wasn’t impactful on the given movies that it’s an unimportant factor, that shouldn’t be considered. The dataset doesn’t have movies produced after 2013 and the question of diversity has become more important these recent years. This consideration probably has more impact now. Plus, the diversity of a cast could have other consequences than the Box office revenue. A completely undiversified cast has deeper and more subtle impacts than the movie revenue.