Blog posts

2025

A Reflection on Data Representativeness in Datasets

3 minute read

Published:

Perhaps the biggest difference between studying and working with Artificial Intelligence lies in the data itself. In courses and small-scale projects, datasets are perfect. At most, we perform some exploratory analysis, convert strings to floats, and clean up missing values or outliers. In real life, however, the story has proven to be completely different. Developing a model that performs well and is actually useful for society when dealing with scarce or low-quality data is extremely complicated. T