When young children begin to count objects and then to draw or record that count, they are embarking on an exciting mathematical journey encompassing some intricately entwined ideas that we have traditionally labelled ‘statistics’, or (and we have long suspected this is in part due to the particular difficulties of pronouncing the latter in front of an audience/class) the education version, ‘data handling’. At many stages their activity isn’t even explicitly or clearly mathematics, never mind carefully delineated into ‘data handling’ versus ‘measurement’ or ‘number’. Counting objects, comparing them, making them into groups, classifying them by size or shape, making images to show what you have done – all these are valuable and important for mathematics learning, but not just ‘about’ mathematics.
One of the interesting ways we could examine this early work/play is to consider it ‘creating data’ – that is, translating between the world (I perceive information) and the reduction of it that we choose to represent (I record or communicate some aspect of that information). As you can see, this might equally be called ‘art’ or ‘design’, or might extend to map-making (geography) or telling stories (English or history) – in fact, there isn’t a school subject we can think of that isn’t implicated somehow.
If we think about life beyond school, ‘creating data’ is an action critical to commercial enterprise, through which the richness and variety of human experience is transmuted, through a process of algorithmic alchemy, into data points. To Facebook, each user is a collection of likes and views, to Tesco every loyalty card customer is the sum total of their purchases. To Google, you search, therefore you are. To some extent, digital consumer existence is data.
What does it mean to ‘create data’? It is about making decisions about what to leave out and put in, considering what the purpose might be of what we are doing, and whether it is about us or about the information or about both. We could ask, is the activity replicable by someone else? Can we justify the decisions we made? And even, are we being ethical with our choices? This is a great example of why I often feel puzzled when people categorise mathematics as ‘black and white, right or wrong’ and don’t recognise the often delightfully human, nuanced, spectral stuff which isn’t just around the edges, but often right there in the middle. I call this thinking ‘maths-as-a-humanity’, and it fits nicely also with the artistic, graphic and design connotations of the phrase ‘creating data’.
‘’’Creating data’’ may seem an odd phrasing. However, data are not lying around like melons on the ground to gather up and cart off to the table. Turning observations into data involves an explicit process of abstraction, a process more like an impressionist painting than snapshot photography’
(Konold & Higgins, 2003)1.
The last fifty years of statistics research have thrown up a field of study known as ‘exploratory data analysis’, or EDA for short. Put simply, researchers such as Tukey (1977)2 advocated that we should do more with data than simply confirm theories we already had about it – we should be detectives, searching for interesting and unexpected emergences. Can young children use the approach and principles of EDA as they ‘make’ data? Well, one principle – of staying ‘grounded in the data and attentive to what they have to say’ (ibid) – seems pertinent here. While it is good practice in statistical education to start with a meaningful question, that certainly doesn’t mean always having to stick to that question at the expense of other interesting things that emerge. In fact, we might argue that very young children, when making data, are only really doing emergent work – in essence, exploratory data analysis – because they don’t yet have the tools to match a research question with methodology; i.e. to know what is possible to find out from what’s in front of them in the world.
Konold & Higgins also suggest that children must ‘learn to see data they have created as separate in many ways from the real-world event they observed.’ This is akin to perceiving the ‘fiveness’ of five, or the defining properties of a geometric shape – an abstraction. One of the most important developmental milestones in thinking about data is being able to see the data set as a thing in itself, a flexible aggregate thinking which is a particular form of unitising. Equally, we should be careful not always to reduce rich and interesting subjects to a single attribute. In particular, an individual is a multi-variable tapestry of likes and dislikes, measurements, counts, and categories. If we ignore all this and focus on a single aspect then the potential to view the aggregate of one attribute through the lens of another – an essential component of EDA – is hamstrung.
Forbes reported last year that ‘Over the last two years alone 90 percent of the data in the world was generated.’3 One single source of data, the Large Hadron Collider, produces 30 petabytes of data per year.4 That’s enough to fill 1.2 million Blu-ray discs which, if stacked on top of each other, would be nearly twice the height of the Burj Khalifa tower in Dubai. Translating items from the real world to a created world of data, so effortless for computers, is also a human mathematical process that students need to understand deeply if they are to cope critically with the data flow of tomorrow.
- Konold, C., & Higgins, T. (2003). Reasoning about data. In A research companion to Principles and Standards for School Mathematics (pp. 193–215). Reston, VA: National Council of Teachers of Mathematics.
- Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley Publishing Company.
- https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#3c9926860ba9
- https://home.cern/resources/faqs/facts-and-figures-about-lhc
Join the conversation: You can tweet us @CambridgeMaths or comment below.