SPIDAS case study – Primary school (Year 6)

As a part of our pilot study, Year 6 (10-11 year old) students investigated if the weather (such as Hourly Temperature, Hourly Mean Wind Direction, Hourly Relative Humidity, Hourly Global Radiation etc.) might be related to students’ attendances, health and injuries in schools. Key skills are interpretation of graphical data, interpretation of averages and drawing conclusions from data.

Data sets were provided by the Met office as well as the school. The students have studied basic statistical concepts, such as appropriate graphical representation involving discrete, continuous and grouped data; and appropriate measures of central tendency (mean, mode, median) based on the UK national curriculum. Leant from the first pilot study, the following aspects were particularly considered in this pilot:

  • Exploratory talk around graphical data is key
  • Pupils need to be extended by dealing with bigger data set s
  • Digital tools must be carefully selected (Excel was quite time consuming)

The students were confident to use ICT tools but they have not used CODAP before, so there was a quick introduction how to use the tool. Although they have not learnt best fit lines etc., the students seemed no difficulties to read scatter graphs and best fit lines to intuitively make their inferences. Their findings are, for example:

  • Most weather variables bear very weak correlations to attendance.
  • UV radiation seems to correlate with higher attendance. Possible link to Seasonal Affective Disorder.
  • Weather data around windspeed correlates closely to the number of slips, trips and falls in the playground.
  • Absence, regardless of weather, increases across a week.

At the end of their project, the students collectively produced a video in which they summarised their findings.

A post attitude survey suggests that the students have more positive attitudes towards learning statistical concepts, and less stressed during the learning, but their confidence using ICT tools were slightly decreased. This is interesting as ICT tools are heavily embedded in their daily lessons, and they are in general very confident to use ICT tools, but perhaps the new tool (CODAP) made them realise not all tools are easy to use, which is not a bad thing to realise!

SPIDAS Data analytics Framework

We definitions of data analytics (DA) as a process of ‘engaging creatively in exploring data, including big data, to understand our world better, to draw conclusions, to make decisions and predictions, and to critically evaluate present/future courses of actions‘.


At the heart of our conceptual framework for DA in schools Figure below is this cyclic process of acts we expect when students engage in DA.

For example, DA associated with data-driven decision making also needs gaining a better understanding of data that is essential to exploring data. This starts with understanding the system that the data come from and defining the problem (see PPDAC investigative cycle i). Then engaging in data exploration involves extracting and categorizing data and analyzing them through data visualization tools and appropriate calculations. This leads to drawing conclusions by using data as evidence for generalizations beyond describing the given data. These earlier steps guide making decisions/predictions with an articulation of uncertainty. Then evaluating courses of actions takes place with a connection to the problem defined earlier in the process.


The investigative cycle is complemented with various skills and competencies that are in line with “Framework for 21st Century Learning” (http://www.p21.org/about-us/p21-framework) by the Partnership for the 21st Century Learning (P21) or Data analytics such as statistical literacy, ICT literacy, and so on.


In 2018-19 we will start our teaching experiments with our partnership schools and hopefully, our students will experience this DA process with authentic data and real-life related DA questions!

The teaching of Data Analytics – Modelling approach

A modelling approach is a popular one in the teaching of mathematics, and we can take this approach for the teaching and learning of Data analytics (and statistics). One of the latest special issues in the journal ZDM Mathematics education publishes papers in modelling approaches in statistics education (https://link.springer.com/journal/11858)


This approach can be used to foster students’ statistical inferences (e.g. Doerr, et al. 2017), because “statistical modeling simultaneously exposes students to statistical and probability concepts and reasoning between real data distributions and simulated data distributions” (Patel and Pfannkuch, in press, p. 2).

A model is “a representation of structure in a given system” (Hestenes, 2010, p. 17), and the curriculum design should consider to provide the following opportunities (p. 33):

  • proficiency with conceptual modeling tools
  • qualitative reasoning with model presentations
  • procedures for quantitative measurement
  • comparing models to data

Pfannkuch et al. (2016) proposed a framework for probability/statistics modelling as the following cyclic process ‘Problem situation – What to know – Assumptions – Build the stochastic model – Test the model – Use the model’. This cyclic process promotes students’ seeing structure in given situations and applying structure, bridging mathematical and real world and enriching their thinking and reasoning. English and Watson (2018) also describe a modelling approach with the following four components: “working in shared problem spaces between mathematics and statistics; interpreting and reinterpreting problem contexts and questions; interpreting, organising and operating on data in model construction; and drawing informal inferences.” (p. 103). They consider this approach has potential to foster students’ statistical literacy because in this approach problems are complex, not organised and ill-structured and therefore students have to make sense of problem contexts, organise data so that they can manage and work with, and make decisions and inferences about data and their interpretations. For example, as an example, students can work with a problem:

You are to select Australia’s best 6 swimmers to compete in either the women’s or men’s 100 m freestyle event at the Rio Olympics. Your selection should ensure Australia has the best chance of winning gold. (p. 108).

We are currently trying to design such problems by using weather/climate change data from the Met office in the UK, which hopefully we can start classroom-based research from September 2018!



Doerr, H. M., Delmas, R., & Makar, K. (2017). A modeling approach to the development of students‘ informal inferential reasoning. Statistics Education Research Journal, 16(2).  

English, L. D., & Watson, J. (2018). Modelling with authentic data in sixth grade. ZDM50(1-2), 103-115.

Hestenes, D. (2010). Modeling theory for math and science education. In R. Lesh, P. L. Galbraith, C. R. Haines, & A. Hurford (Eds.), Modeling students’ mathematical modeling competencies: ICTMA 13 (pp. 13-41). New York: Springer.  

Patel, A., & Pfannkuch, M. (in press). Developing a statistical modeling framework to characterize Year 7 students’ reasoning. ZDM, 1-16. 

Pfannkuch, M., Budgett, S., Fewster, R., Fitch, M., Pattenwise, S., Wild, C., & Ziedins, I. (2016). Probability modeling and thinking: What can we learn from practice?. Statistics Education Research Journal15(2). 


Technological tools for DA – CODAP

The use of technological tools enrich experiences of the teaching and learning of Data analytics in schools, but which tools would be useful? Of course, it depends on the contexts, levels of contents, and so on. For example, R is a powerful tool to explore data, but its interface might not be so intuitive for some students? Amongst many, personally, I am interested in using CODAP.

CODAP is “a free web-based data tool designed as a platform for developers and as an application for students in grades 6–14” according to its website.

I think the exploration of the following data set would be a good starting point.


What can we explore in the above? One might be a decision making (about cats in this case :)).

Decision making is one of the important aspects in data analytics, and this is now termed as informal statistical inference (Makar and Rubin, 2009). In particular, informal statistical inference defined as “decision-making in relation to a statistical question for a population based on evidence from a sample and acknowledging a degree of uncertainty in that decision” (Watson and English, 2018, p. 36) can become a foundation of formal/advanced statistical inference.

In particular, like Tinkerpltos it is possible for us to separate data into ‘male’ and ‘female’, and this makes data more accessible and visual, hopefully supporting students in their decision-making process? This kind of process is what I really want to see in the pilot studies in classrooms which hopefully will start in September 2018…



Makar, K., & Rubin, A. (2009). A framework for thinking about informal statistical inference. Statistics Education Research Journal, 8(1), 82–105

Watson, J. and English, L. (2018). Eye color and the practice of statistics in Grade 6: Comparing two groups, Journal of Mathematical Behavior, 49, 35-60.

Insights from research in the teaching of statistics: Six principles of instructional design

In this project we are seeking a better way to teach Data analytics and insights from research in the teaching of statistics are very useful. For example, the Guidelines for Assessment and Instruction in Statistics Education (Franklin et al., 2007) proposed the following components as a framework for the teaching and learning of statistics in schools:

  • Formulate questions, anticipating variability;
  • Collect data, acknowledging variability;
  • Analyze data, taking account of variability;
  • Interpret results, allowing for variability.

Garfield and Ben-Zvi (2009) describe how to implement a statistics course designed to develop students’ statistical inference/reasoning at the introductory secondary or tertiary level. The advocated teaching approach is different than traditional lectures, e.g. “teaching as telling” approach (p. 73). It is based on constructivist principles of learning. In this approach, the learning environment involves “combination of text materials, class activities and culture, discussion, technology, teaching approach and assessment” (p. 73). This approach is guided by Cobb and McClain’s (2004) six principles of instructional design:

  1. Focus on developing central statistical ideas rather than on presenting set of tools and procedures.
  2. Use real and motivating data sets to engage studentsin making and testing conjectures.
  3. Use classroom activities to support the development of students’ reasoning.
  4. Integrate the use of appropriate technological tools that allow students to test their conjectures, explore and analyse data, and develop their statistical reasoning.
  5. Promote classroom discourse that includes statistical argument and sustained exchanges that focus on significant statistical ideas.
  6. Use assessment to learn what students know and to monitor the development of their statistical learning, as well as to evaluate instructional plans and progress. (Garfield & Ben-Zvi, 2009, p. 73)

According to these principles, it is important for students to develop deep understanding of key statistical ideas, such as data, distribution, center and variability, correlation etc.. It is also emphasized that students need to experience various methods of collecting and producing data and understand how these methods affect the quality of data and appropriate types of analyses. Data sets need to be interesting enough to motivate students to make conjectures and test them.

Two different models of class activities are described: 1) engaging students in making about a statistical problem or data set, 2) group work for solving a problem. The use of technology allows students to spend more time on learning how to select appropriate analysis and how to interpret data instead of focusing on complicated calculations. Technology tools also help students visualize statistical concepts and understand abstract ideas through simulations…



Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., et al. (2007). Guidelines for assessment and instruction in statistics education (GAISE) report: A preK12 curriculum frameworkAlexandria, VA: American Statistical Association. Retrieved from http://www.amstat.org/education/gaise/. 

Garfield, J., & Ben-Zvi, D. (2009). Helping Students Develop Statistical Reasoning: Implementing a Statistical Reasoning Learning Environment. Teaching Statistics, 31(3), 72-77.  

The use of technology in Data analytics

One of the effective approaches might be the use of technological tools, in particular when students deal with complex or big data.  Tinkerplots (Konold and Miller, 2011) is one of such tools, which enables students to interact with data in intuitive and dynamic ways. Students can construct their own representations of data by ordering, stacking and separating data.  

Statstalk project led by Sibel Kazak investigated the issues around (1) young students’ conceptual understanding of statistical and probabilistic ideas within informal statistical inference, which seems to be neglected in the early grades of schooling, (2) mediating roles of technological tools and children’s talk, and (3) recommendations for practical educational contexts. The technology-based teaching can support students’ conceptual understanding of mathematics – Tinkerpots becomes scaffolding to develop their statistical inferences and decision making (Kazak et al. 2015a; Kazak et al. 2015b).   


But the use of the technology is not the final solution, and indeed we need to consider a lot of factors, such as task design, teacher, and educational context (Drijver, 2015)!


For example, while Tinkerplots can offer powerful learning opportunities for students, the roles of teachers should not be underestimated. For example, Watson and English (2018) who also used Tinkerplots found that: 

It was disappointing that some students did not score as well on Parts B or C of the Assessment that were based on TinkerPlots graphs. Although students had filled in their Workbooks during class individually, they worked in pairs using TinkerPlots. It may have happened that one of the pair took over the manipulation on the computer screen, with the other not paying attention, hence recalling less of the activity. (p. 18) 

This implies that teachers carefully monitor or manage the use of Tinkerplots during teaching. In fact, Parero and Aldon (2016) suggest that reported with technology-based learning environments teachers’ roles are very important.


In this project hopefully, we will be able to seek effective ways to teach data analytics with technological tools such as Tinkerplots!



Drijvers, P. (2015). Digital technology in mathematics education: Why it works (or doesn’t). In Selected regular lectures from the 12th international congress on mathematical education (pp. 135-151). Springer, Cham.

Kazak, S., Wegerif, R., & Fujita, T. (2015a). The importance of dialogic processes to conceptual development in mathematics. Educational Studies in Mathematics90(2), 105-120. 

Kazak, S., Wegerif, R., & Fujita, T. (2015b). Combining scaffolding for content and scaffolding for dialogue to support conceptual breakthroughs in understanding probability. ZDM, 47(7), 1269-1283. 

Panero, M., & Aldon, G. (2016). How teachers evolve their formative assessment practices when digital tools are involved in the classroom. Digital Experiences in Mathematics Education, 2(1), 70-86. 

Watson, J. and English, L. (2018). Eye color and the practice of statistics in Grade 6: Comparing two groups, Journal of Mathematical Behavior, 49, 35-60. 

The teaching of Data analytics in schools: 4 points for teaching

Data analytics in schools might be defined as a process exploring data including big data to understand our world better, to draw conclusionsto make decisions and predictions, and to critically evaluate present/future paths or courses of action.


The related competencies for productive data analytics are understanding of fundamental concepts, fluency for statistical techniques and procedures, statistical inferences, communications and collaborations, and ethics and social impacts. A question to be explored is what teaching approaches to be considered so that our students can engage productive data analytics and developing the relevant competencies. Vidic (2006) summarised the statistics teaching recommended by many authors in Proceedings of teaching statistics ICOTS and Journal of statistics education, which focused around one of the following features:

  • teaching statistics through concrete real-life cases
  • teaching statistics with computer programs,
  • cooperative learning with group work,
  • active learning with experiments.

In this project, we will start reviewing literature related to the above four points in order to gain some ideas for innovative teaching approaches to data analytics, which will be updated in this blog!


Vidic, A. (2006). A model for teaching basic engineering statistics in Slovenia. Metodološki zvezki, 3 (1), 163-183.  

Competency for Data analytics: Statistical literacy

In addition to skills and understanding of statistical concepts and procedures, various skills and competencies will be necessary to examine data and then make decisions and predictions and to critically evaluate present/future paths or courses of action, Data analytics in schools.

The statistical literacy suggested by Gal (2002) contains in the following two components: 

(a) people’s ability to interpret and critically evaluate statistical information, data-related arguments, or stochastic phenomena, which they may encounter in diverse contexts, and when relevant, 

(b) their ability to discuss or communicate their reactions to such statistical information, such as their understanding of the meaning of the information, their opinions about the implications of this information, or their concerns regarding the acceptability of given conclusions (p. 2-3)” 

This statistical literacy, claimed as ‘a pre-requisite for an informed democracy’ by Ridgway and Nicholson (2017, p. 13), relates more general skills and competencies related to 21st century skills such as communication, collaborations, creativity, critical thinking, digital literacy etc. (e.g. Ananiadou, K. and M. Claro, 2009; Dede, 2010). Ananiadou and Claro (2009) propose the three dimensions as a framework for 21st century skills and competencies, i.e. informationcommunication and ethics and social impact with a strong link to the use of ICT. These competencies are now recognised as essential in order to work industries (Examples from the MET office are here!).

We are currently reviewing existing literature in order to gain insights what teaching approaches can be taken in order to develop various competencies related to Data analytics in schools. and if you have good ideas please let us know!



Ananiadou, K., & Claro, M. (2009). 21st century skills and competences for new millennium learners in OECD countries. 

Dede, C. (2010). Comparing frameworks for 21st century skills. 21st century skills: Rethinking how students learn20, 51-76. 

Gal, I. (2002). Adults’ statistical literacy: Meanings, components, responsibilities. International statistical review70(1), 1-25. 

Ridgway, J. and Nicholson, J. (2017). Editorial: The Future of Statistical Literacy Is the Future of Statistics, Statistics Education Research Journal, 16(1), 8-14. 

Competency for Data analytics: Statistics education point of view

We defined data analytics in schools as a process of exploring data including big data to understand our world better, to draw conclusionsto make decisions and predictions, and to critically evaluate present/future paths or courses of action.

Then what skills and competencies are necessary for the above process? In a context of the teaching and learning the data analytics in schools, knowledge and understanding of statistical concepts and techniques are perhaps essential. This would include the understanding of fundamental concepts such as frequency, means/mode/median, standards deviations and procedures as well as reading, sorting, clearing data, etc. We also expect a certain degree of fluency in these procedures, which is very important in the learning of mathematics (e.g. Foster, 2017).

Statistical inferences are another important competency. In particular, informal statistical inference defined as “decision-making in relation to a statistical question for a population based on evidence from a sample and acknowledging a degree of uncertainty in that decision” (Watson and English, 2019 p. 36) can become a foundation of formal/advanced statistical inference.

In this SPIDAS project, we are seeking effective teaching approaches which would develop such competencies through investigating data related to weather and climate changes…



Foster, C. (2017). Developing mathematical fluency: comparing exercises and rich tasks. Educational Studies in Mathematics, 1-21. 

Watson, J. and English, L. (2018). Eye color and the practice of statistics in Grade 6: Comparing two groups, Journal of Mathematical Behavior, 49, 35-60. 

Three challenges to teach informal statistical inferences

A key idea in the teaching and learning of data analytics is perhaps informal statistical inferences, defined as ‘a probabilistic generalization based on the evidence of data’ (Makar & Rubin, 2009).

But the teaching of statistical inferences is not easy. Bakker and Derry (2011) summarised the three challenges which we face when we consider teaching approaches and design sequences of activities for students for teaching informal statistical inferences. These three challenges are relate to:

  1. avoiding inert Knowledge – “Even if students have learned the main statistical concepts and graphical displays, they often fail to use them to solve statistical problems.” (p. 7)

  2. avoiding atomistic approaches – approaches “found in many textbooks and to foster coherence from a student perspective.” (p. 6)

  3. sequencing topics from a student perspective – “If the scientific definition of concept D builds on concepts A, B, and C, should A, B, and C then be taught before D?” (p. 8) but is there an alternative way of sequencing? Bakker and Derry continued “The empirical studies cited previously with young students seem to vote against such a view, because awareness of distribution is intricately connected to awareness of center, spread, skewness, and shape as represented in dot plots, histograms, or box plots.” (p. 8)

Are these challenges common in Spain, Turkey or UK in the teaching of data analytics and statistics? Bakker and Derry (2011) proposed their ideas to overcome these challenges, but I would like to share various opinions before summarising Bakker and Derry’s idea here!



Bakker, A., & Derry, J. (2011). Lessons from inferentialism for statistics education. Mathematical Thinking and Learning, 13(1-2), 5-26.

Makar, K., & Rubin, A. (2009). A framework for thinking about informal statistical inference. Statistics Education Research Journal, 8(1), 82–105.