## SPIDAS case study – Primary school (Year 6)

As a part of our pilot study, Year 6 (10-11 year old) students investigated if the weather (such as Hourly Temperature, Hourly Mean Wind Direction, Hourly Relative Humidity, Hourly Global Radiation etc.) might be related to students’ attendances, health and injuries in schools. Key skills are interpretation of graphical data, interpretation of averages and drawing conclusions from data.

Data sets were provided by the Met office as well as the school. The students have studied basic statistical concepts, such as appropriate graphical representation involving discrete, continuous and grouped data; and appropriate measures of central tendency (mean, mode, median) based on the UK national curriculum. Leant from the first pilot study, the following aspects were particularly considered in this pilot:

• Exploratory talk around graphical data is key
• Pupils need to be extended by dealing with bigger data set s
• Digital tools must be carefully selected (Excel was quite time consuming)

The students were confident to use ICT tools but they have not used CODAP before, so there was a quick introduction how to use the tool. Although they have not learnt best fit lines etc., the students seemed no difficulties to read scatter graphs and best fit lines to intuitively make their inferences. Their findings are, for example:

• Most weather variables bear very weak correlations to attendance.
• UV radiation seems to correlate with higher attendance. Possible link to Seasonal Affective Disorder.
• Weather data around windspeed correlates closely to the number of slips, trips and falls in the playground.
• Absence, regardless of weather, increases across a week.

At the end of their project, the students collectively produced a video in which they summarised their findings.

A post attitude survey suggests that the students have more positive attitudes towards learning statistical concepts, and less stressed during the learning, but their confidence using ICT tools were slightly decreased. This is interesting as ICT tools are heavily embedded in their daily lessons, and they are in general very confident to use ICT tools, but perhaps the new tool (CODAP) made them realise not all tools are easy to use, which is not a bad thing to realise!

## SPIDAS Data analytics Framework

We definitions of data analytics (DA) as a process of ‘engaging creatively in exploring data, including big data, to understand our world better, to draw conclusions, to make decisions and predictions, and to critically evaluate present/future courses of actions‘.

At the heart of our conceptual framework for DA in schools Figure below is this cyclic process of acts we expect when students engage in DA.

For example, DA associated with data-driven decision making also needs gaining a better understanding of data that is essential to exploring data. This starts with understanding the system that the data come from and defining the problem (see PPDAC investigative cycle i). Then engaging in data exploration involves extracting and categorizing data and analyzing them through data visualization tools and appropriate calculations. This leads to drawing conclusions by using data as evidence for generalizations beyond describing the given data. These earlier steps guide making decisions/predictions with an articulation of uncertainty. Then evaluating courses of actions takes place with a connection to the problem defined earlier in the process.

The investigative cycle is complemented with various skills and competencies that are in line with “Framework for 21st Century Learning” (http://www.p21.org/about-us/p21-framework) by the Partnership for the 21st Century Learning (P21) or Data analytics such as statistical literacy, ICT literacy, and so on.

In 2018-19 we will start our teaching experiments with our partnership schools and hopefully, our students will experience this DA process with authentic data and real-life related DA questions!

## The teaching of Data Analytics – Modelling approach

A modelling approach is a popular one in the teaching of mathematics, and we can take this approach for the teaching and learning of Data analytics (and statistics). One of the latest special issues in the journal ZDM Mathematics education publishes papers in modelling approaches in statistics education (https://link.springer.com/journal/11858)

This approach can be used to foster students’ statistical inferences (e.g. Doerr, et al. 2017), because “statistical modeling simultaneously exposes students to statistical and probability concepts and reasoning between real data distributions and simulated data distributions” (Patel and Pfannkuch, in press, p. 2).

A model is “a representation of structure in a given system” (Hestenes, 2010, p. 17), and the curriculum design should consider to provide the following opportunities (p. 33):

• proficiency with conceptual modeling tools
• qualitative reasoning with model presentations
• procedures for quantitative measurement
• comparing models to data

Pfannkuch et al. (2016) proposed a framework for probability/statistics modelling as the following cyclic process ‘Problem situation – What to know – Assumptions – Build the stochastic model – Test the model – Use the model’. This cyclic process promotes students’ seeing structure in given situations and applying structure, bridging mathematical and real world and enriching their thinking and reasoning. English and Watson (2018) also describe a modelling approach with the following four components: “working in shared problem spaces between mathematics and statistics; interpreting and reinterpreting problem contexts and questions; interpreting, organising and operating on data in model construction; and drawing informal inferences.” (p. 103). They consider this approach has potential to foster students’ statistical literacy because in this approach problems are complex, not organised and ill-structured and therefore students have to make sense of problem contexts, organise data so that they can manage and work with, and make decisions and inferences about data and their interpretations. For example, as an example, students can work with a problem:

You are to select Australia’s best 6 swimmers to compete in either the women’s or men’s 100 m freestyle event at the Rio Olympics. Your selection should ensure Australia has the best chance of winning gold. (p. 108).

We are currently trying to design such problems by using weather/climate change data from the Met office in the UK, which hopefully we can start classroom-based research from September 2018!

Reference

Doerr, H. M., Delmas, R., & Makar, K. (2017). A modeling approach to the development of students‘ informal inferential reasoning. Statistics Education Research Journal, 16(2).

English, L. D., & Watson, J. (2018). Modelling with authentic data in sixth grade. ZDM50(1-2), 103-115.

Hestenes, D. (2010). Modeling theory for math and science education. In R. Lesh, P. L. Galbraith, C. R. Haines, & A. Hurford (Eds.), Modeling students’ mathematical modeling competencies: ICTMA 13 (pp. 13-41). New York: Springer.

Patel, A., & Pfannkuch, M. (in press). Developing a statistical modeling framework to characterize Year 7 students’ reasoning. ZDM, 1-16.

Pfannkuch, M., Budgett, S., Fewster, R., Fitch, M., Pattenwise, S., Wild, C., & Ziedins, I. (2016). Probability modeling and thinking: What can we learn from practice?. Statistics Education Research Journal15(2).

## Insights from research in the teaching of statistics: Six principles of instructional design

In this project we are seeking a better way to teach Data analytics and insights from research in the teaching of statistics are very useful. For example, the Guidelines for Assessment and Instruction in Statistics Education (Franklin et al., 2007) proposed the following components as a framework for the teaching and learning of statistics in schools:

• Formulate questions, anticipating variability;
• Collect data, acknowledging variability;
• Analyze data, taking account of variability;
• Interpret results, allowing for variability.

Garfield and Ben-Zvi (2009) describe how to implement a statistics course designed to develop students’ statistical inference/reasoning at the introductory secondary or tertiary level. The advocated teaching approach is different than traditional lectures, e.g. “teaching as telling” approach (p. 73). It is based on constructivist principles of learning. In this approach, the learning environment involves “combination of text materials, class activities and culture, discussion, technology, teaching approach and assessment” (p. 73). This approach is guided by Cobb and McClain’s (2004) six principles of instructional design:

1. Focus on developing central statistical ideas rather than on presenting set of tools and procedures.
2. Use real and motivating data sets to engage studentsin making and testing conjectures.
3. Use classroom activities to support the development of students’ reasoning.
4. Integrate the use of appropriate technological tools that allow students to test their conjectures, explore and analyse data, and develop their statistical reasoning.
5. Promote classroom discourse that includes statistical argument and sustained exchanges that focus on significant statistical ideas.
6. Use assessment to learn what students know and to monitor the development of their statistical learning, as well as to evaluate instructional plans and progress. (Garfield & Ben-Zvi, 2009, p. 73)

According to these principles, it is important for students to develop deep understanding of key statistical ideas, such as data, distribution, center and variability, correlation etc.. It is also emphasized that students need to experience various methods of collecting and producing data and understand how these methods affect the quality of data and appropriate types of analyses. Data sets need to be interesting enough to motivate students to make conjectures and test them.

Two different models of class activities are described: 1) engaging students in making about a statistical problem or data set, 2) group work for solving a problem. The use of technology allows students to spend more time on learning how to select appropriate analysis and how to interpret data instead of focusing on complicated calculations. Technology tools also help students visualize statistical concepts and understand abstract ideas through simulations…

References

Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., et al. (2007). Guidelines for assessment and instruction in statistics education (GAISE) report: A preK12 curriculum frameworkAlexandria, VA: American Statistical Association. Retrieved from http://www.amstat.org/education/gaise/.

Garfield, J., & Ben-Zvi, D. (2009). Helping Students Develop Statistical Reasoning: Implementing a Statistical Reasoning Learning Environment. Teaching Statistics, 31(3), 72-77.

## The teaching of Data analytics in schools: 4 points for teaching

Data analytics in schools might be defined as a process exploring data including big data to understand our world better, to draw conclusionsto make decisions and predictions, and to critically evaluate present/future paths or courses of action.

The related competencies for productive data analytics are understanding of fundamental concepts, fluency for statistical techniques and procedures, statistical inferences, communications and collaborations, and ethics and social impacts. A question to be explored is what teaching approaches to be considered so that our students can engage productive data analytics and developing the relevant competencies. Vidic (2006) summarised the statistics teaching recommended by many authors in Proceedings of teaching statistics ICOTS and Journal of statistics education, which focused around one of the following features:

• teaching statistics through concrete real-life cases
• teaching statistics with computer programs,
• cooperative learning with group work,
• active learning with experiments.

In this project, we will start reviewing literature related to the above four points in order to gain some ideas for innovative teaching approaches to data analytics, which will be updated in this blog!

Reference

Vidic, A. (2006). A model for teaching basic engineering statistics in Slovenia. Metodološki zvezki, 3 (1), 163-183.

## Competency for Data analytics: Statistical literacy

In addition to skills and understanding of statistical concepts and procedures, various skills and competencies will be necessary to examine data and then make decisions and predictions and to critically evaluate present/future paths or courses of action, Data analytics in schools.

The statistical literacy suggested by Gal (2002) contains in the following two components:

(a) people’s ability to interpret and critically evaluate statistical information, data-related arguments, or stochastic phenomena, which they may encounter in diverse contexts, and when relevant,

(b) their ability to discuss or communicate their reactions to such statistical information, such as their understanding of the meaning of the information, their opinions about the implications of this information, or their concerns regarding the acceptability of given conclusions (p. 2-3)”

This statistical literacy, claimed as ‘a pre-requisite for an informed democracy’ by Ridgway and Nicholson (2017, p. 13), relates more general skills and competencies related to 21st century skills such as communication, collaborations, creativity, critical thinking, digital literacy etc. (e.g. Ananiadou, K. and M. Claro, 2009; Dede, 2010). Ananiadou and Claro (2009) propose the three dimensions as a framework for 21st century skills and competencies, i.e. informationcommunication and ethics and social impact with a strong link to the use of ICT. These competencies are now recognised as essential in order to work industries (Examples from the MET office are here!).

We are currently reviewing existing literature in order to gain insights what teaching approaches can be taken in order to develop various competencies related to Data analytics in schools. and if you have good ideas please let us know!

References

Ananiadou, K., & Claro, M. (2009). 21st century skills and competences for new millennium learners in OECD countries.

Dede, C. (2010). Comparing frameworks for 21st century skills. 21st century skills: Rethinking how students learn20, 51-76.

Gal, I. (2002). Adults’ statistical literacy: Meanings, components, responsibilities. International statistical review70(1), 1-25.

Ridgway, J. and Nicholson, J. (2017). Editorial: The Future of Statistical Literacy Is the Future of Statistics, Statistics Education Research Journal, 16(1), 8-14.

## Competency for Data analytics: Statistics education point of view

We defined data analytics in schools as a process of exploring data including big data to understand our world better, to draw conclusionsto make decisions and predictions, and to critically evaluate present/future paths or courses of action.

Then what skills and competencies are necessary for the above process? In a context of the teaching and learning the data analytics in schools, knowledge and understanding of statistical concepts and techniques are perhaps essential. This would include the understanding of fundamental concepts such as frequency, means/mode/median, standards deviations and procedures as well as reading, sorting, clearing data, etc. We also expect a certain degree of fluency in these procedures, which is very important in the learning of mathematics (e.g. Foster, 2017).

Statistical inferences are another important competency. In particular, informal statistical inference defined as “decision-making in relation to a statistical question for a population based on evidence from a sample and acknowledging a degree of uncertainty in that decision” (Watson and English, 2019 p. 36) can become a foundation of formal/advanced statistical inference.

In this SPIDAS project, we are seeking effective teaching approaches which would develop such competencies through investigating data related to weather and climate changes…

Reference

Foster, C. (2017). Developing mathematical fluency: comparing exercises and rich tasks. Educational Studies in Mathematics, 1-21.

Watson, J. and English, L. (2018). Eye color and the practice of statistics in Grade 6: Comparing two groups, Journal of Mathematical Behavior, 49, 35-60.

## Three challenges to teach informal statistical inferences

A key idea in the teaching and learning of data analytics is perhaps informal statistical inferences, defined as ‘a probabilistic generalization based on the evidence of data’ (Makar & Rubin, 2009).

But the teaching of statistical inferences is not easy. Bakker and Derry (2011) summarised the three challenges which we face when we consider teaching approaches and design sequences of activities for students for teaching informal statistical inferences. These three challenges are relate to:

1. avoiding inert Knowledge – “Even if students have learned the main statistical concepts and graphical displays, they often fail to use them to solve statistical problems.” (p. 7)

2. avoiding atomistic approaches – approaches “found in many textbooks and to foster coherence from a student perspective.” (p. 6)

3. sequencing topics from a student perspective – “If the scientific definition of concept D builds on concepts A, B, and C, should A, B, and C then be taught before D?” (p. 8) but is there an alternative way of sequencing? Bakker and Derry continued “The empirical studies cited previously with young students seem to vote against such a view, because awareness of distribution is intricately connected to awareness of center, spread, skewness, and shape as represented in dot plots, histograms, or box plots.” (p. 8)

Are these challenges common in Spain, Turkey or UK in the teaching of data analytics and statistics? Bakker and Derry (2011) proposed their ideas to overcome these challenges, but I would like to share various opinions before summarising Bakker and Derry’s idea here!

Reference

Bakker, A., & Derry, J. (2011). Lessons from inferentialism for statistics education. Mathematical Thinking and Learning, 13(1-2), 5-26.

Makar, K., & Rubin, A. (2009). A framework for thinking about informal statistical inference. Statistics Education Research Journal, 8(1), 82–105.

## Why do we have to teach Data Analytics in schools? Employers’ point of view

Why do we have to teach data analytics in schools?

Employers’ needs for data analytics are growing rapidly, in particular big data. The UK government stated in the policy paper UK Digital Strategy as follows:

Developments such as the rising use of social media and the increasing adoption of new technologies like the Internet of Things mean more data is being produced than ever before. At the same time, lower costs of collection, storage and processing – coupled with rising computing power – are making this data a rich raw material. This is creating new opportunities for business growth across all industry sectors, changing how we innovate, market, sell and consume services.

Financial Times has already pointed out in 2014 that “Britain is expected to create an average of 56,000 big data jobs a year until 2020”. It also states that “The big data industry has thrived as sectors as diverse as weather forecasting and fraud investigation have realised the business benefits of using consumer data to predict future trends.” Similarly, in their summary report in 2014, a company Statistical Analysis System (SAS) reported that “With demand for big data specialists forecast to increase by 160 per cent between 2013 and 2020, adding 346,000 big data jobs, hiring and keeping skilled big data refiners could become a costly exercise.” (2014, p. 5)

It is also reported that employers are having difficulties to find potential employees who would be soundly equipped by data analytics skills. For example, the SAS reported skills related to managerial, acumen, sector knowledge/understanding, presentation and interpersonal are particularly difficult to fill (ibid. p. 30).

In our project, we will explore innovative teaching approaches to develop students’ skills and understanding related to Data analytics!

Reference

DfDCM&S. (2017) Policy Paper UK Digital Strategy(https://www.gov.uk/government/publications/uk-digital-strategy, accessed on 30/01/2018)

SAS. (2014). Big Data Analytics Assessment of Demand for Labour and Skills 2013–2020SAS UK & Ireland.

Warrell, H. (2014) Demand for big data and skills shortages drives wages boom. Financial Times(https://www.ft.com/content/953ff95a-6045-11e4-88d1-00144feabdc0, accessed on 30/01/2018)

## What is Data Analytics in schools?

What is Data Analytics in schools? A definition by Piccano (2012) provides us with a perspective in analytics, which is defined as follows:

“The generic definition of analytics is similar to data-driven decision making. Essentially it is the science of examining data to draw conclusions and, when used in decision making, to present paths or courses of action. In recent years, the definition of analytics has gone further, however, to incorporate elements of operations research such as decision trees and strategy maps to establish predictive models and to determine probabilities for certain courses of action.” (p. 12)

We can also find many definitions of data analytics from internet sources.  For example:

“Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Data analytics technologies and techniques are widely used in commercial industries to enable organizations to make more-informed business decisions and by scientists and researchers to verify or disprove scientific models, theories and hypotheses.” http://searchdatamanagement.techtarget.com/definition/data-analytics

In order to visualise what key ideas are used in the definition of data analytics, we created a wordcloud from 9 Internet sources as follows:

As we can see, the word ‘business’ is the most frequently used in these definitions and descriptions of data analytics, and other words ‘information’, ‘process’, ‘statistical’, ‘predictive’ etc. follow. This echoes Piccano’s definitions of analytics above.

We also see the word ‘big’ in the above cloud, but data do not have to be big, although it is also stated, for example, that “data analytics is a general term for any type of processing that looks at historical data over time, but as the size of organizational data grows, the term data analytics is evolving to favor big data-capable systems” (https://www.informatica.com/services-and-training/glossary-of-terms/data-analytics-definition.html#fbid=XLAlnTWWkWy).

In summary, we would like to define data analytics as a process of ‘exploring data including big data to understand our world better, to draw conclusionsto make decisions and predictions, and to critically evaluate present/future paths or courses of action‘.

Reference

Picciano, A. G. (2012). The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks16(3), 9-20)