What’s going on, what’s going on?

For many high school teachers here in New Zealand, the teaching year is over and it’s now a six-week summer break before school starts again next year. Despite the well-deserved break, some teachers are already thinking about ideas for next year. I’ve been amazed (and inspired) by the teachers who have signed up to spend a day with Liza and I on Friday 15th December to learn more about working with modern data (more details here). We are both really looking forward to the full-day workshop 🙂 One of the tools we’ll be working with at the workshop is the platform IFTTT (If This Then That). It’s basically a way to connect devices and online accounts using APIs (application programming interfaces) without using code.

I used IFTTT recently to collect data on New York Times articles. One of the reasons why I started collecting data on New York Times articles was because of their free, online feature “What’s Going On in This Graph?”. On Tuesday, December 12 and every second Tuesday of the month through the US school year, The New York Times Learning Network, in partnership with the American Statistical Association, hosts a live online discussion about a timely graph like the one shown below.

One of the super interesting graphs featured

Students from around the world “read” the graph by posting comments about what they notice and wonder in an online forum.  Teachers live-moderates by responding to the comments in real time and encouraging students to go deeper.  All releases are archived so that teachers can use previous graphs anytime (read this introductory post to learn more). I used “What’s Going On in This Graph?” when I was teaching our Lies, Damned lies and Statistics course, and it is such an awesome resource for helping build statistical literacy and thinking.

So, inspired by the New York Times graphs, about two months ago I created an “applet” on IFTTT that creates a new row in a Google spreadsheet every time a new article is posted to the New York Times website. It stopped working for some reason at the end of November – check out the “raw” data here: https://docs.google.com/spreadsheets/d/1PXGh0xBrJbmrfWq3nRylH5GBqzVd4SYWWiXQj3v9tdQ/edit?usp=sharing 

So what’s going on with the data I collected? Your first thought on viewing the data might be – huh? You call this data? The only variable that is “graph ready” is which section each of the nearly 6000 articles were published in. But there are so many variables in data sets just like this one waiting to be defined and explored. After our workshop on Friday, I’ll post an “after” version of this same data set 🙂

Mind the stats?


Have you noticed how Google sometimes gives the top page in your search results a little summary box? For example, if you Google “how to plan a honeymoon”, you get this:


Since I didn’t do number two on this list, my job for tonight was to check out trains for our travel in the UK leg of our honeymoon. After my first Google search, I got a little distracted and consequently typed up this short post 🙂  I realised part way through that “mind the gap” is more of a London underground thing than a UK train travel thing, but it’s late so hopefully the reference still makes sense.

My first (and only) search tonight was for a train from London to Cambridge. Before even clicking through to the website listed, I got to read this little “statistical report” 🙂


The first two sentences got me questioning what “fastest journey time” means, since how can the “average journey time” be lower than the shortest journey time? The third sentence made me shake my head at the misuse our special stats word “average”  and I automatically re-worded that sentence in my head to “on weekdays there are, on average, 96 trains per day…..”

So not only because I actually needed to find out about trains from London to Cambridge, but also because I was curious to find out what “fastest journey time” means, I clicked through to https://www.thetrainline.com/train-times/london-kings-cross-to-cambridge-station

When you scroll down to the bottom you get this nice table:


This gives some immediate answers to my confusion about the Google search summary – I think. “Slowest route” actually means the minimum time, and “Fastest route” means the maximum time. At least now the average journey time of one hour sits between these two numbers, but did you notice when you scrolled down the page that there were some routes listed with times greater than 63 minutes, the supposed “fastest route”?

Me too, so I went through all routes for the next 24 hours (starting from 8:44am London time) and listed their times:


There’s bound to be a few mistakes in there when I was converting from hours to minutes 🙂 But to finish this short critique, let’s look at the data:


For this particular 24 hour period (from Monday 21st November 8:44am) there were 76 trains from London to Cambridge, with a mean journey time of around 64 minutes (based on the advertised times). If I wanted to check out the claims about the average number of trains per weekday and the average journey time, I’d need a better sampling method and more “weekdays” of data. But this sample does offer evidence to contradict the claims about “shortest” and “fastest” journey times.

Unless those terms still don’t mean what I think they mean, even when I reverse them 🙂

When is a statistical report statistical enough?


We are super lucky in NZ to have something as important as statistical literacy not just written into our curriculum at all levels, but also formally assessed as part of our national qualifications system (NCEA). All of our students have to deal with messages given to them that are based on data, whether it be in the news, online via their Facebook feed, or on TV through advertising (just to list a few examples), so even if we don’t actually assess AS91266 Evaluate a statistically based report or AS91584 Evaluate statistically based reports within a learning programme, it’s still important include good examples of statistically based reports in our teaching of statistics.

This post will focus on AS91266 Evaluate a statistically based report, as this standard provides a really great opportunity to weave together many of the statistics achievement objectives at curriculum levels six and seven.  This focus will also give me an opportunity to provide some guidance on how to select good examples of statistically based reports that are strong enough statistically to be used for the assessment of this standard and also accessible to the students.

So what do you need to look for in a report?

The following questions are a good place to start when considering the suitability of a statistically based report.

  • Is the purpose of the report clear?
  • Is the report based on a survey?
  • Are the findings of the survey clear?

If you get a yes to all three of these questions, you might be heading in the right direction. For example, this article on Stuff regarding whether eating carbs help you lose weight is kind of interesting (it even has a video), has lots of contextual information around diets, has a clear purpose, but it’s not a survey and there’s only one result reported (a weight loss of 10kg). In comparison, this article also on Stuff regarding whether children eating carbs can explain rising obesity levels is more on track.

Then the next round of questions to ask are:

  • Are the population measures and variables are clear?
  • Are the sampling methods clear?
  • Are the survey methods clear?
  • Is the sample size clear?

If you also get a yes to most of these questions, then the report is probably statistical enough for students to use and will allow them to discuss the statistical methods and measures used and sampling and non-sampling errors (one of the requirements for this standard). According to these questions, the second article given above – whether children eating carbs can explain rising obesity levels – is not yet statistical enough (there are no details in the article about sampling or survey methods), so…..

How do we find a report that is statistical enough?

NZ media sites like nzherald.co.nz and stuff.co.nz, and the super awesome StatsChat statschat.org.nz, are good starting points to find reports (tip – search for key words like “survey”, “study”, “causes” etc.), but just like with the articles above, while the articles may be interesting, engaging and relevant for teenagers, there is still some work to do before they can be used in a meaningful way in the classroom. What follows is an example of making a report statistical enough 🙂

I found this article entitled “The unhealthy reason men go to the gym” on Stuff on October 18th [download pdf]. Read it carefully and see how many of those questions above you can tick yes to. The article itself is a good starting point, but on reading the article you’ll see that it takes a few findings from one study and one survey and blends them together to make the article. Also, none of the last four questions are met, and you could argue the first three are met insufficiently. This report needs more details to make it statistical enough. [Note: At this stage I am motivated enough to continue this process because I like the context and I believe it will resonate with students – this is not always the case!]

The article gives links to both of the original documents. The document for the recent study is only available if you pay for it (or have access through your library service like I do at the University) and is a very technical report that is beyond the reach of our students. The full report from the 2010 survey runs to 139 pages and is accessible to our students [download pdf], however, since the link to this survey is broken I had to Google it.


From here there are two options:

  1. Trim the original newspaper article to report only on the 2010 survey and include sufficient details from the original report so that all those questions posed above can be answered (so adapt the article, the approach that seems to be used for the AS91584 exams).
  2. Trim the 139 page report but retain enough information in the report so that all those questions posed above can be answered. I reckon about four pages of A4 is a good target, but it will depend on how the report is formatted, how many graphs are included etc.

I think I prefer the latter, although I have done both in my teaching. The key point here is that students are not expected to go out and find the original article themselves to complete their evaluation of the report.

How do we make a report that is statistical enough?

The 139 page report from Mission Australia is pitched at the right level for curriculum level seven, but is far too long for a class of Year 12 students. Most reports from surveys like this have an introduction and an executive summary. These two sections often provide nearly enough for the report to be used in the classroom. I’ve put these sections into a shorter pdf document, which is well set out and easy to read [download pdf].

In deciding whether the report is statistical enough, we need to consider those questions again:

  • Is the purpose of the report clear?
  • Is the report based on a survey?
  • Are the findings of the survey clear?
  • Are the population measures and variables are clear?
  • Are the sampling methods clear?
  • Are the survey methods clear?
  • Is the sample size clear?

The foreword gives a straight forward summary with a reasonable hint of the purpose of the report. The introduction of the introduction expands on the purpose, and the next three sections of the introduction (participation, areas of focus and methodology) provide sufficient information sampling and survey methods. The demographics section of the executive summary provides some population measures, variables and the sample size, and the remaining sections provide the findings of the report. However, it would make it easier for students to evaluate if each section contained a little more detail and the information was provided in tables and/or graphs. Half an hour editing the pdf and cutting and pasting some text and tables from the full document creates a report that is statistical enough for a good quality learning or assessment activity – YAY 🙂 [download pdf]

What about contextual knowledge?

Students should also be provided with enough contextual knowledge associated with the report so that they can integrate this within their evaluation. For the example discussed, this could include the original article from Stuff along with the edited version of the full report, making it clear to students they are evaluating that edited version of the survey report not the Stuff article. It would also be a great idea for the class to discuss the topic or issue being explored by the study/survey, and compare and contrast their own experiences. This discussion would help them to consider whether the results reported are meaningful, useful and/or relevant.

Want to read more about statistical literacy?

If you can get yourself a copy of “Seeing through Statistics” by Jessica Utts (the eBook version is available for around $NZD60), you will not be disappointed. This was the first book I read to help get my head around how to teach statistical literacy, and it is full of really great guiding principles, examples and explanations. Like I said at the beginning of the post, even if you aren’t teaching AS91266 or AS91584, it is our job to help our students to make sense of the world around them  – a world where so many people are using data and statistics is so many, not always awesome, ways.

Spot the errors – final draft

In NZ, it's report comment writing time for teachers, which means for many statistics teachers not just the fun of writing reports but also the not-so-fun job of checking other teachers' comments for errors (I'm looking at you English department!) One of things we used to do every year as part of "report comment writing PD" was to look at different examples of report comments and identify as many errors as possible.

So in line with this kind of activity, for this post I've put together some examples of tasks and/or student responses that demonstrate some common misunderstandings for statistics, each followed by discussion partially informed by comments other teachers made on the earlier version of this post. Use the tabs at the left hand side of this post to move through each part.

For each of the examples:

  • Have a read of the task/student response
  • Identify the different misunderstandings demonstrated in the task/student response.
  • Try to prioritise the misunderstandings to decide on the ONE that is the most serious and needs addressing first.


The Coach of a soccer team ran a new training programme over the season. At the start of the season and at the end of the season the players in the soccer team had to complete different tests for their ball handling skills. One test was for how many times in a row each player could bounce a soccer ball on their head. You have been given the bounce data in the table below. Write a short report for the Coach of the soccer team about the effectiveness of their training programme.


Student report

Sorry Coach, but the new training programme did not improve each player’s ability to bounce balls on their head, as you can see in my graphs below. The median number of times in a row a ball is bounced on a head was 9 at the start of the season and 9 at the end of the season, so the players did not improve with this skill. The box for the end of the season is not shifted far enough to the right as the median of the end of season is not outside the box of the start of season. So you can't make a call that the numbers of times in a row a ball is bounced tends to be higher at the end of the season compared to the start of the season. There is no difference in how they performed in this ball handling skills test between the start of the season and the end of the season.


As part of a presentation last year, I tried to summarise what I think is important to consider when faced with anything requiring statistical thinking: What are our awesome messages?


I'm going to use these three principles for some of the discussion sections of this post:

  • It matters how much data you have and how you got that data
  • It matters what you are measuring and how you are measuring it
  • It matters that you are uncertain and that there is variation

When I initially published this post, I asked for teachers to submit anonymously what they thought was the biggest misunderstanding demonstrated by the example (the task or the student response). After a few days, I shared these comments unedited on this site so that teachers would be able to view and compare what had been written. This was motivated by a desire to demonstrate that we don't all see the same things in student writing or tasks and also to show the range of possible issues with the task or student response.

So what did teachers identify as the biggest misunderstandings?

  • That the nature of the data lends itself to a paired comparison, not a comparison of two independent groups
  • That the nature of the study was about an experiment and suggestive causality, not a sampling situation
  • That the student incorrectly applied sampling-to-population inference methods associated with box plots
  • That the design of the experiment was flawed as there was no random allocation or control groups used
  • That the words "no difference" were used rather than "I can't make a call" or "I can't tell"
  • That the student has made statements based on point estimates such as the medians without looking at the shape and variation of the data

These are all good points, and it is a difficult to identify which one is the most serious or has the highest priority to address with a student. That there are a number of issues with what was written by the student highlights why it is so important that we always consider how we are building on and building up the key ideas that underpin statistical thinking. For the remainder of this discussion, I have given examples of the kinds of questions I would want to ask students and the kind of thinking I would like students to demonstrate when considering how to write their response to this task.

It matters how much data you have and how you got this data

Why does it matter that this data was collected from a single soccer team over one season? Why does it matter that the data was not obtained through a random sampling method? Why does it matter that the design is of an experimental nature as there was an intervention, however, there was no control group for comparison and no controlling of related variables?

Desired student thinking: I can explore this data to uncover what it might suggest about the effectiveness of the training programme, but any suggestive inferences would be limited to this team only and would be weakened by the fact that the players may get better/worse over the season for other reasons. That is, I can not say that the training programme was the only reason that the players improved/worsened with their ball handling skills. I also could not say that the training programme would work for other players in other soccer teams.

It matters what you are measuring and how you are measuring it

Why does it matter that only one of the tests for ball handling skills was used in the analysis? Why does it matter that each player was measured twice - once at the beginning of the season and again at the end of the season? Why does it matter that the response variable is a numerical variable?

Desired student thinking: I need to consider whether bouncing a ball on your head is a good/best measure of ball handling skills, and should really explore the data from the other tests to assess the performance of the players. I need to measure the change in performance for each player by taking the the difference of their two test results, this is because the players would have different starting skills and what I want to know is if the training programme improved their performance from this starting point. Since the test data is numerical, I can use a dot plot and box plot to display the differences. [I could also used a link graph, two dot plots which show clearly how the test results are linked for each player between the before and after].

It matters that you are uncertain and there is variation

Why does it matter that each player has a different ability? Why does it matter that that you are using a summary measure like the mean or median? Why does it matter that you refer to statistics calculated from experimental data as estimates?

Desired student thinking: I need to think about the different sources of variation and how they could affect the data I am using. There is natural variation because each player is different has a different ability for the task, and when I use a summary statistic like a mean this I am trying to capture an overall measure of ability, based on the average, for all the players. But a summary measure like the mean won't capture how different each player is from each other in terms of ability. I need to be clearly communicate that I am uncertain and don't know the true value and that is why I will use the word estimate.

Student report

My investigative question was "I wonder if boys who ran the Auckland kids marathon in 2015 are faster than the girls who ran the same event?" My graphs of the times for the boys and girls who ran the Auckland kids marathon in 2015 are shown below:


The shapes of the distribution of times to run the event are pretty similar, in that they are both positively skewed. The boxes are pretty similar in size as well. But anyway, the main thing is I can't make a call that the boys ran the Auckland kids marathon faster than the girls in 2015, because the boxes for both groups overlap, and the median time for the boys to run the event is inside the box (the middle 50% of times) for the girls to run the event. So the times are too similar - there's no difference between how fast the boys and girls were. Plus the three slowest times were the boys, including one who took ages to run the event!

So what did teachers identify as the biggest misunderstandings?

  • That you have been given population data, so you can identify who ran faster on average (boys), you don't need to make an inference from a sample to a population
  • That even if you considered this sample data, the sample size is huge and so you can make a call with a smaller shift between the two samples
  • That even if the two samples are similar, you can't say there is no difference, even if it is too close to make a call
  • That the student does not understand their investigative question, or that their investigative question is incorrect

It is important that we have a clear idea/understanding of whether we are working with sample data or population data, or in other words, whether we are wanting students to engage with inferential reasoning (going beyond the data in front of them) or exploratory data analysis (specifically describing the data in front of them).

At higher levels we tend to be a bit more flexible and loose with the idea of a sample but with younger students it is important to keep things simple and clear. If this set of data did represent a sample from a population, what would that population be? At secondary school level, this should be easy to define, not tricky or messy.

Reviewing student responses is a common feature of good PLD. The goal is to understand more about what misunderstandings students have, so we can be aware of this when teaching and to check for these misunderstandings when using formative assessment. Coupled with his should be to review what statistics education research related to the misunderstanding has uncovered.

When I first posted these examples, I asked for teachers to anonymously submit their thoughts on what they thought was the BIGGEST misunderstanding. I was interested to how similar the responses would be - would we all see the same big issue?

I think one of the cool things about doing an activity like this is that we can assume that everyone sees the same thing we do when reading an assessment task or student work. Sometimes in group discussion you don’t get to hear these different perspectives because someone else says the “right” thing first.

So one way to use these activities as part of any PLD might be to use a similar approach. Ask teachers to choose what they think is the biggest problem and to submit this anonymously through an app like Socrative. Then share all the comments, compare them and discuss the similarities and differences in perspective.