What’s going on, what’s going on?

For many high school teachers here in New Zealand, the teaching year is over and it’s now a six-week summer break before school starts again next year. Despite the well-deserved break, some teachers are already thinking about ideas for next year. I’ve been amazed (and inspired) by the teachers who have signed up to spend a day with Liza and I on Friday 15th December to learn more about working with modern data (more details here). We are both really looking forward to the full-day workshop 🙂 One of the tools we’ll be working with at the workshop is the platform IFTTT (If This Then That). It’s basically a way to connect devices and online accounts using APIs (application programming interfaces) without using code.

I used IFTTT recently to collect data on New York Times articles. One of the reasons why I started collecting data on New York Times articles was because of their free, online feature “What’s Going On in This Graph?”. On Tuesday, December 12 and every second Tuesday of the month through the US school year, The New York Times Learning Network, in partnership with the American Statistical Association, hosts a live online discussion about a timely graph like the one shown below.

One of the super interesting graphs featured

Students from around the world “read” the graph by posting comments about what they notice and wonder in an online forum.  Teachers live-moderates by responding to the comments in real time and encouraging students to go deeper.  All releases are archived so that teachers can use previous graphs anytime (read this introductory post to learn more). I used “What’s Going On in This Graph?” when I was teaching our Lies, Damned lies and Statistics course, and it is such an awesome resource for helping build statistical literacy and thinking.

So, inspired by the New York Times graphs, about two months ago I created an “applet” on IFTTT that creates a new row in a Google spreadsheet every time a new article is posted to the New York Times website. It stopped working for some reason at the end of November – check out the “raw” data here: https://docs.google.com/spreadsheets/d/1PXGh0xBrJbmrfWq3nRylH5GBqzVd4SYWWiXQj3v9tdQ/edit?usp=sharing 

So what’s going on with the data I collected? Your first thought on viewing the data might be – huh? You call this data? The only variable that is “graph ready” is which section each of the nearly 6000 articles were published in. But there are so many variables in data sets just like this one waiting to be defined and explored. After our workshop on Friday, I’ll post an “after” version of this same data set 🙂

Developing learning and formative assessment tasks for evaluating statistically-based reports

This post provides the notes and resources for a workshop I ran for the Auckland Mathematical Association (AMA) on developing learning and formative assessment tasks for evaluating statistically-based reports (specifically AS91584).

Notes for workshop

The starter task for this workshop was based around a marketing leaflet I received in my letterbox for a local school back in 2014. I was instantly skeptical about the claims being made by the school and went straight to sources of public data to check the claims. As was often the case, this personal experience turned into an activity I used with my Scholarship Statistics students to help them develop their critical evaluation skills. The task, public data I used, and my attempt at answers (from my past self in 2014) are provided at the bottom of this post. My overall conclusion was that most of the claims check out until around 2011, but not so much for 2012 – 2013, leading my to speculate that the school had not updated their marketing leaflet. The starter task is all about claims and data, and not so much about statistical processes, study design, or inferential reasoning – all of which are required for students to engage with the evaluation of statistically-based reports. However, I used this task to set the focus of the workshop, which was to focus on the claims that are being made, and whether they can be supported or not, and why.

The questions used for the external assessment tasks for AS91584 (available here) are designed to help scaffold students to critique the report in terms of the claims, statements or conclusions made within the report. Students need to draw on what has been described in the report and relevant contextual and statistical knowledge to write concise and clear discussion points that show statistical insight and answer the questions posed. This is hard for students. Students find it easy to write very creative, verbose and vague responses, but harder to write responses that are not based only on speculation or that are not rote learned. We see this difficulty with internally assessed tasks as well, so it’s not that surprising that students struggle to write concise, clear, and statistically insightful discussion points under exam pressure.

Teachers who I have spoken to who have taught this standard (which includes me) really enjoy teaching statistical reports to students. In reflections and conversations with teachers on how we could further improve the awesome teaching of statistical reports, a few ideas or suggestions emerged:

  • Perhaps we focus our teaching too much on content, keeping aspects such as margin of errors and confidence intervals, observational studies vs experiments, and non-sampling errors too separate?
  • Perhaps we focus too much on “good answers” to questions about statistical reports, rather than “good questions” to ask of statistical reports?

Great ideas for teaching statistical report can be sourced from Census at School NZ or from conversations with “statistical friends” (see the slides for more details). These include ideas such as: experiencing the study design first and then critiquing a statistical report that used a similar design, using matching cards to build confidence with different ideas, keeping a focus on the statistical inquiry cycle, teaching statistical reports through the whole year rather than in one block, and teaching statistical reports alongside other topics such as time series, bivariate analysis, and bootstrapping confidence intervals. I quite like the idea of the “seven deadly sins” of statistical reports, but didn’t quite have enough time to develop what these could be before the workshop – feel free to let me know if you come up with a good set! [Update: Maybe these work or could be modified?]

When I taught statistical reports in 2013 (the first year of the new achievement standard/exam), I was gutted when I got my students’ results back at the start of 2014.  I reflected on my teaching and preparation of students for the exam and realised I had been too casual about teaching students how to respond to questions. In particular, I had expected my “good” students would gain excellence (the highest grade – showing statistical insight) because they had gained excellences for the internally-assessed students or were strong contenders to get a Scholarship in Statistics. So, a bit later in 2014, when the assessment schedules came out, I looked carefully at what had been written as expected responses. To me, it seemed that a good discussion point had to address three questions: What? Why? How? Depending on the question being asked, the whats, whys and hows were a bit different, but at the time (only having one exam and schedule to go with!) it seemed to make sense. At least, in my teaching that year with students, I felt that using this simple structure allowed me to teach and mark discussion points more confidently. You can see more details for this “discussion point” structure in the slides.

The last part of the workshop involved providing teachers with one of three statistical reports (all around the theme of coffee of course!) and asking them, in groups, to develop a formative assessment task. After identifying one or two key claims made in the report, they had to select three or four questions from previous year’s exams that would be relevant for questioning the report in front of them (relevant to the conclusions made in the report). We didn’t quite get this finished in the workshop – the goal was to create three formative assessment tasks that could be shared! However, perhaps some of the teachers who attended the workshop will go on to develop formative assessment tasks and email these to me to share at a later date. I do feel strongly that all teachers of statistics should feel confident to write their own formative or practice assessment tasks for whatever they are teaching – if you’re not sure about what understanding you are trying to assess and what questions to ask to assess that understanding, how do you feel confident with what to teach? I’m hoping to launch a project next term to help support statistics teachers to feel more confident with writing formative assessment tasks, so watch this space 🙂

Resources for workshop (via Google Drive)

Mind the stats?

mind-the-stats

Have you noticed how Google sometimes gives the top page in your search results a little summary box? For example, if you Google “how to plan a honeymoon”, you get this:

trains6

Since I didn’t do number two on this list, my job for tonight was to check out trains for our travel in the UK leg of our honeymoon. After my first Google search, I got a little distracted and consequently typed up this short post 🙂  I realised part way through that “mind the gap” is more of a London underground thing than a UK train travel thing, but it’s late so hopefully the reference still makes sense.

My first (and only) search tonight was for a train from London to Cambridge. Before even clicking through to the website listed, I got to read this little “statistical report” 🙂

trains1

The first two sentences got me questioning what “fastest journey time” means, since how can the “average journey time” be lower than the shortest journey time? The third sentence made me shake my head at the misuse our special stats word “average”  and I automatically re-worded that sentence in my head to “on weekdays there are, on average, 96 trains per day…..”

So not only because I actually needed to find out about trains from London to Cambridge, but also because I was curious to find out what “fastest journey time” means, I clicked through to https://www.thetrainline.com/train-times/london-kings-cross-to-cambridge-station

When you scroll down to the bottom you get this nice table:

trains2

This gives some immediate answers to my confusion about the Google search summary – I think. “Slowest route” actually means the minimum time, and “Fastest route” means the maximum time. At least now the average journey time of one hour sits between these two numbers, but did you notice when you scrolled down the page that there were some routes listed with times greater than 63 minutes, the supposed “fastest route”?

Me too, so I went through all routes for the next 24 hours (starting from 8:44am London time) and listed their times:

trains3

There’s bound to be a few mistakes in there when I was converting from hours to minutes 🙂 But to finish this short critique, let’s look at the data:

trains4

For this particular 24 hour period (from Monday 21st November 8:44am) there were 76 trains from London to Cambridge, with a mean journey time of around 64 minutes (based on the advertised times). If I wanted to check out the claims about the average number of trains per weekday and the average journey time, I’d need a better sampling method and more “weekdays” of data. But this sample does offer evidence to contradict the claims about “shortest” and “fastest” journey times.

Unless those terms still don’t mean what I think they mean, even when I reverse them 🙂

When is a statistical report statistical enough?

statistical_reports

We are super lucky in NZ to have something as important as statistical literacy not just written into our curriculum at all levels, but also formally assessed as part of our national qualifications system (NCEA). All of our students have to deal with messages given to them that are based on data, whether it be in the news, online via their Facebook feed, or on TV through advertising (just to list a few examples), so even if we don’t actually assess AS91266 Evaluate a statistically based report or AS91584 Evaluate statistically based reports within a learning programme, it’s still important include good examples of statistically based reports in our teaching of statistics.

This post will focus on AS91266 Evaluate a statistically based report, as this standard provides a really great opportunity to weave together many of the statistics achievement objectives at curriculum levels six and seven.  This focus will also give me an opportunity to provide some guidance on how to select good examples of statistically based reports that are strong enough statistically to be used for the assessment of this standard and also accessible to the students.

So what do you need to look for in a report?

The following questions are a good place to start when considering the suitability of a statistically based report.

  • Is the purpose of the report clear?
  • Is the report based on a survey?
  • Are the findings of the survey clear?

If you get a yes to all three of these questions, you might be heading in the right direction. For example, this article on Stuff regarding whether eating carbs help you lose weight is kind of interesting (it even has a video), has lots of contextual information around diets, has a clear purpose, but it’s not a survey and there’s only one result reported (a weight loss of 10kg). In comparison, this article also on Stuff regarding whether children eating carbs can explain rising obesity levels is more on track.

Then the next round of questions to ask are:

  • Are the population measures and variables are clear?
  • Are the sampling methods clear?
  • Are the survey methods clear?
  • Is the sample size clear?

If you also get a yes to most of these questions, then the report is probably statistical enough for students to use and will allow them to discuss the statistical methods and measures used and sampling and non-sampling errors (one of the requirements for this standard). According to these questions, the second article given above – whether children eating carbs can explain rising obesity levels – is not yet statistical enough (there are no details in the article about sampling or survey methods), so…..

How do we find a report that is statistical enough?

NZ media sites like nzherald.co.nz and stuff.co.nz, and the super awesome StatsChat statschat.org.nz, are good starting points to find reports (tip – search for key words like “survey”, “study”, “causes” etc.), but just like with the articles above, while the articles may be interesting, engaging and relevant for teenagers, there is still some work to do before they can be used in a meaningful way in the classroom. What follows is an example of making a report statistical enough 🙂

I found this article entitled “The unhealthy reason men go to the gym” on Stuff on October 18th [download pdf]. Read it carefully and see how many of those questions above you can tick yes to. The article itself is a good starting point, but on reading the article you’ll see that it takes a few findings from one study and one survey and blends them together to make the article. Also, none of the last four questions are met, and you could argue the first three are met insufficiently. This report needs more details to make it statistical enough. [Note: At this stage I am motivated enough to continue this process because I like the context and I believe it will resonate with students – this is not always the case!]

The article gives links to both of the original documents. The document for the recent study is only available if you pay for it (or have access through your library service like I do at the University) and is a very technical report that is beyond the reach of our students. The full report from the 2010 survey runs to 139 pages and is accessible to our students [download pdf], however, since the link to this survey is broken I had to Google it.

sd

From here there are two options:

  1. Trim the original newspaper article to report only on the 2010 survey and include sufficient details from the original report so that all those questions posed above can be answered (so adapt the article, the approach that seems to be used for the AS91584 exams).
  2. Trim the 139 page report but retain enough information in the report so that all those questions posed above can be answered. I reckon about four pages of A4 is a good target, but it will depend on how the report is formatted, how many graphs are included etc.

I think I prefer the latter, although I have done both in my teaching. The key point here is that students are not expected to go out and find the original article themselves to complete their evaluation of the report.

How do we make a report that is statistical enough?

The 139 page report from Mission Australia is pitched at the right level for curriculum level seven, but is far too long for a class of Year 12 students. Most reports from surveys like this have an introduction and an executive summary. These two sections often provide nearly enough for the report to be used in the classroom. I’ve put these sections into a shorter pdf document, which is well set out and easy to read [download pdf].

In deciding whether the report is statistical enough, we need to consider those questions again:

  • Is the purpose of the report clear?
  • Is the report based on a survey?
  • Are the findings of the survey clear?
  • Are the population measures and variables are clear?
  • Are the sampling methods clear?
  • Are the survey methods clear?
  • Is the sample size clear?

The foreword gives a straight forward summary with a reasonable hint of the purpose of the report. The introduction of the introduction expands on the purpose, and the next three sections of the introduction (participation, areas of focus and methodology) provide sufficient information sampling and survey methods. The demographics section of the executive summary provides some population measures, variables and the sample size, and the remaining sections provide the findings of the report. However, it would make it easier for students to evaluate if each section contained a little more detail and the information was provided in tables and/or graphs. Half an hour editing the pdf and cutting and pasting some text and tables from the full document creates a report that is statistical enough for a good quality learning or assessment activity – YAY 🙂 [download pdf]

What about contextual knowledge?

Students should also be provided with enough contextual knowledge associated with the report so that they can integrate this within their evaluation. For the example discussed, this could include the original article from Stuff along with the edited version of the full report, making it clear to students they are evaluating that edited version of the survey report not the Stuff article. It would also be a great idea for the class to discuss the topic or issue being explored by the study/survey, and compare and contrast their own experiences. This discussion would help them to consider whether the results reported are meaningful, useful and/or relevant.

Want to read more about statistical literacy?

If you can get yourself a copy of “Seeing through Statistics” by Jessica Utts (the eBook version is available for around $NZD60), you will not be disappointed. This was the first book I read to help get my head around how to teach statistical literacy, and it is full of really great guiding principles, examples and explanations. Like I said at the beginning of the post, even if you aren’t teaching AS91266 or AS91584, it is our job to help our students to make sense of the world around them  – a world where so many people are using data and statistics is so many, not always awesome, ways.