Which one doesn’t belong …. for stats?

If you haven’t heard of the activity Which one doesn’t belong? (WODB), it involves showing students four “things” and asking them to describe/argue which one doesn’t belong. There are heaps of examples of Which one doesn’t belong? in action for math(s) on the web, Twitter, and even in a book. From what I’ve seen, for math(s) I think the activity is pretty cool. In terms of whether WODB works for stats, however, I’m not so sure. Perhaps for definitions, facts, static pieces of knowledge it could work (?), but in terms of making comparisons involving data and its various representations (including graphs/displays), I need more convincing. There’s something different between comparing properties of shapes (for example), which remain fixed, and comparing data about something/someone, which could vary.

For example, What cat doesn’t belong? for the four “stats cats” data cards shown below.

To make comparisons between the four cats means to reason with data, but if I am considering only the data provided in these four data cards then these comparisons are made without uncertainty. For example, I can say definitively, for these four cats, that:

• Elliot is the only cat with a name that has three syllables,
• Molly is the only female cat,
• Joey is the only cat is both an inside and outside cat,
• Classic is the only cat that uses a cat door.

I could argue many different cases for which cat (or photo) does not belong. This is all cool, but doesn’t feel like statistics to me. Statistics is all about using data to make decisions in the face of uncertainty, by appreciating different sources of variation and considering how to deal with these. In particular, inferential reasoning involves going beyond the data at hand, thinking about generalisability, considering the quality and quantity of data available, and appreciating/communicating the possibility of being wrong not matter how “right” the methodology.

So while I appreciate that WODB allows for “not just one correct answer” and the development of argumentation skills, I’d be more happier if this kind of activity within statistics teaching led to the posing of statistical investigative questions (SIQ): WODB->SIQ. Why? We need more data and more of an idea of where the data came from to really answer the really interesting questions that comparing these four cats might provoke us to consider. We need students to feel the uncertainty that comes from thinking and reasoning statistically and to help students find ways to deal with this uncertainty. We also need students to care about the questions being asked of the data – my worry here is that otherwise the question students might ask when using WODB is Who cares which one doesn’t belong? 🙂

Questions I have when looking at these stats cats data cards, which are interesting to me are: I wonder …. How many syllables do cats’ names have? Do most cats have two syllable names? Is Elliot (my cat!) an unusual name for this reason? Do I spend too much on cat food (\$NZD30 per week)? Or maybe black cats are more expensive to feed? I won’t be able to get definitive answers to these questions, but by collecting more data and investigating these questions using statistical methods I can get a better understanding of what could be plausible answers.

PS Want some of these data cards? Head here –> It’s raining cats and dogs (hopefully)

Probability teaching ideas using simulation

This post provides some teaching examples for using an online probability simulation tool. It’s a supplement to the workshop I offered for the NZAMT 2015 conference.

Probability simulation tool

I recently developed  a very basic online probability simulation tool . I wanted a simulation tool that would run online without using applets or flash (tablet compatible). I also wanted to be able to animate repeated simulations in a loop – in the past to get this effect, I had to either make animated GIFs or set up slides in Powerpoint to transition automatically. I did a quick search for online simulation tools and couldn’t find what I wanted so I adapted some code I had written previously to get what I wanted.

An example of an animated looped simulation from the probability simulation tool

It’s very much designed “fit for a specific purpose” (more about that in the part 2) so I know it has lots of limitations 🙂 But what I like about the feature being demonstrated above is that it will keep running automatically, freeing me up to ask the students questions about what they are seeing and why they are seeing this.

Small samples – lots of variation

One of the activities I presented in the workshop involved teachers trying to work out who my siblings were based on photos. I presented five sets of four photos. Within each set, one photo was of one of my siblings, the rest were photos of other non-related people. In the workshop there around 30 teachers present.  The basic idea (with lots of assumptions) is that distribution for the number of correct selections IF teachers were guessing can be modelled by a binomial distribution with n = 5 and p = 0.25.

After “marking” the teachers selections of my siblings, I created a dot plot of the 30 individual results. One of the questions put to the teachers at the workshop was “Do these results look like what we’d see if each of you was guessing which person was my sibling?”‘

To build up a simulated distribution based on guessing, each teacher then used five different hands-on simulations to make new sibling selections for each set of photos (see the resources link at the end of this post). I then created another dot plot from these simulated selections and asked teachers to compare the features of the two plots e.g. centre, spread, shape, unusual.

For this workshop, the two distributions actually came out to look pretty similar. But this won’t necessary happen. To demonstrate the amount of variation between repeated simulations (of 30 students guessing across five sets of possible siblings), I set up the probability simulation tool with the options shown in the screen grab below:

So that the axis does not resize for each simulation, I fixed the axis between 0 and 5. To stop the dots from automatically resizing, I fixed the dot size to the smallest option. I then pressed “Start animation” and let the simulations run over and over again. This gives the following animation:

This animation could then be used to ask questions like:

• “What would be an unlikely number of correct siblings if someone was guessing?”
• “How many correct siblings would you expect to see if someone was guessing – between where and where?”
• “What looks similar for each animation?”- “What looks different?”
• “What variation are we seeing?” – “Why are we seeing it?”
• “What does one dot on the graph represent?”
• “How is the simulated data being generated?”

Wild, C. Animations of sampling variation

Wild, C. VIT – Visual inference tools

NZ Senior secondary guide – Lateness: Choice or chance

Resources

Workshop materials – stimulating simulations NZAMT 2015

Online probability simulation tool

Statistics lesson starter: Is this really surprising?

A supermarket is running a promotion. For every \$20 you spend, you will receive one domino. There are 50 dominoes to collect. I received 10 dominoes for my last shop and was surprised to find that all 10 dominoes were different. Should I have been surprised? Explain 🙂

Update after some more shopping…