## A simple app that only does three things

Here’s a scenario. You buy a jumbo bag of marshmallows that contains a mix of pink and white colours. Of the 120 in the bag, 51 are pink, which makes you unhappy because you prefer the taste of pink marshmallows.

Time to write a letter of complaint to the company manufacturing the marshmallows?

The thing we work so hard to get our statistics students to believe is that there’s this crazy little thing called chance, and it’s something we’d like them to consider for situations where random sampling (or something like that) is involved.

For example, let’s assume the manufacturing process overall puts equal proportions of pink and white marshmallows in each jumbo bag. This is not a perfect process, there will be variation, so we wouldn’t expect exactly half pink and half white for any one jumbo bag. But how much variation could we expect? We could get students to flip coins, with each flip representing a marshmallow, and heads representing white and tails representing pink. We then can collate the results for 120 marshmallows/flips – maybe the first time we get 55 pink – and discuss the need to do this process again to build up a collection of results. Often we move to a computer-based tool to get more results, faster. Then we compare what we observed – 51 pink – to what we have simulated.

I use these kind of activities with my students, but I wanted something more so I made a very simple app earlier this year. You can find it here: learning.statistics-is-awesome.org/threethings/. You can only do three things with it (in terms of user interactions) but in terms of learning, you can do way more than three things. Have a play!

In particular, you can show that models other than 50% (for the proportion of pink marshmallows) can also generate data (simulated proportions) consistent with the observed proportion. So, not being able to reject the model used for the test (50% pink) doesn’t mean the 50% model is the one true thing. There are others. Like I told my class – just because my husband and I are compatible (and I didn’t reject him), doesn’t mean I couldn’t find another husband similarly compatible.

Note: The app is in terms of percentages, because that aligns to our approach with NZ high school students when using and interpreting survey/poll results. However, I first use counts for any introductory activities before moving to percentages, as demonstrated with this marshmallow example. The app rounds percentages to the closest 1% to keep the focus on key concepts rather than focusing on (misleading) notions of precision. I didn’t design it to be a tool for conducting formal tests or constructing confidence intervals, more to support the reasoning that goes with those approaches.

## A stats cat in a square?

On Twitter a couple of days ago, I saw a tweet suggesting that if you mark out a square on your floor, your cat will sit in it.

Since I happen to have a floor, a cat, and tape I thought I’d give it a go. You can see the result at the top of this post đ Amazing right?

Well, no, not really. I marked out the square two days ago, and our cat Elliot only sat in the square today.

Given that:

• our cat often sits on the floor
• our cat often sits on different parts of said floor
• that we have a limited amount of floor
• I marked out the square in an area that he likes to sit
• that we were paying attention to where on the floor our cat sat

… and a whole lot of other conditions, it actually isn’t as amazing as Twitter thinks. Also, my hunch is that people who do witness their cat sitting the square post this on Twitter more often than those who give up waiting for the cat toÂ sit in the square.

Below is a little simulation based on our floor size and the square size we used, taking into account our cat’s disposition for lying down in places. It’s just a bit of fun, but the point is that with random moving and stopping within a fixed area, if you watch long enough the cat will sit in the square đ

PS The cat image is by Lucie Parker. And yes, the cat only has to partially in the square when it stops but I figured that was close enough đ

## Probability teaching ideas using simulation

This post provides some teaching examples for using an online probability simulation tool. It’s aÂ supplement to theÂ workshop I offered for the NZAMT 2015 conference.

Probability simulation tool

I recently developed Â a veryÂ basic online probability simulation tool .Â I wanted a simulation tool that would run online without using applets or flash (tablet compatible). I also wanted toÂ beÂ able to animate repeated simulations in a loop – in the past to get this effect, I had to either make animated GIFs or set up slides in Powerpoint to transition automatically. I did a quick search for online simulation tools and couldn’t find what I wanted so I adapted some code I had written previously to get what I wanted.

An example of an animated looped simulation fromÂ the probability simulation tool

It’s very much designed “fit for a specific purpose” (more about that in the part 2) so I know it has lots of limitations đ But what I like about the feature being demonstrated above is that it will keep running automatically, freeing me up to ask the students questions about what they are seeing and why they are seeing this.

Small samples – lots of variation

One of the activities I presented in the workshop involved teachers trying to work out who my siblings were based on photos. I presented five sets of four photos. Within each set, one photo was of one of my siblings, the rest were photos of other non-related people. In the workshop there around 30 teachers present. Â The basic idea (with lots of assumptions) is that distribution for the number of correct selections IF teachers were guessing can be modelled by a binomial distribution with n = 5 and p = 0.25.

After “marking” the teachers selections of my siblings, I created a dot plot of the 30 individual results. One of the questions put to the teachers at the workshop was “Do these results look like what we’d see if each of you was guessing which person was my sibling?”‘

To build up a simulated distribution based on guessing, each teacher then usedÂ five different hands-on simulations to make new sibling selections for each set of photosÂ (see the resources link at the end of this post). I then created another dot plot from these simulated selections and asked teachers to compare the features of the two plots e.g. centre, spread, shape, unusual.

For this workshop, the two distributions actually came out to look pretty similar.Â But this won’t necessary happen. To demonstrate the amount of variation between repeated simulations (of 30 students guessing across five sets of possible siblings), I set up the probability simulation toolÂ with the options shown in the screen grab below:

So that the axis does not resize for each simulation, I fixed the axis between 0 and 5. To stop the dots from automatically resizing, I fixed the dot size to the smallest option. I then pressedÂ “Start animation” and let the simulations run over and over again. This gives the following animation:

This animation couldÂ thenÂ be used to ask questions like:

• “What would be an unlikely number of correct siblings if someone was guessing?”
• “How many correct siblings would you expect to see if someone was guessing – between where and where?”
• “What looks similar for each animation?”- “What looks different?”
• “What variation are we seeing?” – “Why are we seeing it?”
• “What does one dot on the graph represent?”
• “How is the simulated data being generated?”

Wild, C. Animations of sampling variation

Wild, C. VIT – Visual inference tools

NZ Senior secondary guide – Lateness: Choice or chance

Resources

Workshop materials – stimulating simulations NZAMT 2015

Online probability simulation tool

## Statistics teaching ideas based on ….. the alphabet!

This postÂ focuses on randomness, simulations and probability.

10 quick ideas……

1. Choose five letters (e.g. A, B, D, N, U) and display these together. For the rest of these ideas to work, choose letters that can go together to make three letter words (avoid certain words!). Ask students to randomly select one of the letters and write this down.
2. Ask students to share honestly how they selected their letter – you should find they do use a reason e.g. the first letter of their name, or they choose the one they think no one else will select. Discuss the difference between selecting something and randomly selecting something, and get students to come up with examples for each e.g. selecting which lolly to eat based on which one you like vs putting your hand into a bag and choosing a lolly without looking.
3. YouÂ could discuss more how humans are not that great at generating or accepting randomness. There are some greatÂ youtube videosand websites with ideas for activities to explore this. A nice example is this decision byÂ Spotify to change their algorithm for shuffling songsÂ – their article includes some nice visualisations to support their discussion. You could also explore how the word or concept of random is used in everyday language, or in particular, in design (like my example below).
4. Display the class results as a dot plot (with the letters along the horizontal axis). So what are we looking for in the plot? Ask the students – are these results what you expect? Some students may discuss expecting to see an equal number of selections for the five letters, others may expect to see uneven results because “it’s random”, others may have other ideas based on not trusting that other students selected their letters randomly. Try to get as much out of your students as possible so you know what they are thinking đ
5. We can’t use the results to prove that students selected their letters randomly or not, but we can see if the resultsÂ look like what we’d get if a random process was used. Students may not know what they are looking for, and for small samples like a class, we actually expect quite a bit of variation. Use aÂ simulation tool like thisÂ one to simulate randomly selecting n lettersÂ with replacementÂ from the five letters you used (n being the size of your class). Discuss with the class whether their results look similar or different to the simulated results.
6. Make five large cards with each of the five letters on them. Select three students from the class (randomly or not!) and use a shuffling process to allocate each student one of the five cards. Get your students to stand in a line facing the class with their letters hidden. Ask the class how likely they think it is that when the three letters are shown that the three letters will make a word. Then get the students one by one from left to right reveal their letters.
7. Get students to generate three letter “words” by randomly selecting three lettersÂ without replacementfrom the five letters you used (they could work in groups with their own set of five cards). This will require students to decide if a word is real or not. If you want to help students spot correct words, Â you could do a round of “Bogggle” and get the class to create as many valid three letter words as possible from the five letters without repeating letters. Depending on previous learning, you may need to discuss the concept of probability estimates (AKA experimental probability), before getting students to generate 20 “words” (or more if you like!), counting how many of these “words” were real, and determining an estimate for the probability.
8. Discuss how a simulation could be set up using a computer to run thousands of trials to check randomly created words from the five letters against a list of three letter words that are “real” to determine a closer estimate of the modelÂ probability (AKA theoretical probability). This process of checking words against a list of “true words”could be compared to processes around checking whether an emailÂ addressÂ submitted to an online signup form is “real” or not. We need keep linking what we do in the classroom with the real world đ
9. You could explore the model probability by considering the total number of “words” that could be created by randomly selecting three letters from the five without replacement (e.g. 5 x 4 x 3 = 60) and the total number of real words found by systematically trying out all permutations or by using aÂ Scrabble tool like this oneÂ (e.g. for my five letters it’s elevenÂ real words).
10. You could finish by looking at the “Infinite Monkey Theorem“. This will require a bit more of a theoretical focusÂ and understanding ofÂ complementary events and the usefulness of finding P(X = 0) when you need to find P(X â„ 1). This kind of thinking can be referred to whenever a new animal is found to be awesome at predicting the results of sports games e.g.Â Paul the Octopus,Â Richie McCow