Um ….. here’s a new tool for exploring probability distributions!

Actually, it’s not a new tool exactly, more a re-working of the existing modelling tool I’ve already shared on this blog, but with a new name and web location – the probability distribution explorer!

I developed the probability distribution explorer as part of my Masters research into teaching probability distribution modelling. The proposed teaching framework and the tool were developed in response to use of data for distribution modelling for AS91586, in particular the need for students to demonstrate use of methods related to the distribution of true probabilities versus distribution of model estimates of probabilities versus distribution of experimental estimates of probabilities.

The tool was developed primarily to support comparisons of the “distribution of experimental estimates of probabilities” and “distribution of model estimates of probabilities”. When reviewing research literature, I found limited examples of how to teach this comparison using an informal approach i.e. not using a Chi-square goodness-of-fit test. Consequently, I also found a lack of statistically sound criteria to enable drawing of conclusions in such resources as textbooks, workbooks and assessment exemplars.

This led to my research, which involved a small group of New Zealand high school statistics teachers. Focusing on the Poisson distribution, the criteria used by ten Grade 12 teachers for informally testing the fit of a probability distribution model was investigated. I found that criteria currently used by the teachers were unreliable as they could not correctly assess model fit, in particular, sample size was not taken into account.

After exploring the goodness-of-fit using my visual inference tool, teachers reported a deeper understanding of model fit. In particular, that the tool had allowed them to take into account sample size when testing the fit of the probability distribution model through the visualisation of expected distributional shape variation. I’ve re-developed the tool this year to support NZQA as they explore opportunities for assessment within a digital environment. A team of teachers are developing prototype assessment activities for AS91586 and these will be trialled with students in schools later in the year.

The video below gives a general introduction to the tool, using data on how many times I say “um” when I’m teaching. The video itself provides another source of data because, um … well, you’ll see if you watch!

More videos, teaching notes and related resources can be found here:

mathstatic site issues

Just a quick post to let you know that the site is hopefully only temporarily down, and I am working with my hosting company to get it back online ASAP. This affects the random redirect tool, the BYOP sampler tool and the experiment lab page, which will not be available until this gets sorted. I’ll update this post soon with a progress update!


It seems the issue is that some overseas dodgy folk have been using the random redirect tool for fraudalent things like phishing scams. So, I’m going to restrict the URLs that can be used – which means analysis time to identify which sites/URL patterns to accept e.g. Google forms, survey monkey etc. 🙂

UPDATE TWO is back up and running! It probably was a couple of hours ago, but I have been rewriting the code that processes the random redirect requests. Below are the main changes to the random redirect tool to better prevent issues in the future.

Due to abuse of this tool by dodgy folk, only links with domains on the approved list will now be accepted! Please complete this form to request a domain to be added to the approved list, but don’t expect any new additions to happen any time soon (this is a free tool remember and was created for simple classroom-based randomised experiments with Google forms).

Any random redirect URLs created using this tool can be disabled at any time. If this has happened to you and you are a legitimate teacher, educator or researcher, then send me an email and I might be able to help you.


After emailing me this morning to say everything was sorted with, my webhosting company then decided to set my site to “maintenance” mode this afternoon and remove some crucial code used to redirect the URLs to the right locations on my website 🙁 I’m trying to get things rest back to what they were now.


Well, I had been meaning to retire the old website anyway! I’m not sure when mathstatic will be online again, so:

I think that’s everything. If there is something else not working, then please let me know!

Past and future talks and workshops

I’m pretty excited about the talks and workshops I’m doing over the next month or so! Below are the summaries or abstracts for each talk/workshop and when I get a chance I’ll write up some of the ideas presented in separate posts.

Keynote: Searching for meaningful sampling in apple orchards, YouTube videos, and many other places! (AMA, Auckland, September 14, 2019)

In this talk, I shared some of my ideas and adventures with developing more meaningful learning tasks for sampling. Using the “Apple orchard” exemplar task, I presented some ideas for “renovating” existing tasks and then introduced some new opportunities for teaching sample-to-population inference in the context of modern data and associated technologies. I shared a simple online version of the apple orchard and also talked about how my binge watching of DIY YouTube videos led to my personal (and meaningful) reason to sample and compare YouTube videos.

I made hexagon-shaped drink coasters!!!!

Workshop: Expanding your toolkit for teaching statistics (AMA, September 14, Auckland, 2019)

In this workshop, we explored some tools and apps that I’ve developed to support student’s statistical understanding. Examples were: an interactive dot plot for building understanding of mean and standard deviation, a modelling tool for building understanding of distributional variation, tools for carrying out experiments online and some new tools for collecting data through sampling.

The slides for both the keynote and workshop are embedded below:

Talk: Introducing high school statistics teachers to code-driven tools for statistical modelling (VUW/NZCER, Wellington, September 30, Auckland, 2019)

Abstract: The advent of data science has led to statistics education researchers re-thinking and expanding their ideas about tools for teaching and learning statistical modelling. Algorithmic methods for statistical inference, such as the randomisation test, are typically taught within NZ high school classrooms using GUI-driven tools such as VIT. A teaching experiment was conducted over three five-hour workshops with six high school statistics teachers, using new tasks designed to blend the use of both GUI-driven and code-driven tools for learning statistical modelling. Our findings from this exploratory study indicate that teachers began to enrich and expand their ideas about statistical modelling through the complementary experiences of using both GUI-driven and code-driven tools.

Keynote: Follow the data (NZAMT, Wellington, October 3, 2019)

Abstract: Data science is transforming the statistics curriculum. The amount, availability, diversity and complexity of data that are now available in our modern world requires us to broaden our definitions and understandings of what data is, how we can get data, how data can be structured and what it means to teach students how to learn from data. In particular, students will need to integrate statistical and computational thinking and to develop a broader awareness of, and practical skills with, digital technologies. In this talk I will demonstrate how we can follow the data to develop new learning tasks for data science that are inclusive, engaging, effective, and build on existing statistics pedagogy.

Workshop: Just hit like! Data science for everyone, including cats (and maybe dogs) (NZAMT, Wellington, October 2, 2019)

Abstract: Data science is all about integrating statistical and computational thinking with data. In this hands-on workshop we will explore a collection of learning tasks I have designed to introduce students to the exciting world of image data, measures of popularity on the web, machine learning, algorithms, and APIs. We’ll explore questions such as “Are photos of cats or dogs more popular on the web?”, “What makes a good black and white photo?”, “How can we sort photos into a particular order?”, “How can I make a cat selfie?” and many more. We’ll use familiar statistics tools and approaches, such as data cards, collaborative group tasks and sampling activities, and also try out some new computational tools for learning from data. Statistical concepts covered include features of data distributions, informal inference, exploratory data analysis and predictive modelling. We’ll also discuss how each task can also be extended or adapted to focus on specific aspects and levels of the statistics curriculum. Please bring along a laptop to the workshop.

I’m also presenting a workshop at NZAMT with Christine Franklin on what makes a good statistical task. I’ve been assisting Maxine Pfannkuch and members of the NZSA education committee to set up a new teaching journal, which we will be launching at the workshop!!

A simple app that only does three things

Here’s a scenario. You buy a jumbo bag of marshmallows that contains a mix of pink and white colours. Of the 120 in the bag, 51 are pink, which makes you unhappy because you prefer the taste of pink marshmallows.

Time to write a letter of complaint to the company manufacturing the marshmallows?

The thing we work so hard to get our statistics students to believe is that there’s this crazy little thing called chance, and it’s something we’d like them to consider for situations where random sampling (or something like that) is involved.

For example, let’s assume the manufacturing process overall puts equal proportions of pink and white marshmallows in each jumbo bag. This is not a perfect process, there will be variation, so we wouldn’t expect exactly half pink and half white for any one jumbo bag. But how much variation could we expect? We could get students to flip coins, with each flip representing a marshmallow, and heads representing white and tails representing pink. We then can collate the results for 120 marshmallows/flips – maybe the first time we get 55 pink – and discuss the need to do this process again to build up a collection of results. Often we move to a computer-based tool to get more results, faster. Then we compare what we observed – 51 pink – to what we have simulated.

Created using my, yes it should be two-tailed, no my tool doesn’t allow this 🙁

I use these kind of activities with my students, but I wanted something more so I made a very simple app earlier this year. You can find it here: You can only do three things with it (in terms of user interactions) but in terms of learning, you can do way more than three things. Have a play!

In particular, you can show that models other than 50% (for the proportion of pink marshmallows) can also generate data (simulated proportions) consistent with the observed proportion. So, not being able to reject the model used for the test (50% pink) doesn’t mean the 50% model is the one true thing. There are others. Like I told my class – just because my husband and I are compatible (and I didn’t reject him), doesn’t mean I couldn’t find another husband similarly compatible.

Note: The app is in terms of percentages, because that aligns to our approach with NZ high school students when using and interpreting survey/poll results. However, I first use counts for any introductory activities before moving to percentages, as demonstrated with this marshmallow example. The app rounds percentages to the closest 1% to keep the focus on key concepts rather than focusing on (misleading) notions of precision. I didn’t design it to be a tool for conducting formal tests or constructing confidence intervals, more to support the reasoning that goes with those approaches.