While I continue to decide whether to quit Facebook, I’ve been trying to keep on top of my admin responsibilities for the Stats Teachers NZ Facebook group while keeping an eye on any stats-related posts on the NZ Maths teachers Facebook group. Since not everyone is on Facebook, I thought I’d do a quick post sharing some of the ideas for teaching stats I’ve recently shared within these groups.

How is the bootstrap confidence interval calculated?

The method of bootstrap confidence interval construction we use at high school level in NZ is to take the central 95 percentile of the bootstrap distribution (the 1000 re-sampled means/medians/proportions/differences etc). There are other bootstrap methods (but we don’t cover these at high school level) and because of the approach we use you can get non-symmetrical confidence intervals.

Here are a couple of videos featuring Chris Wild talking about bootstrap confidence intervals: 

You can read more about the research project and development of VIT that informed the implementation of simulation-based inference for NZ statistics at the school level here: http://www.tlri.org.nz/sites/default/files/projects/9295_summary%20report_0.pdf

A quick but helpful article with more background about norm-based confidence intervals and bootstrap confidence intervals in terms of teaching: https://new.censusatschool.org.nz/wp-content/uploads/2012/08/Confidence-intervals-what-matters-most.pdf

A recent article by Mark Hooper for the SDSE (Statistics and Data Science Educator) provides an activity for introducing bootstrapping: https://sdse.online/posts/SDSE19-004/

Does shape of the bootstrap distribution tell us anything about whether some values in a confidence interval are more likely to be the true value of the parameter?

All the values in the confidence interval are plausible in terms of the population parameter (well, except for the case of impossible values e.g. a negative value when estimating the mean length of a piece of string, or 0% when estimating a population proportion when your sample proportion was not 0%!). As an extra note, we often see skewness in the bootstrap distribution when using small samples whose distributions are skewed (since we resample from the original sample). Small samples are not that great at getting a feel for what the shape of the underlying/population distribution is.

Is a bootstrap confidence interval a 95% confidence interval?

Sample size is a key consideration here ? With large sample sizes, the bootstrap method does “work” about 95% of the time, hence giving us 95% confidence. But, just like norm-based methods (e.g. using 1.96 x se), with small samples our confidence level will not be as high using the “central 95% percentile” approach.

Can students use both the CL5 and CL6 rules when making a call? Why can’t the CL5 rule be used with sample sizes bigger than 40?

The rules are designed to scaffold student understanding of what needs to be taken into account when comparing samples to make inferences about populations. Once students learn and can use a higher level rule, they should use this rule by itself. The two rules use different features of the sample distributions and do not give the same “results”. If you use both at the same time, you are encouraging an approach where you select the method that will give you the result you want!

In terms of whether the rule “works” we have to consider not just the cases of “making a call” when we should, but also the cases of “not making a call” when we should. Yes, the CL5 “works” when applied to data from bigger samples than 40, in terms of “evidence of a true difference”. The problem is that for larger sample sizes, when using the CL5 rule, you become much more likely to think the data provides “no evidence of a true difference” when really it does. In this respect, the rule does not “get better” as you increase sample size ? It’s too stringent, which is why we move to higher curriculum level “rules” or approaches, ones where we learn to take sample size (among other things) into account.

If a sample size is larger, does that mean it is more representative of the population?

Let’s say you have access to 5000 people who voted for the national party in the last election and ask them whether they support Judith Collins as the next PM, and obtain a sample proportion. If you used this sample proportion to construct a confidence interval, it would have small margin of error (narrow interval, high precision), BUT the confidence interval would probably “miss the target” if you were wanting to infer about the proportion of all NZers who support Judith Collins as the next PM because of high bias/inaccuracy ?

It is important to know the “target” population for the inference you want to make using the same, and check if the sample you are using was taken from this population. In terms of teaching sample to population inference, we need to use a random sample from this population. Our inference methods only model sampling error (how do random samples from populations behave) not nonsampling error (everything else that can go wrong, including the method used to select the sample). If we can’t use a random sample (which in practical terms is pretty difficult to obtain when your sampling frame is not a supplied dataset), then we need to consider how the sample was obtained and also be prepared to assume/indicate even more uncertainty for our inference, in addition to what we are modelling based on sampling variation 🙂

Watch out for a common student misconception that larger populations require larger samples. The population size is not important or relevant (unless you want to get into finite population corrections), it’s the size of the sample that is important in terms of quantifying sampling error. Hence why it was a question in my first stage stats test a couple of weeks ago!

A tool I developed that is handy for exploring confidence intervals for single proportions and the impact of sample size and the value of the sample proportion can be found here: https://learning.statistics-is-awesome.org/threethings/

How can you find good articles for 2.11 Evaluate a statistically based report?

I’ve written a little bit about finding and adapting statistical reports here. To summarise, I find newspaper articles are often not substantial enough, since 2.11 requires the report to be based on a survey and students need to be given enough info about how the survey-based study was carried out to be able to critique it. Often the executive summary from a national NZ-based survey works better (with some trimming, adaption). I like NZ on Air based surveys, as this recent one looks do-able with some adaption: Children’s Media Use Survey 2020  – it even mentions TikTok!

Can you create a links to iNZight Lite and VIT online with data pre-loaded?

Yes – I made a video about setting up data links to iNZight Lite here:

If you want to use the Time Series module with your data, just chance the “land=visualize” part of the URL to “land=timeSeries”.

What should a student do if they get negative forecasts from their time series model, when the variable being modelled can’t take on negative values?

You want the student to go back and take a look at the data! And then the model. And ask themselves – what’s gone wrong here? Is it how I’m modelling the trend? Or is it how I’m modelling the seasonality? Or both? Is the trend even “nicely behaved” enough to model? Same with the seasonality ?

Often the data shows why the model fitted will not do a good job, even before looking at the forecasts generated. We should be encouraging students to look at the data that was used to build the model, particularly for time series when we are focusing on modelling trend and seasonality. Students should be encouraged to ask – why is the model generating negative values for the forecast? How did it learn to do this from the data I used? Can I develop a better model?