Secret statistical snowflakes

Want to make some awesome gift tags/labels for Christmas or holiday-related presents? Here’s a fun little statistical art project. Write whatever words you want in the app below, create some secret snowflakes (the secret part being no one else will know what words you used unless of course you choose to display them), play around with colours if you want (uncheck the option to use random colours), freeze the snowflakes when you get something you like, download your masterpiece and use in some way.

Oh yeah, the snowflakes are made by rotating each letter in the words in a magical statistical way (i.e. randomness).

To make our gift labels, I made the first colour white (the background #ffffff), made the other two colours black (#000000), and then printed on to adhesive sticker paper I had left over from our wedding.

Enjoy and have a great holiday break!

Secret snowflakes app should be shown below (otherwise here is the link) – works best using a Chrome browser ๐Ÿ™‚

A stats cat in a square?

On Twitter a couple of days ago, I saw a tweet suggesting that if you mark out a square on your floor, your cat will sit in it.


Since I happen to have a floor, a cat, and tape I thought I’d give it a go. You can see the result at the top of this post ๐Ÿ™‚ Amazing right?

Well, no, not really. I marked out the square two days ago, and our cat Elliot only sat in the square today.

Given that:

  • our cat often sits on the floor
  • our cat often sits on different parts of said floor
  • that we have a limited amount of floor
  • I marked out the square in an area that he likes to sit
  • that we were paying attention to where on the floor our cat sat

… and a whole lot of other conditions, it actually isn’t as amazing as Twitter thinks. Also, my hunch is that people who do witness their cat sitting the square post this on Twitter more often than those who give up waiting for the cat to sit in the square.

Below is a little simulation based on our floor size and the square size we used, taking into account our cat’s disposition for lying down in places. It’s just a bit of fun, but the point is that with random moving and stopping within a fixed area, if you watch long enough the cat will sit in the square ๐Ÿ™‚

PS The cat image is by Lucie Parker. And yes, the cat only has to partially in the square when it stops but I figured that was close enough ๐Ÿ™‚

Mind the stats?


Have you noticed how Google sometimes gives the top page in your search results a little summary box? For example, if you Google “how to plan a honeymoon”, you get this:


Since I didn’t do number two on this list, my job for tonight was to check out trains for our travel in the UK leg of our honeymoon. After my first Google search, I got a little distracted and consequently typed up this short post ๐Ÿ™‚ย  I realised part way through that “mind the gap” is more of a London underground thing than a UK train travel thing, but it’s late so hopefully the reference still makes sense.

My first (and only) search tonight was for a train from London to Cambridge. Before even clicking through to the website listed, I got to read this little “statistical report” ๐Ÿ™‚


The first two sentences got me questioning what “fastest journey time” means, since how can the “average journey time” be lower than the shortest journey time? The third sentence made me shake my head at the misuse our special stats word “average”ย  and I automatically re-worded that sentence in my head to “on weekdays there are, on average, 96 trains per day…..”

So not only because I actually needed to find out about trains from London to Cambridge, but also because I was curious to find out what “fastest journey time” means, I clicked through to

When you scroll down to the bottom you get this nice table:


This gives some immediate answers to my confusion about the Google search summary – I think. “Slowest route” actually means the minimum time, and “Fastest route” means the maximum time. At least now the average journey time of one hour sits between these two numbers, but did you notice when you scrolled down the page that there were some routes listed with times greater than 63 minutes, the supposed “fastest route”?

Me too, so I went through all routes for the next 24 hours (starting from 8:44am London time) and listed their times:


There’s bound to be a few mistakes in there when I was converting from hours to minutes ๐Ÿ™‚ But to finish this short critique, let’s look at the data:


For this particular 24 hour period (from Monday 21st November 8:44am) there were 76 trains from London to Cambridge, with a mean journey time of around 64 minutes (based on the advertised times). If I wanted to check out the claims about the average number of trains per weekday and the average journey time, I’d need a better sampling method and more “weekdays” of data. But this sample does offer evidence to contradict the claims about “shortest” and “fastest” journey times.

Unless those terms still don’t mean what I think they mean, even when I reverse them ๐Ÿ™‚

How many of my emails will get rolled up this week?

At the start of the year I started using a service call unroll me with my gmail account. It allows you to wrap up regular or subscription emails into one daily email digest. It takes a number of months to setup the service to capture all your regular or subscription emails, but I have found it helpful in reducing the clutter in my email so worth the minimal effort.

I noticed – as you do when you’re a stats teacher – that the number of emails that are rolled up per day varies. I wondered if there was anything going on – any patterns, trends etc. –  so went back over the last couple of months and recorded how many emails were wrapped up per day.

So here’s a little challenge for your students ๐Ÿ™‚

Using the data on the number of my emails wrapped per day for the last few months, can they predict how many of my emails will be wrapped up over the next four days (Tuesday), Wednesday, Thursday and Friday?

Here’s the data…….

Jump with the data into iNZight lite

Download the data as a CSV

Link for data:

Raw data as ordered counts (first count is a Monday)


Not sure how to get the students started?

Here are some ideas you could give to students:

  • Graph the data in Excel or another spreadsheet and used “your eyes” and/or a sketch to make the prediction
  • Import the data into iNZight (or equivalent) and try to use a time series model to make the predictions
  • Find the mean number of emails rolled up for each day of the week and use these to make the predictions
  • Use a probability distribution to model the number of emails rolled up each day and generate four random outcomes from this model to make the predictions

So how many emails did I get?

Move your mouse over the grey box below to see ๐Ÿ™‚

Tuesday: 22

Wednesday: 29

Thursday: 30

Friday: 33

Using statistics to plan a wedding


Sorry there have been no posts for a while. I have a whole stash of draft posts nearly ready to be published, but work, study, wedding planning and life in general have got in the way ๐Ÿ™‚
One of the few posts I have made this yearย wasย about statistical modelling so I thought I’d quickly share something related to this – an article about how an Australian couple used statistical modelling to predict how many guests will turn up to their wedding.

I love this article, not just because I am planning a wedding and I love statistics, but also because of how it discusses some of the key components of statistical modelling, for example:

  • the need for a model (including the risks of getting the model wrong which we don’t always talk about)
  • building the model (what factors were taken into account and why)
  • assumptions (including which assumptions turned out to not be so good)
  • acknowledging uncertainty (factors out of their control and other unknown information)
  • using the model (getting predictions, using a prediction interval)
  • evaluating and refining the model (considering how well the model performed, and how could it be improved for future applications)
…… and probably other aspects I’ve missed in this brief summary. I’m not sure how interesting this context would be for students but for me it was super interesting and inspiring even.
And the answer to the question you may be asking is ……… yes I did create my own statistical model for our wedding ๐Ÿ™‚ And this post may or may not be related to our RSVP date being very soon ….ย 

ย Update: Seems like something we missed from our model was some invitations going missing in the post!