You say data, I say data cards …

This long weekend (in Auckland anyway!), I spent some time updating the Quick! Draw! sampling tool (read more about it here Cat and whisker plots: sampling from the Quick, Draw! dataset). You may need to clear your browser cache/data to see the most recent version of the sampling tool.

One of the motivations for doing so was a visit to my favourite kind of store – a stationery store – where I saw (and bought!) this lovely gadget:

It’s a circle punch with a 2″/5 cm diameter. When I saw it, my first thought was “oh cool I can make dot-shaped data cards”, like a normal person right?

Using data cards to make physical plots is not a new idea – see censusatschool.org.nz/resource/growing-scatterplots/ by Pip Arnold for one example:

But I haven’t seen dot-shaped ones yet, so this led me to re-develop the Quick! Draw! sampling tool to be able to create some 🙂

I was also motivated to work some more on the tool after the fantastic Wendy Gibbs asked me at the NZAMT (New Zealand Association of Mathematics Teachers) writing camp if I could include variables related to the times involved with each drawing. I suspect she has read this super cool post by Jim Vallandingham (while you’re at his site, check out some of his other cool posts and visualisations) which came out after I first released the sampling tool and compares strokes and drawing/pause times for different words/concepts – including cats and dogs!

So, with Quick! Draw! sampling tool you can now get the following variables for each drawing in the sample:

The drawing and pause times are in seconds. The drawing time captures the time taken for each stroke from beginning to end and the pause time captures all the time between strokes. If you add these two times together, you will get the total time the person spent drawing the word/concept before either the 20 seconds was up, or Google tried to identify the word/concept. Below the word/concept drawn is whether the drawing was correctly recognised (true) or not (false).

I also added three ways to use the data cards once they have been generated using the sampling tool (scroll down to below the data cards). You can now:

  1. download a PDF version of the data cards, with circles the same size as the circle punch shown above (2″/5cm)
  2. download the CSV file for the sample data
  3. show the sample data as a HTML table (which makes it easy to copy and paste into a Google sheet for example)

In terms of options (2) and (3) above, I had resisted making the data this accessible in the previous version of the sampling tool. One of the reasons for this is because I wanted the drawings themselves to be considered as data, and as human would be involved in developed this variable, there was a need to work with just a sample of all the millions of drawings. I still feel this way, so I encourage you to get students to develop at least one new variable for their sample data that is based on a feature of the drawing 🙂 For example, whether the drawing of a cat is the face only, or includes the body too.

There are other cool things possible to expand the variables provided. Students could create a new variable by adding drawing_time and pause_time together. They could also create a variable which compares the number_strokes to the drawing_time e.g. average time per stroke. Students could also use the day_sketched variable to classify sketches as weekday or weekend drawings. Students should soon find the hemisphere is not that useful for comparisons, so could explore another country-related classification like continent. More advanced manipulations could involve working with the time stamps, which are given for all drawings using UTC time. This has consequences for the variable day_sketched as many countries (and places within countries) will be behind or ahead of the UTC time.

If you’ve made it this far in the post…. why not play with a little R 🙂

I wonder which common household pet Quick! drawers tend to use the most strokes to draw? Cats, dogs, or fish?

Have a go at modifying the R code below, using the iNZightPlots package by Tom Elliott and my [very-much-in-its-initial-stages-of-development] iNZightR package, to see what we can learn from the data 🙂 If you’re feeling extra adventurous, why not try modifying the code to explore the relationship between number of strokes and drawing time!