Icebreakers! (not the gum)

To start off this post, it’s probably fitting to quote a Duran Duran song (1990): “The lasting first impression is what you’re looking for.”

Besides starting with the usual housekeeping on the first day of class, why not set the tone for the course by providing students with a glimpse into the classroom environment as a community of learners, get students to connect with one another, AND do statistics? Look no further than an Icebreaker activity! We present two Icebreakers that can get your class (either in-person or online) off to a great start: Questions on the Back (a classic) and How Old? Visualization

Questions on the Back Activity (Laura Le)

The purpose of the Questions on the Back activity is to allow students to experience statistics in an informal (and fun!) way. And, it can be implemented within in-person (note: does not adhere to the physical distancing guidelines) and online (asynchronous) introductory statistics courses. 

Activity for in-person courses

The start of the activity is to tape a question1 to the back of each student. Students will not know the questions that are being taped on their back, but tell them that the goal of this activity is to collect data (numbers only!) from their fellow classmates to help them figure out what the question is on their back. Now, some of the questions may be easier to figure out, such as “What is your shoe size?”, and other questions may be harder to identify, such as “What is your lucky number?”. 

1Instructor prep prior to class: (1) Create a list of questions where the answer is a number. Here is my running list as a starting point. (2) Print off the list and cut out the questions into little strips of paper. (3) Bring the strips and Scotch tape to class for the activity. (4) Create a slide (or a poster) of the activity’s instructions that can be displayed. Here is my slide for the instructions.

Note: For the remainder of this article, I’m going to refer to the students with the unknown questions on their back as Question Carriers and the students who read the questions and provide an answer as Responders.

At this point, students are asked to walk around the classroom with a writing utensil and something to write on and interact with their classmates. When a student finds a peer, I ask that they introduce themselves and possibly state their program or major. Then, the pair takes turns reading the question on the back of their peer in their head (silently) and providing an answer to the question with only a number and nothing else (e.g., no units). If the question asks “What is your…?”, it is a question about the Responder (and not about the Question Carrier). After collecting responses from all the students in the class (if the class size is less than 20) or after 15 responses (if the class size is greater than 20), students are to find a place at the board (if there is enough room) or use a drawing tool (paper, iPad) and graph the data to help them figure out what question was on their back. 

Once everyone has graphed their data, I ask for volunteers to summarize their results (using the graph as a visual) and to try to guess their question. For those that volunteer, they are the first to be able to see their question on the back. The number of volunteers I have depends on how many minutes are left before class is over. In total, this activity takes approximately 20-30 minutes of class (or longer, depending on how many students you ask to describe their graph and guess their question).

This activity is one of my favorites to kick off the first day of my in-person introductory statistics courses. Why, you may ask?

  1. Speed Meeting: The activity allows students to meet and interact with their peers on a one-to-one basis in a less intimidating, more personal setting than in a large group setting (e.g., going around the room), and they can meet most of their peers (if not all of their peers) in a relatively short period of time. It’s a great way to set the class’s tone as a community of learners.
  2. Element of Surprise: The students are interested and motivated in figuring out what the question is on their back. 
  3. DOING Statistics on Day 1: They are DOING statistics on the first day of class. They are collecting, and possibly organizing, real data, which is a GAISE recommendation and goal for introductory statistics courses. Some are using external cues (variables) beyond the number provided to help figure out their question, such as how long it takes Responders to answer the question or the Responder’s body language. hey are also summarizing data with a graph of their choice, since they are not told what kind of graph to create. 
  4. Informal Assessment: The kinds of plots that are created help me, as the instructor, understand where my students are at in their prior knowledge (specifically, on graphical representations). I have seen all kinds of plots, some more useful than others, from boxplots and histograms to line charts (with the index number on the x-axis and value on the y-axis) and pie charts. 

Activity for online courses

The online version has a similar goal of exploring real data while getting to know other students in the class, but the roles are flipped. Rather than one student trying to guess one question, one student gets a question, collects data on that question from information in their peers’ introduction posts2, summarizes the data with a plot, and creates a one- to two-sentence description of the data (but not the question) to the rest of the class in a Q&A forum. Then, their classmates respond to the post by guessing which question they had from the description of the data. Students are provided a Word document for the Icebreaker activity that includes the instructions for how to complete it. 

2In the Introduce Yourself discussion forum, students are asked to answer five questions (all have numerical answers) about themselves and told that information will be used in a learning activity for that week. Then these questions are placed into the activity and students are “randomly” assigned to one question. 

Since there are only a few questions (approximately five) that are asked of students, the questions will have multiple students supplying a description of the data. However, it is still very insightful to see how each student decides to tell the story of their data. 


Note: I first learned about the Questions on the Back activity from Michelle Everson when I was a graduate student in the Statistics Education department at the University of Minnesota. To be honest, when I initially learned about these “first day” activities as a kick-off to the course, I was not 100% on board. I thought they sounded interesting, they might be fun, but they were a little cheesy (and not just because I’m from Wisconsin). This was before I tried it out in class, thus not realizing its potential for students and for instructors. So give it a go!)

How Old? Visualization Activity (Steve Foti)

This activity was inspired by a conversation I had with Dennis Pearl at USCOTS 2019 about fun things to try in the classroom. He was showing me the Microsoft-powered website,, that will try to guess your age from a picture that you take or upload and describing how he has used it before in the classroom. With a brand new data visualization course coming up in the spring semester, I was especially open to ideas that could be adapted to my course. After playing around with the website and reflecting on our conversation, I developed an activity and piloted it in two different courses, Biostatistical Literacy in the fall and Data Visualization in the Health Sciences in the spring.

The Biostatistical Literacy course is a graduate level service course that typically contains between 20-30 students, most of whom are advanced-degree-seeking medical professionals. The data visualization course is a new MS elective offered by our department that is open to both majors and non-majors, and most recently contained a small handful of students studying biostatistics as well as other health professions. These are the courses I have tried this activity in so far, but I believe it could be used in classrooms of any skill level. 

The basic idea of the activity is to collect data by having each student take multiple selfies (the website utilizes your default camera app) and record the age it guesses each time. With their individual data, students are asked to create a visualization to show some important feature of their data using any means they are comfortable with (e.g. pens, colored pencils, Excel, R). Then, in pairs or small groups, they are asked to think about and discuss ways they might be able to successfully manipulate the algorithm (e.g. putting on/taking off glasses, smiling, changing the angle). Using their idea, students collect more data and add it to their visualization in a way that distinguishes it from the original data. Finally, the data displays are shared with the class and we have brief discussions about them. I typically lead these discussions with questions like, what does this graph show? Does anything about the data stand out to you? Does it look like the attempt to manipulate the algorithm was successful? 

The full activity instructions are shared on our resources page

I enjoy this activity because it is a little bit different than your standard icebreaker. The discussion between students has an element of fun and mystery, and is likely something that they have not worked with before. At the same time, the work they are completing allows them to showcase their creativity and their comfort with graphing and communicating about data. Below are a couple of examples that students in my classes have created through this activity. Not all of them exactly follow the instructions, but are still generally on topic and are interesting to see. 

So far, I have only tried this activity twice in a face-to-face setting. This fall, I will be testing it out in an asynchronous, online version of the biostatistical literacy course. Since we can no longer have a live discussion, I have changed the activity submission to a discussion board format. Students will upload their final graphic and post their conclusions about their data, their graphic, or the algorithm used by the age guessing software. I may also require them to respond to at least one post in an attempt to encourage interaction and full participation in the activity. 

I think this activity lends itself a little better to the face-to-face setting. Here, students are able to interact throughout the process and share their thoughts on how they might manipulate the algorithm. It is more fun, in my opinion, when students can share and laugh about the results of the age guesser, compare ideas for manipulating the algorithm, and be present for the concluding discussion. In the online setting, the students are no longer able to have the same level of interaction, so while the activity still offers some benefit as a statistical activity, it loses some of its credibility as an icebreaker. 

Slack for (A)synchronous Course Communication

Contributing author Albert Y. Kim is an assistant professor of statistical & data sciences. He is a co-author of the fivethirtyeight R package and ModernDive, an online textbook for introductory data science and statistics. His research interests include spatial epidemiology and model assessment and selection methods for forest ecology. Previously, Albert worked in the Search Ads Metrics Team at Google Inc. as well as at Reed, Middlebury and Amherst colleges. You can follow him on Twitter @rudeboybert.

Contributing author R. Jordan Crouser is an Assistant Professor of Computer Science at Smith College. He is published in the areas of visualization theory, human-computer interaction, educational technology, visual analytics systems and human computation. For more information, visit his faculty page.

Contributing author Benjamin S. Baumer is an assistant professor in the Statistical & Data Sciences program at Smith College. His research interests include sports analytics, data science, statistics and data science education, statistical computing, and network science. For more information, visit his faculty page.

You might have heard of Slack before. But what is it? Is it email? Is it a chat room? Slack describes their flagship product as a “collaboration hub that can replace email to help you and your team work together seamlessly.” In this blogpost, we’ll describe how we’ve been using Slack for asynchronous course communication, as opposed to the synchronous course communications afforded by Zoom and other remote conferencing platforms.

Why do we stress (a)synchronous? The brick-and-mortar constraint of having everyone working at the same time is unworkable under the unfolding COVID-19 pandemic. Across the world, support staff, faculty, and students have suddenly been forced to convert to a remote learning model of education. In order for this model to be successful, flexibility is needed to ensure equitable learning experiences with respect to differences in time zones, suitability of student learning environments, internet access, and many other factors. In order to ensure this flexibility, many instructors are recognizing that some portion of their courses must be delivered in an asynchronous fashion, on top of the synchronous nature of regular lecture and meeting times. 

Before we discuss how we’ve been using Slack, we must explain how Slack is organized.

How is Slack organized?

Slack is organized into workspaces, which loosely correspond to a “team” of individuals (such as a research or special interest group). In our case, this will be an individual course. When using Slack from the Desktop or Mobile app, a list of your workspaces appears in the left-hand vertical menu bar. For example, of the 8 workspaces highlighted in red, we are currently viewing the “220” course workspace:

Within each workspace are channels (identified with hashtags), highlighted here in blue. You can think of channels as forums corresponding to topics. In this example, we have #general (announcements), #questions, and several others. Different stakeholders can join each channel, and channels can be designated public or private as appropriate. Note how the #problem_sets channel has a lock icon, indicating that it is private (to just instructors and graders).

Additionally, within each workspace are direct messages (DMs), highlighted in green. You can think of DMs as group text messages. Unlike with channels, people cannot later “join” these conversations.

What are the benefits of Slack?

Slack’s primary benefit is centralization and organization of communications, which helps to minimize inefficient context switching:

For example, if we want to ignore messages related to the 220 course and focus our attention on the 293 course, we can do so easily. This inherent compartmentalization of communications relating to courses is especially helpful when managing asynchronous communication across multiple courses, the challenges of which have been amplified during the recent outbreak of COVID-19.

Second, Slack facilitates the posing and answering of student questions via channels dedicated to discussion boards. This is a welcome feature of Slack given the importance of (a)synchronous communications in light of COVID-19.

Note that Slack is certainly not the only platform that has such functionality; other platforms include Moodle, Piazza, and Discord

Third, the benefits of Slack increase not only as the number of team members grows, but also as the number of distinct groups of team members grows. For example, this semester’s two sections of Smith College’s SDS/MTH 220 Introduction to Probability and Statistics have 79 students who form 31 term project groups, 2 instructors, 2 lab instructors, 2 graders, and 2 in-class teaching assistants. By carefully constructing both private and public channels and direct messages, we can  localize communications in their appropriate destinations. This is critical at a time where we can’t meet in person, nor can we easily meet at the same time.

Fourth, the more casual nature of Slack interactions versus email reduces instructor/student barriers. For example, less time can be spent choosing appropriate email greetings and signoffs. Additionally, Slack’s use of newer modalities of communication like emojis and GIFs can further facilitate expression at a time when maintaining open communication is paramount.

Other benefits of Slack include (1) seamless transition between Desktop and Mobile interfaces; (2) a growing ecosystem of 3rd party applications to integrate with platforms such as Zoom, GitHub, PollEverywhere, Google Drive, and Dropbox; and (3) unlike Moodle or Piazza, Slack is widely used in industry. While we won’t argue that Slack is a skill, familiarity with it certainly won’t hurt students as they enter the workplace.                   

What are some pitfalls of Slack?

As with any communication platform, Slack has its share of potential pitfalls:

  1. There are cognitive costs associated with switching to Slack-based course communication, and student buy-in can vary depending on (1) general comfort with technology and (2) the use of Slack within other courses at your institution or department.
  2. Notifications settings really matter: students who only use Slack via their browser often miss messages sent between lectures if their email notifications aren’t set. Students who use the Desktop or Mobile applications encounter this issue far less often, but this does require installation of these interfaces.
  3. Since Slack was designed for tech companies rather than for education, it is consequently not FERPA compliant. Thus, certain sensitive communications should not take place on Slack. 
  4. While Slack offers a “freemium” version, it caps access to the most recent 10,000 messages and 5GB of file storage. To exceed these caps, monthly per user fees must be paid. 

When to make the switch

Should you switch to Slack right now (during the COVID-19 pandemic)? Our answer: if you have an existing method that gets the job done, probably not. Switching your communication tool amid the stress currently facing staff, faculty, and students may cause more harm than good. However, you may want to consider the following reasons we think you should use Slack in future courses: 

  • Do you prefer having your communications centralized and compartmentalized?
  • Are there multiple groups to coordinate within your team: instructors, teaching assistants, graders, students, and various groupings thereof?
  • Are you looking for ways to make communication between students and faculty feel more accessible?
  • Does your course involve collaborating on code, either directly or via GitHub?
  • Do other instructors in your department or institution use Slack?
  • Do you hate email?

As your answers to these questions tend toward yes, the case for Slack gets stronger. At our institution, we have been vocal advocates of using Slack in the classroom. The increased importance of (a)synchronous communication brought on by the COVID-19 pandemic has further reinforced our belief in the benefits that Slack can provide for course communication.


As with any large change in workflow, getting started is often the hardest part. To this end, R. Jordan Crouser has created the following quickstart guide for Slack: Getting Started with Slack for (a)synchronous course-based communication. 

Additionally, for a live demonstration of Slack and many of its useful features, check out this video. The content of much of this post is based on Albert Y. Kim’s 2019 Symposium on Data Science and Statistics talk Using Slack for Communication and Collaboration in the Classroom.

Teaching Programming vs. Training Programmers: Where the Means Justify the Means

Contributing author Jonathan Duggins is a Teaching Assistant Professor in the Department of Statistics at North Carolina State University.


Most of us statistics (and data science!) educators understand that knowing how to use statistical software is integral to student successes, both in their coursework and in their careers, for our statistics and data science majors. However, in many degree programs, software usage is seen as a means to an end – getting an analysis – rather than an end goal in its own right. How did this come about, why does it matter, and what can we do to change our software-related instruction? These are the questions I discuss below, first by looking at some history of programming in these contexts, then by presenting two current philosophies on how to incorporate programming.


Getting a bachelor’s degree in mathematics has long meant learning computer programming. As statistics degree offerings appeared, they adopted this convention and the emerging field of data science, with its inherent computational needs, has followed suit. Whether Java, C, R (or S-PLUS back in my day!), Python, or SAS – students pursuing a degree in statistics or data science routinely program in at least one of these languages. Unfortunately, this is often enforced by adding a course to an existing degree program. In some cases, this course is merely borrowed from another department and does not meet the students’ discipline-specific needs! Even in relatively new degree programs, our field’s approach to programming seems anachronistic.

We commonly use software in the classroom to help teach a variety of topics – exploring data graphically, computing classical summary or inferential statistics, or conducting a simulation to study the properties of a resampling technique – and these advances in the inclusion of software are often touted when discussing how we have modernized our curricula. However, if students do not build the programming skills necessary to implement and understand these analyses, then software becomes a black box.

Why Does This Matter?

While there are several reasons to revisit how we teach programming, the one I’m focusing on here is that programming is different than most other skills we teach – if a program is inefficient, doesn’t follow good programming practices, or is otherwise sub-optimal, it can still produce correct results! Programming is not just something students should be doing to get an answer. We have an obligation to go beyond teaching students how to write functional code – we must train high-quality programmers. Statistics and data science careers that make extensive use of programming are exceedingly popular in their own right and as data sets get larger and programming becomes a ubiquitous skill, there is immense value in students having not only an ability to write code that solves a problem, but in using best practices when doing so.

Two Philosophies

Degree programs have typically adopted one of two prevailing philosophies regarding programming instruction: integrated or standalone. Both approaches have advantages and disadvantages that are important for designing an optimal educational experience that prepares our students to write the end-to-end programs they will use in their careers. In this context, I’m defining end-to-end programming as the application of the following three components.

1.      Data cleaning and preparation

2.      Data summary, analysis, and modeling

3.      Reporting/presentation of results

Of course, most programs make use of general computing concepts (e.g. file types, paths, etc.) and not every program needs to employ all three components. However, students trained to write this style of end-to-end program can easily adapt to writing programs that only require one or two of the components.

Integrated Instruction

This approach typically focuses only on data summary, analysis, and modeling – concepts used as a means to an end for discipline-specific course content – e.g. a regression course teaching SAS modeling tools such as PROC REG but excluding any programming concepts not explicitly needed to complete the course. The most obvious pedagogical benefit to this approach is instructors can present a programming skill after students are familiar with the statistical concept. A second, but related, benefit of the integrated approach is logistical – students do not need to worry about when to take a programming course because programming is learned in concert with the discipline-specific content.

However, the drawbacks to solely using integrated instruction are substantial. The overarching issue is that students are less likely to gain an appreciation for, or even an understanding of, the general computing principles necessary to be a practicing statistician. One of the primary examples is that the classroom data sets to which students are exposed have already been sanitized, meaning a loss of opportunities to develop skills with reading, cleaning, and restructuring data. Students are also better able to understand the requirements for developing good data collection methods when exposed to the results of poorly collected and/or maintained data sets. Additionally, more instructional time is required to teach programming along with discipline-specific content.

Integrated instruction also requires all instructors to teach at least some computing concepts in addition to the course-specific content. Depending on department size, this can be an unrealistic expectation if all faculty are not well-versed in the same language because, as learners, it is important to expose students to the same language repeatedly. Exposing them to multiple languages is valuable, of course, but if done without proper structure, students cannot build on what they learned in an earlier course and instructors cannot assume prior knowledge.

Standalone Instruction

I’m defining standalone instruction to mean classes covering the language-specific concepts and any general programming concepts required to effectively use the language. For example, in a SAS course this would mean not only covering SAS concepts but also including path/directory structure, file types/attributes, image resolution, etc. There are two common “flavors” of the standalone course: applications-focused and whole-language. The applications-focused flavor – where material on data summary/analysis/modeling (Component 2) provides students with the skills necessary to carry out discipline-specific analyses needed in their other courses – is similar to the integrated approach above except these analysis tools are all in a single course and some time may be devoted to data cleaning/preparation and reporting/presentation of results (Components 1 and 3, respectively). The whole-language approach provides much less in the way of Component 2 skills and instead focuses on Components 1 and 3 by teaching the analysis software from the computer science perspective by covering syntax, compilation, and good programming practices while including a few basic Component 2 concepts so students get practice writing end-to-end programs.

The applications-focused course suffers from several significant logistical issues – when should students take the course and what should be included? If taken too early, students are unlikely to understand most of the analysis techniques but if taken too late they cannot apply any of the programming skills in their discipline-specific courses, severely limiting the programming course’s utility. To determine the course’s content, instructors need to agree on what skills will be useful throughout the degree program and deviation from that list in later courses also reduces the course’s utility. Additionally, concentrating the analysis in a single course still deprives students of a deeper understanding of the software’s capabilities and operation and is less likely to instill an understanding of good programming practices.

The whole-language approach should still include simple analysis techniques which is both a benefit and a drawback. It removes the logistical barriers because students can take the course earlier in their degree program, but then it derives its usefulness from how programming is emphasized in the remainder of a student’s coursework. If future courses never/rarely require students to use the skills obtained in this early-career course, then its benefits are severely blunted. However, when used properly, the whole-language approach provides a solid foundation onto which students can add skills presented in later classes while lowering the instructor’s burden in those courses.

What Should We Do?

To best educate our students, we should apply both approaches in a way that minimizes drawbacks and maximizes benefits to make sure we are truly training programmers and not just teaching our students to write a program as a means to an end. Because students need to be prepared to write high-quality end-to-end programs, we need to explain to students early in their career what that process looks like. To meet these goals, I propose the following as a starting point for degree programs looking to modernize their approach to teaching programming skills.

1.      Employ an early-career, standalone course using whole-language instruction. Use it to introduce the three components and establish good programming practices.

2.      Use integrated instruction in the same language in multiple future courses, each time assessing the students’ programming skills.

3.      Enforce a common set of good programming practices across all courses.

4.      Apply a common rubric for assessing programming skills across all courses.

Of course, most of us are not in a position to rewrite our department’s curricula or convince our colleagues to teach their courses differently. However, we can all take steps, such as collaborating with whoever teaches your programming course (or proposing a new course!) and choosing to assess programming in our own classes to help our students develop these crucial skills. By building a strong foundation, vertically integrating a programming language into our curriculum, and enforcing good programming practices we can not only produce high-quality data scientists and statisticians, we can also move beyond just teaching programming and start training programmers for the careers that are waiting for them.

Duggins, J. and Blum, J. SAS Global Forum. March 29 – April 1 2020. The Past, Present, and Future of Training SAS Professionals in a University Program.

Online Strategies due to COVID-19, Part 2

In this series of posts, the StatTLC blog team describes how we are managing with the abrupt changes to our courses. In this, we share some of our decisions (and the thinking that went into them), the tools we are using, and tips. We are teaching a diverse set of classes this semester at institutions with many different technology tools. We hope that you find this useful as you make some decisions for your classes moving forward in the time of COVID-19.

Adam’s Situation

Calendar: My institution is on trimesters, so instead of switching a class to an online-only format midstream, we are starting our spring term courses online. We hope to be in person for the second half of the spring, but I doubt that will happen.

Classes: Introduction to data science (undergrad, 34 students), Statistical consulting (undergrad elective, 15 students)

Switched to: Synchronous online classes on 3/11/2020

Technology: Institution uses Moodle, Zoom/Google Hangouts Meet, Panopto; I’ll also use GitHub and Slack for data science.

Adam’s Thoughts

With my “extended” spring break I am making changes to my two courses for online delivery. Both courses require students to use R extensively. In consulting, students work on a group project all term. In data science, students are learning a lot of fundamental ideas and solving problems for homework. I’m a bit worried about the tech requirements for these courses, but there is no way around it without devising completely new courses, which is out of the question. Luckily, my institution already has an RStudio server that students can use remotely. The IT department is also working to get all students internet access and equipment, but who knows how that will go in practice.

My consulting course was supposed to meet once a week for two hours. This allows students to meet with their groups and check in with me. I plan to schedule 30-minute weekly check-ins with each group. I’ll also be using Moodle heavily to guide project progress and have students submit weekly journals where they outline their progress and reflect on assigned readings.

To adapt data science for remote delivery, I plan to do the following:

  • I need to set student expectations from the start and make sure that we are all in this together. I plan to be very transparent, and openly admit there will be technical snafus and unforeseen struggles in this new format. I also really need to beef up my syllabus with a lot of new statements about these expectations.
  • I am going to shamelessly use existing content, recording new lectures via Panopto only when necessary. These videos will be viewed asynchronously. 
  • Like Laura Ziegler (Part I), I plan to make weekly videos with a recap of last week and a look ahead to this week.
  • I’ll use Slack for discussion outside of class and to answer questions during office hours for students who would prefer that type of Q&A platform.
  • “Class time” on Mondays and Fridays will be similar to office hours, where I answer questions and clarify concepts. On Wednesdays, I will have students work in groups. To allow asynchronous work, after the first week students don’t need to “attend” class, but the assignment will be due the next day.
  • I’ll use Zoom for office hours.
  • I’m abandoning tests in favor of case studies, where students will either write a blog post or record a presentation (using Panopto or similar) that they submit to me.

Steve’s Situation

Classes: Biostat Methods II (MS, about 20 students), Data Visualization in the Health Sciences (MS elective/service course, 6 students)

Switched to: Synchronous online classes on 3/11/2020

Technology: Institution uses Canvas and Zoom

Steve’s Thoughts

After teaching a total of 10 sessions online sessions, here are my thoughts/comments:

My courses are all transitioning to synchronous online sessions at their regularly scheduled times through my university’s Zoom license. So far, Zoom has included enough functionality to allow a fairly painless switch to the online setting. Some of my comments may be specific to my university, so I apologize if not all functionality works at your institution. Main topic of each comment is in bold for easier scanning.

  • Scheduling all Zoom class sessions through my course Canvas pages (Zoom Conferences app) has made it simple to provide students with the appropriate meeting links. One recurring meeting for Monday’s sessions, and one recurring meeting for Wednesday’s sessions. I believe you can also set a static meeting ID so in theory students could use the same exact link for all sessions. Scheduling through Canvas automatically notifies students of the meeting times and saves me the time of announcing the link before every class session.
  • When scheduling Zoom meetings, there is an option to automatically record the class session to either the local computer, or if your license allows it, to the cloud. Once we received permission to record to the cloud, this option was strongly preferred because you can set it to automatically transcribe the audio. While I expect most of my students to show up for the live session, I like the option to upload a recording just in case a student is without internet access or runs into technology issues during the scheduled class time. 
  • I set my Zoom meetings to automatically mute all students upon entering to avoid having a bunch of open microphones as people connect. Students are allowed to unmute their audio/video at any time if they have a question or comment (which works fine in my small classes, e.g. < 20 students). Otherwise, students can interact by using the “hand raise” button or the “clap” emoji which looks like a small hand raise. I can see these indicators by keeping my “manage participants” box visible at all times on one of my two monitors. Additionally, I keep the “chat” box open and visible at all times as well in case students are more comfortable typing than speaking. I am considering transitioning to requiring students to use their video, as sometimes I will ask a question and get no response at all. Two-way video may be important for engagement. 
  • Most of my class sessions involve a mix of verbal lecture, presentation slides, code examples, writing on the board, and exercises for students. Zoom’s presentation tools make it possible to continue using these methods. The screen sharing option in Zoom allows me to share my computer screen, which includes my slides and my R session. While sharing, I typically click the “Annotation” tab and use the “spotlight” feature to highlight my mouse. This makes it easier to track as I move it around the screen to “point” at certain things. 
  • Finally, for a virtual whiteboard, Zoom has a few different options. If you have an iPhone/iPad that you are comfortable writing on, you can share your device’s screen by clicking screen share in Zoom and choosing the iPhone/iPad option (instructions provided by Zoom from that point). Personally, I use the touchscreen on my Chromebook to create a digital whiteboard. To do this, I join the Zoom conference on my Chromebook and share my note taking app with the conference (I use Squid). This allows me to write on my Chromebook tablet with a nice stylus and have the result appear to my class in real time (the delay is very minimal).

Laura’s (L.) Thoughts

For anyone who knows me, I’m very much a people person. I say this because I was a little bummed when I realized that I was only teaching online for the 2019-2020 school year. However, in the wake of recent events (and especially because I’m taking on additional in-person classroom (my kiddos…so is that 2 additional courses?? 🙂 ), I feel I can offer tips and tricks into delivering an online course.

Tip #1: Communication is key!

While this may come as no surprise, I feel it is even more key in the online environment. Communication includes:

  • Instructions on how to navigate the online course, if they aren’t used to doing so (e.g., course overview and orientation). For example, provide a course structure for the rhythm of each week/unit (see this example from my introductory course). 
  • Updating expectations and (possibly) grading for the alternative mode of instruction. Things to think about are should they post questions in the Q&A or via email? How will you handle requests for extensions? Who should they contact first if they have questions: instructor or TAs?
  • As Laura Ziegler said, once a week (at the minimum, twice at the maximum) announcements about the week, upcoming assignments, upcoming assignments, and other important notes. 
  • Providing clear directions on all learning materials.
  • Offering timely, constructive, and frequent feedback on assessments. 
  • Responding to questions or posts in a timely manner (within 12 hours, minimum, and no more than 24 hours, maximum).

For other tips, see the recent StatTLC post by John Haubrick on instructor presence in the online classroom.

Tip #2: Create collaborative keys via Google Docs for activities.

If you have activities in class, move the activity to a Google Doc and have the students create the answer key as a class (we call them collaborative keys). Then the teaching team (instructors and/or TAs) can monitor the key to make sure the responses are on the right track and pose any additional questions. This offers an asynchronous, but effective, method for delivering active learning materials. 

We have been doing this method for a while in our flipped classrooms and for our online courses. It works better in the online environment than in in-person, and it’s actually a beautiful thing to see. There are discussions among the students, students helping other students out, questions being asked that are beyond the question that is asked, etc. We require students to post at least once (although, many go above and beyond that). Here is a document that includes (1) assignment instructions (that has a link to an example Google Doc collaborative key for our Week 1 activity) and (2) “How to contribute to the collaborative key” details on how to participate on the key. 

So, if you do have in-class activities, consider using Google Docs to create a community of learners.


This concludes our editor series on transitioning to the online environment during the COVID19 times. We hope that some of our thoughts and experiences are useful as we all try new things to adapt to the current situation. Feel free to share your thoughts in the comment section below or contact us to contribute a post of your own.

We hope that you and your families remain safe and healthy.

Online Strategies due to COVID-19, Part 1

In this series of posts, the StatTLC blog team describes how we are managing with the abrupt changes to our courses. In this, we share some of our decisions (and the thinking that went into them), the tools we are using, and tips. We are teaching a diverse set of classes this semester at institutions with many different technology tools. We hope that you find this useful as you make some decisions for your classes moving forward in the time of COVID-19.

Doug’s Situation

Classes: Introductory Statistics (algebra-based, about 80 students), Introductory Probability (calculus-based, about 20 students)

Switched to: Asynchronous classes (with synchronous office hours and review sessions)

Technology: Institution uses Moodle with Collaborate Ultra, videos made with a variety of free and open source software that are posted to YouTube

Doug’s Thoughts

  • The introductory statistics courses I teach have a course coordinator – this has added an unanticipated layer of planning to all changes and discussions. 
  • My campus uses Moodle which has built-in integrations with Collaborate Ultra. There seem to be two paradigms emerging: 
    • Use Collaborate Ultra and teach synchronously with recordings made available.
    • Make videos for asynchronous learning and then use Collaborate Ultra for office hours. 
  • I’ve chosen to go with asynchronous videos and then use Collaborate Ultra for office hours because I’m already comfortable with making videos. For colleagues with less experience teaching online at my institution, the synchronous approach with recordings seems to feel more accessible from what I’ve been seeing in emails.
  • I make videos using OBS Studio and upload them to YouTube. It is very easy to embed them into Moodle – just drop in a link. Students are familiar with YouTube, and the process is basically painless. (If I need to edit videos, I use OpenShot Video Editor, but with the amount of videos I’m making I’m just going for quick right now.)
  • If I’m making a video of anything other than PowerPoint, I use PenAttention to highlight my cursor. Remember to zoom in on text and applications (e.g. I normally use a small font in RStudio but enlarge it for videos).
  • All of my YouTube videos are unlisted (anyone can watch, but only with a link). I don’t have a good reason for this instead of making them all public. I make a playlist for each class that I add every germane video to; I share this playlist link with the class often.
  • Some students have logged into my office hours expecting content delivery – establishing norms for virtual office hours seems to be something I need to proactively do. 
  • In changing to online, we’ve also changed our grading schemes. We decided to use two different weighting systems (with students earning the higher grade) because of all the uncertainty surrounding this transition and the lost opportunities for improving grades by doing well on a heavily-weighted final.
  • For many of my students, this is their first online class. Explaining the different options for submitting assignments needs to be explicit. I find myself intentionally repeating the same information in multiple messages and on different platforms (email and Moodle).
  • I’m also teaching an introductory, calculus-based probability course this semester. Many of the students are not math or stats majors and don’t know LaTeX. Being very flexible in terms of how assignments are submitted is key – I’m fully expecting some students to email me photos of their homework, and I’m okay with that. 
  • I’ve decided to try to use Zulip with my probability course (about 20 students). This is similar to the Discord app that Chris Engledowel talks about in his StatTLC post, but also includes the ability to use LaTeX, syntax highlighting for code, and is open source. Lots of pros, but the cons is that it is less familiar to students than Discord. So far about ¼ of the class has signed up, but there has been very little use so far (only been a day or so). Students seem to be using it mostly for private messaging me rather than for interacting with each other.
  • We were supposed to have a project at the end of the probability course. This is still happening, but recognizing that I will be less able to support some students, I have developed a few “canned projects” for students to do that are essentially some readings and problems on new topics. (I would offer the same thing in a statistics class with a few pre-selected datasets rather than having students find their own data.) I am still encouraging students to pursue a more creative project, but I recognize that that is not likely to happen for everyone this semester.
  • This blog post has been circulating among my colleagues and raises some really interesting points. Since reading it, I’ve resisted calls to poll students about their technology availability (e.g. webcams, scanners, printers, etc.) because a) this was never an expectation for the course and b) I know some don’t have them and I will already have to accommodate that. I’m trying to meet students where they are and be even more flexible than usual right now.

Laura’s (Z.) Situation

Classes: Introductory Statistics course (6 sections with about 60 students each, course coordinator), Advanced Regression course (about 28 students, half upper-level undergrad stat majors/half MS non-stat majors)

Switched to: online classes (asynchronous and synchronous)

Technology: (Introductory Statistics course: StatKey and JMP, Advanced Regression course: R)

Laura’s Thoughts

I am currently teaching 2 courses; an introductory statistics course and an advanced regression course. The student audience for these courses are very different, and therefore will teach them online differently. I have been reading an overwhelming amount of tips for teaching online, and I am sharing what I have decided will be my best approach to teaching “online in a hurry.”

These are my recommendations for any course, which I will use for both of my courses:

  • Keep it simple, not just for the students’ benefit, but for your sanity as well.
  • Make videos to share with students.
    • Videos should be short for two reasons. First, students will lose interest if they are too long. Second, if you make a mistake, you won’t have to redo as much.
    • Videos should be imperfect. No matter how much of a perfectionist you are, you need to focus on the bigger picture. Don’t worry about having perfect sound quality, saying “um” too much, or having your kids or cat run into the room. Just get it done and out there for students.
  • Try to keep things as similar to what we would do in class with the possible exception of being asynchronous. Students are stressed and if we can keep things similar to what they knew before, that may help relieve stress.
  • Send a detailed weekly checklist to students with recommended dates on when to have videos watched, upcoming due dates, etc…
  • Avoid sending too many emails. We are getting a lot of emails, so we should expect students are also getting a lot of emails. Try not to overwhelm them. Try to write one email per week, ideally on Mondays, providing an overview of what is to come with the checklist that is kind, empathetic, and encouraging.
  • Don’t forget about your TA’s, they are nervous too! Have weekly online meetings with them to ask them how they are doing. Give them the opportunity to ask questions not just about the course but also about life in general.

For a large introductory statistics course, I have additional recommendations. For some background information, the introductory statistics course I work with has 6 sections, each with approximately 60 students. I am the coordinator for the course, and therefore have been in charge of getting it ready to be online.

  • Teach asynchronous with videos. Students are across the country, in different time zones, with different access to internet.
  • Provide software output on assignments in case students do not have access to software.

For my advanced regression course, I have 28 students. Approximately half are upper-level undergraduate students and the other half are Masters-level, non-statistics students.

  • Have spent a lot of time talking to myself creating the online videos for my introductory statistics students and am missing the live aspect of teaching. I am planning to do synchronous teaching for students who want to attend. I will record the lecture during that time and will post it for students who choose not to attend. This is going against most of the recommendations I have seen, but I am going to give it a go anyways!

My plans are not perfect, and will likely change after the first week of online teaching, but that is OK! Be honest with your students and they will appreciate the effort you go through to help them through this challenging time.

Adapting Statistics Instruction for an Online Environment in the Wake of COVID-19

Contributing author Christopher Engledowl is an Assistant Professor of Mathematics Education and Quantitative Research Methods at New Mexico State University.

The world is currently experiencing unprecedented forced movement from face-to-face interaction to a completely virtual form of interaction. Higher education institutions have quickly made sweeping policy decisions that have, overnight, overhauled the classroom learning environment. These decisions have resulted in many people questioning the kinds of quality that can be expected—especially from instructors who have never taught an online course Simultaneously, many organizations have expanded the capacity of their digital platforms to accommodate the insurgence of people making use of their products for teaching and learning.

For instance, Discord—an application with free voice and text chat originally designed for gamers to interact in real time with one another, read more here—recently increased the capacity for live streaming for up to 50 people for the sole purpose of making it more amenable to online instruction. They also published a blog post about how to use Discord for instructional purposes, including a special pre-organized setup to help streamline the interface for new users.

Just as many others have recently experienced, my institution has recently dictated the movement of face-to-face courses to an online setting in order to practice social distancing and follow government recommendations designed to slow the spread of COVID-19. I am currently in the process of transitioning my face-to-face courses to online format, and I am making use of Discord. In this article, I will showcase how I made use of Discord in a prior online course, what I observed about student interactions, and what students reported about their experiences. I believe Discord can be an effective, and easy to implement, tool for creating quality discourse.

Some Context: Advanced Statistics in College of Education

In the Summer of 2019, I taught a required doctoral level course for an online-only program in a College of Education called Advanced Statistics. This course is comparable to a typical undergraduate level introductory course in statistics. In the pre-requisite course, students are exposed to basic descriptive statistics and visualizations, leading up to a two-sample t-test. Advanced Statistics extends this learning to include ANOVA, ANCOVA, and simple linear regression. The student population ranges in age from 25 to 50, whose only experience with statistics is the prerequisite course, where I have had students tell me that they had never seen a boxplot before! The course is application-based and ends with a small-scale project where students explore their own research question using either their own data, or data from the 2012 PISA—which is used throughout the course.

Because it was an online course, in addition to including the kinds of instructor presence and interaction that have been discussed by John Haubrick on this blog, I also was asking myself: How do I emulate the important student-to-student interaction that would occur in a synchronous, face-to-face setting, as suggested by the Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report?

Using Discord to Promote Quality Interaction

What is Discord?

Discord is an application designed with both text and voice chat “channels” within a “server.” The server is the larger space that only invited members can interact within, and it is composed of channels where members chat. Discord text chat channels are free-flowing chat streams. This was highly appealing to me because threaded comments can have the side-effect of conversations only existing in small groups, without ever making their way into the larger classroom discussion. Moreover, because Discord has integrated tagging (using the @ symbol), it makes it easy for everyone to see who is having side conversations, while also allowing others to enter into that same conversation—thus promoting them to the whole group level.

How Can it Promote Discussion?

In the Advanced Statistics course, because the majority of students were teachers and administrators in K–12 schools, this course was managed largely asynchronously. I used Discord as the central place for managing the classroom environment. I regularly posted links to videos on my YouTube Channel and I also posted announcements for the few times I conducted livestreams using a free application called Twitch. Thus, Discord was used for nearly all student-student and student-instructor interaction. To encourage students to get to know one another more, and make discussions feel more authentic, I set up a #general chat channel for them to discuss anything they wanted to, where I would not be monitoring unless I was tagged. I also created a #current-music-jam and #current-reading-interest channel. Many students made use of these channels at different points, and I also shared my own reading and music interests.

The course-specific channels I created to align with course learning goals were #roller-coaster-tycoon (to discuss assignments related to this dataset), #spss-discussion, #research-interests, and #p-values. Channels are very easy to add at a moment’s notice, but these seem to have sufficed for the course and were meant to work toward the GAISE recommendations to “foster active learning” and to “use technology to explore concepts and analyze data” (p. 3). It also supported students as we worked toward the GAISE goals regarding the investigative process, understanding statistical models, and understanding inference (p. 6). Discord supported these goals by providing a space where students could engage in productive discourse with one another to deepen their knowledge. For instance, as can be seen in the two screenshots below, students frequently inserted screenshots of output they were trying to make sense of in order to crowd source whether their interpretations were correct. I largely stayed out of these conversations, and found that many students would enter into the conversations, resulting in dialogic interactions that everyone learned from. Sometimes these conversations would head in an unproductive direction, but because I could see the entire conversation—unlike if it was occurring outside of class or in a group discussion in a face-to-face class where it might be difficult to hear what occurred—I could point directly back to the conversation using tagging—or even just tagging @everyone—or a screenshot and help steer students in a productive direction. This cannot be overstated: Being able to see entire conversations in this way is an incredible advantage over face-to-face discussions. It allows, in a sense, an omniscient perspective—one that everyone in the entire class also enjoys the benefits of.

How Can it Improve Instruction?

On the administrative side of teaching, to encourage participation, I evaluated both the quantity and quality of contributions using a simple rubric. Searching for a student’s username produces a list of every contribution they have made, along with a timestamp and the place in the chat stream where the comment was made—faded in the background. You can also filter contributions by channel. To see the contributions in the context of where they occurred in the chat, a simple click on the background reveals them. This process was very efficient, taking about 2-3 minutes per student.

What Did Students Say About Discord?

In an anonymous poll, I asked students: How useful did you find Discord? On a scale of 1 (Not At All) to 5 (Very Useful), 54% rated it a 5 and 23% rated it a 4. No one rated it below a 3. A follow up item asked students what they liked about Discord, and many responses were things like, “it is very interactive and we’re free to ask any questions we need at any given time” and “nice interactions that you could follow.” When asked what they did not like about Discord, students described issues that would exist even if the courses were face-to-face (e.g., “I couldn’t work ahead because I had to be involved in discussions”). Other responses were simple complaints (e.g., “Another platform”).

Perhaps even more revealing were the comments students gave when asked to compare ways of interacting in the Canvas learning management platform vs. Discord. Many responses included statements such as “Discord is best for discussion in real time” or “Discord seemed easier to follow than discussions on Canvas (and I teach online using Canvas)….seems less formal and more able to operate like a texting stream.” I will leave with this last revealing comment. When discussion turns into “I’ll do it just for the grade,” we have lost a major opportunity to promote deep learning through social interaction:

Discord is best for discussion in real time. Canvas is ok, more for turning in work and such—Posting discussion board responses and getting feedback, it is more assignment based.

Hello, is anyone there? Instructor presence in an online statistics course

Contributing author John Haubrick is an instructional designer and assistant teaching professor for the Penn State Department of Statistics where he supports the teaching and design of the online statistics courses.

With the prevalence of online chat bots and robocalls, we sometimes find ourselves asking: “Are you a machine or a real person?” Students can also experience this when taking an online course with an “absent” instructor. Instructor presence in an online course has been cited in research as a major influence of student satisfaction and engagement, which may impact their ability to learn the course content (e.g., Ladyshewsky, 2013; Gray and DiLoreto, 2016). So what can we do to “show up” to class as an online statistics instructor?

The Community of Inquiry (CoI) framework (Garrison, Anderson, & Archer,  2000) is one model used to classify the types of instructor presence for a rich educational experience. The framework is based on three types of presence: Social, Teaching, and Cognitive. You can find a large collection of publications and resources related to the CoI framework on the CoI website. In the model, the entire educational experience is the result of the interrelationship (or overlap) of the social, teaching and cognitive presence. Let’s explore each presence and how they might apply to an online statistics course.

Social Presence

Social presence shows that you are a real human teaching the students. Examples of incorporating social presence in an online course include…

Start of the course

  • Post an introductory video to put your face and voice with a name. Share your interests, hobbies, research, and the keys to success in your course.
  • Use an introduction forum to allow everyone to share a thing or two about themselves. Make the prompts interesting and provide various format options. Educational social media platforms like Flipgrid and Yammer provide text, audio, and video options beyond the standard text based discussion boards.

Throughout the course

Teaching Presence

Teaching presence refers to the technical set-up and design of the learning management system and the design of the learning materials that the students engage with (e.g., content, activities, assessments). Examples of integrating Teaching Presence in an online course include…

  • Provide clear directions on how to get started the first time they enter the course.
  • Have contact information, resources, and links for finding help and support. This includes technical support, resources for statistics software, and who to contact (e.g., TAs, instructors, other) and when.
  • Create navigation through the course that is clear and optimized for efficiency.
  • Make expectations and directions clear, thorough, and concise on all learning materials.
  • Offer timely, constructive, and frequent feedback in a variety of formats (text, audio, video). Your LMS might offer built-in or integrated media tools, such as Zoom, VoiceThread, Kaltura or YouTube.  

Cognitive Presence

The cognitive presence determines how students create meaning of the course content. Through activities, assignments, and discussions, the instructor can challenge and lead students through the content. Examples of creating Cognitive Presence in an online course include…

  • Create a reflection journal where students can make their thinking visible. For example…
    • You could set up a 3-2-1 post, where they post: 3 key concepts of the lesson, 2 ways in which they can apply the concept to their life, work, or future career, and 1 challenge or difficulty they are still having.
  • Provide lesson overview videos connecting the new content to prior knowledge or previous lessons. Demonstrate how the new content fits into the big picture of the course (or program). 
  • Have students “make sense” of output from statistical software or results from a research article by asking questions about conceptual understanding rather than procedural knowledge.
  • Have students spot errors in worked examples that might include incorrect calculations, equations, software code, software output, or hypothesis testing conclusions. 

The examples provided are just a sample of the myriad of options available for creating presence in an online course. However, THE most important thing is to show up! Your presence is important. Presence can create a positive learning community that will not only motivate your learners, but you as well.



Ladyshewsky, Richard K. “Instructor Presence in Online Courses and Student Satisfaction.” International Journal for the Scholarship of Teaching and Learning, vol. 7, no. 1, Jan. 2013. (Crossref), doi:10.20429/ijsotl.2013.070113.

Garrison, D. Randy, et al. Critical Inquiry in a Text-Based Environment: Computer Conferencing in Higher Education. 1999. Semantic Scholar, doi:10.1016/S1096-7516(00)00016-6.

Gray, Julie A., and Melanie DiLoreto. “The Effects of Student Engagement, Student Satisfaction, and Perceived Learning in Online Learning Environments.” International Journal of Educational Leadership Preparation, vol. 11, no. 1, May 2016. ERIC,

Developing Statisticians in Intermediate Statistics Courses Through an Applied Project

Contributing author Krista Varanyak is a lecturer at the University of Virginia and an Ignite Scholar.

The field of statistics education tends to focus heavily on introductory courses: How can we engage students who typically struggle in math-based courses? How can we develop statistical consumers? How can we prepare students to be successful beyond introductory courses? However, there is not much literature or resources shared about the teaching of intermediate courses. In many cases, the intermediate courses are designed for students working towards a statistics degree who are learning to be statistical producers. Overall, the goal of these courses, and the statistics major as a whole, is to produce students who will enter the workforce as statisticians. Therefore, it is imperative that students in these intermediate courses develop fundamental practical and interpersonal skills that are required to be a working statistician. Some of these skills include: comparing various analysis techniques to select the appropriate procedure, learning a new concept independently, applying the technique on data using a statistical software, and communicating findings in a formal report either written or orally.

For the last three years, I have been responsible for teaching one of the required intermediate courses for statistics majors at the University of Virginia (UVA). Prior to then, my focus had been on the best teaching practices for introductory courses. I spent the majority of my time in graduate school studying the GAISE Report and reading literature on introductory statistics students’ understanding of various concepts. When I learned I would be teaching intermediate courses, I was concerned about how I would develop course materials since there were limited resources on teaching these courses. Thankfully, I was handed a syllabus and some content from the previous instructor, but then the semester quickly started and I did not have time to make the course my own. I didn’t know what the course goals should be, what my students were capable of doing, and what ways I should assess them. This began my three-year development of STAT 3220: Introduction to Regression Analysis. Through trial-and-error, studying student patterns, and review of the ASA curriculum guidelines, I have developed a course that meets students’ needs and encourages them to develop the fundamental practical and interpersonal skills that are required to be a working statistician. One way this goal is achieved and assessed is through a final group project.


At UVA, the only prerequisite for STAT 3220 an introductory statistics course as a prerequisite, so it is comparable to “Statistics II” at other universities. Linear algebra, nor calculus are not required prerequisites. Therefore, the curriculum of this course focuses more on application than theory. The idea for this project was initiated with the realization that there were too many topics to cover in one semester of a regression course and that there did not appear to be an adequate place in the curriculum to develop a new course. That concern, paired with the desire for students to learn and apply an analysis technique independently, became the foundation of the purpose of this final project. For the project, students work in a group of 3-4 students to learn a topic that was not covered in our syllabus. Then students find an appropriate data set that can be analyzed using the new technique. Finally, students analyze the data using the technique and present their findings to the class in an oral presentation and submit a formal written report.


To select their topic, students are given a list of level-appropriate techniques, then have a few days to review the topics and select which they would like. Example topics include: Poisson Regression, Survival Analysis, Time Series Regression, and LASSO. Groups are assigned topics on a first-come-first-serve basis and most groups end up with their first or second choice. After their topic is selected, groups have approximately six weeks to complete the project. For about 3 of those weeks, class time is devoted primarily to continuing the syllabus content, with about 1-2 days where students can exclusively work on the project.  The remaining class time is spent solely on the project, peer review, and presentations. 

Before submitting a final report, students are required to submit a proposal. The purpose of the proposal is for students to demonstrate they understand their technique. They are asked to write about the advantages and disadvantages of the technique, compare the technique to something we have covered in class, and write why their data are appropriate for the technique. During this time, I allow groups to sign up to meet individually with me.

The final written report includes: a research question to be answered, methodology of the technique, applied analysis, and results with conclusion. To write their reports, students are required to cite at least three sources in the methodology section and at least one source to support their research question. In this course, students complete a project earlier in the semester, so they are somewhat comfortable with report writing. If this is the only project for the course, it may be wise to establish general requirements for these sections. 

Finally, students present their findings. In my course, the goal is for students to be able to present to an audience who is unfamiliar with their concept, not teach the concept. Students have about 10 minutes to give a PowerPoint presentation. To keep students focused on listening to the presentations, all students are required to evaluate two other groups. This semester, however, my class is much larger, so instead of PowerPoint presentations, there will be a poster session. Other students in the class will review posters, just as they would have done for in-class presentations.


One concern for assigning group work in any course is deciding how groups will be selected. I have tried many different ways to form groups and without fail, no matter what way groups are formed, there will be issues. However, I do not think it is appropriate to remove group assignments from a course. When students graduate, they will need to learn the interpersonal skills of working in a group: communication, leadership, and conflict resolution. Helping them through the process is a better way to prepare them than remove group work completely. One way I have found to alleviate tension and members not contributing equally, is to require groups to fill out, sign, and submit a group contract at the start of the project. This allows students to establish expectations and have a clear plan in place if expectations are not met. It also allows the instructor to have a point of reference if conflict does arise. 

This project can be adapted to any intermediate or advanced course where there is not enough time to cover all of the topics that are available, which most educators might agree is all of them. This project was extended in an advanced level course at UVA by another instructor, where the students not only presented their findings, but also taught a 30-minute lesson on their new topic and were required to create notes and worksheets for their peers. Finally, there is flexibility on how an instructor wants to assess communication/presentation skills: written reports, oral presentations, poster presentation, podcasts, recorded lessons, and infographics are all great ways to do so.

Visual Inference: Using Sesame Street Logic to Introduce Key Statistical Ideas

As outlined by Cobb (2007), most introductory statistics books teach classical hypothesis tests as

  1. formulating null and alternative hypotheses, 
  2. calculating a test statistic from the observed data, 
  3. comparing the test statistic to a reference (null) distribution, and 
  4. deriving a p-value on which a conclusion is based.

This is still true for the first course, even after the 2016 GAISE guidelines were adapted to include normal- and simulation-based methods. Further, most textbooks attempt to carefully talk through the logic of hypothesis testing, perhaps showing a static example of hypothetical samples that go into the reference distribution. Applets, such as StatKey and the Rossman Chance ISI applets, take this a step further, allowing students to gradually create these simulated reference distributions in an effort to build student intuition and understanding. While these are fantastic tools, I have found that many students still struggle to understand what the purpose of a reference distribution is and the overarching logic of testing. To remedy this, I have been using visual inference to introduce statistical testing, where “plots take on the role of test statistics, and human cognition the role of statistical tests” (Buja et al., 2009). In this process, I continually encourage students to apply Sesame Street logic: which one of these is not like the other? By using this alternative approach that focuses on visual displays over numerical summaries, I have been pleased with the improvement in student understanding, so I thought I would share the idea with the community.

Visual inference via the lineup protocol

In visual inference, the lineup protocol (named after “police lineup” for criminal investigations) provides a direct analog for each step of a hypothesis test. 

  1. Competing claims: Similar to a traditional hypothesis test, a visual test begins by clearly stating the competing claims about the model/population parameters. 
  2. Test statistic: A plot displaying the raw data or fitted model (we’ll call it the observed plot) serves as the “test statistic” under the visual inference framework. This plot must be chosen to highlight features of the data that are relevant to the hypotheses in mind. For example, a scatterplot is a natural choice to examine whether or not there is a correlation between two quantitative variables, but will be less useful in the examination of association between a categorical and a quantitative variable. In that situation, side-by-side boxplots or overlaid density plots are more useful.
  3. Reference (null) distribution: Null plots are generated consistently with the null hypothesis and the set of all null plots constitutes the reference (or null) distribution. To facilitate comparison of the observed plot to the null plots, the observed plot is randomly situated in the field of null plots, just like a suspect is randomly situated amongst decoys in a police lineup. This arrangement of plots is called a lineup.
  4. Assessing evidence: If the null hypothesis is true, then we expect the observed plot to be indistinguishable from the null plots. If you (the observer) are able to identify the observed plot in the above lineup, then this provides evidence against the null hypothesis. If one wishes to calculate a visual p-value, then lineups need to be presented to a number of independent observers for evaluation. While this is possible, it is not a productive discussion in most intro stats classes that don’t do a deep dive into probability theory.    


As a first example in class, I use the creative writing experiment discussed in The Statistical Sleuth. The experiment was designed to explore whether creativity scores were impacted by the type of motivation (intrinsic or extrinsic). To evaluate this, creative writers were randomly assigned to a questionnaire where they ranked reasons they write: one questionnaire listed intrinsic motivations and the other listed extrinsic motivations. After completing the questionnaire, all subjects wrote a Haiku about laughter, which was graded for creativity by a panel of poets. Below, I will give a brief overview of each part of the visual lineup activity.

Competing claims

First, have my students discuss what competing claims are being investigated. I encourage them to write these in words before linking them with the mathematical notation they saw in the reading prior to class. The most common answer is: there is no difference in the average creative writing scores for the two groups vs. there is a difference in the average creative writing scores for the two groups. During the debrief, I make sure to link this to notation:


H_{A}:\mu_{intrinsic} - \mu_{extrinsic}\neq 0

EDA review

Next, I have students discuss what plot types would be most useful to investigate this claim, reinforcing topics from EDA.

Lineup evaluation

Most students recognize that side-by-side boxplots, faceted histograms, or density plots are reasonable choices to display the relevant aspects of the distribution of creative writing scores for each group. I then give them a lineup of side-by-side boxplots to evaluate (note I place a dot at the sample mean for each group), such as the one shown below. Here, the null plots are generated by permuting the treatment labels; thus, breaking any association present between the treatment and creativity scores. (I don’t give the students these details yet, I just tell them that one plot is the observed data while the other 19 agree with the null hypothesis.) I ask the students to

  1. choose which plot is the most different from the others, and
  2. explain why they chose that plot.

[Your turn! Try it out yourself! Which one of these is not like the other?]

Lineup discussion

Once all of the groups have evaluated their lineups and discussed their reasoning, we regroup for a class discussion. During this discussion, I reveal that the real data are shown in plot 10, and display these data on a slide so that we can point to particular features of the plot as necessary. After revealing the real data, I have students return to their groups to discuss whether they chose the real data, and whether their choices support either of the competing claims. Once the class regroups and thoughts are shared, I make sure that the class realizes that an identification of the raw data provides evidence against the null hypothesis (though I always hope students will be the ones saying this!).

Biased observers

When I first started this activity, I showed the students the real data prior to the lineup, which made them biased observers. Consequently, students had an easier time choosing the real data, and the initial discussion within their groups wasn’t as rich. However, I have seen little impact on the follow-up discussion focusing on whether the data are identifiably different and what that implies. 

Benefits of the lineup protocol

The strong parallels between visual inference and classical hypothesis testing make it a natural way to introduce the idea of statistical significance without getting bogged down in the minutiae/controversy of p-values, or the technical issues of describing a simulation procedure before students understand why that’s important. All of my students understand the question “which one of these is not like the others,” and this common understanding has generated fruitful discussion about the underlying inferential thought process without the need for a slew of definitions. In addition, after this activity I find it easier to discuss how we generate permutation resamples and conduct permutation tests, because students have seen permutations in lineups and have already thought about evaluating evidence.

Where would this fit into your course?

As you’ve seen, I use the lineup protocol in my intro stats course to introduce the logic behind hypothesis tests. 

In addition, I use visual inference to help students build intuition about new and unfamiliar plot types, such as Q-Q plots, mosaic plots, and residual plots. For example, when I introduce students to mosaic plots using the flying data set in the fivethirtyeight R package, I pick one pair of categorical variables, such as one’s opinion on whether it’s rude to bring a baby on a plane and their self-identified gender. Then, I have students create the mosaic plot and discuss what they see. Once they have recorded their thoughts, I provide a lineup consisting only of null plots (i.e., no association) and have them compare their observed plot to the null plots, discussing what this tells them about potential association.

How to create lineups for your classes

The nullabor R package makes creating lineups reasonably painless if you understand ggplot2 graphics. I’ve created a nullabor tutorial to help you create lineups for your classes, and am almost done with shiny apps to implement lineups in a variety of settings.

How Do We Encourage “Productive Struggle” in Large Classes?

Contributing author Catherine Case is a lecturer at the University of Georgia and the lesson plan editor for Statistics Teacher.

This post is really inspired by a plenary talk given by Jim Stigler at USCOTS 2015. He’s a psychologist at UCLA, and in his USCOTS talk, he emphasized the idea of productive struggle. He talked about different teaching cultures around the world, and how American classrooms often feature “quick and snappy” lessons as opposed to “slow and sticky” lessons, despite the fact that making the process of learning harder can actually lead to deeper, longer-lasting understanding.

His ideas really challenged me, because I often teach fairly large classes (120 – 140 students per section), and nowhere is “quick and snappy” more highly valued than in a large lecture. There’s definitely tension in large classes between efficiency and productive struggle. 

EfficiencyProductive Struggle
Statistical questions are clearly defined in the textbook.Students carry out the full problem-solving process.
Teacher solves all problems (correctly and on the first try).Students wrestle with concepts before strategies are directly taught.
Students use formulas and probability tables proficiently.Students use appropriate data analysis tools.

At first, this tension was overwhelming to me. In the stat ed community, we’re surrounded with inspiring, innovative ideas, but the gap between where we are and where we want to be can be paralyzing. To counter that, let’s start small with a simple classroom activity that allows students to struggle through the statistical process. Along the way, I’ll mention tricks that make it easier to pull off, even with lots of students in the room.

Example: A Survey of the Class

Formulate Questions

This activity is great for the beginning of the semester, because it only requires knowledge of a few statistical terms – statistical vs. survey questions, explanatory vs. response variables, categorical vs. quantitative variables. It also challenges students’ expectations about what’s required of them in a large lecture class, because right off the bat, they’re being asked to collaborate and communicate their statistical ideas.  

  • First, students work in groups to write a statistical question about the relationship between two variables that can be answered based on a class survey. Then they pass their card to another group.
  • After receiving another group’s card, students break down the statistical question into variables. Which is the explanatory variable and which is the response? Are these variables categorical or quantitative? Then they pass their card to another group.
  • Students write appropriate survey questions that could be used to collect data – one survey question per variable. 

I’ll admit that in many of my lessons, I have a well-defined statistical question in mind before class even starts. This activity is different, because students experience the messy process of formulating a statistical question and operationalizing it for a survey. 

Collect Data

Before the next class period, I read their work (or at least a “random sample” of their work ☺) and I try to close the feedback loop by discussing common issues that I noticed. Do some questions go beyond the scope of a class survey? Are certain kinds of variables commonly misclassified? How can we improve ambiguous survey questions? Even though my class is too large to talk to every student individually, this gives me an opportunity to respond to and challenge student thinking. 

Later we can use student-written questions as the starting point for data collection and analysis. I usually choose 10-15 survey questions (ideally relevant to more than one statistical question), and collect their data via Google Forms. When students answer open-ended questions like, “How many hours do you spend studying in a typical week,” it generates data that’s messy but manageable. It feels more authentic than squeaky clean textbook data, plus the struggle of cleaning a few hundred observations by hand may help students understand the need for better data cleaning methods.

Analyzing Data Using Appropriate Tools

“Appropriate tools” certainly aren’t one-size fits all, but for this activity, I need a tool that…

  • Can handle large(ish) datasets 
  • Is accessible for students – preferably free!
  • Makes it easy to construct graphs and calculate summary statistics

At UGA, we have a site license that makes JMP free for students, and many regularly bring their laptops to class, so JMP works well for us with students working in pairs. If I didn’t have access to JMP, I might consider CODAP, which looks a lot like Fathom (friendly drag and drop interface!) except it’s free and runs in a web browser. 

Speaking of a friendly interface, another hurdle in a large class is how to trouble-shoot technology for students, especially if you don’t have smaller “lab” sections or TA support during class. For me, it’s a delicate balance of scaffolding and classroom culture…

After demonstrating how to construct graphs and calculate summaries using software, I assign some straightforward data analysis questions with right/wrong answers. For this, I use an app called Socrative, which works similarly to clickers, except that it allows for both multiple choice and free response questions. Socrative allows me to give immediate feedback – for example, if they miss a question, I can provide them with the software instructions they need. In addition to feedback through Socrative, I try to normalize the process of struggling with new technology and encourage them to help each other. I remind them it’s impossible for me to help everyone individually, but I’m confident they can work together and solve most problems without me. Students generally rise to the challenge and accept that there are multiple sources of knowledge in the room.  

Once I’m confident students know how to use the necessary data analysis tools, we can try more challenging, open-ended questions. For example, I may choose a response variable and ask students to explore the data until they find a variable that’s a good predictor, then write a few sentences about that relationship. They need to use graphs and calculate statistics to answer this, but I’m not explicitly telling them which graphs and statistics to use, and I’m certainly not giving them “point here, click here” style instructions. There’s a little productive struggle involved!

Interpret Results in Context

In the following class, I present student analyses as a starting point for our interpretations. They already have a foundation for discussing effect sizes and strength of evidence, because they’ve considered the relationships among variables themselves. Students can offer deep insights about the limitations of the analysis (e.g., sampling issues, measurement issues, correlation vs. causation), because they’ve been involved with the investigation at every stage. 

Look Back and Ahead

The authors of the ISI curriculum (Tintle, et al.) include “look back and ahead” as the final step of the statistical process. At this step, students consider limitations of the study and propose future work.

This concept is really helpful in my teaching too. Earlier I mentioned students’ expectations, but I’m also working on managing my own expectations. I can’t let the idea of a perfect active learning class keep me from taking steps in the right direction. I don’t have to change everything in one semester and I can’t expect every activity I try will work. The best I can do is to make a few small changes right now, keep a journal to learn from my experiences, and keep moving forward.