Developing Statisticians in Intermediate Statistics Courses Through an Applied Project

Contributing author Krista Varanyak is a lecturer at the University of Virginia and an Ignite Scholar.

The field of statistics education tends to focus heavily on introductory courses: How can we engage students who typically struggle in math-based courses? How can we develop statistical consumers? How can we prepare students to be successful beyond introductory courses? However, there is not much literature or resources shared about the teaching of intermediate courses. In many cases, the intermediate courses are designed for students working towards a statistics degree who are learning to be statistical producers. Overall, the goal of these courses, and the statistics major as a whole, is to produce students who will enter the workforce as statisticians. Therefore, it is imperative that students in these intermediate courses develop fundamental practical and interpersonal skills that are required to be a working statistician. Some of these skills include: comparing various analysis techniques to select the appropriate procedure, learning a new concept independently, applying the technique on data using a statistical software, and communicating findings in a formal report either written or orally.

For the last three years, I have been responsible for teaching one of the required intermediate courses for statistics majors at the University of Virginia (UVA). Prior to then, my focus had been on the best teaching practices for introductory courses. I spent the majority of my time in graduate school studying the GAISE Report and reading literature on introductory statistics students’ understanding of various concepts. When I learned I would be teaching intermediate courses, I was concerned about how I would develop course materials since there were limited resources on teaching these courses. Thankfully, I was handed a syllabus and some content from the previous instructor, but then the semester quickly started and I did not have time to make the course my own. I didn’t know what the course goals should be, what my students were capable of doing, and what ways I should assess them. This began my three-year development of STAT 3220: Introduction to Regression Analysis. Through trial-and-error, studying student patterns, and review of the ASA curriculum guidelines, I have developed a course that meets students’ needs and encourages them to develop the fundamental practical and interpersonal skills that are required to be a working statistician. One way this goal is achieved and assessed is through a final group project.

Overview

At UVA, the only prerequisite for STAT 3220 an introductory statistics course as a prerequisite, so it is comparable to “Statistics II” at other universities. Linear algebra, nor calculus are not required prerequisites. Therefore, the curriculum of this course focuses more on application than theory. The idea for this project was initiated with the realization that there were too many topics to cover in one semester of a regression course and that there did not appear to be an adequate place in the curriculum to develop a new course. That concern, paired with the desire for students to learn and apply an analysis technique independently, became the foundation of the purpose of this final project. For the project, students work in a group of 3-4 students to learn a topic that was not covered in our syllabus. Then students find an appropriate data set that can be analyzed using the new technique. Finally, students analyze the data using the technique and present their findings to the class in an oral presentation and submit a formal written report.

Logistics

To select their topic, students are given a list of level-appropriate techniques, then have a few days to review the topics and select which they would like. Example topics include: Poisson Regression, Survival Analysis, Time Series Regression, and LASSO. Groups are assigned topics on a first-come-first-serve basis and most groups end up with their first or second choice. After their topic is selected, groups have approximately six weeks to complete the project. For about 3 of those weeks, class time is devoted primarily to continuing the syllabus content, with about 1-2 days where students can exclusively work on the project.  The remaining class time is spent solely on the project, peer review, and presentations. 

Before submitting a final report, students are required to submit a proposal. The purpose of the proposal is for students to demonstrate they understand their technique. They are asked to write about the advantages and disadvantages of the technique, compare the technique to something we have covered in class, and write why their data are appropriate for the technique. During this time, I allow groups to sign up to meet individually with me.

The final written report includes: a research question to be answered, methodology of the technique, applied analysis, and results with conclusion. To write their reports, students are required to cite at least three sources in the methodology section and at least one source to support their research question. In this course, students complete a project earlier in the semester, so they are somewhat comfortable with report writing. If this is the only project for the course, it may be wise to establish general requirements for these sections. 

Finally, students present their findings. In my course, the goal is for students to be able to present to an audience who is unfamiliar with their concept, not teach the concept. Students have about 10 minutes to give a PowerPoint presentation. To keep students focused on listening to the presentations, all students are required to evaluate two other groups. This semester, however, my class is much larger, so instead of PowerPoint presentations, there will be a poster session. Other students in the class will review posters, just as they would have done for in-class presentations.

Adapting

One concern for assigning group work in any course is deciding how groups will be selected. I have tried many different ways to form groups and without fail, no matter what way groups are formed, there will be issues. However, I do not think it is appropriate to remove group assignments from a course. When students graduate, they will need to learn the interpersonal skills of working in a group: communication, leadership, and conflict resolution. Helping them through the process is a better way to prepare them than remove group work completely. One way I have found to alleviate tension and members not contributing equally, is to require groups to fill out, sign, and submit a group contract at the start of the project. This allows students to establish expectations and have a clear plan in place if expectations are not met. It also allows the instructor to have a point of reference if conflict does arise. 

This project can be adapted to any intermediate or advanced course where there is not enough time to cover all of the topics that are available, which most educators might agree is all of them. This project was extended in an advanced level course at UVA by another instructor, where the students not only presented their findings, but also taught a 30-minute lesson on their new topic and were required to create notes and worksheets for their peers. Finally, there is flexibility on how an instructor wants to assess communication/presentation skills: written reports, oral presentations, poster presentation, podcasts, recorded lessons, and infographics are all great ways to do so.

How Do We Encourage “Productive Struggle” in Large Classes?

Contributing author Catherine Case is a lecturer at the University of Georgia and the lesson plan editor for Statistics Teacher.

This post is really inspired by a plenary talk given by Jim Stigler at USCOTS 2015. He’s a psychologist at UCLA, and in his USCOTS talk, he emphasized the idea of productive struggle. He talked about different teaching cultures around the world, and how American classrooms often feature “quick and snappy” lessons as opposed to “slow and sticky” lessons, despite the fact that making the process of learning harder can actually lead to deeper, longer-lasting understanding.

His ideas really challenged me, because I often teach fairly large classes (120 – 140 students per section), and nowhere is “quick and snappy” more highly valued than in a large lecture. There’s definitely tension in large classes between efficiency and productive struggle. 

EfficiencyProductive Struggle
Statistical questions are clearly defined in the textbook.Students carry out the full problem-solving process.
Teacher solves all problems (correctly and on the first try).Students wrestle with concepts before strategies are directly taught.
Students use formulas and probability tables proficiently.Students use appropriate data analysis tools.

At first, this tension was overwhelming to me. In the stat ed community, we’re surrounded with inspiring, innovative ideas, but the gap between where we are and where we want to be can be paralyzing. To counter that, let’s start small with a simple classroom activity that allows students to struggle through the statistical process. Along the way, I’ll mention tricks that make it easier to pull off, even with lots of students in the room.

Example: A Survey of the Class

Formulate Questions

This activity is great for the beginning of the semester, because it only requires knowledge of a few statistical terms – statistical vs. survey questions, explanatory vs. response variables, categorical vs. quantitative variables. It also challenges students’ expectations about what’s required of them in a large lecture class, because right off the bat, they’re being asked to collaborate and communicate their statistical ideas.  

  • First, students work in groups to write a statistical question about the relationship between two variables that can be answered based on a class survey. Then they pass their card to another group.
  • After receiving another group’s card, students break down the statistical question into variables. Which is the explanatory variable and which is the response? Are these variables categorical or quantitative? Then they pass their card to another group.
  • Students write appropriate survey questions that could be used to collect data – one survey question per variable. 

I’ll admit that in many of my lessons, I have a well-defined statistical question in mind before class even starts. This activity is different, because students experience the messy process of formulating a statistical question and operationalizing it for a survey. 

Collect Data

Before the next class period, I read their work (or at least a “random sample” of their work ☺) and I try to close the feedback loop by discussing common issues that I noticed. Do some questions go beyond the scope of a class survey? Are certain kinds of variables commonly misclassified? How can we improve ambiguous survey questions? Even though my class is too large to talk to every student individually, this gives me an opportunity to respond to and challenge student thinking. 

Later we can use student-written questions as the starting point for data collection and analysis. I usually choose 10-15 survey questions (ideally relevant to more than one statistical question), and collect their data via Google Forms. When students answer open-ended questions like, “How many hours do you spend studying in a typical week,” it generates data that’s messy but manageable. It feels more authentic than squeaky clean textbook data, plus the struggle of cleaning a few hundred observations by hand may help students understand the need for better data cleaning methods.

Analyzing Data Using Appropriate Tools

“Appropriate tools” certainly aren’t one-size fits all, but for this activity, I need a tool that…

  • Can handle large(ish) datasets 
  • Is accessible for students – preferably free!
  • Makes it easy to construct graphs and calculate summary statistics

At UGA, we have a site license that makes JMP free for students, and many regularly bring their laptops to class, so JMP works well for us with students working in pairs. If I didn’t have access to JMP, I might consider CODAP, which looks a lot like Fathom (friendly drag and drop interface!) except it’s free and runs in a web browser. 

Speaking of a friendly interface, another hurdle in a large class is how to trouble-shoot technology for students, especially if you don’t have smaller “lab” sections or TA support during class. For me, it’s a delicate balance of scaffolding and classroom culture…

After demonstrating how to construct graphs and calculate summaries using software, I assign some straightforward data analysis questions with right/wrong answers. For this, I use an app called Socrative, which works similarly to clickers, except that it allows for both multiple choice and free response questions. Socrative allows me to give immediate feedback – for example, if they miss a question, I can provide them with the software instructions they need. In addition to feedback through Socrative, I try to normalize the process of struggling with new technology and encourage them to help each other. I remind them it’s impossible for me to help everyone individually, but I’m confident they can work together and solve most problems without me. Students generally rise to the challenge and accept that there are multiple sources of knowledge in the room.  

Once I’m confident students know how to use the necessary data analysis tools, we can try more challenging, open-ended questions. For example, I may choose a response variable and ask students to explore the data until they find a variable that’s a good predictor, then write a few sentences about that relationship. They need to use graphs and calculate statistics to answer this, but I’m not explicitly telling them which graphs and statistics to use, and I’m certainly not giving them “point here, click here” style instructions. There’s a little productive struggle involved!

Interpret Results in Context

In the following class, I present student analyses as a starting point for our interpretations. They already have a foundation for discussing effect sizes and strength of evidence, because they’ve considered the relationships among variables themselves. Students can offer deep insights about the limitations of the analysis (e.g., sampling issues, measurement issues, correlation vs. causation), because they’ve been involved with the investigation at every stage. 

Look Back and Ahead

The authors of the ISI curriculum (Tintle, et al.) include “look back and ahead” as the final step of the statistical process. At this step, students consider limitations of the study and propose future work.

This concept is really helpful in my teaching too. Earlier I mentioned students’ expectations, but I’m also working on managing my own expectations. I can’t let the idea of a perfect active learning class keep me from taking steps in the right direction. I don’t have to change everything in one semester and I can’t expect every activity I try will work. The best I can do is to make a few small changes right now, keep a journal to learn from my experiences, and keep moving forward.