Friday, April 26, 2013

We have units!

First the time zone, now units of measurement ... it's an embarrassment of riches. Thank you, EdX engineers, for responding to our requests.

Now I won't have to say, "Calculate your answer as a percent, but please don't enter the percent sign," and other clunky stuff like that. 

It's a new platform, and we're one of the first math classes on it, so we're pioneers. At least that's what I tell myself when something unexpected happens, e.g. "The solution for Problem X of Set Y doesn't appear!!" That would be because I entered an = sign as the first entry in a line of the explanation, and the platform didn't expect that ... Who knew? Well, we know now.

That's the kind of thing students never see and shouldn't have to see. But every now and then there's a forum post that I have to walk away from or I'll say something I regret, like the 2.1X post that started with, "What do you have against a clock?" This student wanted a running clock next to each due date and time, so that the platform was keeping him/her informed about the amount of time remaining.

I have nothing against clocks. I have one next to my bed. I just don't have one in EdX.

Should I? Well, it's not on the request list I sent the engineers. Students have worked out when things are due, just as they've worked out (I hope) when to pay their rent or pick up their kids. The engineers are doing a superhuman job and must do what's really essential.

Such as maintaining the platform so it doesn't crash. That's been remarkably successful; tens of thousands of students, many of whom access the materials right before assignments are due, and very rarely have we had the system freeze.

And now the video people are fixing that wretched power of 3 that should be a power of 2 in about the third minute of Lec 2.2. Mercifully everything else on the slide is fine, including the audio, so I don't have to re-record. I'm surprised by the very small number of typos, actually. I'm often doing this work at 2 or 3 a.m., and you'd think I'd be pretty zonked by then.

But then there was that experiment that the Optometry department had us participate in when we were graduate students. To test the sharpness of our eyes at various times of day, they had us come in a few times each day and do some proof-reading. To their surprise, we did better in the afternoon than in the morning.

Why on earth were they surprised? Are grad students expected to be sharp in the morning? My friends and I didn't really get going till the sun was quite high in the sky and the coffee had done its work.

Maybe 2 a.m. is my peak productivity period. And if I mess up, I have our wonderful engineers who come in with mops ... so kudos and virtual chocolates (mostly dark) to the EdX engineering team, especially Robert without whom there would be no Stat 2X.

Saturday, April 20, 2013

What a difference a month makes

Stalwarts of 2.1X will remember the barely veiled sneering in the forum in the first week: there was a vocal group of students whose message was, "Seriously? This is it? And that's all you're requiring for the certificate?"

"I took statistics in high school," said one luminary, "And this course is below level."

So there I was, reading this stuff in Section 1 of Week 1 of Stat 2.1, knowing what lay ahead ... which was correlation and regression in Stat 2.1, and now Week 1 of Stat 2.2.

"My God that was hard" read a post today, joining a week-long chorus of "difficult," "very hard," and so on. Some of the students who say it's hard also say they did get the material in the end, but others remain all at sea.

Happens every time in intro stat on campus too, to some extent. Elementary probability theory, before the introduction of the binomial and other formulas, becomes a weapon of mass destruction.

The reason is clear: the combination of logic and the detailed attention to language and assumptions becomes too much for many people. That kind of use of the brain can be considerably harder than computation, which is minimal at this stage of 2.2X: there are only sums and products of a few fractions. But there's no "standard machine" into which you can throw things, crank a handle, and expect a correct answer. You have to develop your own little machine each time. I tried to say this in one of the lectures, and gave the closest thing I've got to a "step by step" approach to problem solving at this level. But then the students have to actually go through the process, slowly.

And that's one of the two key difficulties with this material. The examples look easy: who can't add and multiply a couple of fractions? It should be quick, right? But it isn't. It takes a while to come up with the perfect little sum and product of fractions. Students don't expect it will take that kind of time to get to answers that look so simple.

The other problem is sloppiness in reading and in the use of language. That's disastrous with this material: whether two events must both happen, or you're already given that one of them has happened, or you want either to happen, or you want just one of them to happen ... the list of variations is endless and every word matters. A single wrongly read word, and you can be led far astray. Students don't expect that, either.

The problem with language is increasingly visible among students whose normal discourse relies on, "He was like, 'OMG!'" or "I was all, 'Whatever.'" But I don't see a large fraction of these in 2.2X. What we do have is an international community: lots of people who hardly use English at all outside of this class. For them, the persnicketiness of having to watch every small word can be a torment.

I've learned this the hard way. I was teaching an upper division (calculus prerequisite) probability class once; we were well into the semester and the class and I had a great rapport. I was doing a simple little thing with one card, the "2 of clubs," and getting mightily annoyed that the class was being dull about something so easy. The glazed fish-eye look stared back at me from a hundred people. Eventually, the problem was identified. The majority of the class consisted of students who were not native speakers of English. They didn't think the word "of" mattered; it was some little piece of decoration, in their minds, while the essence of the information was "2 clubs". So they were thinking about two cards, and I was thinking about one particular card, and thus there was discord and a gnashing of teeth.

I only use the jack of clubs, now.

2.2X will get more "machine"y. There will be more instantly recognizable techniques. It's already happening: a couple of people have written in saying they've aced Week 2 exercises. Weeks 3-5 will bring to mind Weeks 3-5 of 2.1X, with estimates, measures of error, and the normal curve. Students will be fine.

But I know I'll lose some students before then. I'm telling them to hang on, but students tend not to believe faculty when we say things like that. If Week 1 is hard, then Week 4 will be 4 times as hard. Isn't everything linear?

Maybe the next Exercise Set should include this test:

Henry the Eighth had six wives. How many wives did Henry the Fourth have?

Thursday, April 18, 2013


It's liberating to be free of a textbook. Yes, the course has an online text, and it's a great resource, but I'm not really following it. I do end up following it because my usual sequence of topics and the sequence in the text are both heavily influenced by Statistics (fourth edition), by Freedman, Pisani, and Purves.

Mostly, though, I'm free to teach what I want, and the students are free to use any support materials they find helpful. So they are referring each other to assorted online sites, collecting materials on the course wiki, simulating on the computer, using algebra – whatever works.  In a group as large as this, the list of whatever works is long.

With the web as their library, Stat 2X students are more open in their approaches to learning and problem solving than I usually find in intro courses on campus. Undergraduates who have to spend lots of money buying assigned textbooks tend to get rather annoyed if the course strays too far from the text and its methods.

But then comes the Stat 2X clamor for, "More practice problems!" and I find myself wishing I could just refer students to the appropriate section of a text. Philip's text does have assignments, and we're working on how to make those available to tens of thousands of students; but as they're not exactly in sync with the lectures, I'll have to spend some time selecting the right ones. I can't use problems from other texts for obvious copyright reasons. So I have to make my own.

Which is a hugely time intensive job ... the hardest part of creating course materials, by far, specially since I also have to write out the solutions.

Time is the resource in shortest supply. Students ask me to "please post more problems," as though I have them sitting around somewhere, all ready to be uploaded. But I don't. I solve lots of problems in class on campus, and the students and I solve them on the fly, and it's fresh and interesting and ... unrecorded. Somehow I'm going to have to create some hours to make sure students get more practice problems for their upcoming midterm exam. There better be eight days this week.

The next time I go through Stat 2X (students keep asking when, and I have no idea), I'll be able to focus largely on the exercises, and then I hope there'll be a large enough stash of them that I'll essentially have produced my own "book". Maybe I'll call it a mook.

Saturday, April 13, 2013

The vast kindness of students

Tens of thousands of them all over the world, hugely generous to each other and to the staff ... it's humbling. 

And I'm not given to gratuitous humility.

One student asks a question about an exercise, and almost immediately there are several responses: somebody points out another way to approach the problem, somebody suggests R code, somebody points to a useful reference section in the text, somebody provides a diagram ... all thoughtful and effective, and, above all, selfless.

Stat 2X is an extraordinarily respectful and productive community. It has none of the rant-a-thons that plague other networks. Hysterical posts are few and far between, and dealt with calmly by students. I've disabled anonymous posting. It has no place in an academic setting, in my view, and I won't respond. So it's pointless, and it's gone.

I wonder if students know how much they affect the quality of the instruction. My lectures aren't "canned" or produced far ahead of release. I'm creating them as we go along. I have to, or else they'll be dry as dust. I don't know how to teach the non-existent student.

When I teach in a lecture hall, I'm continually making adjustments depending on how the class is responding. I can't do that in Stat 2X. So I have to find some other way of keeping myself from not feeling freeze-dried.

For this, I use the forum. I'll go quickly over some topics, and dwell on others, partly because I'm predicting student comprehension based on what I'm seeing in the forum. I know there's a huge silent majority out there, and they're likely quite different from those who are active on the forum, but the forum is what keeps the class alive.

As I record the lectures, I'm grinning to myself because I'm predicting that atopos will provide the R code for this, or that susanaust will be at hand with the necessary details about that, or that prasannasimha will notice that this connects deeply with that, or that pauljm will calm down any flurries of student anxiety, or that klionheart will explain this better than I've done ... the list is endless, and I will try to acknowledge them all as the weeks pass.

Thanks also to Lai Lai from Myanmar, for reminding us of things we take for granted. We should all be able to any line we want, and discuss its merits.

Jamie, thank you. The entire Stat dept office is thrilled. They arrived the day before my kid was due for major surgery (he's fine now, thanks); your timing couldn't have been better. 

Tuesday, April 2, 2013


Number of students enrolled on the last day of Stat 2.1X: 52,661
Active in the last week: 10,609
Earned certificates: 8,181

If you define "completion" as "earned certificate," the completion rate is 15.5%.

For comparison, here are data from a blog by Katy Jordan. She has some beautifully presented data complete with sources; it's a pleasure to read what she writes. Her data points, provided in February 2013, consist of 27 MOOCs, mostly from Coursera. Here's a stem-and-leaf plot of the completion rates in her table.

Stat 2.1X students will know how to read this once I've told them that the first line reads 0.7%. For others, here is an expanded version of the first three lines; you can take it from there. The entries are percents.

2.3  2.3  2.6  2.7
and so on

COMPLETION RATES OF 27 MOOCS [source: February 2013 data summary by Katy Jordan]

0 | 7
1 | 7
2 | 3367
3 | 25
4 | 5678
5 | 24
6 | 056
7 | 036
8 |
9 |
10 | 118
11 |
12 | 56
13 | 8
14 |
15 |
16 |
17 |
18 |
19 | 2

The median is 5.4 by the conventional "half-way point of the data" definition. 5% is often quoted as a "typical" MOOC completion rate.

Coursera's Scala course is at 19.2. It was taught by Martin Odersky who designed Scala.

We're not at 19.2, but even so, 15.5% is exceptionally high on the scale of MOOC completion rates.

I've been thinking about what it means to "complete" a course like Stat 2.1X. "Earned certificate" is a measure that sticks with the usual conventions of exams, grades, and so on. But it's possible for a student to go through all the lectures and try the exercises without regard to due dates and grades, as long as the course materials are available. That's a form of completion too, and a perfectly reasonable one for students who want to learn the subject but don't need or want a certificate.

They're hard to keep track of. But I'm willing to bet they're a big group.

To them - indeed, to all those who made a determined effort in Stat 2.1X, especially those who got the certificate: well done, and thank you for exploring this new world with me.