Wednesday, May 1, 2013

"The equivalent of a Berkeley course"

That's the aim of BerkeleyX.

A laudable goal, and one that I should share.

Then why do I keep remembering Bertie Wooster's insight about Shakespeare?

 " ... sounds well, but doesn't mean anything."

[Stop. Stop. If you were just about to post an outraged comment about Bertie's opinion, stop, and go look him up. There's a magical world waiting for you.]

The medium is not equivalent; requirements aren't equivalent; the students most definitely aren't equivalent. So what's equivalent?

Well it's my course, and MOOCs haven't been around long enough for there to be a historical perspective about this, so I'm simply going to decide.

Here's what I know to be equivalent.
  • I teach Stat 2X at the same level at which I or most of my colleagues would teach Stat 2 in Berkeley. I've asked for the same prerequisite (high school arithmetic) and assign the same level of exercises.
  • I am as attentive to pedagogy, clarity, and effectiveness of the lectures as I am in Stat 2.
  • The Stat 2X text, while not often used in Stat 2 (we use Statistics (4th ed) by Freedman, Pisani, and Purves; not available online), is used in Stat 21 and is written by a department colleague and co-instructor of Stat 2X. It is inspired by FPP and other texts by our colleagues, and provides a high quality introduction to statistics freely online, in the spirit of EdX.
  • My regard for students in the two classes is equivalent. The student populations are entirely different in their backgrounds and goals. But they're all my students, all tens of thousands plus three hundred of them, and my responsibility for all of them is the same.
  • The size of the class notwithstanding, I do my utmost to interact with Stat 2X students as I do with students on campus. My aim is to be accessible, relaxed, and helpful to students who are working and need the help. To others I am either neutral or flatly unavailable, depending on how they choose to conduct themselves. In university classes I've never been particularly indulgent of rudeness or the inability to deal with basic course logistics. In that respect too Stat 2X is the equivalent of Stat 2 and every other course I teach. 
  • I expect students taking a university course to know how to take a course. In both Stat 2 and 2X, I post the course policies and schedule on the website on the first day once and for all, and I follow them. I expect students to know what's on the course website, to have access to a device that tells the time correctly, and to manage their own schedules and calendars. Student reaction varies in the two courses but my approach is equivalent. Many students have a logistical glitch here and there during the term, so I always drop an assignment or two from the final grade in Stat 2 as well as Stat 2X.
This is almost equivalent:
  • There's a range of quantitative and logical reasoning skills among students in Stat 2 as there is in Stat 2X; the 2X discussion forum indicates that the range is wider in 2X than in 2, extending further in both directions. As in Stat 2, I reach most students but not all. In Stat 2, the proportion left unsatisfied is very small. My guess is that in Stat 2X there is a larger proportion who remain unhappy either because they don't understand or because they could go much further with the material if I would only give them the opportunity.
And these are not equivalent.
  • In Stat 2, during every lecture I connect in person with each student; attendance is high, and I've become quite skilled at conducting a conversation with hundreds of people.  I can't do that in 2X. In the forum I can connect with a vocal minority, but the majority are a black box. This affects everyone's perception of how the class is going.
  • In my campus Stat 2 class, every point the students earn is in an in-person closed-book fixed-time test with little to no partial credit. From the point of view of the work required of students, Stat 2X is a different universe.
  • Cheating: I knew this would happen as it has happened in other MOOCs, but it's still stunning to see how swift and organized people are about posting questions all over the internet and gathering answers. I'm trying hard not to let this kill my motivation, as it's unfair to the students who are honest. It's such a pleasure to look at my hundreds of students in Stat 2 and know that the work that I get from them is theirs alone, unaided. (As for the effect that cheating has on faculty interest in course certificates or grade distributions, maybe another time in another post ...)
  • Student support for each other, and the confidence to go further with the material or look at it from a variety of perspectives is greater in 2X. In part this is due to many students being well past the undergrad stage and in occupations where they've had to look beyond what's right in front of them. There is much more of this than I expected, and it's a delight.

  • Student expectations of what can or should be provided to them are so different it's almost comic. Stat 2 students just expect to find stuff on the course website and figure out what they're supposed to do; there's no angst nor fuss; it's just how courses work. I believe this is very much influenced by them being in the same room as all their classmates three times a week; it's hard to approach the instructor with, "I myself personally would be happier if you further provided me with ..." when there are hundreds around you who are obviously fine without it. Not so in Stat 2X, where for some students isolation leads to a lack of recognition of the experience of others or at least a lack of inhibition about asking for extra service.

[Which I would actually have considered providing, had it not been prefaced with:
"I am a very busy person," (I know the feeling), "and am doing this course over and above all my other responsibilities," (you and me both), "and I love your lectures but really, you're pretty far down my list of priorities," (thanks for sharing, 'cause I really needed that motivation to make you a priority), "and I won't read what you've posted," (perish the thought), "so could you please take some extra steps to make things easier for me?" Can't. Too busy looking for a wall to hit my head against.]

On the whole, I find "equivalence" being measured almost entirely by what the instructor provides. That is dangerous. It assumes, tacitly, that a "class" is something an instructor can plop down in front of a student and step back. But it's no such thing. A class is what the instructor and students create together.

So I'm not making "equivalence" the goal; I'll never reach it, and I don't want to teach something that's "equivalent" to something else I'm already teaching. Why would I want to do more of the same? I'd rather make the two different and worthwhile in their different ways. If I had the choice, 2X would be open all the time, with no grades or certificates; just visits by Prof. A. to the forum at pre-announced times.

But that's for another day. Today I have to start putting together Stat 2.3X.

Friday, April 26, 2013

We have units!

First the time zone, now units of measurement ... it's an embarrassment of riches. Thank you, EdX engineers, for responding to our requests.

Now I won't have to say, "Calculate your answer as a percent, but please don't enter the percent sign," and other clunky stuff like that. 

It's a new platform, and we're one of the first math classes on it, so we're pioneers. At least that's what I tell myself when something unexpected happens, e.g. "The solution for Problem X of Set Y doesn't appear!!" That would be because I entered an = sign as the first entry in a line of the explanation, and the platform didn't expect that ... Who knew? Well, we know now.

That's the kind of thing students never see and shouldn't have to see. But every now and then there's a forum post that I have to walk away from or I'll say something I regret, like the 2.1X post that started with, "What do you have against a clock?" This student wanted a running clock next to each due date and time, so that the platform was keeping him/her informed about the amount of time remaining.

I have nothing against clocks. I have one next to my bed. I just don't have one in EdX.

Should I? Well, it's not on the request list I sent the engineers. Students have worked out when things are due, just as they've worked out (I hope) when to pay their rent or pick up their kids. The engineers are doing a superhuman job and must do what's really essential.

Such as maintaining the platform so it doesn't crash. That's been remarkably successful; tens of thousands of students, many of whom access the materials right before assignments are due, and very rarely have we had the system freeze.

And now the video people are fixing that wretched power of 3 that should be a power of 2 in about the third minute of Lec 2.2. Mercifully everything else on the slide is fine, including the audio, so I don't have to re-record. I'm surprised by the very small number of typos, actually. I'm often doing this work at 2 or 3 a.m., and you'd think I'd be pretty zonked by then.

But then there was that experiment that the Optometry department had us participate in when we were graduate students. To test the sharpness of our eyes at various times of day, they had us come in a few times each day and do some proof-reading. To their surprise, we did better in the afternoon than in the morning.

Why on earth were they surprised? Are grad students expected to be sharp in the morning? My friends and I didn't really get going till the sun was quite high in the sky and the coffee had done its work.

Maybe 2 a.m. is my peak productivity period. And if I mess up, I have our wonderful engineers who come in with mops ... so kudos and virtual chocolates (mostly dark) to the EdX engineering team, especially Robert without whom there would be no Stat 2X.

Saturday, April 20, 2013

What a difference a month makes

Stalwarts of 2.1X will remember the barely veiled sneering in the forum in the first week: there was a vocal group of students whose message was, "Seriously? This is it? And that's all you're requiring for the certificate?"

"I took statistics in high school," said one luminary, "And this course is below level."

So there I was, reading this stuff in Section 1 of Week 1 of Stat 2.1, knowing what lay ahead ... which was correlation and regression in Stat 2.1, and now Week 1 of Stat 2.2.

"My God that was hard" read a post today, joining a week-long chorus of "difficult," "very hard," and so on. Some of the students who say it's hard also say they did get the material in the end, but others remain all at sea.

Happens every time in intro stat on campus too, to some extent. Elementary probability theory, before the introduction of the binomial and other formulas, becomes a weapon of mass destruction.

The reason is clear: the combination of logic and the detailed attention to language and assumptions becomes too much for many people. That kind of use of the brain can be considerably harder than computation, which is minimal at this stage of 2.2X: there are only sums and products of a few fractions. But there's no "standard machine" into which you can throw things, crank a handle, and expect a correct answer. You have to develop your own little machine each time. I tried to say this in one of the lectures, and gave the closest thing I've got to a "step by step" approach to problem solving at this level. But then the students have to actually go through the process, slowly.

And that's one of the two key difficulties with this material. The examples look easy: who can't add and multiply a couple of fractions? It should be quick, right? But it isn't. It takes a while to come up with the perfect little sum and product of fractions. Students don't expect it will take that kind of time to get to answers that look so simple.

The other problem is sloppiness in reading and in the use of language. That's disastrous with this material: whether two events must both happen, or you're already given that one of them has happened, or you want either to happen, or you want just one of them to happen ... the list of variations is endless and every word matters. A single wrongly read word, and you can be led far astray. Students don't expect that, either.

The problem with language is increasingly visible among students whose normal discourse relies on, "He was like, 'OMG!'" or "I was all, 'Whatever.'" But I don't see a large fraction of these in 2.2X. What we do have is an international community: lots of people who hardly use English at all outside of this class. For them, the persnicketiness of having to watch every small word can be a torment.

I've learned this the hard way. I was teaching an upper division (calculus prerequisite) probability class once; we were well into the semester and the class and I had a great rapport. I was doing a simple little thing with one card, the "2 of clubs," and getting mightily annoyed that the class was being dull about something so easy. The glazed fish-eye look stared back at me from a hundred people. Eventually, the problem was identified. The majority of the class consisted of students who were not native speakers of English. They didn't think the word "of" mattered; it was some little piece of decoration, in their minds, while the essence of the information was "2 clubs". So they were thinking about two cards, and I was thinking about one particular card, and thus there was discord and a gnashing of teeth.

I only use the jack of clubs, now.

2.2X will get more "machine"y. There will be more instantly recognizable techniques. It's already happening: a couple of people have written in saying they've aced Week 2 exercises. Weeks 3-5 will bring to mind Weeks 3-5 of 2.1X, with estimates, measures of error, and the normal curve. Students will be fine.

But I know I'll lose some students before then. I'm telling them to hang on, but students tend not to believe faculty when we say things like that. If Week 1 is hard, then Week 4 will be 4 times as hard. Isn't everything linear?

Maybe the next Exercise Set should include this test:

Henry the Eighth had six wives. How many wives did Henry the Fourth have?

Thursday, April 18, 2013


It's liberating to be free of a textbook. Yes, the course has an online text, and it's a great resource, but I'm not really following it. I do end up following it because my usual sequence of topics and the sequence in the text are both heavily influenced by Statistics (fourth edition), by Freedman, Pisani, and Purves.

Mostly, though, I'm free to teach what I want, and the students are free to use any support materials they find helpful. So they are referring each other to assorted online sites, collecting materials on the course wiki, simulating on the computer, using algebra – whatever works.  In a group as large as this, the list of whatever works is long.

With the web as their library, Stat 2X students are more open in their approaches to learning and problem solving than I usually find in intro courses on campus. Undergraduates who have to spend lots of money buying assigned textbooks tend to get rather annoyed if the course strays too far from the text and its methods.

But then comes the Stat 2X clamor for, "More practice problems!" and I find myself wishing I could just refer students to the appropriate section of a text. Philip's text does have assignments, and we're working on how to make those available to tens of thousands of students; but as they're not exactly in sync with the lectures, I'll have to spend some time selecting the right ones. I can't use problems from other texts for obvious copyright reasons. So I have to make my own.

Which is a hugely time intensive job ... the hardest part of creating course materials, by far, specially since I also have to write out the solutions.

Time is the resource in shortest supply. Students ask me to "please post more problems," as though I have them sitting around somewhere, all ready to be uploaded. But I don't. I solve lots of problems in class on campus, and the students and I solve them on the fly, and it's fresh and interesting and ... unrecorded. Somehow I'm going to have to create some hours to make sure students get more practice problems for their upcoming midterm exam. There better be eight days this week.

The next time I go through Stat 2X (students keep asking when, and I have no idea), I'll be able to focus largely on the exercises, and then I hope there'll be a large enough stash of them that I'll essentially have produced my own "book". Maybe I'll call it a mook.

Saturday, April 13, 2013

The vast kindness of students

Tens of thousands of them all over the world, hugely generous to each other and to the staff ... it's humbling. 

And I'm not given to gratuitous humility.

One student asks a question about an exercise, and almost immediately there are several responses: somebody points out another way to approach the problem, somebody suggests R code, somebody points to a useful reference section in the text, somebody provides a diagram ... all thoughtful and effective, and, above all, selfless.

Stat 2X is an extraordinarily respectful and productive community. It has none of the rant-a-thons that plague other networks. Hysterical posts are few and far between, and dealt with calmly by students. I've disabled anonymous posting. It has no place in an academic setting, in my view, and I won't respond. So it's pointless, and it's gone.

I wonder if students know how much they affect the quality of the instruction. My lectures aren't "canned" or produced far ahead of release. I'm creating them as we go along. I have to, or else they'll be dry as dust. I don't know how to teach the non-existent student.

When I teach in a lecture hall, I'm continually making adjustments depending on how the class is responding. I can't do that in Stat 2X. So I have to find some other way of keeping myself from not feeling freeze-dried.

For this, I use the forum. I'll go quickly over some topics, and dwell on others, partly because I'm predicting student comprehension based on what I'm seeing in the forum. I know there's a huge silent majority out there, and they're likely quite different from those who are active on the forum, but the forum is what keeps the class alive.

As I record the lectures, I'm grinning to myself because I'm predicting that atopos will provide the R code for this, or that susanaust will be at hand with the necessary details about that, or that prasannasimha will notice that this connects deeply with that, or that pauljm will calm down any flurries of student anxiety, or that klionheart will explain this better than I've done ... the list is endless, and I will try to acknowledge them all as the weeks pass.

Thanks also to Lai Lai from Myanmar, for reminding us of things we take for granted. We should all be able to any line we want, and discuss its merits.

Jamie, thank you. The entire Stat dept office is thrilled. They arrived the day before my kid was due for major surgery (he's fine now, thanks); your timing couldn't have been better. 

Tuesday, April 2, 2013


Number of students enrolled on the last day of Stat 2.1X: 52,661
Active in the last week: 10,609
Earned certificates: 8,181

If you define "completion" as "earned certificate," the completion rate is 15.5%.

For comparison, here are data from a blog by Katy Jordan. She has some beautifully presented data complete with sources; it's a pleasure to read what she writes. Her data points, provided in February 2013, consist of 27 MOOCs, mostly from Coursera. Here's a stem-and-leaf plot of the completion rates in her table.

Stat 2.1X students will know how to read this once I've told them that the first line reads 0.7%. For others, here is an expanded version of the first three lines; you can take it from there. The entries are percents.

2.3  2.3  2.6  2.7
and so on

COMPLETION RATES OF 27 MOOCS [source: February 2013 data summary by Katy Jordan]

0 | 7
1 | 7
2 | 3367
3 | 25
4 | 5678
5 | 24
6 | 056
7 | 036
8 |
9 |
10 | 118
11 |
12 | 56
13 | 8
14 |
15 |
16 |
17 |
18 |
19 | 2

The median is 5.4 by the conventional "half-way point of the data" definition. 5% is often quoted as a "typical" MOOC completion rate.

Coursera's Scala course is at 19.2. It was taught by Martin Odersky who designed Scala.

We're not at 19.2, but even so, 15.5% is exceptionally high on the scale of MOOC completion rates.

I've been thinking about what it means to "complete" a course like Stat 2.1X. "Earned certificate" is a measure that sticks with the usual conventions of exams, grades, and so on. But it's possible for a student to go through all the lectures and try the exercises without regard to due dates and grades, as long as the course materials are available. That's a form of completion too, and a perfectly reasonable one for students who want to learn the subject but don't need or want a certificate.

They're hard to keep track of. But I'm willing to bet they're a big group.

To them - indeed, to all those who made a determined effort in Stat 2.1X, especially those who got the certificate: well done, and thank you for exploring this new world with me.

Friday, March 29, 2013

On a turquoise cloud

The final is over with no glitches on our side. I won't know for a while how the students did or how many of them took it, but the forum seems generally happy.

So for this one day I'm taking a break. On a turquoise cloud.

No, that's not me finding my inner poet. It's Duke Ellington, and I'm celebrating my son's appearance on iTunes.

Just in case people think Stat professors have no other lives ...

He's in high school now, but he was 12 then and oh, how smoothly he sent those high notes soaring ... He sings with the Pacific Boychoir Academy and it was an honor for him to record with Marcus Shelby and his jazz orchestra, one of the finest jazz ensembles in the San Francisco Bay Area.

Students in Berkeley see me at cafes and stores and performances and so on. But students of the MOOC know me only as The Voice. It's very strange. So every now and then I'm going to throw something like this into the blog, because some students are reading it.

I'm on the Board of the Pacific Boychoir. It's a way of giving something back for the joys the PBA has brought to the life of my family.

Armando Fox, Director of BerkeleyX, is on the Board of the Altarena Playhouse, a local theatre.

We have to find the hours in the day for all this, but hours are there to be found if you're willing to forego sleep. I've become very familiar with how the house feels at 3 a.m.

Tomorrow the whole routine will start again: making slides, recording the voice-over, putting together exercise sets, double checking the schedule, trying to figure out a way to make the discussion forum more organized, wishing I had the students in front of me ...

But not today. Today I'm a lady of leisure.

Tuesday, March 26, 2013

Does anybody really know what time it is?

That song was by a band called Chicago. In 1969. Way to show my age ...

The MOOC experiment is cultural as well as educational. Nowhere was this more starkly evident than in the horror generated by the moment at which the first assignment set came due.

It didn't matter that the course policies said Greenwich Mean Time (GMT). The horror split into three main categories:

1. "The deadlines should be in East Coast time, because EdX is in Boston."

2. "The deadlines should be in Pacific time, because Stat2X is in Berkeley."

3. "The deadlines should be in my time here, because I'm here."

All three are understandable, but each has its problems:

1-2. One of the beauties of an online course is that it's not in any fixed physical location. It's in The Cloud, wherever that is. Clouds drift. Where somebody's office is doesn't determine where the work is done. For example, Stat 2.3X won't start till the semester is over in Berkeley; I might handle that one from several different time zones.

3. Students move, even across time zones. We don't want to keep track of where they're working from, and we can't have each student setting his or her own time zone for each assignment, specially when many students aren't clear about time zones at all.

So we stayed with a single time zone - GMT - which is the platform default and is a standard across the world. Using GMT forces us as course developers to do what most of our students do, which is to arrange our working hours to meet deadlines set in a different time zone. It's an excellent way for us to stay aware of how students have to organize their time.

That said, every deadline that appears on the class website should have a time zone attached to it. And that's not happening. Why? Because it's not possible in the current state of the platform. However, EdX is responsive to concerns and is working on it. Let's hope we'll soon be able to display time zones along with times.

Until that happens, we're stating GMT (also spelled out as Greenwich Mean Time) in bold at the tops of assignments, since stating it once didn't do the job.

And that's as much as we're going to do. We will not have countdown clocks nor reminders sent by email shortly before deadlines. There are other priorities for our overworked engineers; most students finish their work long before the deadlines, anyway.

As for the song, a guy walks up to the singer and asks him for the time "that was on my watch," and the singer subjects him to a "does anybody know and does anybody care" philosophical non-response. You can't help feeling sorry for the poor guy who was probably just trying to catch a bus.

Good song, though, for when you're feeling grumpy about the rat race.

Sunday, March 24, 2013

Why do this?

The press asked Philip if he's getting paid for Stat 2X.

Members of my family asked me that too. Their question was more detailed: am I getting paid per student?

Family are always good for a laugh.

No, we're not getting paid. Statisticians who want to make money aren't at universities.

We both still have our regular responsibilities at Berkeley. Moreover this year I happen to be Chair of a bunch of committees; Philip, may the saints preserve him, is Department Chair. Time does not hang heavy on our hands.

So why are we doing this?

I won't speak for Philip, but part of his motivation certainly derives from the fact that about 16 years ago he began writing his online text, long before Berkeley was ready to think seriously about online education.

For me, the new medium offers an irresistible opportunity to demystify my subject for the world. Statistics has a way of scaring the daylights out of people, or bewildering them, or making them suspicious. "Lies, damn lies ..." One way or another they end up hating it. Here's my chance to explain why I think it's sensible, useful, and fun. I do that every year in large intro classes on campus. Now I can reach the teenager who is irritated at her high school for making her use formulas she doesn't understand, the doctor who took some stats ages ago but wants a refresher, the people who just want to make sense of what "statistics have shown." And apparently I can also reach my friends' mothers. I hear some of them are taking the course.

If I need still more motivation, all I have to do is remember growing up in India. I've seen the inequity of the best education being the province of the wealthy. I know how people can hunger for knowledge that's just beyond their reach.  I was fortunate to learn from many superb teachers, but I know how it feels to have teachers who show no mastery of their subject. I know the frustration of studying from materials that are paltry and sloppy. I know what it's like to long for someone to just put some decent study materials into one's hands.

So that's what I'm trying to do. I'm putting what I know about my subject, as simply and clearly as I can, freely into the hands of anyone who has access to the internet.

Did I say I'm not getting paid? I take it back. I'm well rewarded, not with dollars but with the privilege of getting to do this.

Saturday, March 23, 2013

Welcome, reckless reader.

The MOOC (massive open online course) is the hot topic du jour in education. The Governor of California is a fan. The New York Times has hailed The Year of the MOOC. Harvard and MIT provided a number of MOOCs on their EdX platform a couple of years ago; last year they were joined by Berkeley; and now EdX is a consortium of universities all over the world. MOOCs, apparently, are the Thing To Do.

Are they going to work? Jerry Brown hopes so, but The Times has now gone all Eeyore.

Well, there's only one way to find out. So Philip Stark and I have stuck our necks out and are teaching Stat 2X, a MOOC on intro Stat.

Let no-one accuse us of exercising due caution.

This blog is intended as a sort of diary of observations and stories about Stat 2X. But I know myself. It's going to wander where my mind takes me. Read at your own risk.