The study may or may not be flawed, but what's really interesting to me is the r...

tikhonj · on June 8, 2012

I've only scanned the comments, but the chief reaction seems to be exactly what you want for a scientific approach--people are arguing about the scope and methodology of the study. A paper like this is not some broad-reaching conclusion--it's very specific and based on some potentially flawed methodology. You want people to qualify exactly how specific it is and talk about potential flaws in its approach. That's how you improve the general knowledge.

Also, I suspect there are a couple of reasons such studies are uncommon in computer science. For one, CS isn't really a science; programmers and computer scientists are not trained in the scientific method or experimentation (beyond their general education); almost no CS papers I've read have contained empirical studies. If anything, they are closer to math papers than science papers!

Additionally, this sort of study is basically sociology. (Or something similar.) These sorts of fields are considered a little shady by hard scientists, and CS people tend to empathize more with the latter. I think this explains the immediate attack on methodology.

All that said, having more studies done about these questions would be great. I'm just not sure who's the best to do them. Maybe HCI researchers? I can't help thinking that the really intense PL people I know wouldn't be very interested in doing this.

jbranchaud · on June 8, 2012

"Almost no CS papers I've read have contained empirical studies."

You aren't reading the right papers then. At least in Software Engineering, you can't get into the main conferences (ICSE and FSE) without a pretty significant empirical study.

klodolph · on June 8, 2012

Yes, but that's the main difference between computer science and software engineering.

barik · on June 8, 2012

That's a broad brush, and though it's probably a good categorization, I don't find this distinction to be all that useful as a boundary. For instance, the field of artificial intelligence and cognitive psychology branched at one point, so much of my work in cognitive architecture and algorithmic modeling necessitates user studies. One would be hard-pressed to bucket AI into software engineering though. Likewise, in machine learning, I've seen a push from classical data-driven to modern "data-informed" approaches to analyzing these results. Computational linguistics (NLP) and computational narrative are yet additional examples of fields that often requires user studies or other empirical data.

More to the point, the distinction of what is and isn't computer science has become even more blurry in the research community because research in itself has become more inter-disciplinary. There seems to be little to gain from attempting to "bucket" research into distinct taxonomies.

edanm · on June 8, 2012

I think your point is important, and well made.

The only problem is, people have been doing these experiments for 30 years, and do you know what the net effect it's had on the world of programmers: none at all. Saying "no no, this time really listen to this study" seems to be having no effect.

There are a lot of causes for this, not least of all the things you mention (nobody cares about science, people like or dislike based on non-scientific evidence).

But it's also because people know that all these studies are flawed. As much as I hate to be the "nitpicker" who takes apart studies, EVERY SINGLE STUDY on programming doesn't even come close to real-life scenarios. In fact, one of the few studies that people actually believe is the "some programmers are 10x better" study, and that was actually fairly well conducted - many students were given identical tasks, and a fairly large amount of time to do them.

But take a look at the Dynamic vs. Static argument. For years, the Dynamic-fans have been saying "Quicker to program, so it's better", while the Static-fans have been saying "Quicker to program, but harder to maintain, hard to use with large teams". So now we have a study that doesn't even come close to addressing most of the issues that have been argued for years! Of course this isn't going to convince anyone.

barik · on June 8, 2012

"In fact, one of the few studies that people actually believe is the "some programmers are 10x better" study, and that was actually fairly well conducted - many students were given identical tasks, and a fairly large amount of time to do them."

Can you by chance point me to this paper? I'd like to add it to my paper collection, since most of the studies I've seen concerning programmer variability use members of the workforce. I'm not aware of the one involving students, but such replication studies are easy to miss.

Thanks!

dalke · on June 8, 2012

http://forums.construx.com/blogs/stevemcc/archive/2011/01/09...

barik · on June 8, 2012

I'm not sure how that link helps me. I've already seen many of those studies. Which one satisfies, "students were given identical tasks, and a fairly large amount of time to do them"?

That is the specific study I am searching for to add to my list of papers. Did you give me this link because you were referring to Humphrey (A Discipline for Software Engineering), or something else? I can track down Humphrey, but it will take me a few days, since it's a physical book.

dalke · on June 8, 2012

My apologies. I misread what you were looking for. Mostly by not actually reading what you wrote. :(

Perhaps the previous author was thinking of the Prechelt "An Empirical Comparison of .." paper? http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprtTR.pdf . Section 5.7 has "work times" for Java and C/C++ programmers using well-observed times. However, that is not for a "fairly large amount of time."

jdlshore · on June 8, 2012

There is no such study. It's a well-regarded and popular myth.

Laurent Bossavit does a masterful job of researching the origins of this myth (and others) in his new book The Leprechauns of Software Engineering, available on LeanPub [1].

[1] http://leanpub.com/leprechauns

nuncanada · on June 8, 2012

Not at all, that study was not well conducted, it compared people just learning to program with people that already had experience, in a time when programming was much different from what it is today... Just go read the damn paper...

kitsune_ · on June 8, 2012

I don't know how science can help you here. In the case of the improved fan design, well that's easily testable.

However, I don't want to go down the rabbit hole and argue social vs natural sciences.

I find it dumbfounding that in this day and age, people can still create rather arbitrary social experiments with only 49 people and then think they can draw grandiose conclusions from their "data".

jbranchaud · on June 8, 2012

Unfortunately, user studies are difficult to conduct in computer science mainly because it is difficult to get users that you can study. Like it or not, 49 people is actually a pretty big study relative to what is out there. For better or for worse, I have seen much smaller studies accepted/published by top tier conferences.

chillax · on June 8, 2012

Well, in this study there are no grandiose conclusions.

gruseom · on June 8, 2012

Emphatic upvote.

Software development is so complicated that there is always an endless supply of objections to fire at any study at odds with one's beliefs. And that is exactly how all these discussions go. All we're doing is repeating shibboleths.

The most interesting studies would be ones that changed somebody's beliefs. That doesn't happen very often in our field. Does it ever?

silverlake · on June 8, 2012

No study can control all the variables enough to convince a fanboy that his favorite language isn't the greatest ever. However, the current state of PL research is as close to astrology as you can get. Why are we working on type systems and modules and concurrency primitives et. al. without a scrap of evidence that any of it contributes to programmer productivity? There's no science there.

In fact, everyone here should flip this around and ask: can you design a practical experiment to compare productivity of dynamic vs. static languages? Will others find it convincing? Probably not. PL advocates are no different than religious missionaries. They have no objective proof of any of their claims. And both will murder the natives if they don't convert.

edit: Fowler's view here http://martinfowler.com/bliki/CannotMeasureProductivity.html

rpearl · on June 8, 2012

PL research is not necessarily for "contributing to programmer productivity." Formalization of various programming language concepts into a type system[0] allows researchers to apply analysis and verification techniques. The simplest example I can think of is the Maybe type. Instead of having null pointers and, if you forget to check for NULL, getting a runtime error (segfault, NullPointerException, whatever), instead you would fail to compile since the "null-like" pointer would be a Maybe type, not the concrete value. You can't use it without unwrapping it. Concretely, the code:

    int foo(int *p) {
        return *p + 17; //well, probably something more complicated.
    }

if p is NULL, this code doesn't work. In a language with a more expressive type system, we would write:

    foo :: Maybe Int -> Maybe Int
    foo p = case p of
              Just x  -> Just (x + 17)
              Nothing -> Nothing

(Note: there are more succinct ways to write this example in Haskell, but I'm trying to illustrate the code.)

In this case, we have demonstrably covered every case. That's what an expressive type system gets you: you can completely preclude certain classes of bugs like null pointer dereferences by having a sufficiently expressive type system. It's the same in more relevant PL research: people are attempting to formalize systems so that whole classes of bugs can be removed at compile time. It's not about user case studies or anything like that: those can come later, when features look like they'd be useful to integrate into languages.

It looks like the featured article gives a case study about a pretty bad type system. A sufficiently expressive type system doesn't get in the way--it aids the programmer, not hinders her. Heck, in Haskell and ML, you don't even have write down types--the compiler will infer them for you. (It is Haskell practice to type-annotate toplevel functions anyway).

[0] When I say type system, I mean a static type system. For the purposes of this discussion, dynamically typed programs are statically typed, just with not-very-useful types.

silverlake · on June 8, 2012

You say: "when features look like they'd be useful to integrate into languages". How do you know a feature would be "useful" without compelling evidence that this is really a problem for programmers? Now you're back to the original problem of figuring out what the most significant problems are for programmers. Wouldn't it be better to figure this out BEFORE PL people plunge into a particular topic?

Remember all the research done on typestates? The motivation section of those papers was usually a few paragraphs of total BS. AFAIK, there was no real data nor experiment that demonstrated this was a real problem for professional programmers.

FYI: I like static type systems and used Haskell et. al. But no one has demonstrated that it is better than even Visual Basic!

rpearl · on June 8, 2012

Not really. The research, again, isn't in programmer usability. It's in formal logic and analysis of the semantics of computer programs. Benefits for programmers are just a side-benefit. The goal is advancing humanity's understanding of computer science, not in helping programmers, although sometimes the two goals are somewhat linked and deeper understanding occasionally yields industry benefits. In my example, and in many instances of expressive type systems, research has yielded tangible benefits for industry.

bunderbunder · on June 8, 2012

Well, one thought that comes to mind is that the topic is essentially a subset of ergonomics, so perhaps experimental protocols should take a few more cues from ergonomics research.

For example, controlling variables has to be done by constructing artificial systems from the ground up. You can't just pull two commercial products off the shelf and then pretend you're examining the impact of only one of the hundered different things that differs between the two.

Similarly, if we wanted to compare the impact of dynamic vs. static typing, we'd have to make sure that that is the only variable. Which means you basically have to construct a new programming language from the ground up, so that you can easily create new dialects of it that differ in only one very specific characteristic.

silverlake · on June 8, 2012

Did you even read the linked paper? They did exactly what you described.

bunderbunder · on June 8, 2012

As the linked paper points out in the very first sentence of the section describing the language, this hasn't been a very popular approach so far.

I suppose I should have nodded to that. Got me there.

zoul · on June 8, 2012

The problem is that there are so many soft factors affecting the outcome that it's very hard to tell if even a great scientific study will apply to your case. The number of people on your team, their experience with the technology being used, their relationships between each other, the politics of the working space and thousands other factors may have much bigger effect on the productivity and quality then the choice of a typing system, and it may be pretty impossible to control for them in the study. I agree that it's good to approach things scientifically, but at the same time it's also good to question how much science applies to the problems being solved.

sgk284 · on June 8, 2012

Agreed with your overall premise, but this particular study doesn't meet the criteria to have any conclusions drawn from its results.

The number of programmers is too small and skills not representative (49 undergrad programmers). The problem (writing a simple scanner / parser) isn't one that will really benefit from a decent type system. All you need need are ints, strings, and arrays and you're good to go.

nu23 · on June 8, 2012

Reasoning from basic principles is a valuable tool to evaluate conclusions. Sure, it has strong weaknesses (hidden assumptions which are wrong, insufficient imagination about what could happen, rationalizing one's biases). But it is useful, when you dont have complete trust in the quality or the scope of the experiment. For instance, claims about quantum computing solving NP complete problems are legitimately held in doubt because of theoretical reasons. Also, whenever there are short term positives hiding a long term negative, like say unsustainable financial or ecological behavior, the negatives might be only seen by a chain of reasoning and not by direct experiments.

I agree in the sense that we see so many cases in the other direction - reasoning full of holes being trusted over empirics. The interesting thing is in any given situation, how much trust to give to the different tools that we have to evaluate a claim.

ExpiredLink · on June 8, 2012

> We need more science in our computer science, which means more experiments and more results like this.

There is an ongoing experiment wrt the usability of programming languages, libraries, frameworks, concepts, ... It's called the market.

madhadron · on June 8, 2012

Counterexample: JavaScript. It is the best demonstration I know of that our choice of languages and tools is mostly based on historical accident rather than any technical criteria.

xyzzyz · on June 8, 2012

Market is testing for many more different features than just usability, and as such is rather useless if you want to test a single thing.

johnchristopher · on June 8, 2012

What are the market's conclusions regarding dynamic and static typing ?

aangjie · on June 8, 2012

I second that.It's one of the reasons i quit my day job to do a Master's degree...though i chose a naive enough college and course that am now back to a programming job where i am surrounded by the same type of crowd again.

pcopley · on June 8, 2012

I'm curious what your Master's is in and what "type" you're surrounded by.

aangjie · on June 8, 2012

Oh. Sorry, for that snarky reply(tired i guess). I have a master's in cognitive science(from india). And currently a python programmer here. Mainly because there aren't that many jobs(in cog. sci.) , here and i was not focused enough to get published in a journal during my master's. As for types, well, one ex: is a situation where i was trying to defend my choice of Dvorak kb saying usability and got a dismissive, snorty laugh as a response.