Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
An Adversarial Review of “Adversarial Generation of Natural Language” (medium.com/yoav.goldberg)
119 points by sebg on July 2, 2017 | hide | past | favorite | 29 comments


It's worth noting this post caused a very intense twitter debate among the NLP and DL communities, especially after Yann Lecun replied to Yoav's comments. https://www.facebook.com/yann.lecun/posts/10154498539442143


What are some twitter links? I'd be interested to see the kind of nuanced conversations that twitter can spawn. Also looking to follow some NLP people.



None of those tweets have the phrase you searched for, so why do you conclude that search is broken?


Several of the tweets link to the post, and the title shows up in Twitter's preview box (although not in the tweet proper). If Twitter search "worked" it should have found those, via some magic string munging or whatever.

My definition of "works" is https://en.wikipedia.org/wiki/DWIM, others are free to disagree.


This is an instance of the general issue of conflict across, what I like to call, the salesperson-slacker spectrum.

Most researchers/academics lie somewhere on this spectrum. (Well I guess most human beings involved in any activity probably).

On the one end are salespeople who love to make a mountain out of a molehill they just discovered. On the other, slackers are like the perfectionists who never get anything done because they never resolved their analysis-paralysis.

There are very few who are exactly in the middle of the spectrum. The middle is a point of unstable equilibrium. You have to work very hard to stay there and can easily fall off to one side or the other.


I wouldn't say perfectionism and sales-orientedness are on the same dimension. However, I have noticed that being given the hard-sell is more often than not associated with the product being sub par. A product or project that is doing very good work often doesn't have to sell quite as hard, because the work speaks for itself. (My experience on this issue mostly deals with internal teams. I don't know how much this applies to external teams.)

This is especially true when two products have teams of about equal size, experience, and expertise. Time spent selling is time not spent making a better product. All other things being equal, every hour spent selling is one less hour spent working on making your product better.


> A product or project that is doing very good work often doesn't have to sell quite as hard, because the work speaks for itself.

This doesn't work in general, though.

> All other things being equal, every hour spent selling is one less hour spent working on making your product better.

Which, I believe, is a key to understanding why many (most?) of the things you can buy are utter crap, barely fit for the purpose they were made (if at all). It explains why so many successful SaaS businesses offer barely functional products. Because every hour spent selling is a hour spent not working on a product, and marketing has much better ROI than actually building something useful.


Thought experiment at the extremes: if you wait for your product to be literally perfect you will never sell anything.

And of course you can sell a product that doesn't even exist.

The dimension is confidence or maybe approval.


That's not a thought experiment, it's just a statement.


And with exactly zero selling you have the best product you can make but one that nobody knows exists.


True at theoretical extremes. I think the world would be a better place if the building/selling split was more like 99:1. But in a competitive environment, selling has a much better ROI than building, so here we are.


Is this just dunning-kruger effect under a different name, maybe? People discover a small portion of someone else's field, learn a bit and consider themselves well on the way to being experts without a clue how shallow their understanding is?


Dunning-kruger effect (DKE) maybe part of the salesperson mentality. But I see it as more than just DKE.

You can be a DK without any pressure. With research/academia,

- there is a lot of pressure to excel and show that your work is the next best thing since sliced bread. Another popular phrase is 'publish or perish'

- the audience of such researchers are burdened with information overload, and that adds to the difficulty of having your voice heard, leading to a habit of taking shortcuts and trying to wow the audience when you really didn't do much.


Your model fails to admit the possibility of researchers who do better or worse work.


Model fixed ('every' changed to 'most').


Years ago almost all researchers in CS were genuinely trying to advance the topic, they published rarely. There were hyper stars, but almost all of the community thought very carefully before publishing anything. The problem was that the internal ranking of contributors that the community had (and it was pretty clear) was opaque to outsiders and obviously biased - there were few women or people of colour and age had a lot to do with your rank. On the other hand the set up was "useful" in that most people (who were white and male) seemed to be engaged in actual and useful work, and it was pretty clear where the state of the art was and what you needed to read and grasp to understand it.

Right now many people are engaged in creating blizzards of papers, the mechanism of choice is the construction of factories for writing papers - an army of graduate students and post-docs - and the construction of communities of publication that are plausible enough to enable the extraction of funds from funding sources.

Is this "useful" well - there is still bias and discrimination; there are not enough women (where enough is proportionate to the representation of women in the general population and their desire to do this kind of thing) and there are not enough people of colour - but there are more, which, thank god, I think most people think is a good thing. Research in some parts of CS is going very fast now as well, which is great!

But, in HCI, Enterprise IT, Software Methodologies, things do not seem so good, and where things are going well, like in AI I wonder if this is coincidence (as in look, a thousand cores have arrived, I can do many things while waiting for my paper factory to make more papers).

And it is expensive, very expensive. Where-as old CS involved a crowd of poorly paid sports jacket wearers, new CS involves a horde of hoodies, but the expense that worries me is the cognitive one - how to sort through the morass of chatter that conceals (intentionally, often, so as to enable the next seven or eight six page koans of review passing flim flam) the things I need (TM).

So, some things are better in CS land. People who wouldn't have got a shout before are in with a shout (some of them) but instead of building an inclusive community of people really trying to do research we have build a thing that I hesitate to call a community that is more inclusive but contains many people who are doing things which are not research at all, in fact, they are kinda anti-research.

And you can't get funding to do a field study, or publish a paper without waving some silly maths that means almost nothing about. And if you do publish, you'll have to cough up £2k one way or another, and no one will read the bloody thing.


If you want to read more discussion on this topic (and this article), see the article "A computational linguistic farce in 3 acts" and its HN discussion: https://news.ycombinator.com/item?id=14532306


I'm not going to comment about that paper being published on arxiv and I don't generally care about the NLP vs DL debate, just wanted to say that those generated examples did look indeed all rubbish to me.

As do most of the Google Translate pieces, even though I get the feeling that automatic translation of texts is now seen almost as a solved problem (it's not): all that Google Translate does is change some text from some original language to a second one, which is not a real language, it's just a language that's sometimes very close (grammatically and lexical) to one which the agent/user knows.

The idea is that we should try to look harder and have fairer judgements about the actual results and not get stuck on the methodologies.


> This post is also an ideological action w.r.t arxiv publishing

Anyone else thought that this was very weird? The author appears to be complaining about the fact that reputable people/labs can post a PDF on arXiv and be taken seriously. How is this avoidable? Without arXiv, they could just post the PDF on their website or anywhere else.

The "risk" associated to publishing crap on arXiv is the same as always: have people notice it's crap and get a bad reputation. I'm not sure what ideology has to do with it.


No, this is not what this is about at all.

We have reached a point where reviewers of reputable conferences will ask why a paper is not referencing unreviewed work that has been uploaded on arxiv a day before the conference deadline.

This is not hyperbole, as anyone who is submitting to ICLR or NIPS can confirm. Work of certain labs is taken as established authority as soon as it hits arxiv.


Indeed it's possible in theory to do flag planting without arxiv, for example by posting a timestamped technical report to an institutional repository, but it has never been a common thing. Probably because you don't get much of an audience, so people won't cite you unless you bug them in reviews. And if someone publishes the idea some months later in a paper, they will probably get credit as there's the reasonable assumption they just hadn't read your report.

On the other hand, arxiv has a huge audience, lately maybe even more than DL/NLP conferences, making the flag planting really effective, especially if you are from a prestigious group. So there is a real problem now with large, prestigious groups posting half-assed preliminary results in arxiv, and deterring more modest groups from working on the problem or bogging them down because they now need to compare themselves to the well-known arxiv approach, which often has serious reproducibility issues because it has not been peer-reviewed.

I'm not an arxiv hater, in fact I check arxiv every day and for the last year or so I have posted most of my papers in it. But the problem is real and something must be done. Not about arxiv which is just the messenger (and a good one), but about the flag-planting culture using it that has emerged in the field.


Thanks a lot for bothering to explain the problem. I have never seen this in my field (theoretical computer science), so I wasn't really aware of the problem. (More accurately: reviewers may ask why you do not compare yourself to arXiv preprints, often with indulgence if they are recent, but I have not seen this culture emerge of posting half-baked results to arXiv to claim priority.)

It appears that we agree that the problem is not with arXiv. Part of the problem is unsolvable: if prestigious groups can announce what they are working on and discourage other groups from working on the same problem, this may just be reasonable self-interest from the smaller groups and can hardly be avoided. As for reviewers asking for comparison to these works, I guess the problem lies with the reviewers: if an arXiv preprint has not been refereed and/or is hard to reproduce, it should be OK to say so in another paper and not be blamed for the lack of comparison.

In any case, this is an interesting problem, thanks again for making me aware of it.


>But the problem is real and something must be done

I am working on a social network for papers that addresses this problem. Arxiv is too vulnerable to fake news articles because it is an archaic social network. I also think https://openreview.net can be improved. You can check my project here : http://www.startcrowd.club it does not have an anti fake news feature yet, but it is on the product roadmap.


It's very avoidable. This whole thing of taking non-peer reviewed works seriously is very field dependent, and even then only a decade or two old. In physics and computer science it seems to be common, but for example, in biomedical research it is basically unheard of. People have been trying to get a bioarXiv started or post biology papers on arXiv itself, but they have very little influence because people think rightly or wrongly that a non-peer reviewed work must be flawed in some way.


> Anyone else thought that this was very weird? The author appears to be complaining about the fact that reputable people/labs can post a PDF on arXiv and be taken seriously. How is this avoidable? Without arXiv, they could just post the PDF on their website or anywhere else.

The issue is "flag planting" : arXiv is one way to do it; the same arguments hold true for other sources but arXiv is used as a canonical example for "flag planting".


For another point of comparison, has there been a "us vs. them" dynamic in the computer vision community when it comes to deep learning? After all, it seems like deep convolutional networks did sort of railroad there way through a ton of topics in that field. I used to do image analysis research, but moved on before much of this came into play so I lack any sort of inside scoop.


> Communities will naturally recognize contributions and give credit when credit is due. It's always happen that way.

"Let the market decide!"


The market needs signals, and the point of Goldberg was that those signals are missing with arxiv, and that just looking at the lab reputation was a poor signal. I am working on it, contact me for details.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: