Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Off topic

> 3 months later, reading a paper while on board a boring flight home, I have my answer.

I noticed people from hacker news routinely read scientific papers. This is a habit I envy but don't share.

Any tips or sites for someone interested in picking up more science papers to read.



For just getting started I recommend collections:

1. Ideas That Created The Future[1]. It's a collection of fiftyish classic CS papers, with some commentary.

2. Wikipedia's list[2].

3. Test of Time awards[3]. These are papers that have been around for a while and people still think are important.

4. Best paper awards[4]. Less useful than ToT as not every best paper is actually that good or important, and sometimes the award committees can't see past names or brands for novel research.

5. Survey Journals[5]. Students often get their research started with a literature review and some go the extra step to collect dozens of papers into a summary paper. I subscribe to the RSS feed for that one, and usually one or two are interesting enough to read.

6. Citation mining -- As you read all these, consider their citation list as potential new reading material, or if an old paper leaves you wanting more, use Google Scholar to find a papers that cited what you just read.

[1]: https://www.amazon.com/Ideas-That-Created-Future-Computer/dp...

[2]: https://en.wikipedia.org/wiki/List_of_important_publications...

[3]: https://www.usenix.org/conferences/test-of-time-awards

[4]: https://jeffhuang.com/best_paper_awards/

[5]: https://dl.acm.org/journal/csur


I'd like to disagree with this. In particular, about [1]: It is a collection of papers in many different topics. There is little technical overlap between Alan Turing's Entscheidungsproblem paper, for instance, and Hoare's paper on axiomatic semantics. Also, the papers are all from the 70s. They're uniformly influential papers, and have shaped the field, but the fields and the vernacular used by working researchers is very different. At best, the papers approximate a four year undergrad curriculum in CS, and at worst, are a recipe to get distracted and overwhelmed. The link to Wikipedia [2] is somewhat better in that the papers appear to be more modern, but suffers even more from the problem of diversity.

A somewhat similar problem arises with test-of-time and best paper awards. To elaborate on my complaint, imagine the exaggerated case of someone trying to understand modern science by intensely focusing on the work of researchers who won the Nobel Prize. Clearly all very important work, but understanding the 1990 Physics Nobel Prize (on electron-proton scattering) is of no use to understanding the work for which 1991 Nobel was awarded (complex systems and polymers).

There are two things that (I'm assuming the OP's field of interest is computing) a CS education provides: At the undergrad and in the early stages of grad school, breadth of topics, and their modern synthesis. You don't spend much time reading papers (at least in an undergraduate education), but you understand the basics, and get a feel for the problems considered and the sensibilities of researchers. In an intermediate-level graduate seminar, you pick a narrow topic, and focus on papers in that topic. The first papers in the area (like Dijkstra's papers on distributed computing), the best / most important papers in the area, and the latest papers on topical interests (like Merkle trees and blockchains). There is thematic and technical continuity from one paper to the next, and you start to understand the the story being told. Then, late in graduate school, and in the rest of one's professional career, one starts reviewing papers that haven't even been published. At this point, you see the story being written: the steps and the missteps, and the memorable and not-so-memorable papers in a field. To truly understand a field, one needs to read not just the great papers, but also the middling ones.

And one needs to concentrate on a topic. The thing about a forum such as HackerNews is that for every topic of interest, there's likely a person here who's an expert in the area, but it is easy to confuse that observation with the much stronger claim that there's a person here who's an expert on every topic. The last of those people died in the mid-20th century, if they ever existed.


I feel like you're giving advice on how to become a PhD student, and frankly, that's not the point of the question, and if it is: any grad student who can't read papers should ask their advisor for advice.

So I take OP's perspective to be from a practitioner (such as myself). Apart from my colleagues in R&D, we aren't called upon to write new papers that demands expertise in ever increasing narrowness. Instead we are to solve the needs of the product, usually regardless of specific expertise. So we need to be more broadly equipped, as it's typically better to have a screwdriver and a hammer and a screwdriver in the toolbox than a ten different screwdriver bits of varying niche application.

As an example, the TD-IDF paper curated in [1] has been broadly useful as a log analysis tool to surface interesting log lines and remove the mundane common "error" logs. There's been many advancements since then, using bayesian techniques or deep learning, but this one is simple enough and cheap enough to deploy.


Old ideas that were good but didn't become common/standard are something I run across a fair bit in papers and yeah, they're often way behind the state of the art but also a lot easier for me to understand/implement and far better than the relatively naive approach I'd've taken otherwise.


From there, just keep a reading queue. If you notice a particular journal is a good source of material, consider subscribing to it.


> I noticed people from hacker news routinely read scientific papers.

Do they? I suspect that most don't, and those that do are either in specialized careers or are engaged in some kind of scientific research.

Some interesting research gets disseminated via Twitter and chatrooms. Or maybe you follow a podcast that mentions new research. But you might also be following new publications from a handful of reputable journals, or following an Arxiv category, or looking through new conference papers. It's very easy to get overwhelmed with new research to read, and not knowing what's worth your time, unless you're already very familiar with the field and well-versed in the material.


Long time HN'er college dropout and I read a LOT of scientific papers. Probably an average of 4 a week over the past couple of decades, sometimes reading 40 in a week.

I probably averaged 20 a week back in March when open source AI was booming in the wake of Llama and on the heels of GPT-4.


> Long time HN'er college dropout and I read a LOT of scientific papers. Probably an average of 4 a week over the past couple of decades, sometimes reading 40 in a week.

I'm guessing that you don't actually dive into each paper to 100% understand it? I find it takes me at least 10 hours of reading/looking things up per paper before I could consider that I fully understand it. But that would mean, if I want to do 4 papers per week, I'd spend at least 40 hours/week, that's like a full-time job, so obviously I don't have time for that.

How much time would you estimate it takes you to read through one paper? And how much of the content would you estimate gets retained and can be recalled when you wish?


> I'm guessing that you don't actually dive into each paper to 100% understand it

Depends on the paper's content but there's often sections that you don't need to 100% understand to get value. For example, in survey papers, there's typically a section that is basically "what queries we typed in at the library." I skip those and I think you can too =)

For practical papers, sometimes the evaluation can be skimmed. Author's benchmarks are usually designed to be the most favorable to the paper's novel approach, so I don't spend too much time thinking about them.

Similarly, Related Work sections can be skimmed. If you're well read in the field, you probably won't learn anything from it, and if you're entirely unread "its like X but different because Y" isn't helpful as you have no idea what X is, beyond the one dense sentence the paper just gave you.

> And how much of the content would you estimate gets retained and can be recalled when you wish?

If I really want to remember a paper, it goes into Anki flashcards. This is rare, personally. Usually only for tech I support in prod.


How much I understand, and how long it takes to get there, depends on a lot on how well-read I already am into a field.

I can read and fully understand an ML paper in an hour or so. But 6 months ago it took me a day to get through a couple of ML papers and I did not fully grok the mechanics of things like attention heads.

I'm more read in material science, chemistry, pharmacology, and cognitive science. Computer science (especially quantum computing, networking, and cryptography), photonics, and pure math are also big areas of interest for me.

Anything outside of that wheelhouse will take longer and I'll initially understand less, depending on how distant it is from my stronger subjects.


That's quite a range. How do you manage the signal-to-noise ratio? Normally that requires significant familiarity with the field, or a very specific query in mind. For example I only read papers in medicine when I'm researching an actual medical issue that I or someone else is having.


I follow a lot of highly respected researchers and (research minded) operators in the fields I'm interested in. Very often they post about papers of interest on X/twitter or their personal blogs. I also follow a handful of science communicators on YouTube who do short summary videos of papers of interest (Two-Minute-Papers, Anton Petrov, Sabine Hossenfelder, to name a few).

Other times I notice a general trend (ex. increasing discussion of a new paradigm X, more startups raising to work on Y, or a large chunk of talks at an annual conference being variations of Z).

Then I ask the forementioned academics and operators in my circle what papers I should read to get a handle on XYZ and/or simply follow the citations.

Given the amount of followers a lot of these researchers, operators, and science communicators have, I do not think I'm remotely unique in my efforts.


You don't typically need to pore over the paper and absorb every detail. Usually you can skim a little and backtrack if you missed something.


I strongly agree.

Once upon a time, I was in condensed matter physics. I was (and remain) interested in a very specific niche within that, and I read a small handful of the papers that were published each week. I’m not actively researching or publishing anymore so I cap this to one or two per month now, and mostly scan over them to see if anything piques my interest.

I was still interested in condensed matter as a whole, at the time, and attended group seminars once a month to see what other people were currently excited about - there wasn’t any hope of me reading a cross section of all condensed matter papers because there is far more published per week than I’d be physically able to even glimpse at, and most of it is stuff I don’t understand or particularly care about.

I was likewise interested in physics as a whole, and twice a year I’d attend a departmental seminar and see what people in the entire department were interested in. Most was far over my head, but it still directed me to a small handful of papers that I’d read for the hell of it. Of course, I couldn’t do this without first hearing people review the research. There’s far more published per day in physics as a whole than I could read in a year, and most of it I’d find unrelatable and uninteresting.

I guess where I’m going with this is that anyone with a specific interest is already reading papers. It’s their job. Anyone with a general interest would find actively pursing paper hunting to be a waste of time with a ridiculously bad signal to noise ratio. Instead, they should use channels that align closely with their own interests, through which they can get recommendations to read papers from the aforementioned specialists who have already filtered out much of the noise themselves. At that point, they should actually read the resulting papers.

There is another trick, though, and that’s to find an individual who publishes two unrelated pieces of work that you find interesting, then read their work and maybe those of their coauthors. Be careful, though, because this is a slippery slope to specialising, after which you’ll find yourself back at the point where you don’t aren’t following 99.9% of the stuff you wanted to follow in the first place.


I typically look up and read a paper when it's referenced in discussion or cited in something else, I'm reading/watching, and the purported contents seem surprising to me. This normally happens 3 or 4 times a week.

Honestly many papers are written in a way that's hard to approach and difficult to understand unless you're prepared to reread them a few times.

You're better off just getting your science news from actual science communicators and not the raw source.


> I noticed people from hacker news routinely read scientific papers.

Highly doubt that. It’s very hard to actually read scientific papers when you are not actively doing research.

You can’t just read a research paper in isolation. It’s next to useless. You need to understand its context, where it stands with regard to its sources and what it brings which is actually new and valuable. It’s nearly impossible to do properly if you are not fully immersed in a research subject.

I don’t even know how you would scheme introduction and sources to filter articles which are immediately obviously useless without being immersed in a field.

I guess you can obviously go though lists of papers which have be deemed worthwhile by someone else or got prices. That solves the filtering issue but then nearly every time you will be better served reading a text book presenting the ideas in said papers.

I fully expect the HN readership to contain a significant amount of students and actual researchers which explain why you encounter people reading papers but these people aside I would be surprised if the habit is common.


You don't need to be doing research to read an ML paper. With some general knowledge in AI you should be able to understand most papers.

And even then, sometimes you don't understand or care about their procedures, and you just want to look at the pretty results (check out this song they generated using AI!). There's even a very popular YouTube channel that focuses on this (two minute papers).

Finally, you usually hear about these cool papers via Twitter / X


> You don't need to be doing research to read an ML paper. With some general knowledge in AI you should be able to understand most papers.

I have a degree which involved reading some ML papers and I seriously doubt that. The field is flooded with papers which looks good when you quickly read them but are actually worthless because they misrepresent the state of the art or intentionally don’t compare their methods with other papers they should know.

> And even then, sometimes you don't understand or care about their procedures, and you just want to look at the pretty results

That’s fair but I wouldn’t call that reading a scientific paper.


Don't read them for the sake of reading them. Read them to solve your current problem or trying to keep up with advancements in a narrow field you love. Most papers (especially the ones in deep learning) seem to also have a mathematical fetish (to put it mildly) where needless representations are used where none are required and are self evident (for example inputs belong to Real number set). It ends up making the paper pseudo complex and unapproachable. Most papers are doing average/summation/series operations but instead of just saying so, use the symbols all over the place. So even if a few papers appear tough, keep reading them and digest your first paper thoroughly. You will find subsequent papers mostly are a rehash of existing work with similar fetish to make trial and error appear like mathematically sound research. Once in a while, you would find some paper which is fully theoretical and try to prove that either the inputs/outputs/components of models have certain well known mathematical properties and hence can be reasoned similarly. These are rare and would be difficult to parse through.

PS: Best papers I have seen are from deepmind where the approaches usually described are novel, varied and path breaking. Worst ones are - well no names but those that just use training and eval sets generated by GPT4 and try to prove things empirically


> Most papers (especially the ones in deep learning) seem to also have a mathematical fetish (to put it mildly) where needless representations are used where none are required and are self evident (for example inputs belong to Real number set). It ends up making the paper pseudo complex and unapproachable.

I completely disagree with that. Spelling out math is literally something out of 12th century. It just hinders understanding, if you have basic STEM-level math literacy, which anyone who reads an ML paper is implied to have (how could you seriously study linear algebra and calculus without it?).

Math may actually be the first thing you recognise in a paper, which can help you cross-reference the text to understand it.


Build the habit.

When google doesn't return a good result to a specific question, switch to scholar.google.com and start reading abstracts. Everything may seem like an opaque maze at first, but just keep reading and patterns start emerging quickly and become useful.


I don't mind reading research papers, but they're really annoying to read on a phone screen. I remember a few years ago, an HN comment shared a link to some tool that could convert a PDF to single column text and make it more readable on a phone screen, but I can't find it. Anyone remember this or have the link?


I use an android (and iOS I think) app called Xodo. The "reader mode" re-flows the PDF into a screen-width single column like an e-book. The latest update really buried the option in the menus, but it's there somewhere and works pretty well.


> but they're really annoying to read on a phone screen.

+1. I've already read probably 100 research papers this year in search of solutions to some technical problems, mostly while lying on bed with a tablet. I won't read as much without it.


Once phones got relatively big (i.e. 'phablet' ceased to exist as a concept because that size was just 'phone' now) I switched to using a 7/8" tablet with my SIM in it as my primary portable device (Nexus 7 and now Galaxy Tab A6).

Means I have to carry it in my jacket pocket or a side pocket on my combats but the bigger phones weren't comfortable in my trousers' top pocket anyway so for me at least the trade-off is well worth it.


How big is your phone screen and what are you using to read it? A few inches makes a lot of difference. In landscape mode my phone is 6.5" wide and reading a pdf with moonreader in full screen because its wide enough to read without having to reformat anything. You can also click on figures to view only that figure.

If that isn't enough you might consider a tablet or e-reader instead of trying so hard to make existing options work.

You CAN convert to something like epub which is trivially reflowed and this is just fine for reading fiction but just isn't as pleasant and nicely formatted as a pdf.


The software KOReader [1] has a PDF reflow setting which you can try.

[1] http://koreader.rocks/


Check out the papers and talks from Papers We Love, a "repository of academic computer science papers and a community who loves reading them":

https://paperswelove.org/


It depends on why you want to read papers and what you want to get out of it.

https://news.ycombinator.com/item?id=37006967 suggested some avenues for finding some classic papers. The follow-up https://news.ycombinator.com/item?id=37007360 pointed out some circumstances where that's not ideal. But in the process, implicitly assumes that you want to become familiar with current research, instead of just enjoying classic papers for some other motivation.

I mostly read papers in mathematics and computer science. For other disciplines I mostly rely on pop science, like Slate Star Codex or Money Stuff and blogs. There's also The Monad Reader (https://wiki.haskell.org/The_Monad.Reader) if you are interested in functional programming.

There's various blogs with interesting articles. Eg Vitalik Buterin has great stuff, like https://vitalik.ca/general/2017/11/09/starks_part_1.html and he links to the original papers. (I have no conclusive opinions on whether crypto-currencies are useful or good for the real world, but I do find the math behind some of them endlessly fascinating. Especially zero-knowledge proofs.)

Wikipedia is also often a good starting point. Whenever you read about a random topic, Wikipedia usually has an article that comes with plenty of references. Eg https://en.wikipedia.org/wiki/Forth_Bridge#References links to http://www.bath.ac.uk/ace/uploads/StudentProjects/Bridgeconf... and down the rabbit hole you go.

https://gwern.net/ also has great write-ups and links to original papers.


Honestly a lot are really hard to read. You start with the easy ones, learn the lingo, and then just keep going. Eventually you can enjoy reading the harder ones.

You learn pretty quickly that if you want answers, it's better to just go straight to the source, rather than have it filtered through someone else, where the message can (and often does) get twisted.

What are you interested in reading about? Maybe some people can recommend you some papers to start with.


There are certainly easier and harder papers. Though when you are struggling: keep in mind that there are also papers that are just badly written (and some papers that are well written).


> I noticed people from hacker news routinely read scientific papers. This is a habit I envy but don't share.

> Any tips or sites for someone interested in picking up more science papers to read.

Personally, the older I get, the more bored I've been getting with the level of information that "crosses my desk".

Eventually I basically stopped reading blogs et al and started getting my insights from books. Those books would often mention papers. Then I noticed a lot of books (and deep well-researched podcasts) mentioning the same papers. So I started reading those papers.

When you read a couple papers, you notice most of them reference a bunch of other papers. Now you have an exponentially growing queue of interesting papers that you'll never get to. Mission accomplished.

The main trick is to read stuff you're interested in knowing and understanding. Many papers can be quite difficult to read, but getting through a single paper will fuel your brain with more valuable information than 2 weeks of "the internet". In my experience at least.

Ultimately, life is short and papers give you a better information density return on your time than almost anything else. Even the bad ones.


For computer science, https://blog.acolyer.org/ is called The Morning Paper and talks about one interesting paper per post.

Edit: It seems to've gone on indefinite hiatus but there's a lot of backlog already there and some of it's really quite fascinating.


There are some materials about "how to read scientific paper", like the pdf one from U waterloo [3] with some methodological advice. Some good advice in this old HN thread [1]

But I don't see the point of reading a scientific paper unless you're actually curious about a specific topic. They are often hard to read, dense, have so many field-specific jargon that if you're new, you won't be able to read one paper and grasp everything. You would have to read references, or a book/blog that summaries core points.

So find a specific field you're interested in, find a good book/blog/homepage/tutorial/video to get your basics going so that when you start reading papers you won't be completely lost.

Then find a highly cited survey paper to understand what progress have been made beyond what is now basic. Then you can follow your curiously along that survey, decide a branch of research to read upon. You'll probably then realize that a few labs research/publish a lot in a specific direction. Now you can follow those professors (Twitter, Google scholar email notification) to keep up to date. By reading a lot you'll also start to notice papers that are "published just to get my PhD" and soon enough you can just read abstract + intro/result to judge if it is valuable or not.

If ML/LLM is your curiosity probably Lillian Wengs blog [2] is a good start for tutorials / surveys.

[1] https://news.ycombinator.com/item?id=24986727

[2] https://lilianweng.github.io/

Edit: direct link [3] https://web.stanford.edu/class/ee384m/Handouts/HowtoReadPape...


For me it's very helpful to print out papers and read them with a pen in hand, away from my computer. Papers tend to be dense and require a level of focus that (I at least) cannot maintain when reading on a screen. It helps as well to able to easily take notes and annotate the paper.


Pick ones that are easy to read. Some are written line a magazine article. Others are math dense, reference another paper you can’t get hold of every other sentence and are a kind of marketing material anyway.

Also youtube and code: Attention is all you need is not a nice paper to read for Joe programmer, but you can understand what it is doing by watching karpathy and reading his code (or someone else who has implemented it, Llama for example). But you need to do some basic torch training first (karpathy again!)


Anyone can read scientific papers. All you need to do is pierce the layer of jargon. It takes practice but you kind of just pick it up. Reading on a computer helps because you can get words defined by clicking on them. Reading on paper is good too, it’s easier to keep at it and it sticks better.

Some sense of urgency helps. Most people will have a medical ailment or physiological issue of some sort. I promise you that there exist useful papers on it.


Once you obtain subject mastery you just need the read the abstracts.

To get a cold start look for a “survey”, “literature review”, or “systematization of knowledge” papers. Those organize a lot of papers, check out the ones that look cool and read the abstracts.

Rinse and repeat for five years and you get a phd.


Build the habit.

When google doesn't return a good result to a specific question, switch to scholar.google.com and start reading abstracts. It'll seem like an opaque maze at first, but just keep reading and it'll start clearing up pretty quickly and become useful.


Don’t feel like you need to understand 100%. You can always give yourself an hour to read a paper and gloss over some notation. If you read 5 papers over the course of a month, you can go back to your favorite and dive into the notation.


Feedly with keywords for your favorite topics or researchers works decently.

I imagine this routine comes from people with research backgrounds, where browsing papers is the academic way of googling around for answers.


I usually just read the abstract and synthesize that with the comments on HN to get the gist (and legit-ness) of the research.


They read scientific papers in the same way that everyone "read" Capital in the 21st Century, when that was a thing.


read textbooks instead most papers are obtuse and poorly written even famous ones. you can find them in wikipedia footnotes


Step 1. Find papers you're interested in Step 2. Open them Step 3. read them


Step 4: do a depth-first lookup of every citation, and read/finish that paper before continuing


Step 4. Get lost within a minute.


Step 3.5, see some other interesting paper is referenced in the related work, go to step 1.


Step 3.5-turbo, have ChatGPT summarize papers for you to speed up your reading


LlaMAo :)


Semantic Scholar for search. Scihub for any paywalled papers. Libgen for books. Zotero to organize.


Do you like Semantic Scholar more than Google Scholar and of so why?


Pick something you’re interested in and have a passing knowledge of.


just set up a desktop service to randomly open a paper once every few hours

if they're not too boring, and you're not doing anything important, you'll read it for fun


I read the abstract and look at the pretty figures :)


I want to know what a non-boring flight would be like


High turbulence definitely makes it less boring. So will a crying baby, disruptive passenger, or someone getting sick. After a few of those, you'll prefer the boring flights.



Snakes on a plane


Airforce One


The Langoliers


Airplane!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: