Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A README maturity model (github.com/lappleapple)
178 points by fanf2 on Oct 26, 2017 | hide | past | favorite | 72 comments


> User testimonials and evidence of past performance in real development situations

Please, not in a README.

My problem with documentation remains: It's always out of date, and frequently contradictory. I'm trying to work with Kubernetes right now, and by this "maturity model", it's a Level 5. That doesn't actually provide any real help though, since k8s is moving so quickly the tremendous volumes of documentation which make it so attractive to managers and leaders just can't keep up.

What arguments should I be providing to kubelet? Depends on whether you're reading the code, the admin guide, or the getting started from scratch guide. None of which, by the way, reference the version of k8s you're going back to.

The API docs generated from the code are a bit better, but they were written as part of the code, and the descriptions are incomplete and confusing without the context of the code (separate rant - that codebase is split between so many repos you'll go slightly mad trying to learn what is where if you're not already familiar with it).

In the end, we always end up going back to the code; even k8s documentation acknowledges this by constantly linking back to GitHub.


I just had an idea, but more in relation to documenting source code: what if there was a tool sort of like gofmt in that it has to be run in order for code to successfully compile, but instead of formatting, it actually checks if some or all of the variable names in functions or objects appear in some corresponding comment (like Python docstrings [0] or javadoc comments [1]).

I just came up with the idea and the first (bad?) heuristic now, and I didn't check to see if tools with these characteristics already exist. It would be bad initially [2] and it would be onerous for sure, but it would be a start.

[0] https://www.python.org/dev/peps/pep-0257/ [1] http://www.oracle.com/technetwork/java/javase/documentation/... [2] https://en.wikipedia.org/wiki/Colorless_green_ideas_sleep_fu...


In rustdoc, documentation examples are compiled and run whenever you run your tests. It's very helpful for this kind of thing.

https://doc.rust-lang.org/stable/rustdoc/documentation-tests...


Thanks for the example!

To be completely clear, this is what I meant:

Suppose this is the function we want to document (using examples modified documentation tests:

  fn foo {
    let x = 5;
    let y = 6;
    println!("{}", x + y);
  }
Suppose we have a documentation requirement/heuristic of requiring the comment to have all the variables that are declared in the body of any given function. Then a comment like this would pass because whatever compilation step /processor sees that `x` and `y` both appear in the comment.

  // Prints x + y
This would fail because `x` is not found:

  // Prints z + y
This would pass but is not useful/misleading/wrong.

  // y - x FTW!!!
This example is trivial and only uses the one simple heuristic such that it is both onerous and useless, but it would force the writer of the code to also write comments in accordance with said heuristic. In this particular case, if someone modifies the variable name, adds a new variable, or removes and existing once, then the compiler would force that individual to make a necessary change to that particular comment such that the comment cannot become stale (but the heuristic is bad, so it could still be useless :) ).


I would argue that externally documenting declarations inside the function is not particularly helpful.


Just wanted to chime in and say that as a Go developer of almost 5 years now, running into Rusts' compiled examples has been quite cool.

The cool thing about these is that you can actually (depending on the situation) make your examples your tests.


What makes them so different from godoc's examples?


Python has something similar:

https://docs.python.org/3/library/doctest.html

For software with a command-line interface, perhaps it would be useful to do some testing at that interface, perhaps using BATS:

https://github.com/sstephenson/bats

That is also automatically tested, but describes the actual user interface.


Nice! Python has doctests which serve a similar purpose:

https://docs.python.org/3.6/library/doctest.html


In the Python world, about the best you can do is enforce the existence of API documentation. Most Python linters can check that every module/class/function/method has a docstring, though of course they can't enforce that it's up-to-date or useful to a human.

Enforcing useful up-to-date documentation requires cultural setup work rather than technological setup work.


What about documentation, even "how-to use this product" documentation, that's generated from the code's comments directly?

So say you have a function that essentially implements the "--file" flag: what if the developer put a comment above this function explaining the function's purpose but also the documentation for the admin's use case, which would be setting up the solution?

I believe that if such documentation existed it would be a lot easier to update it when the code is changed (provided the API changes) and have it propogate through to the docs. The comments could also be three tiered: API docs, admin docs, user docs; each would explain to the relevant user how-to make use of the code at that point.

Obviously you could omit user/admin comments for functions that aren't "public facing", such as internal helper functions. But anything that's directly interacted via a CLI flag or HTTPS endpoint could have the documentation right there by the code handling that interaction... or not if it's not relevant.

See "godoc" as an example of this: https://blog.golang.org/godoc-documenting-go-code

The code truly does become the documentation at this point, and acts as a central point of truth.


This. There's an almost inevitable orchestration involved in maintaining good documentation for a fast-moving codebase. Tools like Swagger[1] are good for automatically generating API documentation, but beyond that there is little help for maintaining generic README-style documentation.

How do large, fast moving projects like the linux kernel manage this?

[1]: https://swagger.io/


Linux delegated its user-facing documentation to third parties pretty much from the beginning.

Even for those things which have documentation in Documentation/, there is no formal or informal rule that if you make a change which causes existing statements to become false you have to update them.


Which doesn't reduce the usefulness of the linked document at all.

If your project is bigger, README probably should not be a full documentation of the project but link to the actual documentation.


This reminds me of the worst readme I ever read: http://dpdk.org/browse/apps/pktgen-dpdk/tree/README.md

It's extremely long and almost impossible to read. It starts off as it means to go on:

  Pktgen is a traffic generator powered by Intel's DPDK at wire rate traffic with 64 byte frames.
For that to work as a sentence, it needs the phrase "capable of generating packets" putting in the middle.

A bit lower down it spends 200 lines showing what happens if you type "ls" in various folders on the developer's machine. Helpful.

It's fascinating to try to read because it's so mad.

If you are the author, I'm happy to provide some more constructive feedback...


The worst thing is that I could actually use something like this, but that 1500 line stream of consciousness README makes me seriously consider how much time I want to put into it.


I'll take this over nothing at all any day of the week.


This is one area that Perl's CPAN really got right. Pick just about any random module, and you're immediately greeted with a short synopsis showing the use-case. When you search for something and find 5 matching modules, you can quickly evaluate them on the basis of how easy they are to use relative to your codebase.

Random obscure module: http://search.cpan.org/~prasad/X12-0.80/lib/X12/Parser.pm

OK so in the first paragraph, I know how to parse an X12 transaction file and get the results into my code.

This boils down to know your audience.


I think this is a great example. I don't even know what an X12 transaction file is, but boy howdy, I know how to parse it now!


It is a double-edged sword though. It makes it easier for people to use a module without understanding it fully (or at all). The CGI.pm fiasco is one example, where people were copy-pasting the one-liner in the synopsis and opened themselves to an attack. A lot of things are complicated and can’t be usefully summarized in a few lines.


I wasn't around for this fiasco, what exactly was it?


Check the “Perl jam” series of talks given at CCC. They are quite inflammatory, but the author does have a few good points. Specifically, he shows that one of the examples in the documentation for the CGI.pm module had an exploitable vulnerability.


Specifically, he shows that one of the examples in the documentation for the CGI.pm module had an exploitable vulnerability.

I think that's overstating things. He reported a "vulnerability" in Bugzilla which wasn't a security problem in Bugzilla because Bugzilla uses taint, which didn't do any database injection like he claimed, and which is unrelated to CGI.pm becaues Bugzilla doesn't use CGI.pm:

https://bugzilla.mozilla.org/show_bug.cgi?id=1230932

Furthermore, the examples in his presentation don't actually work, he relies on ignorance of lists and Perl data structures, and the one potentially interesting point he makes about calling functions in list context in hash initializers has been documented well understood as a potential mishap in web applications since 2000:

https://events.ccc.de/congress/2014/Fahrplan/system/attachme...

His presentation may have some value to someone spending their first week with Perl in a web context, but that person would have to wade through a lot of nonsense to get at that value.


>Embedded visual aids like diagrams and demos.

>The build status identifies specific project aspects that are incomplete and/or causing instability.

>One or more badges showing code coverage or other quality metrics.

I really have to disagree with points like these. I always hate cloning a project and then opening the README to find a totally unreadable mess of Markdown links and images or worse still - literal HTML. As an example, look at uBlock's README.md[0], which even though uBlock is a fantastic project in it's own right, has a horribly unreadable README. If you're already adding a README file within your code repositoy, you should assume that people will read it locally, without fancy HTML rendering. The whole point behind markup was to have a syntax which could be easily parsed by computers and humans alike (it was inspired by the plaintext email formatting style after all!).

Graphics and badges should be put on a website, in this case for example GitHub's github.io service. Specifics should be placed in a man page or something comparable. Links should either be autolinks[1] or reference links[2], to keep the document clean and structured. I'm sure others could come up with more and better recommendations on how to use "normal" markdown to still create a decent README's for GitHub. Maybe we don't even need Markdown and we can just use plain utf-8[3]...

[0]: https://raw.githubusercontent.com/gorhill/uBlock/master/READ...

[1]: http://spec.commonmark.org/0.28/#example-565

[2]: http://spec.commonmark.org/0.28/#reference-link

[3]: I recently downloaded a dwm statusbar manager called "dstat" (from https://www.umaxx.net/) and it's README was a really nice surprise: https://sub.god.jp/f/JmiFHE9S.txt (extracted and uploaded, since there's no public version)


As a complete aside, some people (in an IRC channel) and I just spent 20+ minutes trying to figure out why https://sub.god.jp/f/JmiFHE9S.txt was displaying in a non-default font in Firefox. Turns out it was being treated as encoded as Shift JIS, because the encoding wasn't explicitly specified in the headers and the TLD is .jp .[1]

[1] https://dxr.mozilla.org/mozilla-central/source/dom/encoding/...


I've put this into a bash function to render markdown files in the terminal window

  function markdown()
  {
      pandoc -s -f markdown -t html "${1}" | sed 's/^<pre class/<p><\/p><pre class/' | lynx -stdin
  }
... which doesn't really address your main complaint, but can help in cases where you really don't want to exit your command-line.

Requires the pandoc package, which isn't installed by default in Ubuntu.


> Requires the pandoc package, which isn't installed by default in Ubuntu.

Neither is lynx, since you mentioned it.

Neat trick :).


While the badges can get to be overkill (like in uBlock's case), I think having a small number with useful information (dependency status, code coverage, etc.) is helpful when quickly assesing a project.


Nice project.

Shame that the history of the maturity model looks like this: https://github.com/LappleApple/feedmereadmes/commits/master/... Even when editing in the web UI you can - and should - set a commit message so one can actually understand how a document evolved without going through the diffs.


Next Up: Git Commit Maturity Model


A level five commit is product-oriented, contains a vision statement for the change, has its own slogan and is updated at weekly or even daily intervals, so inaccuracies are unlikely.


You surpassed my level of snark and reached limits I considered unachievable. Well done!


How to write a great git commit message: http://github.com/joelparkerhenderson/git_commit_message



Thanks! I added your links to the README.


The Wine project has been my go-to reference for what commits should look like in a large and complex project:

https://github.com/wine-mirror/wine/commits/master

During the short time I contributed to Wine, I really got to appreciate their high quality source control discipline.


I really like the "module: thing that changed" format for the title. We've been using that where I work. It helps keep commits focused and makes it much easier to read the log.


This maturity measure seems related to blog post driven development, as outside-in development. Are there any other related principles, approaches, and measures?

https://en.wikipedia.org/wiki/Outside%E2%80%93in_software_de...

http://blog.estimote.com/post/119525082855/user-stories-on-s...

https://news.ycombinator.com/item?id=958480


Github-Star-Driven development?



I feel like GitHub is partially being overrun by cheap and easy repositories to farm for stars, why would you ever need a README maturity model?


Documentation is part of the craft, so I appreciate casual attempts at formalizing conversation about commonly used methods of creating it.


I'm curious, what does star farming get you? Is it just like farming karma on HN or is there a more practical reason to do that?

Also, while I'm sure there's a gigantic amount of rather useless repos on github (I contributed my share...) doesn't mean that it's not a good idea to have guidelines to write usefull READMEs for projects that do matter.


Trust / credibility / reputation attacks, of various sorts.


It looks like the main purpose of this repo is for open source project developers to open issues to get someone to help edit their projects' readme's. That's kinda nice!


If this repository additionally had badges that you could embed at the top of your GitHub README, I would expect it to be satire. I'm still not entirely convinced it isn't satire.


I believe it's a mistake to try and cram everything into the README. We've gone the opposite direction with Material-UI.

Compare 0.x (before): https://github.com/callemall/material-ui/blob/master/README....

With v1-beta (after): 1.0-beta: https://github.com/callemall/material-ui/blob/v1-beta/README...


Don't completely agree with the guidelines presented in the maturity model, but I do think that a well written README goes a long way in making your project discoverable, identifiable or even presentable. I have had old projects for which I am now really glad that I took the time to write good READMEs when I did, because I can refer back to them and still know what I am reading through. It's a good practice, like commenting your code.


Please also include a line about the license in your README.

The full license can go into a separate file, but a paragraph like "This project licensed under the GPL v2 license. See the LICENSE file for details" can be extremely helpful, and is missing far too often.


These levels are known to SREs as Initial, Repeatable, Defined, Managed, and Optimizing: https://en.wikipedia.org/wiki/Capability_Maturity_Model#Leve...

These levels are known to Discordians as Chaos, Discord, Confusion, Bureaucracy, and The Aftermath: https://en.wikipedia.org/wiki/Principia_Discordia


> A line about the average response time to issues and/or pull requests.

Does someone know any projects that have this?


There should be a README-embeddable badge service for this. Anyone up for writing one?


I know that there is an embeddable image of issues/PRs closed/opened over the last weeks, that is used in some READMEs, but I can't remember any of the repos right now.


I'd start with a scan of https://shields.io/


Here's a bootstrap template for creating your own README.md:

https://github.com/jehna/readme-best-practices

You can use it to quickly go through most common information you should include in your own readme.


I like how the project name is formatted as code.

It's missing a TOC though, scrolling through it doesn't give an overview what it contains.


Documents like README have a tendency of getting out of date really quickly — especially if you add dependencies between different content inside the file. I think anyone using Confluence (or similar wiki-style space) can agree.

Including table of contents in your README is pretty big overhead if you cannot generate it automatically. This template is meant to be a good starting point for any size of project, so it was a conscious decision to leave table of contents out of it.


I actually replied to the wrong top comment here, this was meant for https://news.ycombinator.com/item?id=15561356. This links to https://github.com/mcohen01/amazonica which is 25 pages printed - and then you definitely should have a TOC just to be able to learn what the content is all about.

If you are doing TOC manually in Confluence you are doing it wrong by the way: Both TOC or links to subpages can be done automatically.


Is there an easy way to generate a TOC from an existing Markdown file?


Depends on where you are rendering the Markdown.

In Jekyll with kramdown you can use {.toc} to get an automatically generated table of contents.

On Github this doesn't exist, so you have to use an external tool like https://ecotrust-canada.github.io/markdown-toc/ (Google "markdown generate TOC" for alternatives)


It's a bad idea to use first-person voice in READMEs on GitHub. It breaks down when people give in to GitHub's insistence on forking. If you want to include a first-person account (motivation, etc), put it in a blog post and link to it from the README—then everyone will be able to tell who "I" refers to.


Speaking of, I wouldn't mind getting some feedback on my latest README.

https://github.com/billmalarky/react-native-image-cache-hoc/...

It's pretty much the first time I've put effort into trying to make a README before since I'm now trying to make good documentation a serious development habit.

I'm still adding coveralls but I'd be interested to hear anyone's feedback.


https://github.com/mcohen01/amazonica is the best README I've seen. Relevant examples for every piece of the library, and some nice-to-have performance graphs. The 95%-use-case getting started is really the best thing you can do in a README in my opinion, and amazonica totally nails it.


I like how the project name is formatted as code.

It's missing a TOC though, scrolling through it doesn't give an overview what it contains.


The "Supported Services" is the table of contents


If this is the intention, then it is a really terrible one. First link sends me 1/3 down the page. This actually only links to the examples, the 10 other (documentation) sections/headlines are missing.


Yeah, that could maybe be tweaked a bit. The examples are really where this one shines for me. :) I didn't feel more ToC was overly necessary.


I like this idea of a maturity model. It makes sense that READMEs grow in scope with their projects and can be evaluated accordingly.

To add to the README discussion, I made a small website a few months ago for README guidelines, though it's more geared towards beginners: https://www.makeareadme.com


One simple requirement is that a README file should include, near the top, a brief summary of what the project is.

I've seen too many README files that leave that out. Knowing what changed in the latest point release doesn't help me if I don't know what the whole thing is for.


Also applies to landing pages. Just explain briefly what the project is about.


Their own README is at most only level two. No badges.


Level 4 actually seams best to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: