I'm reading the manifesto, and for some reason, a story I once heard, possibly apocryphal, came floating up out of my memory: a person was using their PC when a coworker came up asking for a copy of a spreadsheet and handed the first person a floppy disk. That person took the disk, inserted it into the PC, then launched Lotus 1-2-3, loaded the file, and then saved it to the floppy. The second person was incredulous that the first person just didn't copy the file directly from the hard drive to the floppy. The first person replied, "you can do that?"
I have a family member whose standard workflow involved doing the same with Word on Windows. Save As was the only way he knew how to copy things.
He's a baby boomer. I'd assume this is only more common now with peoples' main interactions with computing being in an app-focused, hidden-filesystem world.
Yes, but this is not as bad as opening the word doc, selecting everything (with the mouse), copying it, opening a new empty doc, pasting it in there, save. This is how you copy word documents. These are pretty easy. Some other file formats are harder, like Excel sheets with tabs.
Now you may be tempted to think, "people do that?!". Of course they do. If you can imagine it, someone is doing it.
The manifesto's idea that files should have a unique address which any machine can access reminds me of Brian Hauer's rant, which argues that each application should have a single instance with a unique address (for a given user) that any machine can access (http://tiamat.tsotech.com/pao). Put the two together and a person's entire digital life would seamlessly follow them between machines.
I like the proposal of making caching a central design element to work around today's bandwidth limitations. I work with large-ish (a few TB) scientific datasets, and it isn't pleasant to have to choose between (a) storing everything on network storage and suffering slow IO or (b) storing everything locally on every workstation and suffering the need to synchronize data.
" it isn't pleasant to have to choose between (a) storing everything on network storage and suffering slow IO or (b) storing everything locally on every workstation and suffering the need to synchronize data."
That's a solved problem, which depending on the workload could involve a compute farm with a clustered or distributed file system, a copy data management solution, or cachefs, for example. The are also solutions for shared storage between containers across multiple nodes.
> That's a solved problem ... compute farm with a clustered or distributed file system, a copy data management solution, or cachefs, for example.
Thank you for suggesting some potential solutions to my data management problems. From googling them, I get the impression that you are thinking primarily of enterprise scale data management (i.e., organizations with server farms and an IT department), whereas I'm primarily thinking of organizations with < 30 employees (who mostly use desktop software) and a single file server. My particular situation is an academic lab. However, I think these solutions can can still work with a little adaptation:
Copy data management seems to be the use of block level data deduplication or virtual disks on a server in order to decrease the disk utilization per VM or per container. I'm not completely sure; I found mostly marketing documents, and there's no wikipedia entry. This, as well as clustered/distributed file systems, would apply if we turned the file server into a VM server and had each employee work on a VM via remote desktop. In principle, this could work, and would let temporary employees (e.g., summer students and visiting scholars) get started quickly with a standard OS environment. I will experiment with this when our last batch of desktop PCs hit end of life.
CacheFS looks useful if we start using NFS to connect to our file server instead of Samba. It looks like Windows 10 (Enterprise) supports NFS caching too (https://technet.microsoft.com/en-us/library/cc976862.aspx). I will try this.
Definitely a fan of this project, but I'm intensely curious as to why they separated from Camlistore, which seems like a very similar project and is also headed by a key member of the Go core team (Brad Fitzpatrick). Anybody from either of those two projects care to comment?
Motivation: There are 100s of initiatives trying to solve similar problems, and they could be solved relatively quickly if engineers deigned to work together on a solution instead of splintering off into hundreds of fractured groups.
> The main difference I see is that Camlistore can model POSIX filesystems for backup and FUSE, but that's not its preferred view of the world.
This makes me want to throw things. I'm actually mentally discounting both projects now on the charge that core authors seem to care more about bickering over technical details than implementing working solutions to these society-breaking problems.
Andrew Gerrard worked on both and apparently didn't think Camlistore was the right basis for what they wanted in Upspin. But I'm sure you, who I'm not sure has used either project, know better than Andrew and Brad and Rob.
I am claiming I do, yes, and would happily make my case to any of them for why they should do the hard work of agreeing on minor technical details and merge the two projects. It is the easiest instinct for engineers to "split off and code their own version" over technical disagreements, and why we have a dizzying array of incompatible, half-completed decentralization projects while Facebook and Twitter continue to eat society.
Thank you again for the info/backstory, though. I am just a naysayer who has sat through 1000 pitches of Fitzpatrick's basis thesis back in 2010 and seen excruciatingly minimal progress in the space of "actually making these things work for normal people".
I'm happy to discuss this further -- my life-passion-project is to see decentralization through -- but fear I've overstepped my bounds in this thread and am taking away focus from the project at hand, which I am a supporter of.
I keep a ranking of decentralization projects in terms of how likely they are to succeed and catch on. Camlistore and Upspin have been near the top of my list for years now (Camlistore was the one that originally inspired me to quit my job at Twitch and do decentralization advocacy full-time). I am now slighly less excited about both projects, although they still have incredible potential and I would be overjoyed if either of them met with minor success.
At this point, I get the sense that Upspin/Camlistore don’t really _want_ to succeed in terms of catching mass-market success and disrupting the innovation-stifling tech giants. It seems like they’re more interested in scratching their personal itch and being content with that. Totally fine, but I’m going to be slighly less excited about releases from both of these projects in the future unless I get indications that the core team members are willing to escape the same trap that plagues all standardization schemes (https://xkcd.com/927/)
Assuming they got similar adoption, I'm wondering why I should use Upspin rather than Keybase? It seems like Keybase's users and groups are more sophisticated, and its private git support is immediately useful.
I am a big fan of both Upspin and Keybase. In my view they are quite different.
1) Polish vs Openness:
- Keybase is a polished product and a closed identity platform;
- Upspin is open-source plumbing and an (almost) open platform.
2) Focus
- Keybase is a crypto identity platform which happens to have a file storage app;
- Upspin is a file storage platform which happens to have a crypto identify feature.
Note: I call Upspin "almost open" because it does not support running your own key server in a private namespace. All users must use the same public key server. In exchange for a slightly less open platform, Upspin gets a strong guarantee of a single global namespace, which is a really great feature for end users. I think it's great that the project is clear about its priorities and the tradeoffs it's willing to make, and communicates them upfront.
Of course you can trivially run your own key server and your own upspin universe. But then of course, you can't talk with other people.
The problem with upspin is that there's a SPOC keyserver, not that there's a single namespace. You could trivially have a single global namespace with many delegated keyservers using DNS. You know, like e-mail. But the authors unfortunately don't want that.
The article was written in 2014. If, say, a letter from Mark Twain got published for the first time today, we'd put the year it was written in the title.
Count me among the confused. Works are usually referenced by their publication year and, on Hacker News, I always assume something with a (YEAR) in the title was published in that year.
That seems strange, and the example you give is less applicable than one might first think.
A letter is private correspondence; the date it is written is essentially the date it is published. A manifesto, on the other hand, could be considered a public document, and so its date of publishing may not necessarily be the date it was written.
I get that it was written a while ago and only made available for public consumption recently. However, appending a date to the article name signals to most users that the article is an older article that may have been previously posted. This is the part that's confusing, and I think masks what is essentially a new document in the eyes of the public.