ActivityPub has a problem of laying all data out, nicely structured, just waitin...

ilyt · on July 12, 2023

shrug not having API didn't stop anyone before that.

And "I want random people to see my social stuff (cos I yearn for attention) but not that particular person/corporation" is unsolvable problem

strogonoff · on July 12, 2023

Bug-free software is unsolvable, but it does not mean we should stop trying to avoid bugs, that’d be just silly.

If fully precluding public and private intelligence is infeasible, that does not mean we should be using a protocol that in many ways is optimised for public and private intelligence.

Privacy, like many things, is a spectrum.

xg15 · on July 12, 2023

I'm with you if you want to keep the API but put them behind stronger authorisation requirements, i.e. what "authorized fetch" seems to be for.

I absolutely disagree if you want to keep the data public but make it "harder to scrape", i.e. remove all APIs bury it in some annoying HTML/Javascript mess.

That would absolutely punish the wrong players: Having an API which allows easy access to structured data allows all kinds of desirable usecases, such as being able to use whatever client you like.

In contrast, the big players who are interested in tracking the entire userbase already have enough experience in building robust scrapers - they won't be deterred by a closed-down API.

strogonoff · on July 13, 2023

Regarding “mess”, it doesn’t have to be. Upon some research, there actually already seem to be protocols that try to address these issues in a reasonable way in spec and implementations (e.g., LitePub[0]).

Regarding dedicated “big players”, I will just repeat my point: they may still be able to do something but perhaps we shouldn’t make it easier for them, especially if it provides no benefit to an ordinary person (that is: excluding users such as corporate or government bodies, OSS projects, and so on).

If it becomes sufficiently difficult for them to gather intelligence, the effort required may actually be useful evidence in case of a lawsuit—it would show that one side cares about privacy and took measures to avoid being identified, while the other side circumvented those measures.

[0] https://litepub.social/overview

dahwolf · on July 12, 2023

Indeed, and I extend this problem to any data of any value. The more semantically you describe it, the more pathways you create for abuse.