Correct me if I'm wrong, but isn't this problem the ideal use case for projects ...

dannyobrien · on Dec 21, 2021

It is -- and Brewster Kahle and the Archive have been thinking about this for a long while (see this talk from him five years ago: https://archive.org/details/LockingTheWebOpen_2016 ). The model you can think for this would be to have as the Archive as the "node of last resort" of content-addressable storage, making sure there's always one node up with the content you want.

The incentive challenges are making sure that the average number of nodes is more than one, because, as Brewster likes to say, "libraries burn; it's what they do", plus all the traditional challenges of maintaining a commons at high levels of resilience. Once you have data on a network like IPFS, we can use a number of incentive models to make sure it stays there, including charitable projects like the Archive, government support (archives are traditionally state projects -- if every country's archive was pinning this content, it would be far more resilient), and decentralized incentive frameworks like Filecoin.

(Disclosure: I work for the Filecoin Foundation; in our decentralized preservation work, we've funded the Internet Archive's work in this area, though I should emphasise that IA works with a lot of different decentralizing technologies through their https://getdweb.net/ community.)

Ericson2314 · on Dec 21, 2021

I was involved with planning https://nlnet.nl/project/SoftwareHeritage-P2P/ for just this reason --- hopefully we will finally be able to start work on it sometime too far off.

Indeed the real challenge of archival is not loosing the stuff, by making sure that people can still find the stuff. "Orphaned" information that no one knows exists, or is bothering to interact with, isn't that valuable compared to resources that are actively being used and still "live" in the culture.

Of course, the archive can never serve the same amount of bandwidth, but the goal is a) interested parties can mirror the stuff they care about in a higher bandwidth / item way after some huge disruption c) random viewers never notice something going down, nor who is serving the info, but just a temporary drop in connection quality.

Ultimately, location-based addressing is a stupid way to run society, needlessly fragile by baking in very property claims (IPs, DNS, etc.) that are incidental to the task at hand. Content-based addressing, with location based hints to avoid trying to solve really hard problems all at once, is the only way to make culture more robust.

btown · on Dec 21, 2021

The great thing about location-based addressing is that an archive of the set of known locations is not subject to the same ownership rules as the canonical live version of those addresses. A document listing all Geocities URLs can be placed in content-addressed storage without needing geocities.com to be owned by the party that emplaces that document. And a chain can be maintained such that people are incentivized to remember that document into the far future. Coupled with archival of the actual content, you bypass the exclusivity of domain ownership.

Of course, ensuring that there's persistence of attention as well is a tougher problem. But one only needs to look at sites like https://reddit.com/r/tumblr to realize that there is immense societal interest in "meme archaeology." Reducing the barriers to entry to would-be archaeologists, giving them a "chain" of breadcrumbs that lead to content, and building communities that will socially reward people for their archaeology work, is the best thing we can possibly do.

Ericson2314 · on Dec 21, 2021

> The great thing about location-based addressing is that an archive of the set of known locations is not subject to the same ownership rules as the canonical live version of those addresses.

Erm, to me this sounds like putting up with link rot as hack around bad IP law? There are already IP exceptions for preservation. And if content-addressing was the norm, geocities-type sites might bow to market pressure to not "own" the content, but merely have some some sort of license for being the exclusive pinning service and running the ads or whatever. This is like avoiding the problem where your the rent on your current apartment doesn't fall as much as the market writ large because your landlord knows moving is not free.

thesausageking · on Dec 21, 2021

IPFS only does addressability, it doesn't provide storage. You could use a decentralized storage network like Arweave, Filecoin, or Sia.

https://www.arweave.org/

https://www.filecoin.com/

http://sia.tech/

cle · on Dec 21, 2021

Or "centralized" ones like Fleek, Textile, Pinata, etc.

https://fleek.co/hosting/

https://docs.textile.io/buckets/

https://www.pinata.cloud/

jayd16 · on Dec 21, 2021

IPFS does the opposite, right? It doesn't guarantee the archive is available, which is what Carmack is asking for. Incentives to scale bandwidth with need already exist as long as you have he data at all.

That is to say, IPFS doesn't help if the desire blooms after the nodes dry up. Things could still be lost.

Ericson2314 · on Dec 21, 2021

The point of IPFS is not keep the data archived, but to allows users to no care who does the archiving.

Concretely, this would be to skip the "many people on encountering a dead URL don't bother to try the internet archive" problem.

ricardobayes · on Dec 21, 2021

There are many a projects who go open source when they fail. The extra step here is IA would become the A record for the project (at least temporarily?)