The Tardigrade Network (built on Storj) recently announced a new IPFS integration at Devcon 5. @jorge, @sohkai, and I spoke with their team and believe adopting this solution for AA’s IPFS infrastructure is a smart investment - it will provide data resiliency through the use of a distributed network (Storj). It also outsources tricky IPFS devops related issues that can take significant time and resources, at an affordable price.
How it works
IPFS stores data in blocks inside its configurable data store. By default, blocks are stored locally (i.e. on machine or in browser storage). In a Storj backed IPFS node, the IPFS datastore is the Tardigrade Network (which is built on Storj). This means that the actual file contents themselves are replicated on a distributed network instead of locally in a single place. Aragon’s IPFS node could explode without jeopardizing any of the files stored there. In this scenario, we could spin up a new machine, and continue to resolve content that was pinned before the explosion. According to @sohkai, something similar actually happened once to Aragon’s IPFS node, and as a result, data was lost forever. Aragon has been backing up their IPFS content since. We believe Tardigrade offers a solid solution to this problem.
Pricing
$349/mo plus $1,000 setup fee for:
- 15gb of RAM
- 4CPUs
- 2TB of cloud storage and 0.5TB of egress bandwidth
Currently we pay $119/mo for:
- 8gb of RAM
- 4CPUs
- 120gb of storage
Storj offers flat fee pricing for clients, so we don’t need to worry about prices changing depending on network conditions.
Handling an explosion - back up our pinset
In the case where our Storj backed IPFS node exploded, we’d lose our pinset. This means that any file pinned to AA’s IPFS node would no longer appear as “pinned” when a new node is spun up to resolve content. This could cause a problem if the new node started garbage collecting unpinned files – all of the data would get erased from the Storj network.
To defend against this, we plan to find a safe way to back up our pinset. When the new node is initiated, we can simply repin all the data that was in the exploded node’s pinset. Since AA’s node is also not open to the public, we would have no reason to run any garbage collection.
Risk factors
- Moving from local block storage --> distributed block storage comes with some risks. Will the Storj network stay alive? Will the files get lost? According to the Storj team, 2 PDs of files have been stored on the network with 0 files lost yet. We think it’s a relatively safe bet to rely on Tardigrade (running on Storj) to stay up and maintain our data.
- File retrieval performance could suffer. Fetching files locally is more performant than fetching files that are chunked and split across a distributed network. However, the Storj network team was bullish on their file retrieval performance.
- satellites - satellites are the consensus and coordination mechanisms within the Storj network. At the moment, they pose seemingly a central point of failure because the Storj network itself is the sole operator of a satellite. The Storj network said they plan on adding more “trusted” satellite nodes to the network. Will the network with more than a single satellite be as reliable and performant? (Note - the Aragon network could eventually run a satellite network if it wanted to, to alleviate the risk on Storj but wouldn’t do so immediately).
IPFS Clusters mitigate risk factors
In the near future, we also think we should run a cluster of IPFS nodes, 1 or more with the Storj backed IPFS node, and 1 or more with other data stores too. In this way, we don’t become overly reliant on Storj performing and staying alive. This would also provide a more resilient pinset (since an IPFS cluster can share a pinset replicated on each node).
Recommendation
We want to move forward with the storj <> IPFS integration, but we will leave some time for the community to give their own opinions, research, questions, and concerns before doing so. Allons-y!