Adopting a Storj network backed IPFS node

The Tardigrade Network (built on Storj) recently announced a new IPFS integration at Devcon 5. @jorge, @sohkai, and I spoke with their team and believe adopting this solution for AA’s IPFS infrastructure is a smart investment - it will provide data resiliency through the use of a distributed network (Storj). It also outsources tricky IPFS devops related issues that can take significant time and resources, at an affordable price.

How it works

IPFS stores data in blocks inside its configurable data store. By default, blocks are stored locally (i.e. on machine or in browser storage). In a Storj backed IPFS node, the IPFS datastore is the Tardigrade Network (which is built on Storj). This means that the actual file contents themselves are replicated on a distributed network instead of locally in a single place. Aragon’s IPFS node could explode without jeopardizing any of the files stored there. In this scenario, we could spin up a new machine, and continue to resolve content that was pinned before the explosion. According to @sohkai, something similar actually happened once to Aragon’s IPFS node, and as a result, data was lost forever. Aragon has been backing up their IPFS content since. We believe Tardigrade offers a solid solution to this problem.

Pricing

$349/mo plus $1,000 setup fee for:

  • 15gb of RAM
  • 4CPUs
  • 2TB of cloud storage and 0.5TB of egress bandwidth

Currently we pay $119/mo for:

  • 8gb of RAM
  • 4CPUs
  • 120gb of storage

Storj offers flat fee pricing for clients, so we don’t need to worry about prices changing depending on network conditions.

Handling an explosion - back up our pinset

In the case where our Storj backed IPFS node exploded, we’d lose our pinset. This means that any file pinned to AA’s IPFS node would no longer appear as “pinned” when a new node is spun up to resolve content. This could cause a problem if the new node started garbage collecting unpinned files – all of the data would get erased from the Storj network.

To defend against this, we plan to find a safe way to back up our pinset. When the new node is initiated, we can simply repin all the data that was in the exploded node’s pinset. Since AA’s node is also not open to the public, we would have no reason to run any garbage collection.

Risk factors

  • Moving from local block storage --> distributed block storage comes with some risks. Will the Storj network stay alive? Will the files get lost? According to the Storj team, 2 PDs of files have been stored on the network with 0 files lost yet. We think it’s a relatively safe bet to rely on Tardigrade (running on Storj) to stay up and maintain our data.
  • File retrieval performance could suffer. Fetching files locally is more performant than fetching files that are chunked and split across a distributed network. However, the Storj network team was bullish on their file retrieval performance.
  • satellites - satellites are the consensus and coordination mechanisms within the Storj network. At the moment, they pose seemingly a central point of failure because the Storj network itself is the sole operator of a satellite. The Storj network said they plan on adding more “trusted” satellite nodes to the network. Will the network with more than a single satellite be as reliable and performant? (Note - the Aragon network could eventually run a satellite network if it wanted to, to alleviate the risk on Storj but wouldn’t do so immediately).

IPFS Clusters mitigate risk factors

In the near future, we also think we should run a cluster of IPFS nodes, 1 or more with the Storj backed IPFS node, and 1 or more with other data stores too. In this way, we don’t become overly reliant on Storj performing and staying alive. This would also provide a more resilient pinset (since an IPFS cluster can share a pinset replicated on each node).

Recommendation

We want to move forward with the storj <> IPFS integration, but we will leave some time for the community to give their own opinions, research, questions, and concerns before doing so. Allons-y!

3 Likes

Hello! Molly here from the IPFS Project. Thanks for sharing your plans/ideas for backing up IPFS data in multiple locations for Aragon. What a lot of other projects do is utilize pinning services like Pinata, Infura, etc when they don’t want to manage persisting their own data. Is the benefit you’re seeing from Storj the flat-fee for storage/bandwidth vs Infura/Pinata?

Very supportive of the plan to mitigate risk factor by replicating across multiple different service providers. Not sure if groups like Storj, etc are able to all act as peers within a single IPFS Cluster setup yet - but that is a great idea and we should definitely invest in making that work if it doesn’t already!

Feel free to come ping me or any core devs on IPFS if you have other questions or want to dive deeper on options you’re considering! Cheers,
~Molly

There’s also Temporal, which is vastly cheaper than Pinata (50-60%), and Storj. It’s also open-source, and we consistently contribute back to upstream IPFS so you won’t have any kind of vendor lock in like you would with Storj’s IPFS node, or Pinata.

Also if you want to replicate Storj’s IPFS node without vendor lock-in and using a closed-source product: https://github.com/RTradeLtd/storj-ipfs-ds-plugin

No, in fact this is most likely nominally more expensive.

The real boon in my eyes is node-level reliability; if Storj’s promises are true, our risk of all our information being lost due to node (or cluster) failure dramatically decreases. As Jon noted, we have lost all of our IPFS data before, because we were only running a single node and our cloud provider decided to randomly decomission it without warning.

2 Likes

Temporal sounds perfect for your needs. We distribute our data in two different locations (west+east coasts of Canada). We run our own datacenter, and have full ownership of our hardware so a cloud provider randomly deciding to decomission your data without warning is impossible.

Each piece of data is mirrored on three different nodes, two on the west coast one in the east coast. The nodes in the west coast are physically separate servers, with physically separate SANs that store the data using hardware RADI6.

To give an example of our redundancy. We could lose an entire SAN+server in our datacenter on the west coast, and then have two complete hard drive failures on the remaining SAN+server in the west coast, and your data would still be available. Our datacenter could them be bombed out of existence, and your data would still be available

Our costs are even cheaper than Storj IPFS. 2TB/month is $100 :slight_smile: There’s no setup fee, and no bandwidth charges. So you could store your 2TB of data for an entire year, and you would still have paid less than than for the first month of storage with Storj IPFS (including the setup fee Storj charges).

Going off the Storj costs Jon mentioned in the post, 2TB of storage for the first year including setup costs is $5188. With Temporal, it would be $1200. You could store your data with us for 4.3 years, and you still would’ve paid less than you would for your first year of storage with Storj IPFS.

1 Like

Here’s our thought on why a decentralized storage layer + IPFS is the right approach. Aragon is one of the very few projects in the space that is live and decentralized. Since IPFS was created people have been using Amazon S3, Microsoft Azure, datacenters, another other centralized cloud storage providers to actually store the data. Then people wonder why their “decentralized” application goes down when Amazon S3, or a private datacenter in Canada has an outage.

There are other solutions out there, but at 2 petabytes we store more data then all of them combined - and have never lost a file. Unlike others, Storj is also globally distributed, rather than just two regions (although you will need to put a IPFS gateway in the those regions for this integration). We are also open source as well (https://github.com/Storj/Storj), and work with some of the most notable, independent open source projects in the industry (including MariaDB, NextCloud, Kafka, etc.)

What we are offering with this integration is the IPFS that everyone knows and loves, with an actual decentralized data store on the backend. I assume that a more decentralized, performant and globally available solution is more valuable to the Aragon community even if it is more expensive.

As far TemporalX is concerned, I’ve already heard good things about the performance. I think it would be pretty easy to use Temporal in a higher layer in the stack, with Storj as the backend storage layer. Then you would get the benefits of both. We also have massive list of potential customers that might be interested too.

5 Likes

Unfortunately Storj IPFS is abysmally slow. When I wrote the Storj IPFS gateway that you guys are using, performance was abysmal. I took a look at the code you guys forked and it still looks like its basically the same thing, so performance will still be abysmal. Additionally you’ll be suffering from the exact same issues that go-ipfs has, such as the pinning system which causes performance issues. And with 2TB of data the pinning system would lead to roughly 50% reduction in performance. This would happen with Storj IPFS just like it would happen with go-ipfs, js-ipfs, and every other implementation of IPFS that isn’t TemporalX due to our reference counter.

As far as Storj IPFS is concerned, it’s literally a prototype that’s it. You are trying to sell prototyped software that I wrote as a ready-to-go production solution, that’s irresponsible and can lead to Aragon losing their data again.

Also your link to the private datacenter going offline is a bid of a misnomer. That’s precisely we why we distribute our data across two locations in two completely separate regions of the country. The odds of both locations going offline at the exact same time is absurdly low.

Hypothetically speaking if John and Aragon are concerned about that remote possibility we also have ways of “exploiting” some of the built-in capabilities that IPFS has which will allow us to mirror content in more locations without having to pay for it up front. I’m sure you know the IPFS capabilities I’m talking about right?

We also have massive list of potential customers that might be interested too. Your entire team has my email and knows exactly where to find me

1 Like

Thanks everyone for the discussion. I’m personally the instigator for this, as soon as I saw the announcement, I asked @super3 to meet at Devcon to chat about it.

The main selling point for me is having a decentralized incentivized network responsible for our files’ availability. At the moment, we have a relatively low volume of files on IPFS (just our deployment artifacts) that it is relatively easy to manually archive and pin (pinning happens automatically on every push to the deployment’s repo master). We will very soon have user generated content on IPFS that will be critical for the operation of Aragon DAOs. We won’t be able to manually pin this content and any data loss would be a major issue.

Pinata and Temporal look good and I am sure they would work great (Aragon Association pays for Pinata at the moment and it is solid). My concern with these services, is that at the end of the day, they are run by startups that can go out of business at any time. Even though I assume we would be given proper notice and a migration path, I don’t want something as critical to rely on the operations of a small, experimental company.

Not a concern as we don’t really have vendor lock-in with Storj either. We will still be accessing our files with IPFS APIs and a backend migration is possible.

Even though I am sure this is rock solid, it doesn’t protect us from Temporal shutting down operations and our files being lost.

This is great and probably a result of your lower overhead. The reliability and resilience of a decentralized network is something I am willing to pay for.

I haven’t seen the performance numbers for this, and I’d love to see how fast the Storj IPFS gateway is (cc @super3). I imagine we would be able to have another IPFS node that is directly connected to our Storj IPFS node, so files that we access regularly would be cached in this other node, making performance only an issue when we need to go and fetch files from the Storj network. This is another tradeoff I’m personally willing to make.

Even though I am sure this is rock solid, it doesn’t protect us from Temporal shutting down operations and our files being lost.

We’re a publicly registered company based in Canada. Storj is registered in the Cayman Islands. So if Storj ever fails to meet the SLA which is entirely possible, then you’re at a much greater level of difficulty trying to recoup your losses. Remember you’re not entering into a business agreement with a decentralized network, you’re entering into a business agreement with a company to have them manually spin up infrastructure for you, and maintain that. Nothing is decentralized about that, and it is in fact centralized.

What if Storj IPFS endpoint ever goes down? Do they have a cluster of nodes that can automatically resume providing the service? That’s not possible because Storj IPFS isn’t even setup to run in a cluster environment. Storj IPFS is as resilient as your dedicated Pinata ipfs node is, except its 50x the cost, and from a company that doesn’t even really get how IPFS works.

I haven’t seen the performance numbers for this, and I’d love to see how fast the Storj IPFS gateway is (cc @super3). I imagine we would be able to have another IPFS node that is directly connected to our Storj IPFS node, so files that we access regularly would be cached in this other node, making performance only an issue when we need to go and fetch files from the Storj network. This is another tradeoff I’m personally willing to make.

You haven’t seen the number because they don’t have them. It should probably be a sign that something is up if they don’t have publicly accessible benchmarks.

This is great and probably a result of your lower overhead. The reliability and resilience of a decentralized network is something I am willing to pay for.

I can assure you that the Storj IPFS gateway isn’t reliable in any sense of the word. It is an alpha-level prototype that I built. It is full of bugs, and hasn’t been thoroughly tested. Additionally Storj has not been capable of maintaining this software on their own. If you really want to pay someone for a service that they aren’t qualified to sell it’s your dollar to spend how you want.

Storj IPFS is still using gx. Gx hasn’t been used in over 6 months, so you’re already 6 months behind in a significant number of updates which contain a number of bug fixes, and performance improvements to go-ipfs.Also if Storj needs to do a bounty every single time they need to make a change to the storj ipfs plugin (which as you can see on github, is how they’ve been making every single change since the forked off our [RTradeLtd’s] code base), how you can trust that they can effectively maintain the software?

edit:

To give an example about Storj IPFS redundancy:

Data is distributed on a network of nodes (storj network), accessible by 1 node (the ipfs node connected to storj network). That means if the 1 ipfs node goes down, all your entire data is “offline” and inaccessible. So you have a single point of failure, vs 3 points of failure with Temporal.

1 Like

Thank you to everyone who has chimed in on this topic. It’s been a great learning experience to see the different ways we can make IPFS more reliable. We’re really impressed with the work that both teams (Storj Network and Temporal) have accomplished.

TLDR
We will not be moving forward with a Tardigrade backed IPFS Node or TemporalX. Instead, we are going to bump up our current setup to a vanilla IPFS Cluster. In the future we might consider swapping out each cluster node’s datastore for highly available block storage (off the actual IPFS machine itself). For now, this solution suits our needs. More information below.

We like the idea of a Storj backed IPFS node, but we don’t feel the integration has been adequately tested yet.

TemporalX looks powerful, but we’d prefer to avoid using any forks of IPFS at the moment.

An IPFS Cluster where each node replicates all the data and the entire pinset is a suitable solution for our needs now. If a single node goes down, the data and pinset exist in another location. It’s also super easy to spin up a new node(s) as part of the cluster, which makes it possible for community members to replicate Aragon data. The costs are ~$350.

Feel free to continue posting here with questions or new updates to either Temporal or Storj <> IPFS, and we will consider them.