Aragon Network IPFS Pinning

Hey @Schwartz10

Thanks a lot for taking the time to provide such a comprehensive answer! That’s super interesting!

I’m in the middle of an Apiary rush these days but I will definitely take the time to answer more deeply in the coming days. In the meanwhile i’m super open for a call whenever you can.

Talk to you soon [here or in a chat!].

3 Likes

I hope that DAOs will easily host and back-up their data, otherwise they would have to trust a 3rd party! :open_mouth:

Here are my thoughts on this issue, please let me know if my assumptions are correct:

  • As far as app developers are concerned, we can and we must (for security reasons, so apps cannot read each other’s data) abstract away the data storage component. We should expose a Storage API from the wrapper side so that apps don’t worry about it.
  • We are forced to use the blockchain to reference these IPFS hashes, otherwise you cannot prove the date of a discussion, who has permissions to see read/write which files, etc. (Apps built entirely on IPFS will always have to assume a high trust environment)
  • There should be an app where users can easily inspect this data
  • Apps should be able to read each other’s data, if permitted

I think we already have a solution to this problem: Aragon Drive, Aragon Datastore, although I’m not sure (it might need some more iterations).

I kind of see it as:

  • Aragon Drive -> File explorer/Finder/Nautilus (application layer)
  • Aragon Datastore -> the base for the OS layer, which the Wrapper can use to create a Filesystem API
  • IPFS -> Hardware layer

Kinda off-topic, but I made some diagrams in the process:

  • Low-trust environment, using the Aragon Network

  • High-trust environment

A high-trust environment could be a Family DAO or a “Personal DAO”. In this scenario I would store sensitive information that I would not want to be backed-up anywhere else. The nodes could even be disconnected from the internet and only sync when they are on the same location.

A low-trust environment would be a Business DAO where I need a 3rd party provider like the Aragon Network to bail me out in case I get “cheated” on, where money is at stake.
In this scenario the Aragon Network would need to be a validator of the “private” eth node and also a node in the IPFS cluster, because without them the jurors cannot know for sure what has been said and when.

Since the Network will be “forced” to host some data of the customers, I guess we need a way to measure storage, bandwidth, etc. I think it would also not make sense not to provide a general hosting service, since it needs the architecture to do so anyway (plus there’s an extra revenue source).

3 Likes

Hey! Thanks for joining the conversation.

We’re going to be speaking about this on Wednesday the 24th at 10:30 AM ET time. Would you be able to join?

2 Likes

my 2c (as yr asking:)
‘b’ please.

2 Likes

Yes!
I will try to reach out to the espresso team as well.

2 Likes

Hey @Schwartz10,
I’ve been skimming through the thread and it is a very interesting conversation. While I don’t have expertise to add up to the debate, I’m curious to know more and would be happy to join the call as well if possible. Also, If we can find a way to record it and post in this thread afterwards it would be awesome :relaxed:

3 Likes

Hey all! IPFS Cluster dev here.

Reading quickly above, it seems interesting to tell you that IPFS Cluster is soon going to launch what we call “collaborative clusters”. This is essentially a way to run a Cluster where some peers are “trusted peers” (can control what’s on the pinset) and the others are “followers” (pin the things but cannot tell others to pin/unpin anything). This also comes with the flexibility of having peers/followers join or depart any time without this affecting the cluster or having to do anything (as now happens with Raft), and the potential scaling to hundreds of peers. This is actually the final crystallization of the pinning rings usecase linked before.

I don’t have a lot of time to dig into Aragon’s architecture right now but I’ll hang around here to answer any questions about Cluster and to note down any feedback as to how it can be useful. Cheers!

10 Likes

Happy Monday everyone! Looking forward to our call this Wednesday at 10:30AM ET - anyone is welcome! We can use this link: https://meet.jit.si/Autark, and I will record the call so others can watch. I set up a preliminary agenda here, but we can obviously adjust and add things if people want. It would be ideal to leave this meeting with concrete next steps to take, so we can keep things moving.

@danielconstantin - Aragon Drive and Aragon Datastore look awesome!! These seem very usable for storing documents about a DAO like the manifesto, code of conduct…etc, but I still want to dig a little deeper. A couple points we could discuss in the call:

we must (for security reasons, so apps cannot read each other’s data) abstract away the data storage component.

Do you feel this way about inherently public information? Does all information need to be private?

We are forced to use the blockchain to reference these IPFS hashes, otherwise you cannot prove the date of a discussion, who has permissions to see read/write which files, etc

I’m wondering if there are other ways to do this - because in an extreme example (like a chat app) users shouldn’t have to continuously pay for transactions to post. Have you looked into solutions like Textile threads and/or 3box p2p communication protocol? Where do these solutions fall short on their own? How could we utilize a blockchain at a minimum to achieve the same desired security (for example, every half hour we could log the HEAD of a thread or orbitDB database in an ethereum event). I think there are opportunities to get creative, achieve a favorable level of security, and provide great user experiences.

Please reach out to the espresso team as well to join the call!

@Julian - thank you for weighing in. I think it would be a great idea to take a temperature check or “informal vote” on the upcoming call to see where people stand on this. If the community prefers (b), I would feel really good about using 3box’s p2p communication protocol to start. Michael Sena, one of their team members, told me he feels 3box makes up for a lot of orbitDB’s deficiencies and thinks the combination of the two are stable and secure. They seem eager to help out on this initiative too.

@hector, appreciate you getting involved! Creating a collaborative cluster sounds like exactly what we need. Do you have a timeline on this feature launch? I would vote for us to get started on this asap, and it’s (as of now) the first topic of our call on Wednesday. You are more than welcome to join, but if not, we will hopefully produce some questions for you and can start a new forum thread to discuss. Super excited to see this in action!

2 Likes

I’m tempted to say 2 weeks, but I’ll say, realistically 4 weeks, until this is part of a tagged release. We will consider this experimental at the beginning and will have to figure out some UX, but the bigger parts of it are merged already.

I’ll try to come to your meeting!

1 Like

Sounds good - certainly 3box seems a popular solution to many current conundrums it seems :grinning:
I’d take the number of likes for possible solutions posted on here as a solid straw poll.

Here, take one for the above :yum:

1 Like

Nice, Textile currently has nicely flushed out IPFS nodes + added app utilities for collaborative pinning, developer tokens, encryption, rest based decrypting gateways, and more goodies that could make MVP dev really fast. A basic overview here, https://docs.textile.io/concepts/cafes/

I’d be happy to join the call if you think there will be any questions about how Textile cafes work or what they solve.

1 Like

Hi, Mathew here from Espresso.

Thanks @Schwartz10 for this fantastic analysis! I’m not fully aware of all the details and requirements Autark has expressed so please correct me if any of my assumptions are wrong.

IPFS pinning and data querying

I tend to agree with Daniel and @osarrouy that an IPFS pinning solution should be usable on its own, for 2 reasons:

  1. Implementing a robust querying layer (regardless of it being fully decentralized or not) is no simple task. Handling concurrency, schema validation, data consistency while maintaining decent performance and high availability, all of this takes time. Yet the need for a simple IPFS pinning solution is increasing as more projects are being built on Aragon. Implementing a standalone IPFS service would quickly provide a solution to those needs.

  2. Adding a querying API significantly increases the attack surface of the service. Considering that one of the requirements is to “Limit security vulnerabilities and expenses”, separating the query layer would probably be the best solution to meet this requirement.

Here above, I assumed that building the querying part as a 2nd layer would require no major extra work, but again correct me if I’m wrong.

Querying

I think we all agree that a completely decentralized solution would be ideal. However, I’m skeptical that it could be realistically achievable within the given timeframe.

Furthermore, I would argue that MongoDB is actually a perfectly fine solution for this in the near-term as it is not as centralized as it may sound. We could easily imagine for example a replica-set architecture where each of Aragon One, Aragon Black and Autark run its own MongoDB node. In combination with @Schwartz10 clever idea to store db snapshots on IPFS, this would be a fairly(albeit not completely) decentralized solution.

IPFS API and authentication

I feel like the Aragon IPFS service should be 100% compatible with the IPFS API

I completely agree with Olivier. It would be in the best interests of users as well as the devs who maintain the service to have a compatible solution. IPFS supports basic authentication but the client library also supports custom headers if you want to implement JWT authentication (or any complex authentication) without breaking the API.

Regarding JSON web tokens, I think they can be a great choice for APIs as they are lightweight, supported pretty much everywhere and can technically be stateless. However in practice they are quite often stateful, as we probably want a revoking mechanism of some kind. i.e. If the token gets lost/compromised or if you want to disable access to a user who previously had access.

Have you considered directly using a smart contract to validate authentication? What I have in mind would be something like this:

  1. Before sending the “pin request”, the library signs the CID with the user’s account.
  2. It then attaches the signature and user account to the request as an additional header, keeping 100% compatibility with IPFS.
  3. On the server side, a small service verifies that the signature is valid and that the smart contract contains the permission for this specific account. If yes, it forwards the request to IPFS. If not, it returns an error 403.

The smart contract could be a simple DAO where the IPFS access is managed by the ACL. This solution has its drawbacks however: the user would have to sign the CID every time he/she wants to pin data, and it would probably be harder to implement than JWT authentication.

Espresso

Finally, regarding the Espresso Datastore, it solves slightly different problems than what this IPFS and data querying service is intending to solve so I’m not sure it would be an optimal solution for you. However there may be a few concepts and ideas that could be useful. I won’t be available tomorrow unfortunately but you can contact me on Telegram at mcormier and I will most certainly keep an eye on this thread :slight_smile:

3 Likes

Oh…I guess I don’t care as much about privacy as I do about having to trust each other.
That being said, it might still make sense to abstract it away under a generic interface so switching between IPFS/Swarm and others is seamless.

I haven’t checked them out :grimacing: .
I think transaction cost is a different problem that we could tackle separately and that it only applies to the public blockchains. If the DAO is running on a plasma chain or a side-chain it could, like you described, periodically log a merkle root to a set of public blockchains (one of which could be the Aragon chain). The time frame of this could be set by the DAO.

I guess my biggest concern is about the ipfs “admin” nodes that could choose to go rogue.

Hey everyone! Very productive call. Here’s a short summary, recordings are at the bottom:

  • Breaking this large forum discussion into two smaller “challenges” is helpful - pinning + querying.
  • Pinning: @hector jumped on the call to answer questions about IPFS-Cluster and collaborative pinning. TLDR - we should wait until their latest changes for peer permissions within a cluster (2-4 weeks) are merged. Longer notes can be found in comments from the agenda doc.
  • Pinning: we’re going to be creating a separate chat channel for pinning discussion / implementation. Please like this post if you’d like to be added to this channel.
  • Querying: some research will be done on fluence, Origin Messaging and a POC 3box discussions app to see if they are good fits. Sponnet is going to write up a post on peepeth, which pins IPFS hashes in one big dag to Ethereum every time interval.

Check the agenda doc for more notes about the call

Recording of call (will upload to YouTube too):

https://www.dropbox.com/s/3i7wkyokzefj06g/autark%20on%202019-04-24%2015%3A55.mp4

5 Likes

We created a channel in aragon chat to coordinate IPFS efforts. https://aragon.chat/channel/ipfs-dev

@hector please keep us updated about the collaborative cluster functionality

Weighing in on some thoughts for IPFS clusters and authentication.

As discussed in a call today with @Schwartz10 and @stellarmagnet, I see two primary longer-term strategies for hosting IPFS data in the Network:

  • Association backed, potentially pinned by all Flock teams and or service providers: host mission critical data for the Network, e.g. Flock apps, radspec and token registries, etc.
    • Makes the most sense for collaborative clusters, where each Flock team may be expected to run a replication node, and pinning permissions could be concentrated primarily in the Assocation or delegated to a small technical group
    • Already begun this process with AGP28, where the Aragon client’s releases are becoming more decentralized from A1’s control (and ideally would be kept pinned in other servers than just A1’s)
  • Per-app / per-flock backed data stores: app-specific data, e.g. TPS’ Projects app’s markdown files, Pando-backed repos
    • Other teams could altruistically replicate these data sets on goodwill / reciprocation
    • Service providers could provide replication nodes for a fee

In the long future, I would hope that each organization eventually begins to run its own infrastructure (or rent it via service providers) to pin important information related to its operations (similar to how basically every 5+ person organization in modern countries will either have self-hosting or paid cloud-backed solutions).

However, in both the short and long term, I get the impression there would be considerable value if the Association provided infrastructure for organizations to pin a reasonable amount of storage (e.g. 10-100mb) for free. After this range, there could be paid service tiers provided by either the Association or other service providers.

This limited amount would ideally be large enough for small users to frictionlessly upload their organization profiles, cross-org customizations (layouts, local labels, etc), and app-specific data blobs.


Potential per-organization authentication strategy

I assume we are able to create a thin authentication layer on top of IPFS clusters, either through a reverse proxy or some other means of proxying authentication requests, that is able to track the amount of data pinned by each author. If not, some more research will need to be done on creating this type of authentication “shell” around the IPFS API.

Extending @mcormier’s message signature strategy, we could augment it to include some organization-based checks:

  1. Hardcode a role into the Aragon infrastructure, e.g. DATA_UPLOAD_ROLE, likely in the Kernel
  2. Allow organizations to assign certain apps or accounts a permission granting DATA_UPLOAD_ROLE
  3. Send requests to pin either through an app or EOA (see below)
  4. The authentication layer would check if the requester has permission to upload data on behalf of an organization (Kernel.hasPermission(<requester>, <kernel>, DATA_UPLOAD_ROLE))

Step 3 will differ based on the requester, as apps would not be able to sign messages and their contracts (obviously) cannot make HTTP calls:

  • If an EOA requests the pin, the user would only be required to provide the organization’s address and a signature of the CID as HTTP headers for a pinning request
  • If an app is assigned the permission:
    • If the contract can immediately invoke the action (see related issue), allow an EOA to send a pin request through HTTP, but require a “final forwarder” to be part of the request headers
      • The authentication layer needs toalso check that the EOA is indeed able to forward to the “final forwarder” (see @aragon/wrapper), and that the app has the correct permissions
    • If not:
      • First, we assume a Network-wide “fake” contract (that is never deployed), e.g. "0xFFFF..FFF" - "DATA_UPLOAD_ROLE"
      • The Aragon client could create EVMScripts to this “fake” contract address with ABI-encoded IPFS hashes as its calldata
      • An EOA sends an HTTP request with “proof” that an app requested the pin, supplying enough information that allows the server to verify the EVMScript encoded in the app that calls this “fake” contract, and that the app has the correct permissions.

This last case may be hard to generalize (as the “proof” could be hard to generate; I see no easy way to standardize an interface for testing if a forwarder would actually execute an action), so the alternative is to actually deploy a contract that just emits an event with msg.sender and contentHash, and have a server watch that contract’s events (a “watch tower”).

This contract could alternatively have a mapping of keccak256(kernel, contentHash) -> bool (a bit more expensive than an event) to remove the “watch tower” and allow users to use the HTTP flow (as this contract would provide direct proof).

If the organization encodes an IPFS hash on-chain (either in the Kernel or a very simple app), it could also automatically call this contract for each storage update.

5 Likes

Hey @sohkai thank you for the in-depth write up. I think a good first step is getting the collaborative cluster up and running, and providing instructions to make it easy for other teams to run their own clusters as well. Then, we can create a new thread for authentication strategies, since that seems applicable to other features as well (Aragon One product scope for 0.8).

@hector - any updates about collaborative clusters?

Hey, the Cluster RPC authentication is PR’ed and will be merged this week. This is what allows to run Cluster peers in “public” without fear of anyone other than the trusted peers controlling them.

Apart from that, we have a number of UX things in mind to make it slightly easier to run Clusters in this fashion (i.e. by being able to fetch configuration templates from IPFS), but they are not blockers to run a collaborative cluster. I’ll drop around here and post instructions RPC auth is merged.

4 Likes

Hi, just heads up that we merged all the big things to ipfs-cluster master and it is now possible to run collaborative clusters. We are adjusting small details and improvements now.

If you want to start working on integrating it should be possible, the only problem is that documentation will not be ready until a stable release comes out and this will take a while since we’re swamped with IPFS Camp preparations. However, you can just ping me and I’ll tell you what you need to know (I’m usually on the IPFS IRC channels).

3 Likes