Aragon Network IPFS Pinning

Hi, Mathew here from Espresso.

Thanks @Schwartz10 for this fantastic analysis! I’m not fully aware of all the details and requirements Autark has expressed so please correct me if any of my assumptions are wrong.

IPFS pinning and data querying

I tend to agree with Daniel and @osarrouy that an IPFS pinning solution should be usable on its own, for 2 reasons:

  1. Implementing a robust querying layer (regardless of it being fully decentralized or not) is no simple task. Handling concurrency, schema validation, data consistency while maintaining decent performance and high availability, all of this takes time. Yet the need for a simple IPFS pinning solution is increasing as more projects are being built on Aragon. Implementing a standalone IPFS service would quickly provide a solution to those needs.

  2. Adding a querying API significantly increases the attack surface of the service. Considering that one of the requirements is to “Limit security vulnerabilities and expenses”, separating the query layer would probably be the best solution to meet this requirement.

Here above, I assumed that building the querying part as a 2nd layer would require no major extra work, but again correct me if I’m wrong.

Querying

I think we all agree that a completely decentralized solution would be ideal. However, I’m skeptical that it could be realistically achievable within the given timeframe.

Furthermore, I would argue that MongoDB is actually a perfectly fine solution for this in the near-term as it is not as centralized as it may sound. We could easily imagine for example a replica-set architecture where each of Aragon One, Aragon Black and Autark run its own MongoDB node. In combination with @Schwartz10 clever idea to store db snapshots on IPFS, this would be a fairly(albeit not completely) decentralized solution.

IPFS API and authentication

I feel like the Aragon IPFS service should be 100% compatible with the IPFS API

I completely agree with Olivier. It would be in the best interests of users as well as the devs who maintain the service to have a compatible solution. IPFS supports basic authentication but the client library also supports custom headers if you want to implement JWT authentication (or any complex authentication) without breaking the API.

Regarding JSON web tokens, I think they can be a great choice for APIs as they are lightweight, supported pretty much everywhere and can technically be stateless. However in practice they are quite often stateful, as we probably want a revoking mechanism of some kind. i.e. If the token gets lost/compromised or if you want to disable access to a user who previously had access.

Have you considered directly using a smart contract to validate authentication? What I have in mind would be something like this:

  1. Before sending the “pin request”, the library signs the CID with the user’s account.
  2. It then attaches the signature and user account to the request as an additional header, keeping 100% compatibility with IPFS.
  3. On the server side, a small service verifies that the signature is valid and that the smart contract contains the permission for this specific account. If yes, it forwards the request to IPFS. If not, it returns an error 403.

The smart contract could be a simple DAO where the IPFS access is managed by the ACL. This solution has its drawbacks however: the user would have to sign the CID every time he/she wants to pin data, and it would probably be harder to implement than JWT authentication.

Espresso

Finally, regarding the Espresso Datastore, it solves slightly different problems than what this IPFS and data querying service is intending to solve so I’m not sure it would be an optimal solution for you. However there may be a few concepts and ideas that could be useful. I won’t be available tomorrow unfortunately but you can contact me on Telegram at mcormier and I will most certainly keep an eye on this thread :slight_smile:

3 Likes

Oh…I guess I don’t care as much about privacy as I do about having to trust each other.
That being said, it might still make sense to abstract it away under a generic interface so switching between IPFS/Swarm and others is seamless.

I haven’t checked them out :grimacing: .
I think transaction cost is a different problem that we could tackle separately and that it only applies to the public blockchains. If the DAO is running on a plasma chain or a side-chain it could, like you described, periodically log a merkle root to a set of public blockchains (one of which could be the Aragon chain). The time frame of this could be set by the DAO.

I guess my biggest concern is about the ipfs “admin” nodes that could choose to go rogue.

Hey everyone! Very productive call. Here’s a short summary, recordings are at the bottom:

  • Breaking this large forum discussion into two smaller “challenges” is helpful - pinning + querying.
  • Pinning: @hector jumped on the call to answer questions about IPFS-Cluster and collaborative pinning. TLDR - we should wait until their latest changes for peer permissions within a cluster (2-4 weeks) are merged. Longer notes can be found in comments from the agenda doc.
  • Pinning: we’re going to be creating a separate chat channel for pinning discussion / implementation. Please like this post if you’d like to be added to this channel.
  • Querying: some research will be done on fluence, Origin Messaging and a POC 3box discussions app to see if they are good fits. Sponnet is going to write up a post on peepeth, which pins IPFS hashes in one big dag to Ethereum every time interval.

Check the agenda doc for more notes about the call

Recording of call (will upload to YouTube too):

https://www.dropbox.com/s/3i7wkyokzefj06g/autark%20on%202019-04-24%2015%3A55.mp4

5 Likes

We created a channel in aragon chat to coordinate IPFS efforts. https://aragon.chat/channel/ipfs-dev

@hector please keep us updated about the collaborative cluster functionality

Weighing in on some thoughts for IPFS clusters and authentication.

As discussed in a call today with @Schwartz10 and @stellarmagnet, I see two primary longer-term strategies for hosting IPFS data in the Network:

  • Association backed, potentially pinned by all Flock teams and or service providers: host mission critical data for the Network, e.g. Flock apps, radspec and token registries, etc.
    • Makes the most sense for collaborative clusters, where each Flock team may be expected to run a replication node, and pinning permissions could be concentrated primarily in the Assocation or delegated to a small technical group
    • Already begun this process with AGP28, where the Aragon client’s releases are becoming more decentralized from A1’s control (and ideally would be kept pinned in other servers than just A1’s)
  • Per-app / per-flock backed data stores: app-specific data, e.g. TPS’ Projects app’s markdown files, Pando-backed repos
    • Other teams could altruistically replicate these data sets on goodwill / reciprocation
    • Service providers could provide replication nodes for a fee

In the long future, I would hope that each organization eventually begins to run its own infrastructure (or rent it via service providers) to pin important information related to its operations (similar to how basically every 5+ person organization in modern countries will either have self-hosting or paid cloud-backed solutions).

However, in both the short and long term, I get the impression there would be considerable value if the Association provided infrastructure for organizations to pin a reasonable amount of storage (e.g. 10-100mb) for free. After this range, there could be paid service tiers provided by either the Association or other service providers.

This limited amount would ideally be large enough for small users to frictionlessly upload their organization profiles, cross-org customizations (layouts, local labels, etc), and app-specific data blobs.


Potential per-organization authentication strategy

I assume we are able to create a thin authentication layer on top of IPFS clusters, either through a reverse proxy or some other means of proxying authentication requests, that is able to track the amount of data pinned by each author. If not, some more research will need to be done on creating this type of authentication “shell” around the IPFS API.

Extending @mcormier’s message signature strategy, we could augment it to include some organization-based checks:

  1. Hardcode a role into the Aragon infrastructure, e.g. DATA_UPLOAD_ROLE, likely in the Kernel
  2. Allow organizations to assign certain apps or accounts a permission granting DATA_UPLOAD_ROLE
  3. Send requests to pin either through an app or EOA (see below)
  4. The authentication layer would check if the requester has permission to upload data on behalf of an organization (Kernel.hasPermission(<requester>, <kernel>, DATA_UPLOAD_ROLE))

Step 3 will differ based on the requester, as apps would not be able to sign messages and their contracts (obviously) cannot make HTTP calls:

  • If an EOA requests the pin, the user would only be required to provide the organization’s address and a signature of the CID as HTTP headers for a pinning request
  • If an app is assigned the permission:
    • If the contract can immediately invoke the action (see related issue), allow an EOA to send a pin request through HTTP, but require a “final forwarder” to be part of the request headers
      • The authentication layer needs toalso check that the EOA is indeed able to forward to the “final forwarder” (see @aragon/wrapper), and that the app has the correct permissions
    • If not:
      • First, we assume a Network-wide “fake” contract (that is never deployed), e.g. "0xFFFF..FFF" - "DATA_UPLOAD_ROLE"
      • The Aragon client could create EVMScripts to this “fake” contract address with ABI-encoded IPFS hashes as its calldata
      • An EOA sends an HTTP request with “proof” that an app requested the pin, supplying enough information that allows the server to verify the EVMScript encoded in the app that calls this “fake” contract, and that the app has the correct permissions.

This last case may be hard to generalize (as the “proof” could be hard to generate; I see no easy way to standardize an interface for testing if a forwarder would actually execute an action), so the alternative is to actually deploy a contract that just emits an event with msg.sender and contentHash, and have a server watch that contract’s events (a “watch tower”).

This contract could alternatively have a mapping of keccak256(kernel, contentHash) -> bool (a bit more expensive than an event) to remove the “watch tower” and allow users to use the HTTP flow (as this contract would provide direct proof).

If the organization encodes an IPFS hash on-chain (either in the Kernel or a very simple app), it could also automatically call this contract for each storage update.

5 Likes

Hey @sohkai thank you for the in-depth write up. I think a good first step is getting the collaborative cluster up and running, and providing instructions to make it easy for other teams to run their own clusters as well. Then, we can create a new thread for authentication strategies, since that seems applicable to other features as well (Aragon One product scope for 0.8).

@hector - any updates about collaborative clusters?

Hey, the Cluster RPC authentication is PR’ed and will be merged this week. This is what allows to run Cluster peers in “public” without fear of anyone other than the trusted peers controlling them.

Apart from that, we have a number of UX things in mind to make it slightly easier to run Clusters in this fashion (i.e. by being able to fetch configuration templates from IPFS), but they are not blockers to run a collaborative cluster. I’ll drop around here and post instructions RPC auth is merged.

4 Likes

Hi, just heads up that we merged all the big things to ipfs-cluster master and it is now possible to run collaborative clusters. We are adjusting small details and improvements now.

If you want to start working on integrating it should be possible, the only problem is that documentation will not be ready until a stable release comes out and this will take a while since we’re swamped with IPFS Camp preparations. However, you can just ping me and I’ll tell you what you need to know (I’m usually on the IPFS IRC channels).

3 Likes