You had to select Disks: 4x800GB SSD if you want SSDs, it’s HDD by default.
Ah, yes. My bad, I didn’t see the last option, I only saw 2x480GB SSD
It should work, so I’ve adjusted the requested funding accordingly It works out to around $215-$220 for the archive node, and coupled with the other resources mentioned above (but now on OVH) it should work out to around $315-320 in total per month.
Wow, that was fast. It only took one day before someone else followed our path. LOL
I had it in my drafts for a few days. It was my plan since AGP-10 was added to the initial voting round to create a proposal
Tbf immediately post the last AGP there was general consensus that something like this was required. Iirc someone turned up on the chat forum a few days after talking about it too…
I don’t think you are asking for nearly enough money. We at Scout have been running multiple Ethereum full nodes and archive nodes for the past 6 months and here are something we have learned along the way:
The cheaper server you use, the more problematic the node gets and the more maintenance time you would have to spend. We ended up upgrading our server to m5.xlarge (4CPU+16GB memory+2.7TB SSD) for an archive node. We found way less crashing or blockchain syncing issues. The monthly cost on AWS is $407. Every 100GB that adds to the SSD drive, you pay extra $10 a month.
You will probably run your code on those servers as well to parse data. For example, we run cron jobs to get the balances of all DAOs every 2000 blocks. They are all done through concurrent ethcalls. We also have a separate server runs a full node to deal with anything does not require historical blockchain states or balances. I definitely don’t recommend using cheap servers.
Any data involves aggregation or transformation, you would need a modern database. A $40 database server is far from a production quality database. I am almost certain you will upgrade that.
You need to back up the nodes data. Back last May, it took us more than 2 weeks to sync an archive node. I think it will probably take you more time to sync today. So we back up the node data every 24 hours. So in the case something happens to our server or hard drive, we only need to sync the last 24 hours worth of node data.
The above is just hard cost every month to run a production quality service. I am not even counting the compensation for your time.
In case you are interested, here is a recent talk on running a production level eth node cluster.
Interesting. Would it make more sense to fund someone to manage such infrastructure and to give access to it to projects who would need it, such as Daolist? It could be the same person initially. Although renting servers should probably done by the foundation, and the management of them, by anyone elected to do so.
Like if two projects have this requirement, is it the same price if they both host and maintain an archive separately, compared to sharing one? I would think sharing one might be a lot more efficient. Although I could imagine a little downside that is troubleshooting, harder to do in “collocation” type environments.
In the same vein, we could have our own “Infura” for Aragon team members / projects. Having your own node is full decentralisation and is great, relying on Infura is the opposite is full non-sense, but having a node that would be shared between project collaborators however, might be good a practical middle-ground. At least until light-nodes are rock solid, that is.
I believe any startups should deploy all their available engineering resources towards the core of their product. When I say “core”, I meant something that your users can NOT live without, something that DIRECTLY impacts the outcome of the quality of your product. Otherwise, the startup is ignoring the opportunity cost of their engineering resource and scalability down the road.
Now to answer your question, I think it makes sense to consider managing the infrastructure for your core product. For non core products, I would not do that at all. I would only bring it in house when the cost of using a 3rd party service is exceeding the amount that would have costed you to maintain it internally. As a matter of fact, it rarely happens unless you are at Facebook/Google scale.
Disclaimer: I run Scout which builds self-service analytics platform for Ethereum blockchain teams. So my answer might sound biased.
That’s the part I am curious about if you want to expand on the rationale. To me your first point makes perfect sense, and actually because of the fact that the engineering resources should be focused on the product, managed hosting services might be a good path. While I am saying that, I am aware that in some specific cases it is better to manage those services internally, I am just not seeing this case here, yet.
Hey I just found this, that’s exactly what I have in mind: https://blog.slock.it/how-to-not-run-an-ethereum-archive-node-a-journey-d038b4da398b
At Scout, we started with Infura and quickly ran into limitations of how we can gather, aggregate and transform data. We consider having that flexibility significantly improves the core of our product. That’s why we ended up rolling out our own node structures.
For database, we use mongodb atlas which is a cloud based managed service. We’re more than happy to pay extra money for not worrying about managing, fine tuning, migrating and scaling our database servers.
I hope that’s helpful for evaluating your case.
I see, thanks for clarifying!
I would think that the archive node is just that: an archive node. So the admin of that node just has to make sure there’s enough Storage / IOPS / CPU / Memory in order for the service to work properly for its users. Then a separate server would indeed gather, aggregate and transform data; for that specific (much) smaller server it makes sense for it to be dedicated to the project.
When it comes to your usage of Infura, their interests are not aligned with yours, so although they might try to help a bit, they will probably dedicate little resources to adjust their systems for your use case. In the case of having an “infura-like” infrastructure for the Aragon devs and projects, like SlockIt did / are doing, it would probably be quite different as Aragon has every incentive to maintain an archive-node that works well for the devs and projects relying on it.
I might be wrong but this seems worth clarifying to help token holders best spend the project’s funds.
It is beyond my knowledge to give suggestions on what approach you should take since I don’t have enough insights of what will or have been planned for Aragon.
If you are confident that there are enough upside to justify the risk+cost of building an Aragon infrastructure team upfront, having an “infura-like” infrastructure for the Aragon devs and projects makes sense. It is almost guaranteed that there will be overhead of maintaining, tuning and scaling that infrastructure no matter how simple it might appear at the beginning.
If you are not confident, probably best just wait and see whether there are enough demand as more projects are building on top of Aragon. By then, you would probably have more data points to make a solid decision.
From my personal experience (having founded and exited two venture backed startups), I have never made significant investment in the tech infrastructure in house until I know that
- We have a very clear product market fit
- We have a solid 18-24 months runway without revenue.
I thank you for the concern, I’ve tried to reply to the best of my ability
Sure, but I am not running 6 nodes and I don’t have any particular need in running 6 nodes. I only need to run 1 for now, maybe one or two more if I want to be really sure nothing funky happens (like missed events). I just don’t need that now.
This is not how Daolist works now and this is not how Daolist is going to work. I’ve also stated in the funding proposal that I am going to run the code on a seperate server - the architecture Daolist uses now is scalable enough and does not use cronjobs.
I am certain I will upgrade it at some point, but for aggregation this happens as soon as the data comes in and is cached. This is why I’ve requested funding for a Redis instance, used for caching.
I am aware of this, but I am swallowing that cost myself.
I am not even counting the compensation for your time.
I am not searching compensation for my time.
Final note, I am not asking for the full amount needed to host Daolist. I am well aware that it might cost more and there will certainly be unforeseen expenses, but I am willing to pay for these myself. This proposal has never been about getting full funding for Daolist. It’s about getting funding for the most expensive parts, because I can’t take on the full cost myself, otherwise I would have done that.
I am aware that the setup I’ve described might not seem “scalable” or “production ready”, but I know what i am doing and I am sure that this is enough for now. If Daolist needs to scale even further and I all of a sudden have to run a lot more stuff, then that will be reflected in another funding proposal if needed.
Final note, I don’t think Scout and Daolist can be compared too much. Yes, both collect and transform data from the blockchain, but Scout has a lot more requirements in terms of scalability seeing that Scout is general purpose and needs to be able to manage more than 1 project. Daolist is exclusively collecting data on Aragon orgs and provides a bit more of an Aragon-centric experience.
My friend, that’s totally your call.
Thank you very much @huangkuan for taking the time to express your experience with this matter, that’s precious feedback.
The question for me is now whether or not Aragon should have an Ethereum archive node that would be shared among devs and sponsored projects. The cost of such node is high, higher if configured in a fault-tolerant way, and will also keep getting higher as the chain size increases.
edit: I’d be curious to see how this would work out. I’m open to start an Ethereum archive node now and fund it for 3 to 6 months. Out of this I’d get to increase my DevOps skills and we get to experience with the model I suggested. However, if we clarify that it’s useless to eventually have an archive node shared within the org, it’s probably best that nobody waste time with this idea.
I like this proposal. I would vote yes on it. Good work on daolist so far @onbjerg and the redesign looks nice too
I think that having an archive node synced and available for Flock and Nest teams could be valuable. For example, generating EVM Storage Proofs requires an archive node, and we have only been testing with the latest block (totally misses the point) or local development nodes because we didn’t have access to the node.
An archive node would also be helpful in case there is an issue that requires us to inspect the blockchain at a particular block height in order to debug.
The fact that daolist.io also needs it and Oliver is willing to set it up and maintain it, converts it into a win-win situation in my opinion.
Regarding daolist vs Scout I think they solve different problems and I use them both every day for different reasons. I would really love for both proposals to pass so these amazing services can keep improving!
I’m curious to know, what are the different reasons that you use them?
Daolist I use when I just want to get the figure of deployed DAOs and checking the names of new DAOs. It is also what I always use to show how people are using Aragon.
Scout I use mainly for personal use a couple of times a week, to check usage stats that Daolist doesn’t currently have.