Building the components of a trustless distributed compute network

🌐 Overview

We've had a great week figuring out Filecoin data onboarding, connecting to IPC for the first time, updating the prototype to be able to connect to IPC, researching autonomous agents as a future idea for auto-tuning collateral requirements, as well as speaking at HackFS and featuring Waterlily on a Twitter Spaces appearance!

⚒️ Lilypad Engineering Update

Filecoin data onboarding use-case

We had a great brainstorming session with Anjor from the outercore team and put together a sketch for how Filecoin data prep from Lilypad can work in the future:

Assuming that the user stages their data in S3 or an S3-compatible API with object versioning enabled
Pinning a specific set of object versions (with client code we provide to make this trivial) will allow us to make onboarding reproducible
We will focus only on data-prep, not deal making
The input to the system will be S3- or S3-compatible data and the output will be CAR files and metadata files served via HTTP URLs for a deal engine to provide directly to SPs
Since the input data can be pinned down by S3 object versioning (and optionally md5 checksums), the dataprep workflow can be made deterministic, and therefore we can use the same verification algorithm in the Mechanisms paper (optimistically verifying via replication)

First contact with an IPC subnet

We successfully connected to IPC from hardhat and deployed a contract to it:

Updating prototype to connect to IPC

The prototype code we inherited unfortunately assumed it was running against a dev instance of, say, geth or hardhat where the accounts were unlocked on the server. What's more, it did this by manually making JSONRPC calls to the server with pycurl. When connecting to an IPC subnet, you need the production mode of having the client sign transactions with a local private key which isn't shared with the server.

So, we ripped out the hacky pycurl code in the EthereumClient in the prototype code and swapped in web3.py, a much more robust Ethereum client which has middleware for automatically signing transactions with private keys:

This was quite a large engineering effort because we then had to implement the interface and lots of stuff in the codebase made assumptions about the exact form of the return values, which varied between the pycurl code and the library, but we got it working, and were just moments ago able to connect the prototype to IPC and sign a transaction!

Unfortunately, we are now fighting (intermittently varying) errors from IPC about gas limits, even when we set sensible ones:

So work is ongoing to figure out if this is something we need to tune on our end, or an issue with IPC/FVM.

🎓 Lilypad Research Update

Levi Rybalov has been busy reviewing the research on Autonomous Agents and Reinforcement Learning this week. Here's his summary from the academic trenches!

On Autonomous Agents and Reinforcement Learning

There is a major shortcoming of many papers attempting to disincentivize cheating in distributed computing environments with verification via replication. The literature generally assumes that participants in the network are rational, utility-maximizing agents.

The problem is that clients and compute nodes are not necessarily perfect in executing their utility-maximizing strategies. Rather, they are operated by human beings who are designing their own strategies. The combination of collateral requirements, compute capabilities and choosing which tasks to accept is a complex problem for which analytic solutions are unlikely to achieve the potential of the combined computing power of the network.

Autonomous agents, representing compute nodes, can take capital availability and collateral requirements in collateral markets (and the compute capabilities of the machines they represent) into account when determining which tasks to bid on. Likewise, autonomous agents representing clients can balance the capital availability, compute and collateral requirements and the reliability of compute nodes when making asks on their tasks and choosing which compute nodes to select.

In this manner, autonomous agents can be used to turn all utility-maximizing agents into rational actors that are (near) perfect in their strategy executions thereby eliminating this complicating factor and additionally automating many of the decision-making processes of all network participants.

One of the long-term goals of modelling the problem in this manner is to design the protocol in such a way as to align to the greatest extent possible utility-maximization and honest computation, such that all utility-maximizing actors would prefer to delegate strategic decision-making to these autonomous agents. There is a vast literature touching on all of these topics.

Expect more to come in the weeks ahead as we dive into this fascinating (and currently exploding) field of autonomous agents, reinforcement learning, and cooperative artificial intelligence.

🌟 Lilypad Community Update

This week we're proud to support the hackers at HackFS and chatted on The Decentralist's podcast about AI and the infrastructure needs for the future.

👉 HackFS: Build the Decentralized Web

HackFS, supported by ETHGlobal, is Filecoin's premier yearly hackathon and is in its 4th year running! Prior hackers have gone on to become successful startups of their own in the Protocol Labs Network.

We're excited to be supporting HackFS this year with a $10,000 Bacalhau Bounty, as well as mentorship and guidance along the way.

Check out the live video workshop on Bacalhau & Lilypad here!

https://youtu.be/TO5IzOExYZw

We've also had several teams let us know some of the projects they're building with Bacalhau x FVM, here's a sample of some of the promising project ideas we're seeing (we're loving seeing DataDAOs, D-AI & DeSci front & centre!):

Science-Requestor: a DataDAO where scientists can pay for AI work or other computation requests with their own scripts and data to be used.
DatAgent DAO: a DAO marketplace where communities can share their datasets to train ML models & collectively earn from them.
FutureOfMedTech: Aggregating open medical datasets via FileCoin & IPFS CID's listing. Features include automated ML training with Baclahau, inference on private patient data, computation proof via ZKProofs.
DMindDAO: a governed marketplace to trade voice NFTs that can be used for training TTS AI models on a decentralized compute network
InferAI: A single click deployment platform for ML models on a decentralized compute and storage powered by Filecoin Virtual Machine and Bacalhau

👉 Twitter Spaces - The Decentralists

Check out the latest episode of The Decentralists where Ally Haire spoke about why decentralised AI technology matters - both from an engineering and a social perspective & how web3 can help solve some of the new challenges cropping up with widespread AI usage, like attribution, provenance & verification and proof of humanity. They also chatted about waterlily.ai - built on top of FVM & Lilypad/Bacalhau, Waterlily.ai is a first-of-its-kind user-facing app that is aiming to provide a more equitable new paradigm in generative AI Art platforms and tackle the issue of AI attribution by providing royalties to original artwork creators on every use of the platform.

🎙️ Listen to the replay here

🎨 Find out more about Waterlily here

https://twitter.com/DeveloperAlly/status/1666771951183077376?s=20

https://www.youtube.com/watch?v=oXNaCs0eyXI

🔮 What's Next?

This upcoming week we're working on defining our roadmap and building out more prototype work around the research as well as continuing with the technical prototyping.

☎️ Contact Us

🗞️ Subscribe to the newsletter: https://lilypad.tech/newsletter

💬 Chat with us on Slack: https://bit.ly/bacalhau-project-slack #bacalhau-lilypad

🍃 Lilypad Project Report: June 12, 2023

Table of contents