Marlin’s Oyster CVM
Confidential Virtual Machines (CVMs) provide secure and private computation in untrusted environments. By leveraging hardware-based encryption and secure enclaves, CVMs ensure that sensitive data and applications remain fully protected, even when running on third-party infrastructure. With oyster CVM tooling, you can create secure execution environments that preserve data privacy while enabling complex computations and operations.
AI Agents & The Challenge of On-Chain Execution
Running AI agents directly on-chain presents significant challenges:
- High computational demands.
- Non-deterministic execution.
- Dependence on general-purpose programming languages
Since on-chain execution isn’t feasible, the next-best solution is to run AI models off-chain while ensuring their execution is verifiable on-chain.
Ensuring Verifiable AI Execution
To guarantee trust in off-chain AI execution, we implemented an HTTP proxy server that verifies whether responses are generated within a trusted enclave.
- When the enclave boots, it generates a public-private key pair that remains inside the enclave and is lost upon reboot.
- This key cryptographically signs all model-generated responses, ensuring verifiability.
Deploying Llama3.2 Securely in Oyster
We deployed Llama3.2 inside Oyster using the Ollama framework:
- An HTTP proxy forwards external prompts to the model running inside the enclave.
- The model processes the request and generates a response.
- Instead of streaming the response (as seen in AI chatbots like ChatGPT), it is fully collected, signed, and timestamped before being sent to the user.
This approach ensures that every response is both secure and cryptographically verifiable, maintaining the highest level of trust in AI-driven computations.
Deploying Llama3.2 in Oyster with Docker-Compose
The following Docker Compose configuration sets up Llama3.2 inside Oyster, ensuring secure execution within a Confidential Virtual Machine (CVM):
services:
# Llama Proxy Service
llama_proxy:
image: kalpita888/ollama_arm64:0.0.1
container_name: llama_proxy
init: true
network_mode: host
volumes:
- /app/ecdsa.sec:/app/secp.sec
# Ollama Server
ollama_server:
image: ollama/ollama:0.6.0
container_name: ollama_server
init: true
network_mode: host
healthcheck:
test: ["CMD-SHELL", "ollama --version"]
interval: 10s
retries: 3
# Ollama Model Instance
ollama_model:
image: ollama/ollama:0.6.0
container_name: ollama_model_llama3.2
command: pull llama3.2
init: true
network_mode: host
depends_on:
ollama_server:
condition: service_healthy
healthcheck:
test: ["CMD-SHELL", "ollama show llama3.2"]
start_period: 2m30s
interval: 30s
retries: 3
Service Breakdown
- Llama Proxy Service:
- Forwards prompts to the model
- Signs responses for cryptographic verification
- Proxy Code & Dockerfile available for further customization
- Ollama Server:
- Runs Ollama without a desktop application
- Ensure Ollama server is running
- Ollama Model Instance:
- Pulls and initializes Llama3.2 (3B parameters, ~2GB size)
- Ensures the correct model setup for AI inference
Note: If using a different model, check the Ollama Model Library and update the Docker Compose file accordingly.
Deploying the Enclave
Set up a wallet with 0.001 ETH and 1 USDC on the Arbitrum One network
Deploy the enclave using the following command:
- For amd64 systems:
# Replace <key> with the private key of your wallet
oyster-cvm deploy --wallet-private-key <key> \
--docker-compose ./docker-compose.yml \
--instance-type c6a.4xlarge \
--region ap-south-1 \
--operator 0xe10Fa12f580e660Ecd593Ea4119ceBC90509D642 \
--duration-in-minutes 20 \
--pcr-preset base/blue/v1.0.0/amd64 \
--image-url https://artifacts.marlin.org/oyster/eifs/base-blue_v1.0.0_linux_amd64.eif
- For arm64 systems:
# Replace <key> with the private key of your wallet
oyster-cvm deploy --wallet-private-key <key> \
--docker-compose ./docker-compose.yml \
--instance-type c6g.4xlarge \
--region ap-south-1 \
--operator 0xe10Fa12f580e660Ecd593Ea4119ceBC90509D642 \
--duration-in-minutes 20 \
--pcr-preset base/blue/v1.0.0/arm64 \
--image-url https://artifacts.marlin.org/oyster/eifs/base-blue_v1.0.0_linux_arm64.eif
Deployment & Execution Time
Enclave Setup: ~3 minutes
Model Download & Initialization: ~4 minutes (varies for larger models)
Testing the Setup
Once the deployment is complete, you can verify the model by running the following cURL command:
curl http://{{instance-ip}}:5000/api/generate -d '{
"model": "llama3.2",
"prompt": "Why is the sky blue?"
}'
Since the HTTP proxy collects, signs, and timestamps the response before sending the final output, expect a wait time of ~2 minutes.
Verifying the Response
Running the cURL command with the -v
flag will display two critical headers for verification:
x-oyster-timestamp: 1741620242
x-oyster-signature: 8781e472b0f8e3693c1c6cec60b1ae0f5fed4c574d24e3bfcc6cc23f02a918a8785709ceb8a464a7d1dbbb8809ba73047acaa3ff5f1918ba565d82d177e123801b
That’s it! You now have Llama3.2 securely running inside Oyster, with cryptographic verification ensuring trusted communication.