Isolated Instances’ Monitoring Protocol
The monitoring protocol establishes the fault tolerance guarantees for Oyster Isolated Instances and involves a network of Auditors who ensure compliance through the Isolated Instance protocol.
Protocol guarantees
When registering, enclave providers stake POND tokens as security for the enclaves they operate. If these enclaves are found to be non-operational during audits, the staked tokens are slashed. Similarly, auditors involved in the auditing process stake POND tokens to guarantee their active participation and the accuracy of their auditing. If auditors are found to be inactive or if they submit inaccurate audit information, their stake can be slashed.
Protocol trust assumptions
Secure Communication
The protocol assumes that communication between Auditors and the enclaves is secure, and that HTTPS or other secure channels (like TLS for the audit requests) are effectively concealing the nature of the request (audit or user) from the host. This assumes no vulnerability in the communication protocol or the implementation thereof that could be exploited to distinguish between user and auditor traffic or to intercept and manipulate data.
Endpoint Security and Reliability
It is assumed that each enclave exposes a standard endpoint through a secure channel and that this endpoint is reliably available for auditing requests. However, there could be the case that the enclave image is under a DoS attack or other network-level disruptions that could prevent auditors from reaching the enclaves.
Randomness and Assignment Integrity
The assignment of Auditors to enclaves is based on a random seed (Sri), which is assumed to be generated fairly and securely, and to be resistant to manipulation. This seed influences the distribution of auditor subsets to enclaves, meaning that the integrity of this randomness is critical to proper functioning of the auditing process. In case this process is compromised f.e the randomness seed is unevenly and predictably distributed, that could be an attack vector for collusion and targeted attacks.
High-level overview
Protocol structure
- Epochs: Time is organized into periods called Epochs.
- Slots: Each Epoch is further divided into Slots.
- Ages: Each Slot is subdivided into Ages.
- Auditors: A random number generated each Epoch assigns Auditors specific Jobs to audit.
Audit Process
Auditors send requests through secure channels established withto Instances, which must respond within a set time limit. For each Job within a Slot, Auditors take a majority vote to determine the liveliness of the assigned Instance.
Technical Specifications
Enclaves (T): A set size of t enclaves are audited.
Auditors (A): A set of auditors, totaling a in size, conducts the audits.
Epochs (E): Run for a length of LE, divided into n Slots (EiS) each of length e.
Ages: Every Slot is broken into m ages, each lasting p seconds.
Formulas
- Total Epoch length, LE = n * e.
- Slot length, e = m * p.
- SlotId for the s-th slot of epoch Ei is calculated as: SlotId = i * n + s
- AgeId for the a-th age of a slot with SlotId is: AgeId = SlotId * m + a
Protocol stages
The auditing protocol, at a high level, consists of the following stages:
- Enclave setup
- Auditor assignment
- Enclave monitoring phase
- Aggregating audit data
- Challenging audit’s data
- Verifying the audit
Stage 1. Enclave setup
At the beginning of every Epoch, each Enclave generates a random seed Sei internally, which is not exposed to anyone until a seconds after the end of the Epoch. The host machine on which the Enclave is running can query the random seed Sei after a seconds of the end of the epoch, along with an attestation signed by the enclave. The host machine has to submit the random seed Sei on-chain within k seconds after it can be queried (that is, within a+k seconds after the end of the epoch).
Slashing condition(s)
- If the host machine fails to submit the random seed Sei, the Enclave is considered to be offline and POND tokens staked against the enclave are slashed.
Stage 2. Auditor assignment
At this stage, each enclave Ti in the set T is assigned a subset of auditors TAi from the set A, such that |TAi| = k (k is a constant that determines the number of auditors assigned per enclave). The auditor subsets TAi largely consist of different subsets of Auditors for different Slots of an Epoch.
The assignment from A to T for epoch Ei is randomized using a seed Sri which is generated at the start of the epoch Ei-1. Sri is currently considered to be the blockhash of the block at which the epoch Ei-1 starts.
Auditors can only enter or leave the auditor set A at the end of the Epoch which ensures that the auditor subset for a Slot does not change within an Epoch. Enclave set T can expand or decrease based on whether a new job is added or an existing job is closed or runs out of funds.
Each enclave, for a given Slot of an Epoch, is assigned an auditor subset TAi of length k using the following algorithm:
let AuditorSubsetIndices = [];
let count = 0;
let iter = 0;
while(count < k) {
index = (keccak256(`${iter}-${SlotId}-${EnclaveJobId}`)) mod(t)
if(!AuditorSubsetIndices.includes(index)) {
AuditorSubsetIndices.push(index);
count++;
}
iter++;
}
This algorithm ensures that the auditor subset for every enclave in a Slot consists of unique Auditors. It also ensures that every new enclave created during the Epoch will have assigned Auditors to itself. Auditing of new enclaves takes effect only after c slots from the Slot at which the Isolated Instances Job was created, where c is a constant that determines the delay for auditing new enclaves (JobStartup period).
Slashing condition(s)
- In case neither the response, nor the online status is reported on-chain by the auditor, then it is assumed to be offline and the POND tokens staked are slashed for inactivity.
Stage 3. Enclave monitoring phase
During each age, the auditor subset TAi assigned to the enclave Ti for the Slot sends audit requests to the enclave to ensure its availability. The audit requests are sent through the Tor network, practically making the request anonymous to ensure that hosts cannot distinguish between an audit and a user request. The random seed Sei is used by the enclave Ti to generate a response to an audit request. The response to the audit request is a 1-bit data as follows:
const response = keccak256(Auditor address + AgeId + Sei)/2^255
Slashing condition(s)
The above response by the enclave Ti to the audit request is required to be submitted on-chain by the Auditor within q seconds to prove that the audit was actually done.
- If the enclave Ti does not respond to the audit request, then it is considered to be offline and reported as such on-chain by the auditor and its POND tokens are slashed.
- In case neither the response nor the offline status is reported on-chain by the auditor, then the auditor is assumed to be offline and the POND tokens staked are slashed.
Stage 4. Aggregating audit data
This stage involves submitting the audit data for the epoch on-chain to ensure that the monitoring is effective. The audit data consists of the following:
- The audit responses collected by the auditors during the epoch, for each enclave they were assigned in a slot, for each age in that slot. The audit responses are the 1-bit data generated by the enclaves. These are submitted on-chain by Auditors within q seconds of the Age for which they are generated.
- The random seeds Sei used by the enclaves to generate the audit responses, along with the attestations signed by the enclaves. The random seeds with attestation are submitted by the enclave hosts within a+k seconds of the end of Epoch.
Slashing condition(s)
- If the seeds are not submitted within the deadline, the enclave hosts are considered to be offline and the POND tokens staked by the enclave provider towards the enclave are slashed.
- If the audit responses are not submitted within the deadline, the corresponding auditors are considered to be offline and the POND tokens staked by the auditors are slashed.
Stage 5. Challenging audit’s data
Given the seed information and the auditor subsets assigned to each enclave during the epoch, anyone can verify the correctness of the audit responses that were submitted by the auditors. If any audit data is wrong or missing, the audit response can be challenged within f seconds after the end of the epoch. The challenger has to stake POND tokens to create a challenge and the faulty response is computed on-chain.
Slashing condition(s)
- If the challenge is valid, then the POND tokens staked by the auditor are slashed and the challenger receives a portion of the penalty. Challenge is considered invalid if the response generated by the auditor was found to be correct after on-chain verification which triggers the POND tokens staked by challenger to be slashed.
Stage 6. Verifying the audit
Remember that in stage 2 the auditors send multiple requests to the enclaves. After the challenge period for the audit data correctness is over, anyone can penalize an enclave provider if most of the auditors report that the enclave was offline during an age. The penalty increases with the duration of the unavailability of the enclave during the epoch. Any challenges for the enclave provider staking have to be made within g seconds after the end of the audit data correctness phase (within g + f seconds after the end of the epoch). The challenger has to stake POND tokens to create a challenge and provide the age ID to be checked on-chain.
Slashing condition(s)
- The challenger has to stake POND tokens to create a challenge and provide the age ID to be checked on-chain. Challenge is valid if the enclave was inactive as per the audit responses by the auditors, the POND tokens staked by the enclave provider are slashed and the challenger receives a portion of the slashed POND tokens. If the challenge is invalid, then the POND tokens staked by the challenger are slashed.