Merkle trees are commonly used to scale timestamping and clock synchronization protocols, with examples including Guardtime (trusted timestamping) and Roughtime (clock synchronization). However the standard approach of timestamping just the tip of the merkle tree suffers from poor precision, especially if signatures are recorded in a calendar for auditing. This creates a fundamental conflict between timestamp precision and scalability.

Here we propose a new construction, the Stamp/Beacon Tree, which greatly improves on the precision of standard merkle trees by including per-leaf timestamps that commit to exactly when a given leaf digest was both received, and published.

This new approach allows for a highly scalable trusted timestamping and random beacon service, with precision limited only by network latency. Additionally, we’ll show how to use stamp/beacon trees for clock synchronization, providing a secure Network Time Protocol (NTP) replacement, with better precision than Roughtime.

# Contents

## Cryptographic Construction and Attested Guarantees

Similar to other protocols such as Roughtime, a stamp/beacon tree is a merkle tree, timestamped with a signature from a trusted notary that signs a commitment to the merkle tree, an absolute time $$T$$, and a radius $$\epsilon$$ indicating the uncertainty of the time. The notary receives one or more client submitted digests, builds a tree, and then returns to each client a unique merkle path from the digest-specific leaf to tip.

Unlike existing protocols, each leaf $$i$$ in a stamp/beacon tree commits to the following:

• $$d_i$$ - Client-submitted digest.
• $$n_i$$ - Notary chosen nonce.
• $$s_i$$ - The delta between time $$T$$ and when the digest was received (timestamped).
• $$p_i$$ - The delta between time $$T$$ and when the nonce was sent (published).

The deltas are the key to the stamp/beacon tree’s timing precision. The notary is attesting that the following is true:

1. Digest $$d_i$$ was received at true time $$T_{s_i} = T - s_i \pm \epsilon$$
2. Nonce $$n_i$$ was sent at time $$T_{p_i} = T + p_i \pm \epsilon$$
3. Prior to being sent, the nonce was a secret.

As usual, the combination of merkle path and signature proves the validity of the attestation. Note that this means the merkle path must be unique, eg by committing to the total number of leaves in the tree - and thus the expected merkle path length - or ensuring that leaves can’t be confused with inner nodes via the hashed message length, tags, etc.

### Trusted Timestamping

This is the most straightforward use for a stamp/beacon tree. Since the notary attested to the fact that it received the digest at a specific true time, with uncertainty $$\pm \epsilon$$, the notary also attesting that the digest must have existed prior to that time. Specifically, prior to the later bound of the uncertainty, $$T - s_i + \epsilon$$.

The nonce does double duty, by preventing other clients with sibling leaves in the tree from learning about other digests via brute force search. In fact, the OpenTimestamps server already uses a per-leaf nonce.

Additionally the signature by itself is a valid timestamp: $$T + \epsilon$$ must be after the digest was submitted, thus the notary is attesting to the fact that the digest existed prior to $$T + \epsilon$$. This may be useful for simplified proof validation that does not need full precision.

### Random Beacon

Since the notary attests to the fact that the nonce was a secret prior to being sent - aka published to the world - the nonce acts as a random beacon. Specifically, the notary is attesting to the fact that the earliest the nonce value could have been known to the public was after the earlier bound of the uncertainty, $$T + p_i - \epsilon$$.

Again the tip digest of the tree can itself be used as a random beacon, because it’s a commitment to random beacons. As we’ll discuss later, if multicast message delivery is available, it’s a good option to broadcast the tip.

The main reason why per-leaf beacon timing is proposed is bandwidth: being able to spread out the bandwidth usage of the beacons, rather than having a sudden spike, reduces network congestion caused timing jitter.

## Secure, Precise, Clock Synchronization

Here we have a local clock, and a remote clock, and our goal is to synchronize the local clock to the remote clock. To do this we need to measure the delta time, $$\Delta$$, between the local clock and the remote clock; with $$\Delta$$ measured, the local clock can be moved forwards or backwards to match the remote clock. Repeated measurements can also measure the difference in clock frequency between local and remote.

The Network Time Protocol (NTP) achieves very precise clock synchronization. But, NTP is not resilient to Man-In-The-Middle (MITM) attack. The Roughtime protocol is resilient to MITM attack. But as the name suggests, Roughtime only achieves imprecise synchronization. Here we will combine the principles of both NTP and Roughtime, into a secure, and precise, clock synchronization protocol using stamp/beacon trees.

### Threat Model

Since we’re trying to achieve secure, high precision, time synchronization, our threat model (unusually!) has to include attacks on the timing of messages.

We assume the existence of a MITM attacker who is not capable of forging signatures, breaking hash functions, violating the laws of physics, etc. But the attacker is capable of causing messages to be:

1. Dropped
2. Delayed
3. Hastened

Dealing with dropped messages is fairly obvious, so we won’t go into detail about them here. Dropped messages of course includes corrupted packets, packets that are so delayed as to exceed retry timeouts, DDoS attacks, etc.

A delayed message is simply the MITM attacker introducing an artificial delay. A hastened message shows up as a negative delay. While unlikely, an attacker could in theory accomplish this by rerouting a message to a network link with a lower latency than the usual route, e.g via a more direct network path, or even a low-latency radio or laser link (light travelling through a fiber optic cable is about 30% slower than EM waves travelling through air/vacuum).

### Delta Time Measurement

As in Roughtime, each delta time measurement is performed by the client generating a nonce, and asking the notary to timestamp that nonce. Replay attacks are prevented by the uniqueness of the nonce.

As in NTP, this four step process results in four different time measurements, $$t_s$$ and $$t_p$$ with respect to the client’s local clock, and $$T_s$$ and $$T_p$$ with respect to the notary’s clock. We’ll refer to the network latency of the request and response as $$l_s$$ and $$l_p$$ respectfully.

Finally, we have a MITM attacker, who can artificially delay the request and response messages. We’ll refer to those delays as $$w_s$$ and $$w_p$$. That gives us the following timing diagram:

Client: t_s                                      t_p
↘                                   ↗
Latency:      l_s + w_s               l_p + w_p
↘           ↗
Notary:                   T_s → T_p


We can assume that the difference in frequency between client and notary clock is negligible: even the cheapest quartz oscillator will have better than 0.01% frequency accuracy. With that assumption, we can derive two equations for the relationship between each pair of time measurements:

\begin{align} t_s + \Delta + l_s + w_s &= T_s \pm \epsilon \\ T_p \pm \epsilon + l_p + w_p &= t_p + \Delta \end{align}

…isolating the delta time difference:

\begin{align} \Delta &= T_s \pm \epsilon - t_s - (l_s + w_s) \\ \Delta &= T_p \pm \epsilon - t_p + (l_p + w_p) \end{align}

Neither equation can be solved in isolation, because the latencies in each direction are unknown. But if we add them together we get:

$\Delta = \frac{T_s + T_p}{2} - \frac{t_s + t_p}{2} \pm \epsilon + \frac{(l_p + w_p) - (l_s + w_s)}{2}$

The first two terms are our true delta time difference; the third term $$\pm \epsilon$$ is the notary attested error bounds (which adds directly due to the delta encoding of the attestation). In a perfect world with no network delay, and no attacker, they would be our only terms.

The final term comes from the network latencies, both real, and any artificial latency introduced by the attacker (or maybe removed!). There’s two key things to understand about the impact of latency on the uncertainty of the measurement:

1. Symmetrical latency cancels out: if messages take the same time in both directions, latency has no impact on the measurement.

2. Asymmetric latency is undetectable: without additional knowledge it is fundamentally impossible to know the ratio of the two message latencies.

Usually networks have close to symmetrical latency; NTP relies on a symmetric latency assumption. Conversely an attacker who wants to skew our clock will introduce an asymmetric latency.

#### One-Shot Measurement

Suppose nothing is known about the communication channel between client is notary, and one delta time measurement is performed. What does the client learn?

We’ll call the sum of the latencies $$L$$. The client can measure the sum latency:

\begin{align} L &= l_s + w_s + l_p + w_p \\ &= (t_p - t_s) - (T_s - T_p) \end{align}

The worst possible scenario is if the latency is completely asymmetric, and entirely caused by a MITM attacker ($$l_s = l_p = 0$$). Since the attacker can choose which direction to delay, the worst case uncertainty from MITM attack is simply $$\pm \frac{L}{2}$$.

#### Repeated Measurements

Suppose the delta time measurements are repeated. What does the client learn?

Real world network latencies vary due to network congestion, routing changes, etc. Second, attacks are rarely continuous.

Using the above worst case analysis, each sample can be thought of as a constraint on what the present time could be. Due to the uncertainty in the local clock’s frequency relative to the notary, that constraint degrades over time. As calculating these constraints is relatively straightforward, we won’t go into more details. But it’s worth noting that constraint degradation happens relatively quickly with standard computer hardware: a frequency error of ±50ppm, typical for a cheap quartz oscillator, is a ±180ms/hour drift. Thus prior delta time samples become out of date relative to new ones in typical conditions in a matter of minutes.

#### Discontinuous Attacks/Measuring Minimum Latency

A special case of repeated measurements is to assume that:

1. Latency is symmetric.
2. The MITM attacker can only delay packets, not hasten their arrival.
3. At some point, the attack stops.

With these assumptions the minimum latency can be measured, and subsequently, subtracted from the measured latency to get a tighter uncertainty bound. With dedicated near light speed communication channels such as radio links, and careful link characterization, this approach could be used to achieve secure clock synchronization with accuracy almost as good as existing non-MITM-secure approaches.

## Scaling via Untrusted Aggregation Servers

The security requirements of trusted servers make achieving high performance expensive for many reasons such as the difficulty of removing heat from tamper-detecting enclosures, the desire to use simpler microprocessor architectures that are easier to audit, etc.

Thus, we want to scale our trusted notaries with untrusted aggregation servers. To recap, an aggregation server accepts $$n$$ digests from clients, creates a merkle tree, and requests that the notary timestamp the tip of that merkle tree. Effectively, we’re just adding more layers to the tree - the OpenTimestamps proof format and calendar software already supports aggregation. In fact, OpenTimestamps proofs even allow for an arbitrary number of aggregation layers: the proof format doesn’t care how many merkle trees were built.

Our question here is how does this impact timing uncertainty? Let’s looks at a timing diagram for an aggregation round:

Clients:   t_s_0   ...  t_s_n                         t_p_0 ...  t_p_n
↘            ↘                       ↗          ↗
Aggregator:                    t_s               t_p
↘           ↗
Notary:                              T_s → T_p


Since the aggregator is not trusted, an aggregation layer is simply an increase in overall end-to-end latency. The most fundamental limit here is how often merkle trees are built: if the trusted notary can accept $$f$$ requests per second, the worst-case interval between an aggregator receiving a request from a client, and the aggregator sending the merkle tip, must be at least $$\frac{1}{f}$$.

The logic also holds true for each additional aggregation layer. However which each aggregation layer able to scale total req/s by a factor of hundreds of thousands on modern hardware, it’s hard to imagine a scenario where you would need more than two or three layers.

The fundamental limit is unlikely to be the limiting factor in practice: even an original 700MHz Raspberry Pi can do well over 200,000 SHA256 operations per second, more than enough to saturate the ~5000reqs/s the 100BaseT ethernet interface is capable of. 5000req/s adds just 0.2ms of inherent max latency - other sources of latency are likely to dominate.

Aggregation does create an interesting fairness/incentives problem: repeated requests can get more or less lucky as to where a given request ends up relative to when the aggregated request is actually sent. Implementations may want to add artificial delays to ensure that repeated requests don’t get substantially better precision.

## Multicast Beacon Delivery

While the leaves of the merkle tree are unique to each client, the stamp/beacon tree tip is common to all, and itself constitutes a valid random beacon with attested time $$T \pm \epsilon$$. Thus if low-latency multicast messaging mechanisms are available such as radio, broadcast ethernet, multicast routing, etc, tighter timing bounds may be achievable by using them to broadcast the tip.

For clock synchronization, the tip can be used instead of the per-leaf beacon. Whether or not this actually reduces uncertainty depends on the multicast technology used, and the security assumptions. In particular, whether or not latency to/from the notary is assumed to be symmetric.