Skip to content
Snippets Groups Projects

feat: add 005-aggregate-tunnel-metrics.md

# Aggregate Tunnel Metrics
* Author: bassosimone
* Reviewers: cyberta
* Status: under discussion
This specification describes how to format and submit aggregate
tunnel metrics to the OONI collector. Additionally, we include
sections with design and implementation considerations.
## Index
1. [Overview](#overview)
2. [Definitions](#definitions)
3. [Use Cases](#use-cases)
4. [Goals](#goals)
5. [Non-Goals](#non-goals)
6. [Related Work](#related-work)
7. [Threat Model](#threat-model)
8. [Tunnel Lifecycle](#tunnel-lifecycle)
9. [Data Format (Envelope)](#data-format-envelope)
10. [Data Format (Test Keys)](#data-format-test-keys)
11. [Implementation Requirements](#implementation-requirements)
12. [Privacy Considerations](#privacy-considerations)
13. [Security Considerations](#security-considerations)
14. [LEAP Implementation Details](#leap-implementation-details)
## Overview
The objective of this specification is to allow a VPN provider
to report aggregate tunnel metrics to the OONI collector. We submit
to the OONI collector for archival reasons. The submitted metrics
range from availability to tunnel performance metrics.
The aim is to allow researchers to evaluate the viability of
using a given protocol in a specific country and ASN. This
information will be provided with varying levels of granularity
depending on the VPN provider's needs.
## Definitions
**AS**: autonomous system.
**ASN**: autonomous system number.
**CC**: country code.
**bootstrap**: the process with which the VPN fetches all the
required information to create tunnels.
**tunnel**: communication between two network endpoints
encapsulating TCP/IP packets.
**protocol**: the protocol stack used by the tunnel to encapsulate
packets that the adversary can observe.
**(tunnel) endpoint**: either a bridge or a gateway.
**measurement**: JSON document submitted to the OONI collector.
## Use Cases
In all use cases, a VPN provider wants to make *statements* about
the availability and performance of accessing *something* they manage
from a set of clients located in a given ASN and CC in a given time
period. What sets use cases apart is the *something* about which
they are making the statement.
More specifically, we cover these use cases:
**endpoint**: the statement is about a specific tunnel endpoint. For
example, in the case of RiseupVPN, an endpoint could be:
- hostname: `vpn02-par.riseup.net`
- address: `51.159.197.108`
- port: `443`
- protocol: `openvpn+obfs4`
- asn: `12876`
- cc: `FR`
which is as detailed and transparent as possible.
**endpoint_pool**: the statement is about a homogeneous pool of
endpoints. RiseupVPN publishes its endpoints, so this use case does
not make sense for them. However, another VPN provider may want
to disclose less information than in the `endpoint` use case and
say something like:
- protocol: `openvpn+obfs4`
- cc: `FR`
which does not give away the exact endpoints, but still allows
to make statements regarding the availability and performance of
using `openvpn+obfs4` endpoints located in France.
**global**: the statement is about the whole set of endpoints
and only includes information about the protocol:
- protocol: `openvpn+obfs4`
Each statement is a single *measurement* submitted to the OONI
collector for archival reasons. The `scope` field inside the
measurement `test_keys` identifies the specific use case.
In summary, all use cases allow a researcher to evaluate the
availability and performance of tunnels using specific protocols,
but the granularity of the information disclosed varies.
## Goals
1. *Extensible spec*: we aim to specify a baseline spec that is
simple to implement for LEAP *right now*, while, at the same time,
allowing to extend it to submit more information later on.
2. *Anonymity set*: the spec allows incrementally including more
metrics as the user population grows.
3. *~Standards compliance*: where possible, reuse concepts and
terminology from existing standards and related work done in
this space, including Network Error Logging (NEL), Ain Ghazal's
"tunnel telemetry", DPIDetector, and the OONI specifications.
## Non-Goals
1. Specifying how a VPN client should submit tunnel telemetry
data to the VPN provider itself. This scope has already been explored
by Ain Ghazal under the codename of "tunnel telemetry". We aim
to leverage the work done by Ain Ghazal with respect to NEL
error reporting. Yet, we believe that the mechanism with which
a VPN client reports to its own infrastructure should
directly use NEL without being constrained by the specific
details of the OONI measurement format.
2. Specifying how a VPN client should directly submit individual tunnel
establishment measurements to the OONI collector. Adding this
requirement would significantly complicate the design and the job
of the data analyst, because the same format would need to accommodate
for both aggregate and individual measurements. That said, scoping
this use case seems a matter to continue working on the Ain
Ghazal proposal on "tunnel telemetry".
3. Measuring the VPN bootstrap phase. This phase is not
tunnel-specific, rather it is VPN-specific. We believe we
should not conflate this information into the same
measurements. We could write a bootstrap-related spec. Also,
the aggregation and reporting requirements required by the
bootstrap may differ from those for tunnels.
## Related Work
1. [OONI spec](https://github.com/ooni/spec);
2. [Tunnel Telemetry repository](https://github.com/ainghazal/tunnel-telemetry);
3. [Tunnel Telemetry proposal](https://github.com/ooni/spec/pull/274);
4. [DPIDetector](https://dpidetector.org/);
5. [Network Error Logging](https://www.w3.org/TR/network-error-logging/).
## Threat Model
A tunnel comes to life with the establishment of a connection
between a client and an endpoint. The client establishes the tunnel
initiating the communication.
While the adversary will only be able to observe the most external
protocol layer (e.g., `obfs4+kcp`), we assume it can also use
other identification techniques, including conversation analysis,
statistical analysis on packet sizes, and active probing.
Additionally, the adversary can interfere with the communication
after it has been established. For example, it can mess with the
TCP state by injecting RST segments, or it can throttle the available
bandwidth down to ~zero, explicitly or through routing.
For these reasons:
1. we include the full protocol stack into the definition of the
protocol (i.e., `openvpn+obfs4+kcp` rather than `obfs4+kcp`),
to reflect the fact that the packet dynamics differ depending on
the inner protocols of the protocol stack;
2. we include the possibility of reporting performance and
latency metrics, to allow provider to make statements about
the overall quality of an established tunnel.
Additionally, as explained in the next section, this specification
allows VPN providers to selectively choose how much information
to disclose about their VPN architecture, thus ensuring they are
in control of how much information to expose to the adversary.
In the interest of facilitating research, VPN providers MAY choose to
publicly disclose more information about *some* endpoints, while
being more secretive about other endpoints.
## Tunnels Lifecycle
We model the tunnel lifecycle using this state machine:
```
.--> <<active_measurement>>
.---------. .-------------. / /
| Initial | --> <<creation>> --> | Established |<-----------'
`---------' `-------------'
|
V
.-------------.
| Final |
`-------------'
```
In the `Initial` state the tunnel has not been created yet. The
`creation` operation creates the tunnel and transitions the state
to `Established`. While in `Established`, the VPN app may run
active measurements to assess the tunnel performance. For example,
the app may `ping` the the VPN gateway or well-known addresses,
or it could run network-performance tests, such as NDT. The
tunnel enters into a `Final` state when it is closed.
We define the following *operations*:
**creation**: this operation creates the tunnel. It SHOULD NOT include
DNS lookup or other operations required to fetch the IP address and the
required certificates. For example, if we are using `obfs4`, the
`create` operation SHOULD be about performing the TCP three-way
handshake, the OBFS4 handshake, and the OpenVPN handshake.
**tunnel_ping**: this operations is the active measurement using a
`ping`-like tool to ping well-known addresses.
**tunnel_ndt_download**: this operation is the active measurement using a
NDT to measure the download speed over the tunnel.
**tunnel_ndt_upload**: like `active_ndt_download`, but for the upload.
Note that both `tunnel_ping` and `tunnel_ndt_{down,up}load` are optional and
MAY possibly occur multiple times during the tunnel lifecycle.
For each `tunnel_xx` operation, we also define the equivalent
`baseline_xx` operation optionally performed before creating the
tunnel to establish a baseline.
This specification allows a VPN provider to submit aggregate
reports about the `creation`, `tunnel_ping`, etc.,
operations. A future version of this specification may extend
the set of operations to include more active measurements,
or to include additional information about tunnels, such as
the aggregate average duration of tunnels, the number of bytes
transmitted and received, etc.
We use the `phase` keyword (borrowed from NEL terminology) to indicate
the *operation* associated with specific statements.
## Data Format (Envelope)
The metaphor used by OONI measurements is that there is
a `test_name` describing a specific network testing methodology
where we make a statement about a specific resource (the
`input` field). This happens in the context of a given ASN
and CC. Each specific experiment type (i.e., `test_name`) has
its own specific data format, which is described by the
experiment-specific `test_keys` field.
Accordingly, we define the `aggregate_tunnel_metrics` experiment
name and sketch out the overall envelope as follows:
```JavaScript
{
"annotations": {
"upstream_collector": "riseup-par-01"
},
"data_format_version": "0.2.0",
"input": "openvpn+obfs4+kcp://riseup.net/",
"measurement_start_time": "2024-10-29 00:00:00",
"probe_asn": "AS1234",
"probe_cc": "IT",
"test_keys": { /* ... */ },
"test_name": "aggregate_tunnel_metrics",
"test_runtime": 0.0,
"test_start_time": "2024-10-29 00:00:00",
"test_version": "0.1.0"
}
```
Here is the justification for setting the fields as such:
- `upstream_collector` (`string`): the name of the upstream
collector that collected and aggregated metrics before
submitting to the OONI collector. This is useful to know
which entity collected the data and submitted the aggregate.
- `data_format_version` (`string`): the version of the data
format used by OONI, which must be exactly equal to `0.2.0`.
- `input` (`URL`): the input URL format is consistent with
the OONI `openvpn` experiment and is discussed in more detail below.
- `measurement_start_time` (`Date`): the UTC (without
explicit indication!) moment in which this aggregate was produced.
- `probe_asn` (`^AS[0-9]+$`): the ASN of the set of
probes that this measurement is about.
- `probe_cc` (`^[A-Z]{2}$`): the country code of the set
of probes that this measurement is about.
- `test_name` (`string`): must be `aggregate_tunnel_metrics`.
- `test_runtime` (`float64`): runtime of this test in
seconds, which is set to zero because there is no
real runtime here.
- `test_start_time` is set equal to `measurement_start_time`,
since there is not really a test here, this is just an aggregate.
- `test_version` (`^[0-9]+.[0-9]+.[0-9]+`): the version of the
test, which will evolve as we evolve this specification.
Regarding the `input`, its main purpose is to allow searching
for measurements through the OONI API. Whether the aggregate
tunnel metrics will be exposed by the OONI API is an orthogonal
topic, which requires coordination with the OONI team.
The `input` format is the same as the `openvpn` experiment,
with some additions that are specific to this spec:
```
{protocol}://{provider}/?{query_string}
```
More specifically:
- `{protocol}` (`string`): the VPN protocol stack being used,
therefore, `openvpn`, `openvpn+obfs4`, etc.
- `{provider}` (`string`): the entity that manages the endpoints, for
example `riseup.net`.
- the `{query_string}` contains the following parameters (we use the
`<type>|undefined` syntax to mark optional fields):
- `address` (`string|undefined`): the endpoint IPv4/IPv6 address;
- `asn` (`string|undefined`): the endpoint ASN;
- `cc` (`string|undefined`): the endpoint country code;
- `hostname` (`string|undefined`): the endpoint hostname;
- `port` (`string|undefined`): the endpoint port;
For example:
```
openvpn+obfs4://riseup.net/?address=51.159.197.108&asn=AS12876&port=443
```
the above URL describes a currently existing RiseupVPN endpoint.
## Data Format (Test Keys)
The `test_keys` format is specific to this experiment. Here's how they
would look like in JSON format (where we have added comments to try and
be explicative about what it means):
```JavaScript
{
// for this provider
"provider": "riseup.net",
// with this `endpoint` scope
"scope": "endpoint",
"endpoint_hostname": "vpn02-par.riseup.net",
"endpoint_address": "51.159.197.108",
"endpoint_port": 443,
"protocol": "openvpn+obfs4",
"asn": "AS12876"
"cc": "DE",
// alternatively, with this `endpoint_pool` scope
"scope": "endpoint_pool",
"protocol": "openvpn+obfs4",
"cc": "DE",
// alternatively, with this `global` scope
"scope": "global",
"protocol": "openvpn+obfs4",
// in this time window
"time_window": {
"from": "2024-10-29T00:00:00Z",
"to": "2024-10-30T00:00:00Z"
},
// we make the following statements
"bodies": [
{
// during the tunnel creation phase
"phase": "creation",
// with this sample size
"sample_size": 200,
// we make a statement about network errors
"type": "network-error",
// and the statement is that we fail 66%
// of the times with tcp.timed_out
"failure_ratio": 0.66,
"error": "tcp.timed_out"
},
{
// during the tunnel_ping phase
"phase": "tunnel_ping",
// targeting the 8.8.8.8 IP address
"target_address": "8.8.8.8",
// with this sample size
"sample_size": 500,
// we make a statement about the latency distribution
"type": "ping",
// and the statement is that we see the following
// latency distribution in milliseconds.
"latency_ms": {
"25p": 100,
"50p": 150,
"75p": 200,
"99p": 1100,
}
},
{
// during the tunnel_ndt_download phase
"phase": "tunnel_ndt_download",
// targeting the 8.8.8.8 IP address
"target_hostname": "ndt-mlab2-mil07.mlab-oti.measurement-lab.org",
"target_address": "162.213.100.88",
"target_port": 443,
// with this sample size
"sample_size": 500,
// we make a statement about the latency distribution
"type": "ndt_download",
// and the statement is that we see the following
// latency distribution in milliseconds and the
// following download speed distribution in Mbit/s.
"latency_ms": {
"25p": 100,
"50p": 150,
"75p": 200,
"99p": 1100,
},
"speed_mbits": {
"25p": 4,
"50p": 7,
"75p": 11,
"99p": 200,
}
},
// ...
]
}
```
More formally, this is the meaning of the fields (where we indicate
optional fields using the `<type>|undefined` syntax):
- `provider` (`string`): the provider of the tunnel service, using
the same syntax as defined for the `input` field.
- `scope` (`enum`): the scope, as defined above.
- `endpoint_hostname` (`string|undefined`): the endpoint hostname.
- `endpoint_address` (`string|undefined`): the endpoint IPv4/IPv6 address.
- `endpoint_port` (`uint16|undefined`): the endpoint port.
- `asn` (`^AS[0-9]+$|undefined`): the ASN of the endpoint or endpoint pool.
- `cc` (`^[A-Z]{2}$|undefined`): the country code of the endpoint or endpoint pool.
- `protocol` (`enum`): the protocol as defined above.
- `time_window` (`object`): the time window in which the
statements we are making are valid.
- `bodies` (`array`): an array of NEL-like objects.
In turn, the common structure of each NEL-like object is the following:
```JSON
{
"phase": "",
"sample_size": 0,
"type": ""
}
```
where:
- `phase` (`enum`): is the operation as defined above.
- `sample_size` (`int53|undefined`): the number of samples
that are being considered for making this statement, appropriately
rounded, or directly omitted to preserve privacy. We RECOMMEND
to round to the nearest multiple of 100 and omit below 1000.
- `type` (`enum`): the type of statement that is being made.
The `network-error` NEL-like object is like:
```JavaScript
{
// ... common NEL-like fields ...
"type": "network-error",
"failure_ratio": 0.0,
"error": ""
}
```
where:
- `failure_ratio` (`float64`): the ratio of the number of
failures over the population sample size.
- `error` (`enum|undefined`): the network error as defined by NEL
or `undefined` if we don't know or don't want to report.
Regarding network errors, note that the the `creation` phase
SHOULD NOT include DNS operations or other operations required
to obtain information useful for creating the tunnel. Rather,
this information is part of the VPN bootstrap process and
is out of the scope of this document.
The `ping` NEL-like object is like:
```JavaScript
{
// ... common NEL-like fields ...
"type": "ping",
"target_address": "",
"latency_ms": {}
}
```
where:
- `target_address` (`IPAddr`): the target IP address.
- `latency_ms` (`object`): the latency distribution in
millisecond containing the latency percentiles indicated
using `pXX` (e.g., `p50` is the median).
The `tunnel_ndt_download` NEL-like object is like:
```JavaScript
{
// ... common NEL-like fields ...
"type": "tunnel_ndt_download",
"target_hostname": "",
"target_address": "",
"target_port": 0,
"latency_ms": {},
"speed_mbits": {}
}
```
where:
- `target_hostname` (`string`): the target hostname.
- `target_address` (`IPAddr`): the target IP address.
- `target_port` (`uint16`): the target port.
- `latency_ms` (`object`): is exactly like in `ping`.
- `speed_mbits` (`object`): is like `latency_ms` but for
the download speed expressed in Mbit/s.
Note that a `phase` is not restricted to use a specific NEL-like
object `type`. For example:
```JSON
{
"phase": "tunnel_ndt_download",
"target_hostname": "ndt-mlab2-mil07.mlab-oti.measurement-lab.org",
"sample_size": 500,
"type": "network-error",
"failure_ratio": 0.66,
"error": "dns.name_not_resolved"
}
```
the previous JSON snippet contains a statement that in 66% of the
cases, out of ~500 samples, the DNS lookup failed.
## Implementation Requirements
This specification assumes that there is a *collector* for NEL-like
reports submitted by VPN apps. Defining how this happens is out of
the scope of this document, but the "tunnel telemetry" spec is a good
starting point. The privacy implications of submitting aggregated
measurements are in scope and are discussed in a dedicated section below.
As far as this specification is concerned, it is also important to note
that VPN apps SHOULD probably be allowed cache unsent reports for up
to one week. This is to ensure that reports are not lost in case of
heavy censorship or just widespread internet failure.
The collector will be responsible for storing the incoming reports
into a spool directory organised in daily buckets. After putting the
reports into the spool, the job of the collector is done.
A separate component, the *submitter*, will periodically process
the spool directory, aggregating existing reports, deleting the
already-processed reports, and sending the aggregated reports to
the OONI collector using the data format defined in this document.
The initial aggregation period is set to one week, anticipating
that, at the outset, there will be a low number of reports. We will
revise this decision based on actual numbers.
In principle, determining whether a existing report has already
been processed is a simple matter. It suffices to delete the files
that have already been processed and "close" old buckets. Yet,
since the VPN app is allowed to cache reports up to one week, it
is possible that the submitter will receive reports for already
closed buckets. Additionally, malfunction in the VPN app may cause
reports to be submitted multiple times. A future version of this
spec will articulate how to solve this problem.
Summarising, this discussion leads us to the following architecture:
```
.---------. .-----------.
| VPN App | --> <<push>> -> | collector | --> {spool}
`---------' `-----------'
.-----------. .------.
{spool} --> <<pop>> --> | submitter | --> <<submit>> --> | OONI |
`-----------' `------'
```
where it is intended that the semantics of `<<pop>>` includes both
processing and removing a report from the spool.
## Privacy Considerations
Users of VPN apps that submit NEL-like reports that end up being
aggregated and resubmitted to the OONI collector MUST be asked for
their informed consent. The informed consent SHOULD clearly
specify the purpose of the data collection (i.e., collecting
data for evaluating the effectiveness of specific protocol stacks
in creating usable VPN tunnels). Additionally, users MUST be
able to opt-out of the process at any time.
Additionally, the aggregation period and the amount of information
disclosed in the aggregated measurements submitted to OONI MUST
take into account the anonimity set.
## Security Considerations
In principle, it is not possible to absolutely trust measurements
submitted by unknown parties. The attack from which we want to
defend is the injection of bogus aggregate measurements, which has
more impact than the injection of bogus individual OONI measurements,
since less information needs to be submitted to the OONI collector
to have a significant impact.
OONI is aware of the issue posed by the injection of bogus
measurements, and they are considering implementing an anonymous
probe ID mechanism to mitigate this issue.
A future version of this specification will consider integrating
this functionality into the submitter, to facilitate OONI's job
of identifying reliable data sources.
The related problem of how to evaluate the reliability of the VPN app
instances is out of the scope of this document.
## LEAP Implementation Details
(This section is non-normative and describes the architecture
of LEAP with respect to data collection and submission. It
mainly serves the purpose of explaining the original context
in which we implemented this specification in production.)
LEAP is currently collecting logs from docker-compose-based field
testing clients. There is a logs processing pipline that transforms
these textual logs into CSV files, shown in a dashboard.
The initial implementation of this specification could be as
simple as a script that processes the CSV files, transforms
its rows into aggregated measurements, and submits them to the
OONI collector. This statement about simplicity is grounded
into the understanding that the existing logs processing pipline
is already able to deduplicate incoming field testing reports.
To make the system compatible with the design described above, we
will also need to modify the logs pipeline to put the CSV files
into a spool directory organised in daily buckets.
This leads us to the following systems architecture:
```
.-----------. .---------------.
| FT Client | --> <<push>> --> | Logs pipeline | --> {spool}
`-----------' `---------------'
.-----------. .------.
{spool} --> <<pop>> --> | submitter | --> <<submit>> --> | OONI |
`-----------' `------'
```
where `FT Client` is the field testing client, and it is intended
that the `<<pop>>` operation includes both processing and removing
a given CSV file from the corresponding bucket.
A future version of this specification will address the problem of
extending this architecture to account for the submission of NEL-like
reports from the Bitmask-VPN app.
Loading