Skip to content
Snippets Groups Projects

Add proposal for invite system with storage layer

1 file
+ 60
0
Compare changes
  • Side-by-side
  • Inline
+ 60
0
# 004: invite system
* Author: atanarjuat, max-b
* Reviewers: ...
* Status: draft
* Related: 002-vpnweb-deprecation, 005-introducer
## Problem
By default, `menshen` api is open to the public [^menshen23].
For low-profile deployments, we'd like to add a way to perform resource
allocation, which is able to assign pools of resources to distinct sets of
users. With this ability, menshen can further filter the general pool of
resources by a basic form of ACL.
[^menshen23]: by the end of 2023, `menshen` can optionally disable the all-gateways and all-bridges endpoint via a configuration variable.
## Properties
The ideal properties in a first invite system are:
* Revocation, if at all possible by the chosen implementation.
* Decoupled from certificate system; i.e., after using invite system the API or the tunnel nodes (gateways/bridges) should not be able to identify the user.
* An invitation token can be used to create any number of VPN configurations with unique credentials each.
* An invitation token can be shared by different users and is valid until the token is revoked by an administrator
For a first iteration, we're not after a web UI. This means system admins can generate a batch of tokens and distribute it to coordinators.
## Proposal
* Tag all resources (bridges, gateways) with a bucket tag (`bucket1`, `bucket2`...). This tag is needed every time a new resource is added to the pool.
* Generate a random, unique string value that will be an access token.
+ This could be a variety of different random types, including UUID, hex string of random bytes, random alphanumeric characters.
+ We can prefix it with something to denote that it is an access token, eg: `solitech:qVJ3rvRbE6`.
+ The requirement here is that they be universally unique. We can do that by ensuring a sufficiently large "key space".
* Create a storage layer where we can map arbitrary key strings to an ACL list. This ACL list is likely just a list of the buckets the token should have access to eg: `(bucket1,bucket2)`
* This personal access token (`PAT`) is distributed to users (via a trust anchoring mechanism, off-band).
* Requests to menshen can exchange this PAT to optionally authenticate as many menshen endpoints as needed.
* Admins can invalidate either buckets or the individual `PAT`s.
+ Invalidating/revocating `PAT`s would be as simple as deleting that key from the database OR adding a `deleted_at` column where we consider any row in the database that has a non-null `deleted_at` value to be "deleted"/"missing"/etc.
...
## Consequences/Discussion
* We might have to revisit the global authentication premise (for the tunnel layer). This means that an user from bucket A can also authenticate to gateways from bucket B.
* Alternatively, we might want to maintain the global authentication premise (for tunnel) but guard the discovery+access to the bridges, per bucket.
* Our strategy for allowing invitation tokens to create any number of VPN configurations and be shared complicates identifying/attributing abuse.
+ Theoretically if we saw abuse patterns at a gateway/bridge we could theoretically look at the associated cert and then we _could_ have the logs to go backwards in time and see which token generated those credentials.
* This requires that we add a storage layer, as opposed to a JWT style approach. However, adding revocation to the JWT strategy would require a storage layer anyways, so 🤷
* SQLite might be a good match for this use case.
+ SQLite contains its database in a single file and has a self contained engine.
+ This could significantly cut down on infrastructure changes in lilypad, etc.
+ If we needed additional menshen hosts to access the database, we _could_ use a tool like [litestream](https://litestream.io/) to replicate the database across hosts.
+ Like any database, we *will* need to make sure that we're configuring/holding it correctly (see https://blog.turso.tech/something-you-probably-want-to-know-about-if-youre-using-sqlite-in-golang-72547ad625f1 for an example of a footgun)
+ [wig](https://git.autistici.org/ai3/tools/wig) is a good example of a project that uses SQLite in a similar manner.
* Redis is also a reasonable solution, though _does_ require additional infrastructure changes.
* We _may_ want to consider whether it's appropriate to re-use this storage layer for other purposes like a dynamic inventory. Redis is a better candidate for that use case, whereas SQLite would likely be specifically just for this one purpose.
* We'll need to figure out how to actually tag resources and how to import that list of resources and their tags in menshen. See https://0xacab.org/leap/menshen/-/issues/26 for additional detail.
Loading