as part of activity 1 in the roadmap https://0xacab.org/leap/soledad/wikis/2017-roadmap we referred to expose some live benchmarks for measuring the scalability of the server (to be used, mainly, in activities 4 and 5 of the roadmap).
definition of done
I would consider "done" the discussion and creation of a new issue with the proposed steps for benchmarking the behavior of a server when serving requests to an elevated number of clients.
Designs
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
If we define "scalability" as "the ability to serve X requests in a certain amount of time while maintaining low usage levels of memory/cpu", then we have to define:
what is the expected number of requests.
what is a low usage level of memory/cpu.
what is the request timeout for satisfying this.
We can start by running benchmark tests and observing what is the current situation. I propose we start with the following test:
spawn a server.
watch resources consumed by server.
launch 1/10/100/1000 clients in parallel
run different tasks:
nothing to sync.
sync of, say, 10MB of JSON.
sync of, say, 10MB of blobs.
measure:
time taken to answer all requests.
cpu usage.
max and mean memory usages.
Does that make sense as an initial step? After that, we could iterate and improve the tests and the output to come up with more data, if needed.
@drebs your overall approach sounds right. however, I don't think we really need to start by setting an expected number of requests, or even defining what a "low" usage is, but rather focus on measuring the current status, as in what is the shape of the curve (linear? superlineal?) and what's the saturation point of the server when adding concurrent requests.
it boils down to answering the opposite questions I think: with n Gb of RAM and this many CPUs (one, hehe), what is the maximum number of clients the server can serve without timing out or crashing? From there, and knowing what's the slope of the response-to-concurrency curve, I think we have enough to start profiling better and measuring improvements.
I think we'd need more steps in the number of clients, to have a nice scalability curve.
I see important to isolate server and client in different machines.
Define clearly what's the "failure point" (I would guess a % of timed out requests beyond a certain cutoff, assuming a fixed standard timeout harcoded in the clients that we also control). we could investigate how other projects are dealing with this).
steps I can think of to test client/server in different machines:
decouple twisted server from tls handling.
get nginx/apache listening on some external port, and proxying to other local port where the twisted code runs in localhost. this makes easier the deployment from a virtualenv, you just need to bind to any local port (no need for root, no need to acess the global tls key material)
get a deploy-from-git running there (can be modified from current soledad deploy-from-git script).
add post-commit hooks to deploy each branch on a server X.
add a post-commit hook to run stress tests on server Y (this can be the docker runner maybe).
@drebs I think this is a good plan, for me we can close this task and point here in the implementation issue.
re. servers, I think we really don't need to run this on every commit. it makes more sense to trigger a benchmark between two releases, or between a branch and master if we're working on something perf-related.
in that sense, I think we can use cdev as a server-side, and something with good bandwith for the DDoS...