-
Type: Task
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Replication
Jepsen is written in Clojure and runs on its own infrastructure. Very few engineers are familiar with either of these and consequently even minor changes to workloads (like DEVPROD-2763) or infrastructure (SERVER-79167) take weeks to implement.
We can get around this problem by rewriting Jepsen workloads in Javascript so that Resmoke can run the workloads instead. Most Jepsen workloads are simple (but clever), so rewriting them in Javascript shouldn't be very time consuming.
Roughly, the steps involved to make this happen:
1. Rewrite Jepsen workloads in Javascript
2. Implement infrastructure to feed the Jepsen workload output into Elle, Jepsen's model checker. This could be as simple as prefixing all log lines from the workload and then filtering them out.
3. Integrate Elle into Resmoke via hooks - for example, we can have a hook that launches this Elle CLI - that processes the filtered output.
A few benefits of doing this:
- Server engineers can write / debug in an environment they're familiar with
- We can extend Jepsen workloads through overrides. For example, we can pretty easily come up with a timeseries version of list-append
- We can run Jepsen workloads under the various fault injection modes / background processes by adding hooks. For example, we can run Jepsen + tenant migrations, Jepsen + initial sync, etc.
- We can run Jepsen workloads on Antithesis