I remember the "spin" network protocol simulator from Bell Labs in the early '90s. It was used at then-AT&T to predict and avoid bugs in telecom equipment systems. Sounds very similar in approach; simulate first, detect protocol problems (deadlocks, livelocks, bad states, etc) and then implement once you get it right.
@outwithrealitytoo10 күн бұрын
It does not entirely check for "thread safety" - but testing for unexpectedly out of order events are certainly a starter for ten. In the olden days working on radio comms s/w of state/event matrices forced devs to think about all eventualities - but those were simpler systems. When you have multiple processes speaking to each other, each with their own state machine, the number of states of the system as a whole becomes exponentially more complex as you add processes, and bad things can happen. Part of the argument for distributed systems is that each small part can be easily understood and tested fully. However, often by splitting a complex system you actually create a system with far far more states many of which are never considered, or used or tested. Deterministic Simulation will test some of these situations by not all of them. It will be helpful for debugging but best not have a bug in the first place. And all this is before people start inaccurately and incompatibly reflecting the same state in multiple processes. For this reason my advice has always been "do not create another process or another thread unless absolutely necessary - it may be complicated , but splitting it up unnecessarily will only make that worse, it just won't be obvious until it is too late".
@LewisCampbellTech Жыл бұрын
5 minutes into this talk and my mind is already blown.
@syzer392110 жыл бұрын
can u fix the sound?
@waynemokane3 жыл бұрын
What if the invariant you want to test isn't so invariant after all, but actually depends on what happened during the simulation? Ex: if some message gets dropped, then I wouldn't expect the final state to have key X. Is this feasible? You would need to make the simulator dynamically update the expectation based on what random thing it breaks. It seems like doing that would likely require reimplementing a lot of the state management logic of the system under test itself.
@mortenbrodersen866410 жыл бұрын
Also, tools like TLA+ are made to solve these problems. The difference being that TLA+ uses logic to verify all traces *exhaustively*, without taking millions of years to do so. Running a sim will only explore a tiny percentage of possible event traces.
@bgianf10 жыл бұрын
TLA+ will only get you proof of your algorithms, you must still verify your implementation...
@09goral4 жыл бұрын
It’s also not verifying all and exhaustively. It picks some subset of the all possibilities.