Deadlock Analysis in Stronghold
Stronghold uses RwLock
to enable concurrency.
This approach has the pros of being simple in theory and fast performance when done right. The downside is that, in practice, it is usually really difficult to use locks in concurrent programming, and you often encounter data races or deadlocks when using them.
One of the worst cases would be to have a deadlock in Stronghold, which could block Stronghold and render it unusable.
To avoid this situation, Stronghold uses deadlock analysis.
Requirements for a Multithreaded Stronghold
- Be able to clone the element type parts of the API and use them concurrently. The following types ar reachable through the API and may be cloned and shared between threads:
- Use
Mutex
orRWLock
Deadlocks Analysis
These are thoughts stemming from experience, but they do not have a peer-reviewed paper backing them.
Model
Locks are used for controlling access to shared memory. To simplify the problem, we only model mutex so that only a single thread can access shared memory.
Model of functions and locks
- , , represent locks that protect 3 different and distinct shared memory locations.
- is a function that sequentially takes hold of lock then then .
- releases the locks , , at the end of its computation.
Note that lock depends on lock , because in you can't take lock before taking lock . In the same way, lock depends on locks and .
Example: Deadlock with 2 functions
The following is an example of the most simple possible deadlock.
- The function wants to take the locks for shared memory , then .
- The function wants to take the locks for shared memory , then .
- On the right, you can see a circular lock dependency between and . This means that you can do a circle by following the arrows of and .
Analysis
The graph at the right represents the merged dependency graph of functions and . It shows all the dependencies of the locks in and . A circular dependency on the merged graph indicates the possibility of a deadlock when executing and concurrently.
Example run resulting in a deadlock
- starts and takes the lock .
- starts and takes the lock .
- is stuck, because is holding the lock that it requires.
- is stuck, because is holding the lock that it requires.
Example: Deadlock with 3 functions
In the previous example, we showed that there is a circular dependency of locks between , and .
Example run resulting in a deadlock
- starts and takes the lock .
- starts and takes the lock .
- starts and takes the lock .
- is stuck, because is holding the lock that it requires.
- is stuck, because is holding the lock that it requires.
- is stuck, because is holding the lock that it requires.
Proof backing "no circular dependency equals no deadlock"
This is a proof on an example for simplicity. You can apply the reasoning in the next part to any dependency graph of locks to prove the absence of a deadlock when there are no circular dependencies.
Example: no deadlock
There is no circular dependency in the right-side representation. We will try to argue that there is indeed no possible deadlock in this configuration. The main idea linking absence of deadlock with absence of circular dependency is that any thread will always eventually get the lock it's waiting for. So, it will eventually terminate its computation thus releasing the locks it acquired.
Definition: deadlock-free lock
A lock is deadlock-free when we can prove that it will never be part of a deadlock and will always eventually be released.
Prove the absence of deadlock using the merged graph
- Lock is deadlock-free. No locks depend on lock . When a thread acquires lock it will manage to terminate its computation and will always release it.
- Recursively lock is deadlock-free. Lock is the only lock that depends on lock . That means:
- If a thread acquires lock , it will eventually get lock because lock is deadlock-free (lock will always be available at some point).
- If a thread acquires lock , it will manage to terminate its computation and eventually be released because lock is deadlock-free.
- Lock is deadlock-free by recursivity.
- Recursively, lock is deadlock-free. All the locks that depend on lock are deadlock-free. You can apply the same reasoning for lock to lock .
Stronghold Deadlock Analysis
The goal is to list Stronghold's functions that access locks and build a merge graph. Then, we show that the merge graph does not contain circular dependencies.
Stronghold Share Memories
This list shows the different shared memory locations that are protected behind locks. Each shared memory location/lock is associated with a number that we will use for the dependency graph that follows.
Fields of the Stronghold
type which are shared memories
- snapshot:
Snapshot
- clients:
HashMap<ClientId, Client>
- store:
Store
- key_location:
Option<Location>
Fields of the Client
type which are shared memories
- keystore:
KeyStore
- db:
DbView
- store:
Store
Stronghold Functions
Creating one dependency graph per function would be too complicated. Instead, we list every function that accesses the shared memories. After the function name, we describe the locks that are acquired or the functions that they call.
Functions of the Stronghold
type
- reset:
- store:
- load_client_from_snapshot: 1 2
client
.restore() - load_client: 1 2
client
.restore() - get_client: 2
- unload_client: 2
- purge_client: 1 2
- load_snapshot: 1
- store_snapshot_key_at_location: 1 4
- create_client: 2
- commit_with_keyprovider: 1 2 5 6 7
- commit: 1 2 5 6 7
- write_client: 1 2 5 6 7
- clear: 1 2 3 4
client
.clear()
Functions of the Client
type
- store:
- vault:
- vault_exists: 5
- record_exists: 6
- sync_vaults: get_hierarchy() get_diff() export_entries() 5 6
- sync_with:
other_client
.get_hierarchy() get_diff()other_client
.export_entries()other_client
.5 5 6 - id:
- restore: 5 6 7
- clear: 5 6 7
- execute_procedure: execute_procedure_chained()
- execute_procedure_chained: exec_proc()
- get_db: 6
- get_key_provider: 5
Runner
trait- get_guards: 5 6
- exec_proc: 5 6
- write_to_vault: 5 6
- revoke_data: 5 6
- garbage_collect: 5 6
- resolve_locations: 5
- get_guard: 5 6
SyncClients
trait- get_hierarchy: get_key_provider() get_db()
- get_diff: get_key_provider() get_db()
- export_entries: get_db()
Stronghold Merged Graph
This was created by merging all the lock dependencies from the Stronghold functions.
You can easily see that there are no circular dependencies in this graph.
Conclusion
This document reassures us of the absence of deadlocks in our multithreaded implementation of Stronghold.
However, the theory presented has not been proven correct to our knowledge. Moreover, creating the merged graph is not exempt from human error either. This is not proof of the absence of deadlocks, but an argument and a view on how we designed Stronghold to avoid such a situation.