Direct Blocking Relation
A direct blocking relation is a relation between two tasks T1 and T2, and a lock L1. This type of relation is illustrated in Figure 1. The following properties are related to direct blocking relations.
- T1 is the owner of lock L1
- T2 is said to be blocked on T1 through L1
- T1 is blocked on T2
- T1 is blocked on L1
Figure 1: A Direct Blocking Relation
Indirect Blocking Relation
An indirect blocking relation is a relation between three or more tasks, and 2 or more locks. An inductive definition is used to define an indirect blocking relation because the number of tasks and locks is unbounded.
Base Inductive Case
The simplest indirect blocking relation is between three tasks T1, T2, and T3, and two locks L1 and L2. This relation is illustrated in Figure 2, where the following terminology is used:
- T2 is directly blocked on task T1 through lock L1
- T3 is directly blocked on task T2 through lock L2
- T3 is said to be indirectly blocked on task T1 through lock L1
- T3 is said to be indirectly blocked on task T1 through task T2
![]()
Figure 2: Indirect Blocking Relation (Base Case)
Inductive Case
An indirect blocking relation involving more than three tasks and two locks is illustrated in Figure 3, where the following terminology is used:
- T2 is directly blocked on task T1 through lock L1
- T3 is indirectly blocked on task T1 by the base case
- T4 is indirectly blocked on all tasks and locks that task T3 is blocked on either directly or indirectly.
![]()
Figure 3: Indirect Blocking Relation (Inductive Case)
Proxy Relation
A proxy relation is a blocking relation in which the owner of the lock associated with the relation is not blocked (i.e. it is in the running or ready state). For example, in Figure 1 a single, direct blocking relation exists, and this relation is also a proxy relation. But in Figures 2 and 3, proxy relations exist as both direct and indirect blocking relations.
- The proxy of a blocked task Ta, is a runnable task Tb, associated with the proxy relation that exists between Ta and Tb.
Notice that in Figures 1, 2, and 3 each blocked task is associated with exactly one proxy relation. Because a task can be blocked on exactly one lock at a time, and a lock can have exactly one owner at any moment, a linear path composed of individual blocking relations exists between a blocked task and its proxy. Consider for example the complex set of blocking relations shown in Figure 4. In this figure the cloud represents an arbitrary set of co-existing blocking relations that involve tasks T2 and T5. And finally, note that T1 exists as the proxy of all other tasks, including those tasks existing in the cloud.
Figure 4: Complex Lock Chain
A Proxy Tracker (tracker) is a Proxy Management data structure that associates a blocked task with a blocking relation, with the added property that the system always attempts to associate a tracker with a blocked task’s proxy relation.
A tracker cannot always be immediately associated with a blocked task’s proxy relation. For example, consider the following scenario starting with the lock chain shown in Figure 5. In this initial state task T3 is in the process of blocking on task T2 (shown in green). The faded area of the figure depicts part of a longer lock chain that will be discovered, but is not known immediately. Thus, Proxy Management associates T3‘s proxy tracker with the best choice given the available knowledge of the locking chain.
Figure 5: Proxy Tracker Associated With New Blocking Relation
As part of a iterative, secondary stage in the blocking process a complete locking chain is discovered. Returning to our example scenario, Figure 6 shows the state of the tracker once Proxy Management has knowledge of the entire locking chain, but has not yet updated the tracker associated blocking relation.
Figure 6: Proxy Tracker Needs Update
Finally, Proxy Management will update the association between the tracker and a proxy relation. In this scenario, the final step associates the tracker with a proxy relation, shown in Figure 7, fulfilling the policy goal of Proxy Management’s use of the tracker structure.
Figure 7: Tracker Associated With Proxy Relation
The struct pmgt_tracker structure implements a Proxy Management Tracker.
struct pmgt_waiter {
atomic_spinlock_t lock;
struct task_struct *proxy_task;
struct list_head proxy_ent;
struct list_head nodes;
struct sprx_waiter_node *tnode;
struct list_head gsched_members;
};
lock
An atomic (traditional) spinlock controlling access to the tracker.
proxy_task
The proxy_task field points to the owner of the lock involved in the blocking relation that the tracker is associated with. The proxy_task pointer is depicted by the dashed, red arrow in Figure 6 and Figure 7.
proxy_ent
A list entry structure used to place the tracker on a list of trackers maintained by the task pointed to by proxy_task. The list is named pmgt_trackers and is located in the struct task_struct. The per-task pmgt_trackers list is represented in Figure 8 as a dashed, green line.
nodes
An ordered list of proxy management nodes representing the sequence of blocking relations that compose a lock chain. A lock chain represented by the nodes list begins with a blocked task’s direct blocking relation, and ends with the blocking relation associated with a tracker. This list is represented in Figure 8 as a dashed, black line.
tnode
A pointer to the node representing the same blocking relation that the tracker is associated with. This node is referred to as the top node, and always points to the node at the head of the nodes list maintained by a tracker.
gsched_members
This field is specific to Proxy Execution, and stores a set of Group Scheduling Proxy Members representing the task proxy_task.
A proxy node is a Proxy Management data structure that represents a blocking relation. Unlike a tracker that tracks a single relation, multiple node are linked together to represent the composition of an entire locking chain that a task is blocked on, composed of one or more relations.
Lock Chain Representation
A locking chain is represented by a sequence of nodes arranged in a linked list and stored with a tracker. The nodes are ordered in the list consistent with the underlying blocking relations. Thus the first node in the linked list always represents a direct blocking relation, and the last node in the list is the relation tracked by the tracker associated with the blocked task. In Figure 8 the locking chain associated with each blocked task is shown as a dashed, black line attached to a node.
Figure 8: Linking Among Proxy Management Data Structures (empty lists are not shown)
Lock Chain Distinction
Proxy Management algorithms must be able to distinguish between waiters in distinct locking chains at positions where local, structrual information is not sufficient to make a distinction. For example, in Figure 8 tasks T4 and T2 are blocked on lock L1, and represent two distinct locking chains that merge at lock L1. The merging of the two chains effectively masks the locking chain membership of indirectly blocked tasks. For example, without additional information, given the nodes N3 and N5 associated with lock L1 it cannot be determined which of the two higher-level locking chains involve a specific blocked task. However, notice that the two blocked tasks are distinguishable at a given position (e.g. at L1) by the set of relations they are indirectly blocked on, which includes the direct blocking relations at that position. Thus, we can determine if two observed indirect blocking relations are part of the same chain if and only if they arrived via the same directly blocked task at a particular location. For example, the nodes N3 and N5, when observed at the position of lock L1, are distinguishable if it is known that one arrived via task T2 and the other arrived via task T4.
Figure 9: Proxy Node arrived_via Pointer
The arrived_via property of a blocking relation is implemented using a struct task_struct pointer in the node structure, named simply via_task. The pointer is illustrated in Figure 9 as a dashed black line. In the figure tasks T3 and T5 are indirectly blocked on lock L1, and arrived_via tasks T2 and T4, respectively. All indirect blocking relations by definition have an associated arrived_via property that points to a task. The arrived_via property of a direct blocking relation is set to the NULL value by convention to indentify it as a direct blocking relation.
struct pmgt_node {
raw_spinlock_t lock;
struct task_struct *owner_task;
struct list_head owner_ent;
struct pmgt_tracker *tracker;
struct list_head tracker_ent;
struct task_struct *via_task;
struct rt_mutex *via_lock;
struct list_head via_lock_ent;
};
lock
An atomic (traditional) spinlock controlling access to the structure.
owner_task
The owner of the lock associated with the blocking relation represented by a node. The blocked task associated with the relation is blocked on the owner_task either directly or indirectly. This field is used exclusively for assertions in Proxy Management algorithms. The owner_task pointer is illustrated in Figure 10 as the dashed black line.
owner_ent
A linked list entry structure used to place a node on the pmgt_nodes list of the task pointed to by owner_task. This list represents all direct and indirect waiters on a task. The pmgt_nodes list is shown in Figure 10 as the dashed red line.
Figure 10: Proxy Node owner_task, and pmgt_nodes list
tracker
Pointer to the tracker associated with the locking chain containing a node. Note that deferred actions are used in Proxy Management to clean-up unused nodes in some circumstances. An unused node is identified in Proxy Management by a NULL value assigned to the tracker field. The blacked dashed line in Figure 11 shows the use of the tracker pointer in an example lock chain.
Figure 11: Proxy Node tracker Field
tracker_ent
A linked list entry structure used to place a node on the ordered nodes list of a tracker that represents the sequence of blocking relations beginning with a blocked task, and ending with a proxy relation. The nodes list of the tracker structure is shown as a dashed blacked line in Figure 8.
via_task
The via_task pointer, as discussed in section Lock Chain Distinction, allows Proxy Management to distinguish between blocked tasks based on their inclusion in separate locking chains.
via_lock
A pointer to the RT-Mutex associated with the blocking relation represented by a node. The lock pointed to by via_lock contains a list of all tasks that are directly or indirectly blocked on the lock.
via_lock_ent
The linked list entry structure used to place a node on the list of waiters on the RT-Mutex. The referenced lock is pointed to by via_lock. The list is called pmgt_nodes, and is depicted in Figure 8 as a dashed blue line.
The Linux struct task_struct is modified to track direct and indirect waiters on a task.
struct task_struct {
/* ... omitted ... */
struct list_head pmgt_trackers;
struct list_head pmgt_nodes;
};
pmgt_trackers
A list of trackers associated with a node involving this task in a blocking relation. This list is shown as a dashed green line in Figure 8.
pmgt_nodes
A list of nodes representing all blocking relations involving this task. This list is shown as a dashed red line in Figure 8.
The RT-Mutex structure is modified to track all direct and indirect waiters on the lock.
struct rt_mutex {
/* ... omitted ... */
struct list_head sprx_nodes;
};
pmgt_nodes
A list of nodes representing blocking relations that involve this lock. The via_lock pointer of all nodes on this list point to this RT-Mutex. This list is shown as a dashed blue line in Figure 8.
The RT-Mutex Waiter is modified to point to the tracker of a blocked task in order to minimize the amount of changes to the Linux struct task_struct. The RT-Mutex Waiter and tracker structures serve the same purpose, and have the same life-time (they exist only while a task is blocked). Placing a pointer to a task’s tracker in the RT-Mutex Waiter structure avoids placing it in the struct task_struct (the struct task_struct already contains a pointer to struct rt_mutex_waiter).
struct rt_mutex_waiter {
/* ... omitted ... */
struct pmgt_tracker *pmgt_tracker;
};
pmgt_tracker
Pointer to the tracker structure associated with a blocked task.
Blocking on an RT-Mutex is an operation that resutls in the creation of at least one new proxy relation. When a task blocks on a mutex a direct blocking relation is always created between the blocking task and the owner of the mutex. In addition to the direct blocking relation zero or more indirect relations are also created. Indirect relations are created when the blocking task has waiters on one or more locks it owns, and/or when the owner of the mutex being blocked on is itself blocked on mutex. In the later case the lock chain is “walked” by iteratively examining blocking relations until an owner is found that is not blocked on a mutex. In both cases old proxy relations are destroyed, and new relations are created as locking chain representations in Proxy Management are updated to reflect the change in blocking relations.
The Group Scheduling framework must be informed of changes to blocking relations. During the blocking process Group Scheduling is notified of all new relations created, as well as those that are removed.
static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task,
int detect_deadlock, unsigned long flags,
struct pmgt_tracker *tracker)
{
struct list_head pmgt_trackers;
struct list_head pmgt_nodes;
atomic_spin_lock(&task->pi_lock);
waiter->task = task;
waiter->lock = lock;
list_add(&waiter->list_entry, &lock->wait_list);
task->pi_blocked_on = waiter;
17 pmgt_init_tracker(tracker);
19 INIT_LIST_HEAD(&tmp_trackers);
20 INIT_LIST_HEAD(&tmp_nodes);
22 pmgt_advance_tracker(tracker, NULL, lock, owner, &tmp_trackers,
&tmp_nodes);
25 waiter->pmgt_tracker = tracker;
Line 17
Initialize the tracker stored on the waiter’s stack.
Lines 19-20
Initialize temporary lists on the stack to store proxy management structures modified while holding task->pi_lock which will later be used when holding owner->pi_lock. These two locks cannot be held at the same time, but the spinlock lock->wait_lock protects these data structures during the entire blocking operation.
Line 22
Create a direct blocking relation between the blocking task and the owner of the lock. See Create Blocking Relation.
27 pmgt_advance_waiters(task, lock, owner, &tmp_trackers, &tmp_nodes);
atomic_spin_unlock(&task->pi_lock);
Line 27
Create indirect blocking relations between the waiters on the task blocking (if any), and the owner of the lock being blocked on. The data structures that are placed on the lock owner’s lists are stored on temporary lists until we are able to acquire owner->pi_lock. See Phase 1: Prepare New Relations for more information regarding the pmgt_advance_waiters routine.
30 atomic_spin_lock(&owner->pi_lock);
31 pmgt_advance_waiters_finish(owner, &tmp_trackers, &tmp_nodes);
atomic_spin_unlock(&owner->pi_lock);
get_task_struct(owner);
atomic_spin_unlock_irqrestore(&lock->wait_lock, flags);
37 pmgt_adjust_rt_mutex_chain(owner);
atomic_spin_lock_irq(&lock->wait_lock);
return 0;
}
Line 31
Finish the creation of indirect blocking relations by associating the waiters on the blocking task with the owner of the lock, as well as finalizing the new direct blocking relation. See Phase 2: Finalize/Install New Relations for more information on the pmgt_advance_waiters_finish() routine.
Line 37
In cases where the owner of the mutex being blocked on is itself blocked on a mutex, indirect blocking relations must be extended to represent the entire length of the lock chain. This is done by iteratively “walking” the chain. The routine pmgt_adjust_rt_mutex_chain is responsible for this, and is described in Walking a Lock Chain.
A blocking relation represents various relationships between a blocked task and other Proxy Management structures, as described in struct pmgt_tracker. Creating a new relation involves representing these relationships using pointers and linked lists.
static void pmgt_advance_tracker(struct pmgt_tracker *tracker,
struct task_struct *via_task, struct rt_mutex *lock,
struct task_struct *owner, struct list_head *trackers,
struct list_head *nodes)
{
6 struct pmgt_node *node = pmgt_alloc_node();
/* Setup blocking relation */
9 node->via_task = via_task;
10 node->via_lock = lock;
11 node->owner_task = owner;
12 list_add(&node->via_lock_ent, &lock->pmgt_nodes);
13 list_add(&node->owner_ent, nodes);
/* The tracker is accessible via nodes */
16 node->tracker = tracker;
/* List of nodes in same order as blocked tasks in chain */
19 list_add(&node->tracker_ent, &tracker->nodes);
/* Advance tracker to next position in the chain */
22 list_add(&tracker->proxy_ent, trackers);
/* Used for assertions */
tracker->proxy_task = owner;
tracker->tnode = node;
}
Line 6:
Allocates a node for the new blocking relation. Memory for nodes is created at compile time, and concurrent access to the cache of nodes is protected by a global spinlock. More advanced allocation schemes and concurrency control schemes are possible.
Lines 9-11:
Record information describing a blocking relation’s position in a lock chain.
- Line 9: As described in section Lock Chain Distinction, the via_task field points to the task, directly blocked on the lock, through which an indirect blocking relation is constructed. By convention the via_task field is set to NULL for direct blocking relations. For example, in Figure 9 task T5 records, using the via_task field, that it is blocked on task T1 via task T4.
- Line 10: via_lock is the lock associated with this relation. The task via_task is directly blocked on via_lock. For direct blocking (via_task is NULL) relations the waiter associated with this relation is directly blocked on via_lock.
- Line 11: The owner of via_lock.
Lines 12-13:
Inserts the blocking relation into sets of other relations occupying the same position in the locking chain. For example in Figure 9 the top waiter nodes N5, N4, N3, and N2 are one the same lists associated with lock L1, and task T1.
- Line 12: Inserts the node into a list of waiters on via_lock.
- Line 13: Inserts the node into the list of waiters blocked on the owner of via_lock. A pointer to the target list is passed in because the list is actually a temporary list allocated by the caller. This is done because concurrency control restrictions prevent us from immediately accessing the owner’s real list. The calling context resolves this descrepency.
Line 16
Save a pointer to the tracker. Other Proxy Management routines must be able to locate the tracker from a given reference to a node.
Line 19
Store the node on the tracker’s list of nodes. This is an ordered list representing the structure of the locking chain.
Line 22
The tracker goes onto the temporary list of trackers that will later be added to the per-task list of the lock’s owner.
This section outlines the three routines that are used to extend the blocking chains represented in Proxy Management. The first two routines are used together to move waiters from a blocked task to the owner of a lock being blocked on. The last operation is used create additional indirect blocking relations by walking a lock chain during the blocking process. The creation of new relations during a lock chain traversal uses the first two routines.
The routine pmgt_advance_waiters is the first phase in transferring waiters from one task to another. This routine results in the removal of one proxy relation, and the creation of a new proxy relation (when a tracker is advanced). Group Scheduling must be notified of the removal and creation operations in order to properly represent this new state within the scheduling framework.
static void pmgt_advance_waiters(struct task_struct *task,
struct rt_mutex *lock, struct task_struct *owner,
struct list_head *waiters, struct list_head *nodes)
{
struct pmgt_tracker *tracker, *t;
7 list_for_each_entry_safe(tracker, t, &task->pmgt_trackers, proxy_ent) {
atomic_spin_lock(&tracker->lock);
9 list_del(&tracker->proxy_ent);
10 pmgt_advance_tracker(tracker, task, lock, owner, waiters, nodes);
atomic_spin_unlock(&tracker->lock);
}
}
Line 7
Loop over task‘s set of tracker structures. Each tracker represents a blocking chain that must be extended further up-chain.
Line 9
The tracker is removed from task‘s set. It will be moved to the next task in the lock chain (the owner task).
Line 10
Create a new indirect blocking relation, extending each of task‘s waiters’ locking chains (see Create Blocking Relation). The associated proxy relation is setup at this step, and finalized in the second phase.
This routine is called following the first phase of extending proxy relations, described in Phase 1: Prepare New Relations. At this point during execution a new set of proxy relations is ready to be installed. To install the relation it must be added to the proper linked lists associated with the task the relation is being trasferred to, and Group Scheduling must be notified.
static void pmgt_advance_waiters_finish(struct task_struct *task,
struct list_head *waiters, struct list_head *nodes)
{
struct pmgt_tracker *tracker, *t;
6 list_for_each_entry_safe(tracker, t, waiters, proxy_ent) {
7 list_del(&tracker->proxy_ent);
8 list_add(&tracker->proxy_ent, &task->pmgt_trackers);
}
11 list_splice(nodes, &task->pmgt_nodes);
}
Line 6
Loop over each tracker, removing it from the temporary list we are looping over, and adding it to the new proxy’s list of trackers.
Line 11
Add the set of node on the temporary list to task‘s set of nodes. This list is not currently used for the correct operation of any algorithms, but may be useful in the future: it is the set of all relations involving a particular task.
This routine is called as the last step of a task blocking on an RT-Mutex (Blocking On RT-Mutex) and is used to extend the locking chains maintained in Proxy Management when the owner of the locking being blocked on is itself blocked on a lock. This is an iterative process that examines a sequence of blocking relations until a task is found that is not blocked on an RT-Mutex. This final, non-blocked task is at the head of the lock chain being traversed.
static void pmgt_adjust_rt_mutex_chain(struct task_struct *task)
{
struct rt_mutex *lock;
struct rt_mutex_waiter *waiter;
struct list_head tmp_trackers;
struct list_head tmp_nodes;
unsigned long flags;
static int maxdepth = 0;
int depth = 0;
11 again:
atomic_spin_lock_irqsave(&task->pi_lock, flags);
14 waiter = task->pi_blocked_on;
15 if (!waiter)
16 goto out_unlock_pi;
17 lock = waiter->lock;
18 if (!atomic_spin_trylock(&lock->wait_lock)) {
29 atomic_spin_unlock_irqrestore(&task->pi_lock, flags);
20 cpu_relax();
21 goto again;
}
Line 11
The beginning of a routine-long loop that, for each iteration, considers the next task in the lock chain.
Lines 14-16
Check if the task is blocked on an RT-Mutex. If not, we have found the front of the lock chain, and have completed the chain traversal.
Lines 17-21
Try to lock the RT-Mutex that the task is blocked on. Try locks are used because this ordering of locking is reversed from normal locking order. To prevent deadlocks, the entire process is repeated if we fail to acquire both spinlocks.
24 INIT_LIST_HEAD(&tmp_trackers);
25 INIT_LIST_HEAD(&tmp_nodes);
27 pmgt_advance_waiters(task, lock, rt_mutex_owner(lock),
&tmp_trackers, &tmp_nodes);
atomic_spin_unlock(&task->pi_lock);
put_task_struct(task);
Lines 24-25
Initialize the temporary lists used to store data structures allocated and initialized while holding task->pi_lock, and then transferred to the owner while holding owner->pi_lock. These two locks cannot be held at the same time.
Line 27
Execute the first phase in transferring waiters on task to the owner of the lock that task is blocked on.
33 task = rt_mutex_owner(lock);
get_task_struct(task);
atomic_spin_lock(&task->pi_lock);
38 pmgt_advance_waiters_finish(task, &tmp_trackers, &tmp_nodes);
atomic_spin_unlock(&task->pi_lock);
atomic_spin_unlock_irqrestore(&lock->wait_lock, flags);
44 goto again;
46 out_unlock_pi:
atomic_spin_unlock_irqrestore(&task->pi_lock, flags);
put_task_struct(task);
}
Line 33
Advance to the next task in the lock chain, the owner of the lock being considered. This also initializes the task variable for the next iteration of the loop.
Line 38
Execute the second phase of transferring the waiters.
Line 44
This returns to the beginning of the routine starting a new iteration of the traversal loop.
Line 56
The only exit point of the lock chain traversal is when we encounter a task not blocked on an RT-Mutex. When the head of the lock chain is found execution is transferred to this label.
An RT-Mutex is released when its owner no longer requires the resource that the RT-Mutex protects. After releasing an RT-Mutex a new task may acquire the resource. The RT-Mutex primitive is a blocking semaphore, thus upon releasing an RT-Mutex the owner must wake-up a waiter to ensure that a new task will make progress by running to acquire the resource it is blocked on.
The waiter on an RT-Mutex chosen to be awoken by the owner during release is called the pending owner. The name pending owner refers to the state that the task is in with respect to being the real owner of the RT-Mutex. Specifically, before a task can become the real owner it must run in its own context to fully acquire the lock. Since a delay exists between a task being woken up and being scheduled on a CPU, the state is called pending. We will see later that this state is used in an optimization called stealing in which this delay is exploited to allow high priority tasks to steal an RT-Mutex from a pending owner.
static void wakeup_next_waiter(struct rt_mutex *lock, int savestate)
{
struct rt_mutex_waiter *waiter;
struct task_struct *pendowner;
struct pmgt_tracker *tracker;
struct pmgt_node *node, *t;
atomic_spin_lock(¤t->pi_lock);
10 waiter = rt_mutex_top_waiter(lock);
11 list_del(&waiter->list_entry);
12 pendowner = waiter->task;
13 waiter->task = NULL;
15 if (!savestate)
16 wake_up_process(pendowner);
17 else {
18 smp_mb();
19 if (pendowner->state & ~TASK_RUNNING_MUTEX)
20 wake_up_process_mutex(pendowner);
21 }
22
23 rt_mutex_set_owner(lock, pendowner, RT_MUTEX_OWNER_PENDING);
Lines 10-11
Select a waiter that is directly blocked on this RT-Mutex to become the pending owner. This selection must be integrated with the system scheduling policy. The selected waiter structure is removed from the list of waiters on the RT-Mutex.
Lines 12-13
Get a reference to the pending owner’s task struct, and mark the waiter as not being blocked (blocked within the context of the RT-Mutex framework).
Lines 15-23
Wake-up the pending owner, and set the new owner of the lock to be the pending owner.
25 tracker = waiter->pmgt_tracker;
26 waiter->pmgt_tracker = NULL;
atomic_spin_lock(&tracker->lock);
30 list_del(&tracker->proxy_ent);
atomic_spin_lock(&tracker->tnode->lock);
33 tracker->tnode->tracker = NULL;
34 list_del(&tracker->tnode->tracker_ent);
atomic_spin_unlock(&tracker->tnode->lock);
35 list_del(&tracker->tnode->owner_ent);
36 list_del(&tracker->tnode->via_lock_ent);
37 pmgt_free_node(tracker->tnode);
atomic_spin_unlock(&tracker->lock);
Lines 25-26
Get a reference to the tracker for this pending owner. In this context the owner of the lock is releasing the lock, thus the owner cannot be blocked on any other RT-Mutex. As a result, the tracker for the pending owner should be associated with the owner.
Line 30
Remove the tracker from the owner’s list of trackers.
Lines 33-34
Remove the top-node from the owner’s list of waiters.
Lines 35-37
Finish removal of the top node. Return node to pool of free nodes.
41 list_for_each_entry(node, &lock->pmgt_nodes, via_lock_ent) {
tracker = node->tracker;
43 list_del(&tracker->proxy_ent);
44 list_del(&node->owner_ent);
}
atomic_spin_unlock(¤t->pi_lock);
Line 41
Loop over each relation involving this lock. All relations involving this lock currently will reflect that the owner is the current task.
Lines 43-44
Remove all trackers and nodes from the task releasing the lock.
atomic_spin_lock(&pendowner->pi_lock);
pendowner->pi_blocked_on = NULL;
53 list_for_each_entry_safe(node, t, &lock->pmgt_nodes, via_lock_ent) {
tracker = node->tracker;
atomic_spin_lock(&tracker->lock);
57 if (node->via_task == pendowner) {
58 pmgt_prune_node_chain(tracker, node, pendowner);
59 pmgt_free_node(node);
60 } else {
61 list_add(&node->owner_ent, &pendowner->pmgt_nodes);
62 node->owner_task = pendowner;
}
65 tracker->proxy_task = pendowner;
66 list_add(&tracker->proxy_ent, &pendowner->pmgt_trackers);
atomic_spin_unlock(&tracker->lock);
}
atomic_spin_unlock(&pendowner->pi_lock);
}
Line 53
Loop over all nodes on the lock to setup new relations between waiters on the lock and the pending owner that has been chosen by the releasing task.
Lines 57-59
When the node being considered arrived via the stealing task then a relationship between the waiter associated with the node and stealing task already exists for some other RT-Mutex that the stealing task owns. In this case the chain of nodes for the waiter is pruned of its top node, leaving the node representing the relation between the waiter and the stealing task.
Lines 60-62
If the waiter represented by the node is directly blocked on the RT-Mutex, or arrived via another task, then the node is reused.
Lines 65-66
Finalize the movement by adding the tracker to the pending owner.
When a lock is stolen or released the new owner (or pending owner) will already have relationships established with waiters that are blocked on the lock indirectly through the new owner. For these waiters their tracker is associated with the old owner of the lock being acquired, and thus their top node must be removed. This routine removes the top node from the tracker and update the tracker’s top node with the next node in the lock chain.
static void pmgt_prune_node_chain(struct pmgt_tracker *tracker,
struct pmgt_node *node, struct task_struct *next_owner)
{
atomic_spin_lock(&node->lock);
5 node->tracker = NULL;
6 list_del(&node->tracker_ent);
7 list_del(&node->via_lock_ent);
9 tracker->tnode = list_first_entry(&tracker->nodes, struct pmgt_node,
tracker_ent);
atomic_spin_unlock(&node->lock);
}
Lines 5-9
Mark the node as dead and remove from the linked lists that it is on.
Line 9
Update the new top node of the tracker with the next node in lock chain.
Stealing an RT-Mutex refers to an optimization by which a task may steal ownership of an RT-Mutex from a pending owner. The optimization takes advantage of the scheduling latency between a pending owner being awoken, and it running to acquire an RT-Mutex.
Figure 12: The dashed arrow in the left sub-figure depicts a pending owner. The right side shows the result of T4 stealing L1. On the right, T1 is running to re-block on the lock, depicted by the dashed arrow.
The dashed arrow in the left side of Figure 12 leaving L1 and entering T1 indicates a pending owner relation. There are three possible scenarios that can occur when the system is in a state such as that shown in the left side of Figure 12.
In the first scenario the pending owner T1 runs in its own context to become the full owner of the lock L1. The is the simplist, and most common case.
The second scenario is that a task unrelated to the locking chain such as T4 steals the lock from pending owner T1. The result of this scenario is shown on the right side of Figure 12. The dashed line indicates a blocking relationship between T1 and L1 that is in the process of completing. Recall that when a task is selected to be a pending owner it is woken up. Thus, T1 is running when T4 steals the lock, and T1 will re-block on lock L1.
Figure 13: The dashed arrow in the left sub-figure depicts a pending owner. The right side shows the result of T2 stealing L1. On the right, T1 is running to re-block on the lock, depicted by the dashed arrow.
The third scenario occurs when a waiter on the lock receives a non-mutex wake-up and runs to steal the lock from the pending owner T1. Figure 13 shows the result of this scenario when task T2 runs to steal the lock from task T1.
static inline int try_to_steal_lock(struct rt_mutex *lock,
struct task_struct *task, int mode)
{
struct task_struct *pendowner = rt_mutex_owner(lock);
struct pmgt_tracker *tracker;
struct pmgt_node *node, *t;
8 if (!rt_mutex_owner_pending(lock))
return 0;
11 if (pendowner == task)
return 1;
atomic_spin_lock(&pendowner->pi_lock);
16 if (!lock_is_stealable(task, pendowner, mode)) {
atomic_spin_unlock(&pendowner->pi_lock);
return 0;
}
21 if (likely(!rt_mutex_has_waiters(lock))) {
atomic_spin_unlock(&pendowner->pi_lock);
return 1;
}
Line 8
If this is the real owner then we cannot steal the lock.
Line 11
If the pending owner is the task trying to steal then let it succeed.
Line 16
Make a policy decision about allowing the lock be stolen from the pending owner. This decision must be integrated with the system scheduling policy.
Line 21
If there are no waiters on the RT-Mutex being stolen then there are no other required operations. The lock may be acquired.
26 list_for_each_entry(node, &lock->pmgt_nodes, via_lock_ent) {
27 if (!node->tracker)
continue;
tracker = node->tracker;
30 list_del(&tracker->proxy_ent);
31 list_del(&node->owner_ent);
}
atomic_spin_unlock(&pendowner->pi_lock);
Line 26
Loop over all waiters on the lock and remove them from the pending owner. These would have been originally setup by the owner of this lock when it was being released.
Line 27
Skip dead nodes that will be reclaimed during lock release.
Lines 30-31
Remove node and tracker from the pending owner.
atomic_spin_lock(¤t->pi_lock);
38 list_for_each_entry_safe(node, t, &lock->pmgt_nodes, via_lock_ent) {
39 if (node->tracker == NULL)
continue;
tracker = node->tracker;
atomic_spin_lock(&tracker->lock);
45 if (node->via_task == current) {
46 pmgt_prune_node_chain(tracker, node, current);
47 pmgt_free_node(node);
48 } else {
49 list_add(&node->owner_ent, ¤t->pmgt_nodes);
50 node->owner_task = current;
}
53 tracker->proxy_task = current;
54 list_add(&tracker->proxy_ent, ¤t->pmgt_trackers);
atomic_spin_unlock(&tracker->lock);
}
atomic_spin_unlock(¤t->pi_lock);
return 1;
}
Line 38
Loop over all nodes on the lock to setup new relations between waiters on the lock and the stealing task.
Line 39
Skip dead nodes that will be reclaimed during lock release.
Lines 45-47
When the node being considered arrived via the stealing task then a relationship between the waiter associated with the node and stealing task already exists for some other RT-Mutex that the stealing task owns. In this case the chain of nodes for the waiter is pruned of its top node, leaving the node representing the relation between the waiter and the stealing task.
It is important to node here that the stealing task can be either a waiter on the lock, or another task trying to acquire the lock for the first time. In the former case the stealing task is running even though it has not been chosen as the pending owner. This is a special case because the configuration of the stealing task’s data structures are for that of a blocked task.
To handle this case the stealing task treats its own ‘node’ as any other waiter, and the actual clean-up of this inconsistent state is handled in remove_waiter before returning from the RT-Mutex acquisition routine.
Lines 48-50
If the waiter represented by the node is directly blocked on the RT-Mutex, or arrived via another task, then the node is reused.
Lines 53-54
Finalize the movement by adding the tracker to the stealing task.
There are two instances where a task trying to acquire a lock must explicitly remove itself as a waiter on a lock. The first occurs when a waiter runs unexpectedly to steal a lock from a pending owner. Having not been chosen by the owner of a lock during normal release, the waiter is acquiring will acquire an RT-Mutex while still being a waiter. The second case occurs when a waiter receives a signal or a timeout occurs while acquiring an interruptible mutex. In this case the waiter does not acquire the lock, but must still remove itself as a waiter.
In the later case in which the lock acquisition is aborted the result is a severed lock chain which can occur at an arbitrary location. This requires special a special clean-up routine that updates the Proxy Management data structures to represent the split.
The routine remove_waiter() is called by a waiter in both the case that it steals the RT-Mutex it blocked on, and the case in which it receives a signal or a timeout occurs.
static void remove_waiter(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
unsigned long flags)
{
struct task_struct *owner = rt_mutex_owner(lock);
struct pmgt_tracker *tracker;
atomic_spin_lock(¤t->pi_lock);
9 list_del(&waiter->list_entry);
10 waiter->task = NULL;
11 current->pi_blocked_on = NULL;
tracker = waiter->pmgt_tracker;
waiter->pmgt_tracker = NULL;
atomic_spin_unlock(¤t->pi_lock);
18 pmgt_remove_tracker(tracker, lock);
20 if (owner != current)
21 pmgt_move_waiters(lock);
}
Line 9
The waiter structure is removed from the list of waiters on the lock.
Lines 10-11
Mark this waiter and task as being no longer blocked.
Line 18
Remove the tracker associated with the waiter.
Lines 20-21
If the current task (the waiter) is not the owner then the lock acquisition is occuring because of a signal or timeout. In this case the split can occur at an arbitrary location in a locking chain and requires special clean-up.
Note that when the waiter is the owner (as a result of stealing the lock) then it cannot be blocked on any other RT-Mutex, and thus such a situation will always occur at the head of a locking chain.
The removal of a waiter results in a change the blocking relations involving the waiter and the tasks ahead of the waiter in its lock chain. Since trackers represent proxy relations that affect scheduling state they must be removed. Nodes, however, can be marked as dead, in which case they are ignored by the other Proxy Management code. This is ability is important because it allows us to avoid acquiring a large amount of locks to clean-up all data structures. The routine pmgt_remove_tracker is used to remove a tracker from an arbitrary location, and mark all nodes associated with the waiter as dead.
static void pmgt_remove_tracker(struct pmgt_tracker *tracker,
struct rt_mutex *lock)
{
struct pmgt_node *node, *t;
struct rt_mutex *via_lock;
struct task_struct *proxy_task;
retry:
atomic_spin_lock(&tracker->lock);
11 proxy_task = tracker->proxy_task;
12 if (!atomic_spin_trylock(&proxy_task->pi_lock)) {
atomic_spin_unlock(&tracker->lock);
cpu_relax();
goto retry;
}
18 via_lock = tracker->tnode->via_lock;
19 if (via_lock != lock && !atomic_spin_trylock(&via_lock->wait_lock)) {
atomic_spin_unlock(&proxy_task->pi_lock);
atomic_spin_unlock(&tracker->lock);
goto retry;
}
25 list_del(&tracker->proxy_ent);
27 list_for_each_entry_safe(node, t, &tracker->nodes, tracker_ent) {
atomic_spin_lock(&node->lock);
BUG_ON(node->tracker == NULL);
node->tracker = NULL;
list_del(&node->tracker_ent);
atomic_spin_unlock(&node->lock);
}
if (via_lock != lock)
atomic_spin_unlock(&via_lock->wait_lock);
atomic_spin_unlock(&proxy_task->pi_lock);
atomic_spin_unlock(&tracker->lock);
}
Lines 11-12
Given the tracker being removed we attempt to lock proxy task that the tracker is associated with. Note that it is possible that the proxy task is the owner of the current lock (i.e. at the current location in the chain).
Lines 18-19
The RT-Mutex at the location of the tracker must also be locked. However, if via_lock is equal to the current lock, then we are already holding the relevant lock.
Line 25
Remove the tracker from the proxy it is associated with.
Line 27
Mark all of the nodes associated with the waiter as dead. These will be removed during the next release of the RT-Mutex, and allows the routine to avoid acquiring multiple locks along the lock chain to completely clean-up the nodes.
A waiter that receives a signal or timeout may abort acquisition of an RT-Mutex. In such a situation a locking chain can be ‘cut’ at an arbitrary location. Because the aborting task will become the new proxy of all its waiters (because it is now not blocked), special clean-up is required.
static void pmgt_move_waiters(struct rt_mutex *lock)
{
struct pmgt_node *node, *node2, *t;
struct pmgt_tracker *tracker, *tw;
struct list_head tmp_trackers;
struct rt_mutex *via_lock;
struct task_struct *proxy_task;
9 INIT_LIST_HEAD(&tmp_trackers);
retry:
list_for_each_entry(node, &lock->pmgt_nodes, via_lock_ent) {
atomic_spin_lock(&node->lock);
16 if (!node->tracker) {
atomic_spin_unlock(&node->lock);
continue;
}
21 if (node->via_task != current) {
atomic_spin_unlock(&node->lock);
continue;
}
Line 9
Initialize a list that is used to hold a set of up-chain trackers that are to be moved down-chain to the location of the aborting waiter.
Note that it is important to consider other concurrent operations that may be taking place. The most important being another task walking the lock chain and advancing trackers up-chain while we are trying to move them down-chain. This situation is made safe by two properties. The first is that a task walking the lock chain cannot proceed once it reaches a task that is not blocked. At this point in the execution of the lock abort code path RT-Mutex data structures have been set to make the waiter be considered as not blocked (task->pi_blocked_on = NULL).
The second situation addresses tasks that are walking the lock chain and are already at a location ahead of the current task. A problem will arise if a task tries to advance trackers forward which we are moving backwards. This case is made safe because this routine removes all trackers from up-chain tasks, thereby insuring that no up-chain task can advance a tracker (because they do not exist at up-chain locations).
Line 16
Ignore dead nodes or any concurrent down-chain lock breaks.
Line 21
Ignore waiters that are not on the same lock chain as the aborting task.
25 tracker = node->tracker;
if (!atomic_spin_trylock(&tracker->lock)) {
atomic_spin_unlock(&node->lock);
goto retry;
}
31 proxy_task = tracker->proxy_task;
if (!atomic_spin_trylock(&proxy_task->pi_lock)) {
atomic_spin_unlock(&tracker->lock);
atomic_spin_unlock(&node->lock);
goto retry;
}
38 via_lock = tracker->tnode->via_lock;
if (via_lock != lock && !atomic_spin_trylock(&via_lock->wait_lock)) {
atomic_spin_unlock(&proxy_task->pi_lock);
atomic_spin_unlock(&tracker->lock);
atomic_spin_unlock(&node->lock);
goto retry;
}
Line 25
Acquire the lock on the tracker associated with the node being considered.
Line 31
Acquire the lock of the proxy task at the tracker’s location in the lock chain. Note that the proxy task may be the owner of the current RT-Mutex.
Line 38
Acquire the lock on the RT-Mutex at the tracker’s location. Note that it might be the case that the RT-Mutex is the same as the one the current task is aborting acquisition of.
46 list_del(&tracker->proxy_ent);
48 list_for_each_entry_safe(node2, t, &tracker->nodes, tracker_ent) {
50 if (node2 != node)
atomic_spin_lock(&node2->lock);
53 node2->tracker = NULL;
54 list_del(&node2->tracker_ent);
56 if (node2 != node)
atomic_spin_unlock(&node2->lock);
else
break;
}
Line 46
Remove the tracker from the proxy task at its location.
Line 48
Prune the set of nodes such that all nodes up-chain from the current location are made dead, as these represent stale blocking relations. Note that we are traversing the list from head to tail. That is, starting with a proxy relation and moving backwards towards the waiter in a down-chain location.
Line 50
The node lock for the variable node is already held, thus we do not need to acquire it.
Lines 53-54
Remove the node from the tracker, and mark the node as dead.
Line 56
When the current node is reached then pruning is complete. The next node on the lock chain will be the relation between the waiter aborting acquisition, and another waiter at a down-chain location.
62 tracker->tnode = list_first_entry(&tracker->nodes,
struct pmgt_node, tracker_ent);
65 tracker->proxy_task = current;
66 list_add(&tracker->proxy_ent, &tmp_trackers);
if (via_lock != lock)
atomic_spin_unlock(&via_lock->wait_lock);
atomic_spin_unlock(&proxy_task->pi_lock);
atomic_spin_unlock(&tracker->lock);
atomic_spin_unlock(&node->lock);
}
atomic_spin_lock(¤t->pi_lock);
78 list_for_each_entry_safe(tracker, tw, &tmp_trackers, proxy_ent) {
79 list_del(&tracker->proxy_ent);
80 list_add(&tracker->proxy_ent, ¤t->pmgt_trackers);
}
atomic_spin_unlock(¤t->pi_lock);
}
Line 62
Setup the new top node.
Lines 65-66
Add the tracker to the temporary list.
Lines 78-80
Move all trackers to the current task.
Nodes that are maked dead must be reclaimed. This is performed when an owner releases a lock. All nodes that are marked dead have been removed from the relevant tracker. The only thing remaining is to remove them from the owner’s list and from the lock’s list of nodes. In the context of releasing an RT-Mutex all locks can saftely be acquired.
static void pmgt_clear_dead_nodes(struct rt_mutex *lock)
{
struct pmgt_node *node, *t;
atomic_spin_lock(¤t->pi_lock);
list_for_each_entry_safe(node, t, &lock->pmgt_nodes, via_lock_ent) {
if (node->tracker == NULL) {
list_del(&node->owner_ent);
list_del(&node->via_lock_ent);
pmgt_free_node(node);
}
}
atomic_spin_unlock(¤t->pi_lock);
}