The Proxy Management (PMGT) subsystem is responsible for all accounting required to track: (1) all owners of locks, (2) all waiters on locks, (3) all blocking relations among tasks and locks, and (4) all proxy relations among tasks. A detailed discussion of its design and implementation is provided in the Proxy Managment Section.
This section discusses how the PMGT Scheduling Layer interface is configured to use Group Scheduling (GS) as the Linux system Scheduling Layer (SL) for purposes of integrating concurrency control and and scheudling. The PMGT layer implements its interface to the SL as a set of function pointers that are filled in as appropriate to the SL being used. The KUSP system configuration associates with GS routines enabling the GS layer to make use of the PMGT information in controlling the behavior of the system. The system configuration demonstrating PMGT layer compatibility with default PREEMPT_RT patch semantics associates the pointers with routines that implement the Priority Inheritance semantics of RT-Mutexes under PREEMPT_RT using the PMGT software layer.
The set of routines implementing the PMGT-SL interface can be divided into two broad categories: notification and decision. Routines in both sets are called from the RT-Mutex code as the situations for which they are responsible arise. Notification routines are concerned with notifying the SL when the state or context of a task changes in a way that may affect the control of task execution by the SL. The decision making set are concerned with asking the scheduling layer for decisions related to lock management, including: which waiter to select when a lock is released, or whether a given task should be permitted to steal an RT-Mutex from a pending owner.
This enumeration defines a set of task states used by the GS software in managing tasks as members of groups. The state is stored in the GS member structure. Since tasks may be members of more than one group, state variables in more than one member structure may be involved. In general, the set of task memberships is managed as a group, and so the state variables of each member structure will generally have the same value, except during update periods when some in a set may have been changed, but others have not.
This set of states is a superset of those used by the vanilla Linux system since GS is a user of the PA information and is thus willing to select tasks which are blocked but have proxies, so that the proxy may be run and stop blocking the preferred task.
enum gsched_task_state {
GSCHED_TASK_RUNNABLE
GSCHED_TASK_BLOCKED
GSCHED_TASK_PROXY_SEARCH
GSCHED_TASK_PROXIED
}
GSCHED_TASK_RUNNABLE - A task is runnable. The member is on the task cpu.
is on the task cpu.
of being found.
GSCHED_TASK_PROXIED - A task is blocked and has a proxy.
A task A and its proxy B may be on the same CPU or on different CPUs. When they are on different CPUs, the member structure of A will be transferred to the per-CPU structure of B’s CPU so that if A is selected, B can be run. These state variables are thus important in controlling crucial aspects of GS behavior.
The gsched_member structure holds all information for the member of a group, and is discussed elsewhere. When a group member is a task the gsched member structure’s anonymous union members ‘task’ and ‘group’ are of structure types appropriate to representing tasks and groups. The gsched_member_task structure is the type used to represent tasks as members of a group. The structure gsched_member_task is defined in include/linux/kusp/sched_gsched.h.
struct gsched_member_task {
struct task_struct *ptr;
struct task_struct *proxy;
struct list_head waiter_ent;
struct list_head waiting_members_ent;
int task_cpu;
int proxy_cpu;
enum gsched_task_state state;
enum gsched_task_state blocked_state;
}
It holds proxy information and CPU assignment.
ptr - Pointer to the task struct of the task represented by this member.
member. Proxy must be on the same CPU. NULL when a process is blocked on a lock but we are still searching for the proxy.
field in the pmgt_waiter structure which represents all group memberships of this task. When a task is blocking on a lock the list is protected by the pi_lock of the task. After the proxy has been discovered, the list is protected by the pi_lock of the proxy.
gsched_waiting_members field in task structure of proxy. This list represents all of the tasks waiting on the proxy. Protected by the rq lock of proxy. Contains the default values indicating an empty list if the proxy field is NULL.
task_cpu - The cpu that the task represented by this member is on.
proxy_cpu - The cpu of the proxy. -1 if proxy is NULL
state - One of the values from gsched_task_state.
and then blocks. The blocked state can never be GSCHED_TASK_RUNNABLE. If the state of a task is not GSCHED_TASK_RUNNABLE then the state and blocked_state will have the same value. The reason for this is in the implementation of proxy accounting there are some periods during which the state of the thread is blocked but it must continue running to complete some of the accounting code.
The pmgt_waiter structure is used by the PAL layer to represent a task waiting on a lock. The primary discussion of its is in the PA documentation, here. The pmgt_waiter structure is created and associated with a proxy task when a task blocks on a lock. There can be several steps involved in identifying the Proxy of the blocking task, so the pmgt_waiter structure is associated with tasks along the chain from the point at which a new waiter joins a set to the end where it finds its proxy.
When GS is the scheduling layer above PMGT, it stores the group memberships of a task on a gsched_task_members list in the pmgt_waiter structure. When the PI PREEMPT_RT scheduling layer is using PMGT, then a pi_waiters_ent field is used to keep the waiter on a queue sorted by priority.
The elements of the gsched_task_members list are removed when the task is no longer waiting on a lock and thus waiting on a proxy. Elements of the list are also removed when the waiting task represented by the pmgt_waiter structure is removed from a group, and thus no longer a member of it. An element is added to the list if the waiting task is added to a group while it is a waiter.
The gsched_waiting_members field in the task_struct structure representing task T is the root of a list which represents the tasks for which T is a proxy. The task structure is defined in include/linux/sched.h. The elements of the list are instances of the structure gsched_member which represents members of a group under GS, and is defined in include/linux/kusp/sched_gsched.h. Since the waiters are necessarily tasks, the member structures use the gsched_member_task version of the union in the gsched_member struct. Within that union containing information specific to tasks, the waiting_members_ent field is used to keep entries on the gsched_waiting_members list.
List Root: task_struct->gsched_waiting_members
List Element: gsched_member->task->waiting_members_ent
Use: If a task T is a proxy for one or more tasks, then this list holds all of the gsched_member structures for the tasks waiting on T. Otherwise, it is empty.
CC: rq lock of CPU to which the task is assigned.
The group_memberships field in the task_struct structure represents the memberships of the task in one or more GS groups. The task structure is defined in include/linux/sched.h. The elements of the list are instances of the structure gsched_member which represents members of a group under GS, and is defined in include/linux/kusp/sched_gsched.h. The list uses the samecomputation_entry member of the structure to form the list.
Note
The samecomputation_entry field is also used for groups which are members of a group. When the entity is a group, the list is rooted in the gshed_groups structure, defined in include/linux/kusp/sched_gsched.h, using the memberships field to root the list.
Root: task_struct->gsched_memberships
List Element: gsched_member->samecomputation_entry
CC: rq lock of CPU to which the task is assigned.
This list holds the same information as gsched_memberships, but it rooted in the pmgt_waiter structure, and uses the waiter_ent element of the list to form the list of all of a waiting task’s group memberships. The reason we create a “duplicate” list is that the concurrency control can be differnt, and less restrictive. The gsched_memberships list rooted in the task structure is protected with the run queue lock for the CPU to which the task is assigned. This obviously restricts concurrency for all tasks on the CPU. In contrast, this list is protected by the task_struct->pi_lock of the waiting task at first, and then of the proxy task when it is identified. This obviously restricts concurrency only for tasks accessing the infomation associted with a single task, and is much less restrictive.
Root: pmgt_waiter->gsched_task_members
List Element: gsched_member_task->waiter_ent
CC: Initially protected by task_struct->pi_lock of the task represented by the pmgt_waiter. When a proxy has been found, then it is protected by pi_lock of the proxy, pmgt_waiter->proxy_task->pi_lock.
The Proxy Management (PMGT) layer implementation assumes that some scheduling layer will exist in the system and use the proxy data being managed. The interface between the scheduling layer (SL) and PMGT is implemented as a set of hooks, divided into two subsets, those notifying the SL of information it may care about, and those requiring a decision from the SL.
The notification set are managed using the pmgt_sched structure, while the decision set are managed using the pmgt_arbiter structure. Both structures are defined in include/linux/rtmutex.h. The pmgt_sched structure represents the full set of PMGT SL interface functions since it contains the pointer field arbiter which refers to an instance of the pmgt_arbiter structure.
Note
This is a bit of a tangle. The text here would be easier to follow if it included soem mention of where all these routines and structures are defined. The players include: kusp/gsched_core.c, kernel/rtmutex_proxy.c, kernel/pmgt_sched.c, include/linux/rtmutex.h, linux/kusp/sched_gsched.h
These hooks are called when the locking code must interact with the SL because the proxy code is in a situation where scheduling semantics influence behaviour. One of the obvious examples of such a hook is when a lock is released and a pending owner should be selected. Obviously the choice of pending owner is determined by the current scheduling semantics, and is beyond the jurisdiction of the PMGT layer. Other examples include when the lock code must ask the SL if a lock can be stolen from its pending owner by another task, or when a task has stopped being a waiter on a lock and the SL must be notified that the set of waiters for a given proxy task has changed.
These functions are prefixed with gsched_pmgt.
Note
FIXME.D: This appears to have changed into task_init and waiter_init. Check with Tyrian on this.
Called when proxy management is creating a new waiter because a task is about to block on a lock.
Walk gsched_memberships list in task structure of blocking task to find all its group memberships in order to build the parallel list of group memberships rooted at gsched_task_members in pmgt_waiter structure with each member of the list using waiter_ent of the gsched_member_task. The purpose of the parallel data structure is to shift the concurrency control domain from the rq lock required for gsched_memberships to the pi_lock required for pmgt_waiter.
Called when a task was a waiter on a lock but is becoming the owner or aborting the wait. We are thus destroying any record of the task as a waiter.
Usually, the task represented by a waiter will be GSCHED_TASK_RUNNABLE whenever a waiter is destroyed. However, it is possible for a GS SDF to modify these semantics. Therefore, GSCHED_TASK_PROXY_SEARCH and GSCHED_TASK_PROXIED are also valid states. For example, on lock release, guided execution does not permit wakeups for the pending owner because it waits for the releasing thread to reach the next waypoint in userspace. Therefore, the pending owner could have its waiter destroyed while it is GSCHED_TASK_PROXY_SEARCH or GSCHED_TASK_PROXIED.
The blocked state of the task having its waiter destroyed will be set to GSCHED_TASK_BLOCKED, regardless of which state the task is currently in. This indicates to GS that the task is no longer blocked on a lock. Additionally, if the task is not currently runnable then the state will be set to the blocked_state.
Under normal circumstances, the task having its waiter destroyed will be GSCHED_TASK_RUNNABLE. However, this is not always the case when the guided execution programming model is being used and its SDF is controlling a lock and the waiters on the lock. When a lock is released, guided execution does not allow the pending owner of the lock to be woken up immediately to prevent tasks controlled by the linux scheduling class from violating the guided execution schedule. Therefore, the pending owner will not immediately be runnable and it could be in the GSCHED_TASK_PROXIED state.
Walk gsched_task_members list in pmgt_waiter structure.
remove each member from gsched_task_members
waiter is going to be destroyedmove member back to the cpu of the task
task no longer has a proxyremove member from gsched_waiting_members of its former proxy
member is not waiting on another task anymore
Called when the proxy field for a waiter is being temporarily cleared because a new proxy is being found. The typical example of this is when chain 1 is joining chain 2, and thus the proxy of the already blocked tasks in chain 1 will be changing.
Requires old proxy->pi_lock held by calling context
Acquires rq lock for old proxy
To access gsched_waiting_members list of the old proxy in order to remove the memberships of a task T in chain 1 which will be getting a new proxy. We step through the old proxy’s list removing all memberships representing T.
Walk gsched_task_members list in pmgt_waiter protected by pi_lock of old proxy
Clear the proxy for the member(erase old proxy)
Do not move the member back to the cpu of the task - optimization
if task is blocked then member remains on the cpu of the old proxy
a new proxy will be found shortly and member will be moved to the cpu of the new proxyif task is runnable then members will be moved to the task cpu by separate code in gsched_enqueue_task when the task is placed on the runqueue
The blocked state of the waiting task is set to GSCHED_TASK_PROXY_SEARCH. If a task is not runnable then its state is set to its blocked state.
The scheduling class of the task the waiter is being removed from is not changed
Class switch occurs after all waiters have been prepared for movement
Invoked later by pmgt code
Optimization for chain walking
gsched_pmgt_move_prepare is called for all of the members waiting on the task T that is the proxy for chain 1. When T is blocking on a member of chain 2, all of T’s waiting members need to be moved to become waiters of chain 2’s proxy.
Called when a proxy is being set for a waiter because its proxy has been identified
Paired with gsched_pmgt_move_prepare. gsched_pmgt_move_prepare must be called previous to gsched_pmgt_move. Generally, this is done earlier in the process in a different concurrency context.
Move should be called on tasks that are either GSCHED_TASK_RUNNABLE or GSCHED_TASK_PROXY_SEARCH. In the case of GSCHED_TASK_RUNNABLE, the most common case is the running end of chain 1 that is joining chain 2. This task will become GSCHED_TASK_PROXIED eventually, but outside the context of the call to move. A less common possibility is a task that was previously GSCHED_TASK_PROXIED may receive a signal or timeout and thus be in GSCHED_TASK_RUNNABLE state. In these situations, the task will leave the rt mutex code and destroy its waiter structure. However, it may then immediately re-enter the locking code and try to reacquire the same lock. Rarely, a task will receive a wakeup that does not correspond to a signal or a timeout. In this case, a task will not leave the locking code and will immediately re-block.
If a task is GSCHED_TASK_PROXY_SEARCH then the search is over at the time move is called becase we have found its proxy. The task is blocked already and becomes GSCHED_TASK_PROXIED as part of what move does.
The blocked state of the waiting task is always set to GSCHED_TASK_PROXIED and its proxy task pointer is set because a proxy is being assigned. If a task is not runnable then its state is set to its blocked state.
than they are on Must acquire group->lock to change CPU assignment of members
call sdf->move_member
Add members to the gsched_waiting_members list on the new proxy so that group scheduling knows that the proxy has more waiting members.
Must acquire rq_lock for proxy to access gsched_waiting_members
Optimization for chain walking
gsched_pmgt_move is only called when all wwaiters have reached the end of the chain and the new proxy has been found
Place a task in the correct scheduling class. Tasks that are under group scheduling retain memory of wanting to be in the KUSP scheduling class even when they are moved to the linux class because they have linux waiters. This enables the code to note when a task wanting to be in the KUSP class no longer has linux waiters and can thus be moved back to the KUSP class.
If a task has linux waiters
if the task is in KUSP scheudling class
move it to CFS or RT based on its prioIf a task does not have linux waiters
If the task wants to be in the KUSP scheduling class
move it to the KUSP scheudilng class
Perform same operations as gsched_pmgt_move_prepare
Additionally, perform gsched_pmgt_switch_class.
Used when class switch operation cannot be performed at a later time.
Reset must be used if waiters are being moved from multiple tasks at the same time as opposed to moving waiters from a single task.
pmgt_remove_waiters
Less efficient than the gsched_pmgt_move_prepare and gsched_pmgt_switch_class pair
More class switch checks are performed
When scheduling code must influence locking data structures.
Members are added or removed from the pmgt_waiter’s gsched_task_members list when they are added and removed from the task represented by the waiter. This can occur when an executive task is influencing group structure as it runs in its own context.
These functions are prefixed with gsched_pmgt.
This is a wrapper function which unifies the common code required in several contexts. Acquiring the locks needed to access the waiter for a task is part of establishing the necessary context. A function pointer(add_member, rem_member) is one of the arguments of this function and the referenced function is called when the proper context has been established.
task->pi_lock
The task struct uses pi_blocked_on to refer to the rt_mutex_waiter structure representing that task as a waiter in the PI code. Access to this is controlled with the pi_lock. Proxy currently creates a sibling data structure pmgt_waiter to hold proxy specific code. The rt_mutex_waiter structure uses the pmgt_waiter field to refer to the pmgt_waiter structure.pmgt_waiter->lock
Controls read access required by transition_member find the proxy task by reading the proxy_task field in the pmgt_waiter structure.pmgt_waiter->tnode->via_lock->wait_lock
FIXME: Analysis of how tnode is used and some policy statements of what it is used for are pending.
Put member on gsched_task_members list in pmgt_waiter. An example of when this is called is when a task is given a new membership.
Acquire rq_lock for proxy to access gsched_waiting_members list to add the new member structure and to switch scheduling class as required.
Place member struct on gsched_waiting_members list of the proxy. This may require moving the member structure to the per-cpu area of the proxy’s cpu if the proxy is remote. To do this, it calls gsched_pmgt_move_member.
Switch proxy to appropriate scheduling class
gsched_pmgt_switch_class
Remove member from gsched_task_members list on pmgt_waiter. An example of this is when a task is removed from a group.
Acquire rq_lock for proxy to access gsched_waiting_members list to remove the member structure and to switch scheduling class
Remove member from gsched_waiting_members
Do not need to move member back to CPU of its task since we are removing the membership.Switch proxy to appriprate scheduling class
gsched_pmgt_switch_class
These functions are prefixed with gsched and use the proxy related information that is maintained by other parts of this subsystem.
Find the cpu of a group scheduling task member
if the task is GSCHED_TASK_RUNNABLE or GSCHED_TASK_BLOCKED
the cpu is the cpu of the taskif the task is GSCHED_TASK_PROXY_SEARCH or GSCHED_TASK_PROXIED
the cpu is the cpu of the proxy
Hierarchy is evaluated by a particular CPU to make a scheduling decision.
If a task is chosen then we need to consider its proxy, if it has one
task_cpu, proxy_cpu, proxy, state fields in gsched_member_task structure are protected by group->lock
check if we need to run the proxy
do not run proxy if task is runnable
Special case: Generally, a runnable task should not have a proxy but
A task with a proxy can wake up and start running. So for some set of instructions inside the locking code, it will still have a proxy noted but that will be removed in the course of emerging from the locking code.
a task may be awakened and be executing in the locking code while still having a proxy designated even though it may decide to reblock. In this case, the proxy remains the same.
do not run proxy if the executing cpu is not the same as proxy_cpu
do not run proxy if proxy is not runnable
if there is no proxy chosen
run task if the task is on this cpu and on the runqueue
Called when a task is placed on a runqueue.
This happens under several different circumstances. First, when a task blocks it is removed from a runqueue and it when it unblocks it is replaced. Second, when task affinity is changed it is moved from the runqueue of one cpu to the runqueue of another. Third, when a task is first created, it is added to a runqueue. Fourth, when scheduling class is changed, when scheduling class is changed the task must be moved from one section of the runqueue to another and removing and replacing the task on the same runqueue is part of this process.
if a task is waking up and has a proxy
move members from the proxy cpu to the cpu of the task
special case
task is being placed on the runqueue but not due to a wakeup
occurs when changing cpus or switching scheduling classes
thread is in atomic context when these oprations are being performed. In this situation, not only is preemption disabled as indicated by atomic context but itnerrupts are also disabled. For example, when changing scheduling classes the task spends a period in atomic context.
When changing scheduling classes, members should already be on the cpu of the task because when changing scheduling classes the task is being removed and replaced on the same runqueue.
When changing CPU affinity, the task doing the changing is also in atomic context but the memberships of the task whose affinity is being changed would have to move from one CPU to another.
if a task switches CPUs
move members from old task cpu to new task cpu
if a task is a proxy and switches cpus
move the members that are waiting on it to the new CPU of the proxy
waiting tasks are recorded in gsched_waiting_members of the proxy task which is the task being enqueued
step down gsched_waiting_members and move all elementsprotected by rq_lock which is held by calling context
Called when a task T is taken off of a runqueue.
Task T may dequeue itself in schedule() or when it changes its own scheduling class or CPU affinity. Task T may also be dequeued by task R if R is changing the CPU affinity or scheduling class of T.
if task T is going to sleep
Task T will always be the task executing the dequeue
set the state of all of the memberships of task T to be equal to its blocked state
if task T has a proxy
move task T’s memberships to the cpu of the proxy
Task T is being taken off of the runqueue but it is not going to sleep
occurs when changing cpus and switching scheduling classes
Task T or Task R may be executing
The thread executing the dequeue is in atomic context with interrupts disabled. It will continue to run on the current cpu until task T is placed back on the runqueue. Additionally, the cpu associated with the runqueue cannot make any scheduling decisions until T has been placed back on the runqueue because the lock for the runqueue is held for the duration of the operation.
If task T is executing the dequeue then task T will run for a short period of time while it is not on the runqueue.
If task R is executing the dequeue then Task T will be taken off of the runqueue temporarily. Task T may or may not be running during this period of time. However, GS does not need to change the state of task T’s memberships because the cpu of the runqueue(s) involved cannot make a scheduling decision until task R has finished changing the cpu or scheduling class of task T. Task T remains runnable because no scheduling decision can be made for this duration and T will be runnable afterwards.
Special case
If a task is GSCHED_TASK_RUNNABLE then we do not need to move its memberships to the proxy because there is no proxy or any indicated proxy relationship is about to be eliminated. This case can arise most commonly because R is simply changing the assignment of a running task or a task that did have a proxy has aborted or been spontaneously awakened.
Some scenarios requiring GS and/or proxy actions are not related to any lock actions. Tasks have CPU affinity and scheduling class properties that are of concern to the proxy accounting and scheduling subsystems, and which may change due to actions outside the context of a thread executing lock code.
It is important to remember that any Task A, with adequate priveledges, can change the CPU affinity or scheduling class of any Task B.
When a task is running and changes its own CPU affinity, its GS member structures must be moved from the per-CPU data structure associated with its current CPU to that associated with its destination CPU.
When a task is either running or blocked, and an executive Task A changes the CPU affinity of Task B, the GS member structures representing B must be moved from the data structures associated with its original CPU to those associated with its destination CPU.
When tasks are block on a lock, they have a designated Proxy process, except during short periods when the PRoxy is being determined or adjusted. The Proxy task is one not currently blocked on a lock, although it may block for other reasons in the normal course of execution.
When a Task B that is a Proxy for one or more tasks changes its own CPU affinity, or if another Task A changes the CPU affinity of Task B, then the GS memberships structures for all tasks for which Task B is proxy must be moved from data structures associated with the old CPU of Task B to the data structures associated with the new CPU affinity of Task B. Note that these group member structures are “remote” in the sense that they are on data structures associated with a CPU that can be different from that of the CPU affinity of the task they represent. This is permitted because while they can be chosen by the scheduler for a CPU on which they cannot execute, their proxy on the relevant CPU will be the task given execution context, not the blocked task.
By
Move all memberships waiting on the proxy to the new cpu of the proxy
only move memberships if the task represented by the membership is not running
special case for timeout, signal, stray wakeup
These scenarios involve tasks and locks in some way. The most obvious kind of situation is, of course, when a task is newly blocking on a lock and thus joining a set of waiting tasks. In this case, the proxy for the task must be found, and the data structures representing its direct and indirect blocking relations must be updated. This can be a farily complex process of the newly blocking task is itself a proxy for other blocked tasks. The other most obvious case is when a lock is being released by the current proxy and a new owner for the lock, and a new set of proxy relationships, must be established. More subtle scenarios can also occur. A blocked task may receive a signal or unblock for another reason, requiring the data structures representing the blocking relations and identity of the proxy tasks to be updated.
At the most basic level the accounting for these scenarios is handled by the PMGT layer, and is of limited concern to the SL of the system. However, the SL clearly needs to know about these status changes so it can take whatever measures are deemed appropriate. In the case of GS, we want to make every task in in the GSCHED_TASK_RUNNABLE and GSCHED_TASK_PROXIED states be selectable at scheduling time, because in both cases there is a clearly identified task to run.
Complicating factors relevant at the SL level which are not a concern at the PMGT level, however, are the CPU affinity and scheduling class of the tasks involved in a given scenario. Each task can only run on the CPU for which it currently has affinity, but a task and its proxy may have different CPU affinities. The scheduling classes of the set of tasks involved affect specific actions taken by the GS layer to perform its duties, and the waiting tasks and their proxy can be managed under an arbitrary mix of scheduling classes.
These complicating factors add configuration dimensions to set of locking scenarios used to evaluate the correctness and completeness of the PMGT implementation. By adding these dimensions, we effectively convert each scenario at the PMGT level into several scenarios at the GS level. In addition, concerns at the GS level which are not relevant at the PMGT level make it necessary to create some additional senarios.
The CPU affinity dimension has two values: same-cpu and different-cpu. In general, these values describe whether a task and its proxy have the saem CPU affinity or different affinity. This matters because it affects where the GS_MEMBER structure for the waiting task should be placed, since it must be selectable when the proxy’s scheduling code is executed. Remember that the scheduling code is shared among all CPUs in a system, but the data structures used by that code are segregated by CPU. This means that the GS_MEMBER structure for a waiting task must be in the per-cpu data structures for the CPU of the proxy in order for it to be selectable by the CPU which can run the proxy task.
The scheduling class of the tasks involved are relevant because the run queues are segregated by scheduling class, GS has its own class, and proxy relations across scheduling class boundaries are possible. In the context of these tests, we really only care about the “Linux” scheduling class, technically known as SCHED_OTHER, and the GS scheduling class. For purposes of discussing the various scenarios we will refer to the L and G scheduling classes of the tasks.
While the lock scenarios described here are closely related to the locking scenarios used to test PMGT, there are important differences arising from the fact that the PMGT layer is only concerned with the accounting required to track task blocking relations and to identify proxy tasks. In contrast the SL in the system, in this case GS, is responsible for controlling the execution of the tasks using the information provided by PMGT as well as other task state information. As a result, the set of testing scenarios must be understood in a more complex way when considering the GS scheduling layer than was sufficent for the tests concentrating on the actions of the PMGT layer.
We have identified four broad categories of locking scenarios from the GS perspective:
- First Waiter Arrives, Inducing Proxy
- Proxy Static, but Waiter Set Changes
- Proxy and Waiter Set Static, but Thread State Changes
- Proxy Releases Lock
We will still use the previous number scheme denoting the number of tasks, the number of locks, and the number of the scenario as the basic designation for a particular test. In addition, however, we will also not additional parameters that specificy aspects of the sceanrio that are significant for the GS layer. Each basic scenario identified under the basic numbering scheme used for the PMGT test will thus potentially become several tests at the GS layer. Most of the PMGT sceanrios are reelvant, but not all, so the relationship between PMGT and GS level tests is somewhat more complex than it may first appear.
This covers a variety of scenarios related to establishing the first waiting relationship for a task that thus becomes a proxy. The scheduling classes of the new waiter and the proxy are relevant because some sets of circumstances require a task in the G class to temporarily be added to the L class.
Dimensions
Proxy Scheduling Class: L/G
Waiter Scheduling Class: L/G
CPU Affinity: waiter and proxy have the Same (S) or Different (D) CPU affinity
Scenarios: 2.1.1, 2.1.4, 2.1.8
Combinations: LLS, LGS, GLS, GGS, LLD, LGD, GLD, GGD
This covers a variety of scenarios related to changes in the membership of the existing set of waiters on a proxy. Such a change can occur because a new waiter is added to the set due to a task blocking on a lock, or because an existing waiter is removed due to a waiter aborting its wait on a lock.
Dimensions
Scenarios:
Combinations: L->L, L->G, G->L, and G->G
This covers a variety of scenarios that fall into two broad categories: changes to proxy state and changes to waiter state.
Dimensions
Scheduling Class Change: L->G (LG) or G->L (GL)
Scheduling Parameter Change: should be irrelevant
Scenarios:
Combinations: LGS, GLS, LGD, GLD
This covers a variety of scenarios that are distinguished form each other in a number of ways. The releasing task was the proxy, so a new proxy will have to be chosen. The releasing task can be called the old proxy and the task chosen as its successor can be called the new proxy. The set of waiters is generally reduced by one, since one of the waiters is chosen as the new proxy. The set of waiters might also
Dimensions
Old Proxy: Real and effective domain LG, LL, GG, GL???
New Proxy: L/G
Scenarios:
Combinations:
Figure BS 2.1.1: Blocking Scenarios in the 2.1.1 Set
Two phases
setup
proxy relationship with T1 created
T2 is executing
block
T2 is no longer runnable
gg.s
setup
GS records that T1 is the proxy of T2
gg.d
setup
GS records that T1 is the proxy of T2
block
move T2 memberships to cpu of T1
gl.s
setup
GS records that T1 is the proxy of T2
gl.d
setup
GS records that T1 is the proxy of T2
block
move T2 memberships to cpu of T1
lg.s
setup
change T1 to be in linux class
lg.d
setup
change T1 to be in linux class
Figure BS 2.1.2: Blocking Scenarios in the 2.1.2 Set
T1 is executing
T2 is blocked
gg.s
GS clears the proxy of T2
gg.d
GS clears the proxy of T2
move T2 memberships to cpu of T2
gl.s
GS clears the proxy of T2
gl.d
GS clears the proxy of T2
move T2 memberships to cpu of T2
lg.s
change T1 to be in GS class
lg.d
change T1 to be in GS class
Figure BS 2.1.6: Blocking Scenarios in the 2.1.6 Set
Two phases
run
T1 starts to runremove
T1 removes its waiter
gg.s
remove
GS clears the proxy of T1
gg.d
run
move T1 memberships to cpu of T1
remove
GS clears the proxy of T1
gl.s
remove
GS clears the proxy of T1
gl.d
run
move T1 memberships to cpu of T1
remove
GS clears the proxy of T1
lg.s
remove
change T2 to be in linux class
lg.d
remove
change T2 to be in linux class
Figure BS 2.1.7: Blocking Scenarios in the 2.1.7 Set
Two phases
run
T1 starts to run
- steal
- T1 steals the lock
gg.s
steal
GS clears the proxy of T1
gg.d
steal
GS clears the proxy of T1
run
move T1 memberships to cpu of T1
gl.s
steal
GS clears the proxy of T1
gl.d
run
move T1 memberships to cpu of T1
steal
GS clears the proxy of T1
lg.s
steal
change T2 to be in GS class
lg.d
steal
change T2 to be in GS class
Figure BS 3.1.1: Blocking Scenarios in the 3.1.1 Set
Two phases
setup
set new proxy for waitersblock
locking task blocks on a lock
setup
if T3 in GS
GS records that T1 is the proxy for T3if T3 in linux, T2 in GS, and T1 in GS
put T1 in linux class
FIXME: The class of T2 is not represented.
block
if T3 in GS
move T3 memberships to cpu of T1
Figure BS 3.1.2: Blocking Scenarios in the 3.1.2 Set
if T2 in GS
GS clears the proxy of T2
move T2 memberships to cpu of T2
if T3 in GS
GS temporarily clear proxy for T3
GS set T2 to be proxy for T3
move T3 memberships to cpu of T2
FIXME: The cpu of T3 is not represented.
if T3 in Linux and T2 in GS
put T2 in Linux class
FIXME: The class of T3 is not represented.
if T2 or T3 is in linux and T1 wants to be in GS but is in linux because of T2 or T3
put T1 in GS class
Figure BS 3.1.4: Blocking Scenarios in the 3.1.4 Set
if T3 in GS
GS temporarily clears proxy of T3
GS records that T1 is now the proxy of T3
move T3 memberships from the cpu of T2 to the cpu of T1
FIXME: The cpu of T3 is not represented.
if T3 in linux and T2 wants to be in GS but was in linux because of T3
put T2 in GS class
if T3 in Linux and T1 in GS
put T1 in Linux Class
FIXME: The class of T3 is not represented.
Figure BS 3.1.5: Blocking Scenarios in the 3.1.5 Set
if T3 in GS
GS temporarily clears proxy for T3
GS records that T1 is now the proxy of T3
GS moves T3 memberships to cpu of T1
FIXME: The cpu of T3 is not represented
if T1 in GS
GS moves T1 memberships to cpu of T1
GS clears proxy of T1
if T3 or T1 in linux and T2 wants to be in GS but was in linux because of T3 or T1
put T2 in GS class
if T3 in Linux and T1 in GS
put T1 in Linux Class
FIXME: The class of T3 is not represented
Figure BS 3.2.1: Blocking Scenarios in the 3.2.1 Set
Three phases
reset
clear old proxy of waiterssetup
set new proxy for waitersblock
locking task blocks on a lock
reset
if T3 in GS
GS clears proxy for T3
setup
if T3 in GS
GS records that T1 is the proxy for T3
move T3 memberships from cpu of T2 to cpu of T1
if T3 in linux and T2 wants to be in GS but was in linux because of T3
move T2 from linux class to GS class
FIXME: Class of T3 is not represented
if T2 in GS
GS records that T1 is the proxy for T2if T2 or T3 in linux and T1 in GS
put T1 in linux class
FIXME: Class of T3 is not represented
block
if T2 in GS
move T2 memberships to cpu of T1
Figure BS 3.2.2: Blocking Scenarios in the 3.2.2 Set
if T2 in GS
GS clears proxy of T2
move T2 memberships to cpu of T2
if T3 in GS
GS temporarily clears proxy of T3
GS records that T2 is the proxy for T3
move T3 memberships from cpu of T1 to cpu of T2
FIXME: The cpu of T3 is not represented
if T3 in linux and T2 in GS
Put T2 in linux
FIXME: The class of T3 is not represented
if T1 wants to be in GS but is currently in linux because T3 or T2 was in linux
put T1 in GS
Figure BS 3.2.3: Blocking Scenarios in the 3.2.3 Set
Two phases
setup
set new proxy for waitersblock
locking task blocks on a lock
setup
if T1 in GS
GS records that T2 is the proxy for T1if T1 in linux, T3 in GS, and T2 in GS
put T2 in linux class
FIXME: The class of T3 is not represented.
block
if T1 in GS
move T1 memberships to cpu of T2
Figure BS 3.2.4: Blocking Scenarios in the 3.2.4 Set
FIXME: Class of T3 is not represented
if T3 in GS
clear proxy of T3
move memberships of T3 from the cpu of T2 to the cpu of T1
FIXME: CPU of T3 is not represented
if T3 in linux
if T2 wants to be in GS but is in linux because of T3
put T2 in GSif T1 in GS
put T1 in linux
Figure BS 3.2.7: Blocking Scenarios in the 3.2.7 Set
T1 becomes runnable, even though it is a waiter(signal, timeout, etc)
move T1 memberships to cpu of T1 because T1 is runnable
if T1 in GS
GS clears proxy of T1
if T3 in GS
GS clears proxy of T3
move T3 memberships from the cpu of T2 to the cpu of T1
FIXME: Cpu of T3 is not represented
if T1 or T3 in linux, and T2 wants to be in GS but is currently in linux due to T1 or T3
T1 is no longer waiting on T2 and T3 is now waiting on T1
move T2 to GS
FIXME: Class of T3 is not represented
if T3 in linux and T1 in GS
move T1 to linux
FIXME: Class of T3 is not represented
Figure BS 3.2.8: Blocking Scenarios in the 3.2.8 Set
if T1 in GS
GS clears proxy for T1
move T1 memberships to cpu of T1
if T1 in linux, T3 in GS, and T2 wants to be in GS but is currently in linux due to T1
put T2 in GS class
Figure BS 3.2.10: Blocking Scenarios in the 3.2.10 Set
if T2 in GS
GS records that T1 is the proxy for T2
move T2 memberships to cpu of T1
if T2 in linux, T3 in GS, T1 in GS
put T1 in linux
FIXME: Class of T3 is not represented
Figure BS 3.2.11: Blocking Scenarios in the 3.2.11 Set
Two phases
abort
T2 aborts its attempt to lock L1exit
T2 thread ends
abort
if T2 in GS
GS clears proxy for T2
move memberships of T2 from cpu of T1 to cpu of T2
if T3 in GS
GS temporarily clears proxy for T3
GS sets T2 to be the proxy for T3
move memberships of T3 from cpu of T1 to cpu of T2
FIXME: CPU and class of T3 is not represented
if T3 in linux and T2 in GS
put T2 in linux
exit
futex implementation unlocks L2
if T3 in GS
GS clears the proxy of T3
move memberships of T3 from cpu of T2 to cpu of T3
FIXME: CPU and class of T3 notn represented