Kernel Testing

Configurations

Add versions of these configurations for each of these patches.

  • minimal_smp.config
  • minimal_up.config

Patches

P1. 2.6.31.6-rt19 - vanilla 2.6.31.6 with the rt19 patch applied

P2. 2.6.31.6-rt19-pm1 - Noah’s proxy patch 12/09 on top of rt19

P3. 2.6.31.6-rt19-pm2 - Noah’s proxy patch 2/10 on top of rt19

P4. 2.6.31.6-rt19-cd - rt19 with CCSM and Datastreams

P5. 2.6.31.6-rt19-cd-pm1 - proxy on top of 2.6.31.6-rt19-kusp-cd

P6. 2.6.31.6-rt19-gsched -
Group scheduling on top of 2.6.31.6-rt19-cd-pm1 with no exclusive control. This branch exists as a common point of change for the gsched-se and gsched-ce branches. Merging changes not related to exclusive control is easier. It is not intended to be a production branch.

P7. 2.6.31.6-rt19-gsched-se - simple exclusive control on top of 2.6.31.6-rt19-gsched

P8. 2.6.31.6-rt19-gsched-ce - complete exclusive control on top of 2.6.31.6-rt19-gsched

P9. 2.6.31.6-rt19-kusp - complete KUSP code for public consumption

Available Tests

T1: CCSM regression tests
The tests need to be increased in scale to use a larger number of sets and a larger set of members. We assume that we can write these large scale tests in python and that they don’t have to mean anything they just have to have a lot of members and a lot of sets. One constraint is we have to keep track of what we are doing well enough that we can do sanity check on set membership and know what the answer should be.
T2: Discovery regression tests(Bala)
Need automation to ease execution. The examples we have currently of running traceme on Totem and Transmission serve as good tests.
T3: Proxy testing under load

Bash script to do kernel compiles to be provided by Jared Port counters from 2.6.29(Jared) The goal is to count various aspects of proxy use including: total number of lock, unlock, steal, and total number of waiting chain length instances if that’s easy to do. We have a version that might be able to do this with or without stealing but we put it in 2.6.31.6-rt19-cd-pm1. The rtmutex debugging code does not compile with proxy but we may be able to adapt it to be used for proxy.

  • DEBUG_RT_MUTEXES - RT Mutex debugging, deadlock detection
T4: vanilla Linux lock tests

Run at boot time in the kernel They were successful under the composite kernel but should be run under P2 and P5.

  • DEBUG_LOCKING_API_SELFTESTS - Locking API boot-time self-tests

Other vanilla lock debugging

  • DEBUG_SPINLOCK_SLEEP - Spinlock debugging: sleep-inside-spinlock checking
  • DEBUG_SPINLOCK - Spinlock and rw-lock debugging: basic checks
  • DEBUG_LOCK_ALLOC - Lock debugging: detect incorrect freeing of live locks
  • PROVE_LOCKING - Lock debugging: prove locking correctness
  • LOCK_STAT - Lock usage statistics
T5: Proxy scenario testing

with rt mutex testing code?

  • RT_MUTEX_TESTER - Built-in scriptable tester for rt-mutexes

need to be created

T6: Child_Parent_Exclusive_Control test for group scheduling

T7: signal pipeline and socket pipeline examples

T8: Noah’s balanced pipeline
only uses group scheduling
T9: multi-thread example
CCSM <-> gsched interaction test
T10: Tyrian’s process based balanced pipeline
uses CCSM and group scheduling

T11: Datastreams

Basic DS functions are tested in the course of any use of it. These tests are more ambitious, opening multiple data streams, using CCSM to track sets of kernel and user threads, as well as threads in Sched_Other, SCHED_FIFO and SCHED_RR. These sets are not used for anything in particular at this stage, but do represent a stress test of the CCSM and DS subsystems when a load creating a significant number of new processes is present.

T12: guided execution proxy tests under group scheduling

Relevant Tests

Patch: P1. Tests: T4, T5

Probably redundant but could be useful in developing tests and providing a baseline for test results that would then be compared to results for KUSP code.

Patch: P2. Tests: T3, T4, T5

Patch: P3. Tests: T3, T4, T5

Patch: P4. Tests: T1, T2, T11

Patch: P5. Tests: T1, T2, T3, T4, T5, T11

P6. Skipped

Patch: P7. Tests: T1, T2, T3, T4, T5, T11, T6, T7, T8, T9, T10, T12

Patch: P8. Tests: T1, T2, T3, T4, T5, T11, T6, T7, T8, T9, T10, T12

Patch: P9. Tests: T1, T2, T3, T4, T5, T11, T6, T7, T8, T9, T10, T12

Known Problems

  1. When a child thread is forked, group scheduling does not add the child to the parent’s groups.

    To be implemented using CCSM. This should occur when the fork will be successful and the task is ready to be scheduled. Otherwise, group scheduling may be interacting with a task that will disappear(without calling do_exit) or the task will not be fully initialized.

  2. Group scheduling changes the semantics of wake_up_new_task and set_cpus_allowed_ptr.

    The scheduling class callbacks associated with these functions are not called for all tasks on the system, even if the task has nothing to do with group scheduling.

  3. gsched_mutex

    It may be obsolete, it is no longer used except in conjunction with task_rq_lock which is a more restrictive lock.

Testing Strategy

This section describes the strategy we think will be most efficient in testing the various KUSP subsystems and producing a series of patches that can be tested individually. One anomaly is that we will not test the patches from “the bottom, up” as is most common due to complicating factors that will be explained in context.

The KUSP components that need to be tested individually and in some cases as they interact or collaborate with each other are: Data Streams (DS), Computation Component Set Manager (CCSM), Proxy Accounting (Proxy), and Group Scheduling (GS). As reflected above in the list of patches, DS, CCSM, and Proxy are the most “primitive”, in that they do not interact with each other. GS depends on Proxy to operate correctly, and depends on CCSM for many of its more interesting and relevant operating modes.

In principle, we can and should add the patch for each component to the development base, linux-2.6.31.6-rt19, and perform a set of tests focused on that subsystem. Further, each subsystem S should, in principle, be tested before a patch for a subsystem T depending on S is added.

In the interests of minimizing time and effort, this plan chooses to violate both of those principles, in ways we believe are justified and well controlled. Further, if any problems arise we can and will revert back to a slower and more deliberate strategy.

First Testing Stage

The DS and CCSM subsystems are completely separate and simple. There is no reason to believe that any interactions exist, and combining them in a single patch provides the opportunity to test DS by using it to instrument CCSM, as well as using DS to help analyse CCSM behavior. We will thus start with Patch P4, which combined DS and CCSM.

The tests at this stage are those exercising CCSM (T1), the basic Discovery regression tests which use both DS active filters and CCSM (T2) and DS and CCSM stress tests tracking scheduling class of all processes in the system using CCSM sets, DS active filters, and instrumentation points in appropriate Linux Kernel system code.

Second Testing Stage

This stage is the one which violates the “from the bottom, up” patch application and testing principle. Under that principle we should add the Proxy patch which is independent of both DS and CCSM (P5). The problem with this approach is that creating an independent environment in which all relevant Proxy Accounting scenarios can be tested is difficult. It can, in principle, be done, perhaps by using and extending the existing framework for testing kernel preemptable mutexes using Priority Inheritance semantics. However, this would require non-trivial development of the testing framework, and would not be able, in our view, be able to test some of the more obscure Proxy scenarios.

Instead, we will leap forward to the patch containing DS, CCSM, Proxy, and GS-SE (P7). The GS-SE patch provides all of the Group Scheduling capabilities, but uses a “simple” implementation mechanism for placing threads exclusively under GS control. The more complex (CE) approach to implementing exclusive GS control of a set of threads is a later patch. The SE approach, in theory, is subject to some inaccuracies, but we believe these are unlikely to affect this testing stage, and we can ensure that we detect any cases where the inaccuracies are relevant.

The reason we use this patch as the second testing is that it enables us to use the Guided Execution (GE) programming model to implement the full set of Proxy Accounting scenarios directly, clearly, and efficiently, without additional testing framework implementation. It does depend on exclusive GS control of all threads involved in testing, so one complicating factor is if any testing thread is ever run without being explicitly chosen by GS. However, we believe this is extremely unlikely, and we can guarantee to detect if it occurs.

If the GE programming model performs as expected, then we will be able to implement a complete set of Proxy Accounting tests for which we can prove coverage of all listed Proxy scenarios. This set of tests will use DS instrumentation to fully document the execution paths through the Proxy code represented by each scenario. An important point about this approach is that this DS instrumentation would not be viable for stress testing the kernel preemptable mutexes due to data volume produced and the associated instrumentation effect. This will, however, present no problem under the GE programming model used to implement and control these tests.

Further, we believe this stage will also permit us to implement stress tests of the system under which we can ensure that the set of Proxy scenarios is complete. Under this approach we will alter the instrumentation in the preemptable mutex code to use an extremely lightweight accounting approach which can record a unique execution path “finger print” of each use of the preemptable mutex in a large kernel buffer. The overhead of this approach will present no major problems under stress testing. By recording and later analyzing the millions of preemptable mutex use fingerprints we can guarantee the completeness of the set of scenarios tested using the GE programming model by showing no untested scenarios exist, or by identifying any scenarios that need to be added.

Third Testing Stage

The results of the second stage testing may make this stage irrelevant. If we have fully correct behavior under a provably complete set of Proxy scenarios, then this stage will not be required. However, if we cannot meet that stringent level of correctness, this stage will drop back to the “skipped” P5 patch. At this point, however, we can focus the testing and development on specific problems, rather than trying to provide and prove completeness of Proxy testing without having GS-SE to use.

Fourth Testing Stage

This stage will test all aspects of GS-SE support for a wide range of GS use cases and regression tests. Specifically, Test sets T1 to T12 as defined above. This stage will also be used to determine if the GS-SE approach to implementing exclusive GS control of a set of threads is sufficient. This approach “enforces” exclusive GS control over a thread by placing it in the “SCHED_IDLE” class. If we can prove that these threads will, in fact, never be chosen by the Linux framework, then the GS-Se approach is sufficient.

If that proof is not possible, then we may able to impose exclusivity through creation of a new Linux class, which we would call “GS-Exclusive”, and which we would ensure would never choose a thread for execution.

Fifth Testing Stage

This stage may involve the patch for the “Complex” (CE) approach to exclusive control, if the SE approach proves insufficient. If the SE approach is sufficient, then this stage will involve more complex, and particularly multiprocessor group scheduling tests under a variety of loads. The character of this stage will be more precisely described as earlier stages are completed.

Test Development

All tests are located in:

$KUSPROOT/tests

The tests are driven by Makefiles. All tests can be executed using:

bash$ make -f kusp_regression.mk

The top level Makefile calls the Makefiles in the subdirectories, which are the subsystem Makefiles. These Makefiles create a build directory for the subsystem, build the tests, and then execute the tests.

The success or failure of a test is determined by the exit code of the test. A test should return zero if it successful and non-zero if it fails.

If it fails, the test should provide output explaining why it failed. Output should be provided using stderr. The stdout stream should NOT be used to output error messages because the Makefiles disregard its output to make the testing output easier to read.