I would like to express my sincere gratitude to Dr. Douglas Niehaus, my advisor and committee chairman, for his invaluable guidance and advice throughout the course of this research work. I would also like to thank Dr. John Gauch and Dr. Susan Gauch for accepting to be in my committee.
I would like to take this opportunity to thank my group members Robert Hill, Matthew Peters, Shyam Pather, Sean House, Anil Gopinath, Sandeep Bhatt, and Shyam Murthy for making it a joy to come to work and for helping me whenever humanly possible. I would also like to thank Anil Gopinath and the other users of KURT for their important suggestions on how to improve the system. I would like to thank Shyam Pather for helping me complete some of the work that I started.
I am indebted to Raghavan Menon for clarifying all my doubts (even the really silly ones) about the Linux kernel. I hope at some point in my life I will be able to achieve his level of maturity of the thought process.
I would like to thank Linus Torvalds and the thousands of other volunteers around the world who contributed to the Linux kernel and made it the best operating system in the world. I would also like to thank the creators of LaTEXfor making it so simple to write a technical document.
A million thanks goes to the encouragement provided during these years by my friends Preeti, Prashant (NBD), JP, Vinai, and Jaisri. Without their motivation, I would never have completed my second Masters. I would also like to thank Vishal, Vrushali, Dilip, and Praveen for making my last year in Lawrence a memorable one.
This acknowledgement would be incomplete if I forgot to mention my whole family for their love, support, and encouragement throughout this course of study.

Until the fairly recent past, most real-time applications fit into two broad categories [10]. The first consists of those applications with soft real-time constraints which require timely execution of tasks at coarse temporal resolution, but do not produce catastrophic consequences if their deadlines are violated. The second group consists of hard real-time applications. These applications impose stringent timing demands on their operating systems, with disastrous, and sometimes fatal, consequences resulting from temporal errors.
Many commercial operating systems have been extended in a straightforward manner to meet the modest timing needs of soft real-time applications. Typically, when equipped with a comparatively fine-grain timing mechanism (usually with resolution no finer than milliseconds) and some type of priority scheduling algorithm, most conventional operating systems can support such applications. Providing this slightly finer than normal temporal resolution seldom precludes these extended operating systems from providing the services such as network, graphic display, and disk access that they typically offer. Hard real-time applications, in contrast, typically require specialized operating systems that run on very specific hardware, and come at a significant cost. In order to meet the strict timing requirements of hard real-time applications, these operating systems can provide only a very austere execution environment, offering few services such as network access, graphics display or file-system interfaces.
Recent developments in many areas, including multimedia and ATM networking, have spawned several applications that defy the binary hard/soft classification system. Multimedia video applications, for example, exhibit comparatively strict timing requirements typical of hard real-time applications since they need to maintain very precise display refresh rates. However, most hard real-time operating systems cannot, or are only beginning to, support such video applications because they require comparatively sophisticated operating system services such as access to video display adapters and sound cards. Our own research in the development and performance evaluation of ATM networks has included several applications with similar combinations of fine grain timing requirements, typical of hard real-time systems, with system service requirements typical of soft real-time systems. These applications, which fall neither into the hard nor soft real-time classes, might reasonably be called firm real-time applications. Also, since most existing real-time systems are priority based, there is no explicit support for QoS guarantees (in terms of CPU time) required by multimedia applications [30]. Furthermore, these applications have stringent cost constraints. At present, there is no system which adequately satisfies the combination of temporal, service, and cost constraints exhibited by this class of applications.
The KU Real-Time (KURT) system satisfies the constraints of many of the applications in the firm real-time category through innovative modification of free software running on commercial off-the-shelf (COTS) hardware. KURT is based on the Linux operating system. The choice of Linux, a freely available, popular Unix clone, as the base operating system was motivated by two important design goals.
Linux meets both of these requirements since it runs on commercial off-the-shelf hardware, and is distributed freely.
Linux had to be modified in several important ways to support firm real-time applications. First, using the scheme described in Chapter 3, the system's temporal resolution has been increased, without significantly increasing the overhead of the software clock. This UTIME patch to Linux increases the utility of Linux for soft real-time applications.
The KURT patch builds on top of UTIME, adding a number of features. KURT enables the system to switch between normal mode, in which it functions as a normal Linux system, and real-time mode, in which it executes only designated real-time processes according to an explicit schedule. When in real-time mode, real-time processes can still access any of the system services that are normally available to non-real-time processes. While many approaches to scheduling can be easily implemented, KURT employs explicit scheduling, as it was found to be the most appropriate for our current set of firm real-time applications.
KURT and closely related approaches are also being applied in our lab to a range of problems, including synchronized distributed real-time computation, the implementation of a software ATM switch with precise rate control, the implementation of a real-time Object Request Broker (ORB), and support for multimedia applications. One such firm real-time application considered in this work is the ATM Reference Traffic System (ARTS), described in detail in Chapter 4. This application can record and generate packet-level ATM traffic streams with microsecond accuracy. ARTS uses KURT's scheduling abilities in its traffic generation module by placing packet transmission events in an explicit KURT schedule. In order to use inexpensive COTS hardware for such an application, two key simplifying assumptions were made. First, traffic is controlled at the packet, and not at the cell level. Also, ARTS only records and reproduces the load, and not the data content of the traffic. These restrictions are perfectly acceptable in a wide range of performance evaluation applications and ARTS has become an important part of our ATM performance evaluation toolkit.
KURT fills an important gap in the existing spectrum of real-time systems by providing a low cost system with fine-grain temporal resolution, capable of satisfying firm real-time performance constraints. However, the system is new, still under development, and is as yet not appropriate for all applications. It schedules tasks with extremely high, yet not perfect, accuracy and reliability. The generic Linux services have not been adapted to real-time execution, and are thus the source of some scheduling distortion. However, in spite of this limitation KURT has provided excellent support to our firm real-time applications for which no other system was appropriate. KURT also provides a well-structured environment within which to investigate the source and reduction of the remaining scheduling distortion.
The next chapter discusses related work on real-time operating systems. Chapter 3 describes the implementation of the UTIME Linux patch and provides a performance analysis of the UTIME system. Chapter 4 describes the design and implementation of the KURT system on top of UTIME. This chapter also provides results comparing the scheduling efficiency of KURT with the existing First-In-First-Out scheduler of Linux. Chapter 5 discusses our conclusions and possible future work.
In this chapter the characteristics of existing firm real-time applications will be presented first. Several scheduling paradigms will then be analyzed to find the best scheduling algorithm for these applications. Finally, several available commercial and research real-time operating systems will be discussed and the constraints on their application to the class of emerging real-time applications will be presented.
Owing to the periodicity of continuous-media data, the processing of audio and video data must occur at specific times. Moreover, the execution of these operations must be finished within certain deadlines to serve the real-time characteristics of these media. Owing to these characteristics of audio and video data, multimedia systems have to provide mechanisms to support time-related quality-of-service (QoS) guarantees [30].
Multimedia systems for single-user, and especially single-task, machines provide only simple mechanisms to provide time-based operations (e.g. for delaying program execution) but no real-time support. It is often argued that this approach is sufficient for these systems since the CPU is used mostly for the multimedia application during its run time. In those situations where the user has another time-consuming application running, it is easy for him to abandon that application. For multi-user and server systems, such as video-on-demand servers, this assumption is not valid. Other user applications, including other multimedia sessions, can disturb multimedia applications in such a way that the perceived QoS is not acceptable.
In addition to providing time-related QoS guarantees, the operating system should also provide some programming constructs for supporting multimedia applications. Since continuous-media applications are inherently periodic, it will be instructive to look at approaches taken by application programmers for writing periodic programs.
Loop/Delay constructs rely on mechanisms provided by most operating systems to delay the execution of a program for a certain amount of time. One such example is the UNIX sleep function. They are used as shown in Program 2.1 where f denotes some function for processing the data (e.g. displaying a frame on the screen).
While such constructs avoid the influence of the actual execution time
of the operations, in a preemptive multi-tasking environment this
approach is incorrect since the statements are not executed
atomically. Any preemption between the calculation of the length of
the delay period and its use in the delay operations leads to time
drifts. In this case the computed value is too large. Using a function
(e.g. delay_until) which delays the execution until a
specified point in time instead of delaying a specified amount of time
can solve this problem. This approach is shown in
Program 2.2.
![\begin{Program}
% latex2html id marker 339
\begin{verbatim}
1 get_time(t);
2 wh...
...iodic Programs]{A Periodic Program using the Delay-Until Construct}\end{Program}](img3.gif)
Although this approach leads to correct timing, it places the burden of time handling and temporal state information management on the application programmer. It is better, and safer, to have the underlying operating system do this work [30].
![\begin{Program}
% latex2html id marker 348
\begin{verbatim}
1 p = create_new_pro...
...Programs]{A Periodic Program using the Periodic Processes Approach}\end{Program}](img4.gif)
Note that the function schedule_me can be easily implemented using the delay_until primitive. However, with schedule_me the programmer does not have to worry about the calculation of times as this is done by the system software.
The weakest point of the delay, delay-until, and other user-level scheduling paradigms in general, is that they do not provide any information about timing and requested resource usage to the system scheduler. Hence, no schedulability tests can be performed to check whether the system is able to support all requests. This may lead to overload situations which may degrade the QoS to unacceptable levels. Therefore, these models can be seen as simple ad hoc methods that are not acceptable in general multimedia systems where mechanisms for reliable QoS provision have to be available. As will be shown in Chapter 4, KURT provides the schedule_me construct to application programs.
Aside from research operating systems, most commercially available systems do not offer the needed periodic process model, but only provide basic real-time mechanisms to implement these models. For example, the POSIX threads extension provides the possibility of associating scheduling attributes with each thread defining the scheduling policy and the thread's fixed priority [11]. The timing of the thread's operation, e.g. suspending the thread until the next period, has to be implemented by the programmer. The timer operations offered by the POSIX real-time extension can be used for that purpose, but due to the soft real-time nature of the POSIX real-time extension, this is usually insufficient for handling multimedia tasks [11]. Chapter 4 presents data supporting this argument.
The static table-driven approach is highly predictable. But it suffers from inflexibility, since any change to the tasks or their characteristics may require a complete overhaul of the table. Though this method is suitable to many hard real-time systems, it is not suited for the emerging firm real-time applications, because of the unpredictable arrival rate of real-time tasks in present day systems (ie. the user may start one or more than one multimedia application at arbitrary times).
Most next generation firm real-time applications like multimedia are periodic and need explicit guarantees from the operating system. Dynamic planning-based schedulers can have admission control mechanisms that prevent the overload of the CPU. Hence schedulers belonging to this class hold the most promise for scheduling the next generation firm real-time applications.
As noted in Section 2.1, continuous media applications are periodic and have explicit timing requirements. Priority driven approaches are either insufficient, in case there is no admission control mechanism, or unnecessarily complicated, when it takes into account priority inversions and priority inheritance. Hence a planning based scheduling approach might be necessary for supporting multimedia and other present day firm real-time applications.
The major design challenge for a operating system supporting firm real-time applications is to strike an appropriate balance between system services and predictability. Many real-time UNIX operating systems, have been developed under the first approach [10]. There are also a large number of proprietary real-time systems, over 70 extant commercial real-time kernels, that take a similar design approach, including QNX [4], OS-9 [3], and LynxOS [1]. Research projects have also taken this approach, the most obvious example being Real-Time Mach, which is an extension of Mach [29]. There are other research operating systems like Spring [28], Maruti [24], and YARTOS [14] which take the second approach and design an operating system with predictability as a first class design feature.
LynxOS uses a patented interrupt handling technology which makes use of kernel threads to decrease scheduling latency and distortion caused by heavy interrupt load [2]. Without this feature, several back-to-back interrupts running at higher priority than existing real-time tasks can cause significant scheduling distortion. The claims made for LynxOS' combination of conventional operating system features with the ability to meet stringent real-time performance constraints is impressive. As a set of modifications to the Linux system, the KURT approach is similar in large measure to that of LynxOS. However, use of free software and low cost COTS hardware provides a significant cost advantage over proprietary LynxOS. Also, the UTIME approach to event scheduling provides an important advantage in temporal granularity, since it appears, from references obtained so far, that LynxOS provides only the conventional 10 millisecond temporal resolution. Further, KURT's support for explicit event scheduling provides an alternative to the conventional priority scheduling of LynxOS and systems like it.
VxWorks [5] is a commercial real-time operating system that is not compatible with UNIX. Though VxWorks provides some of the POSIX interface functions, most of the VxWorks API is proprietary. In VxWorks the kernel and tasks run in one address space. This allows task switching to be very fast and eliminates the need for system call traps. A run-time linker allows dynamic loading of both tasks and system modules. An interactive shell with C-like syntax can be used to examine and modify variables, evaluate expressions, call functions and perform simple debugging. These features make development of applications for VxWorks somewhat easier. But since the same address space is shared by both the kernel and the real-time tasks, errors in one real-time task can easily affect other tasks as well as the whole system. Hence, VxWorks is not suitable for many present day firm real-time applications due to the lack of protection and other support common to conventional systems.
There is a current trend to use Windows NT for real-time processing. This would be the most viable scenario for present day firm real-time applications as Windows NT is the most popular desktop operating system and as it has a multitude of available applications. But Windows NT kernel is incapable of real-time processing [6]. The Win32 API is not designed for real-time since interrupt notification can be delayed for an unpredictable amount of time and memory preallocation is a problem. Several companies provide solutions for some of these problems. The system provided by LP-Elektronik uses non-maskable interrupts (NMI) generated by a special hardware device [7]. This approach guarantees timing independence of Windows since NMI interrupt requests are never disabled. Since multimedia applications are targeted towards systems that are inexpensive and widely used, and since this approach requires specialized hardware which is expensive, it is not suitable for running multimedia applications. The INtime product of Radisys Inc. modifies NT's Hardware Abstraction Layer (HAL) to trap Windows' attempt to disable interrupts or reset the clock [8]. This is an approach similar to that taken in RT-Linux [31]. As shown in the next section, this is not a viable approach to providing desktop computers with firm real-time capabilities.
While in some respects this implementation is able to combine hard real-time performance with a wide range of services, it is applicable only to a limited set of applications that can be conveniently split into co-operating sets of real-time tasks requiring almost no services, and tasks that make use of operating system services but have almost no timing constraints. There are many important applications, including the ARTS system described in Chapter 4, that cannot be implemented this way. For example, if ARTS were implemented under RT-Linux, then its real-time module would have to be able to send an ATM packet into the network. This is not possible, however, because access to the networking subsystem of Linux is not supported by RT-Linux. Another example of an application that does not fit the RT-Linux model is a multimedia viewer. Usually a multimedia viewer reads a multimedia file (ie. mpeg, avi etc.) and displays it. The display of the frames in the multimedia stream have to occur at specific rates. Using the RT-Linux model it is not clear how this could be accomplished, as RT-Linux does not provide any access to video adapters, filesystems etc.
Other research systems take a substantially different approach. The Spring system developed at University of Massachusetts-Amherst took predictability as a first class design criterion, providing system services in a bottom-up manner only as implementations consistent with the predictability goals could be created [19,27,28]. One purpose of the project was to investigate the limits on predictability imposed by conventional hardware, but maximizing predictability clearly minimized services. Further, the only hardware supported by Spring is too costly and not always appropriate for the applications currently being addressed.
The Maruti system developed at the University of Maryland is not constrained by the same hardware dependencies as Spring [24]. In its most basic form, Maruti consists of a library of core routines needed to support the real-time demands of its applications. These core library routines were originally implemented on top of Unix, and were thus subject to the temporal resolution of the underlying system. Subsequent targeting on Mach is subject to the temporal granularity of Mach, and is similar to KURT in the sense that both help add real-time semantics to an existing system. Maruti, like KURT, represents the execution of its applications with an explicit schedule. Maruti is more sophisticated than KURT in many respects, and thus imposes more stringent constraints on the semantics of the applications it supports. Similar considerations of restrictions on supported hardware, cost, availability or the tradeoff between services and predictability apply to other well known research systems, such as MARS [22] and HARTS [26].
YARTOS is an experimental real-time operating system kernel that provides guaranteed response times to tasks [13,14,15]. The YARTOS scheduling model is composed of two basic abstractions: tasks and resources. A task is an independent thread of control that is invoked at sporadic intervals. Each invocation of a task must complete execution before a well defined deadline. The invocation interval and deadline of a task is derived from constructs in the higher level programming model. During the course of execution, a task may require access to a number of resources. A resource is a software object that encapsulates shared data and exports a set of procedures for accessing and manipulating the data. The scheduling algorithm used is an integration of a synchronization scheme for access to shared resources with the earliest deadline first processor scheduling algorithm. In YARTOS, the deadlines of the task are manipulated so that there is no contention for a shared resource and hence it ensures mutual exclusion on resource operations. This dynamic manipulation of a task's deadline is similar to the concept of priority ceiling in priority inheritance protocols [14]. Jeffay et. al. implemented a desktop video-conferencing utility using YARTOS [15]. They state that the YARTOS programming model has not supported the conferencing application seamlessly. Writing multimedia applications for YARTOS seems to increase the development complexity of the application and hence YARTOS is probably not suited for supporting present day firm real-time applications.
In generic Linux, a hardware timer chip is programmed at boot time to interrupt the CPU at a fixed periodic rate. Each time the CPU is interrupted, the kernel updates its software clock to indicate the passing of another tick. Thus the software clock is nothing more than a running count of the number of clock ticks, or hardware timer interrupts, that have passed since the kernel was started. The length of time between each software clock tick is referred to as a jiffy. A simplified version of the timer interrupt service routine (ISR) is shown in Program 3.2. In the timer ISR, after updating the jiffies counter, the kernel calls run_timer_list (Program 3.3). The run_timer_list routine is the timer dispatch function which checks if there are any scheduled timers that are due for processing, and calls the appropriate function if necessary. Finally, the kernel performs other activities such as updating process times, scheduling a new process if necessary in the do_other_stuff routine.
Since the kernel only checks its list of pending scheduled events at each timer interrupt, the length of time between interrupts (or the length of each jiffy) is the smallest meaningful unit of time with which events can be scheduled. Thus the length of a jiffy determines the kernel's timing resolution.
In order to increase the resolution of the Linux software clock, the basic mechanism by which it is implemented had to be altered. A naive approach to increasing the timing resolution would be to simply program the timer chip to interrupt the CPU at a higher frequency. This would reduce the length of a jiffy, and therefore increase the software clock resolution. However, since the timer chip would regularly interrupt the CPU irrespective of whether or not any events were scheduled to occur at the time of each interrupt, this approach leads to excessive, unsupportable overhead.
A better solution to the problem of increasing the kernel's timer resolution is based on the following observation: there is a crucial difference between the temporal resolution and the frequency of events. In other words, even though firm real-time applications schedule events with microsecond resolution deadlines, events are rarely scheduled to occur every microsecond. To support such applications, what is needed is a mechanism by which timer interrupts are allowed to occur at any microsecond, not necessarily every microsecond. In light of this observation, UTIME departs from the common practice of using the timer chip to interrupt the CPU at a fixed rate. Instead, the timer chip is programmed to interrupt the CPU in time to process the earliest scheduled event.
In order to achieve microsecond accuracy, a new member had to be added to the timer structure. This member (usec) specifies the microsecond within the jiffy at which the timer will expire. Two global variables were also added: jiffies_u, which maintains the microsecond within the current jiffy, and jiffies_intr, the use of which will be explained later in this section.
Since interrupts no longer happen periodically, a function
(update_time) that keeps track of elapsed time was
added. The update_time routine makes use of the time stamp
counter (TSC) that is available on most modern CPUs to maintain
time
. This
function calculates the time elapsed since the last time this function
was called. It then updates the global variables jiffies
and jiffies_u to reflect the elapsed time. The pseudo-code
for update_time is shown in Program 3.4.
The pseudo-code for the modified timer ISR is shown in Figure 3.5. When an interrupt is serviced, the timer ISR first calls update_time to update the software clock. Then the time till the next scheduled event is found, and the timer chip is programmed to interrupt the CPU at that time. Finally, the modified run_timer_list (Program 3.6) is called to service all the events that have expired.
Thus the CPU is interrupted only when it needs to be, and not at regular intervals. In this simple form, our scheme eliminates the timer interrupts at which no events are due for processing. Since the timer chip can be programmed with microsecond accuracy, we effectively achieve microsecond resolution.
In this simple version UTIME, the jiffy counter is no longer
incremented regularly, as is the case in standard Linux. Indeed the
timer interrupts, at which jiffy counter updates can be made, often
occur with little or no regularity. This presents a problem since
there are several kernel subsystems that assume both implicitly and
explicitly that the jiffy counter is being updated at regular
intervals. To keep these systems working, a way to reproduce this
periodic heartbeat
in the
absence of regular, periodic interrupts had to be devised. Hence, when
reprogramming the timer chip to trigger the next interrupt in time to
process the next event, a check is made to see if that event occurs
after the next jiffy counter update would have occurred. If this is
found to be the case, then an interrupt is scheduled to occur just in
time to increment the jiffy counter
(Program 3.7). This introduces some
interrupts at which no events are due for processing, but keeps all
kernel subsystems, including the software clock, working as normal.
If during any call to update_time (Program 3.4),
the accumulated jiffies_u crosses the number of
microseconds in one jiffy, the global variable
jiffies_intr is incremented. As can be seen in
Program 3.5, the timer ISR performs its normal
functions only when a jiffy boundary is crossed. Though this increases
the jitter of jiffy events, this jitter is still less than
the resolution required by the systems assuming the periodic
heartbeat.
The implications of UTIME, in terms of system overhead and scheduling effectiveness, are discussed in the next section.
Most of the UTIME implementation centers around changes to the
processing of timer interrupts. For this reason, comparing the
processing time of a timer interrupt under UTIME with that of a timer
interrupt in standard Linux, gives a quantitative measure of the
overhead resulting from UTIME. To measure the processing time of a
timer interrupt, we inserted code into the timer interrupt handler
that would take time-stamps upon entry and upon exit from the ISR.
Time-stamps were obtained by reading the time-stamp counter (TSC),
which increments at the clock rate of the Intel Pentium Pro Processor
used in the test system
.
Timer interrupt processing times were measured for three cases: standard Linux with no timer modifications, Linux with the UTIME modifications but without any events scheduled at microsecond resolution (i.e. behaving like a normal Linux system), and Linux with the UTIME system with several events scheduled at microsecond resolution. The processing times for around 8000 timer interrupts were measured for each of these three cases and Figure 3.1 shows the processing time distributions. Note that this figure illustrates the percentage of interrupts with overhead less than or equal to a given value on the horizontal axis. As can be seen from the figure, without UTIME, most timer interrupts take approximately 1 microsecond to process, and virtually all take less than 5 microseconds. With UTIME in place, most timer interrupts take approximately 15 microseconds to process, regardless of whether or not any microsecond events are scheduled. It should be noted that the timer interrupt processing time, as described here, does not include the time required to process any scheduled events: the timer interrupt handler simply checks which events are due for processing, and marks them as such.
The exact value of the timer overhead is less important than the fact that it is greater under UTIME than under standard Linux for two main reasons. Since timer interrupts do not occur on any regular interval, there is some amount of extra computation required to determine the new value to load into the timer chip to trigger the next interrupt. This is not the case in standard Linux because the timer chip is simply programmed, during system boot-up, to interrupt the CPU at a fixed rate. A large portion of the processing time increase, however, is because the timer chip on the systems we used required approximately 4 microseconds of latency before its inputs are latched. A more appropriately designed timer chip would therefore decrease the amount of overhead represented by UTIME. Even though the actual time required to process timer interrupts using UTIME increases, this penalty is incurred at most once every 10 milliseconds when no microsecond events are scheduled. Thus, the overhead presented by the UTIME modification is only about 0.15% , with an additional 15 microseconds per event explicitly scheduled with microsecond resolution.
To test the effect of UTIME on soft real-time applications, we wrote a
dummy application that would try to run once every 10
milliseconds. The pseudo-code for this application is shown in
Program 3.8. The application first sets its
scheduler to SCHED_FIFO so that it gets preferentially scheduled over
other processes
. Then it enters a loop, collects a time-stamp, and
finds the difference between the current timestamp and the previous
one. It then sleeps for the remainder of 10 ms. The time difference
between consecutive loops is plotted in the UTIME case in
Figure 3.2(a). As can be seen from
Figure 3.2(a), the process actually gets scheduled every
10 ms. Whereas in the non-UTIME case
(Figure 3.2(b)) the process ends
up getting scheduled every 20 ms. Since the POSIX standard for
select states that the process should sleep for
at least the time stated in the select call, the Linux kernel
adds 1 jiffy (10 ms) to the time provided in the select call. As can
be seen from these two figures, UTIME alone increases the utility of
Linux for soft real-time applications.
Finally Figure 3.3 shows the amount of time elapsed
between when a microsecond resolution event was scheduled, and when
the function associated with the timer (fn in
Program 3.1) was called. As can be seen from the figure,
the timer handling function for most events are called within
50
sec of the scheduled time. Hence this is the actual minimum
resolution with which events in UTIME can be scheduled. This result
has some implications on the scheduling of firm real-time events and
will be discussed further in Chapter 4.
Since Linux is a time sharing operating system, many of its components
are not tuned to perform well under real-time constraints. As shown in
Section 4.4.3, the filesystem code and the device driver
code for the disk drive may block interrupts for up to
250
sec. Hence, it might be advantageous to turn off these
components when performing real-time tasks. KURT provides a way in
which this can be achieved. The KURT kernel can run in one of two
modes. In the normal mode, all processes are allowed to
run. Since all processes are allowed to run in the normal mode, some
process may be using kernel services (like the disk) that might block
interrupts for an extended period of time. To avoid this, the kernel
can be switched to a real-time mode in which only processes
that are marked as real-time processes are allowed to run. Since the
scheduling occurs within a focussed set of processes, it is possible
for the developer to make sure that those processes do not use any
kernel service that might block interrupts and distort the performance
of the KURT scheduler beyond acceptable limits. The KURT core provides
system calls that allow a user process to toggle the kernel between
real-time mode and normal mode (Section 4.1.3). The
steps required to mark processes as real-time is explained in
Section 4.3.
The Process RTMod is a built-in real-time module that switches context to a specific user mode process when invoked. Through the Process RTMod, KURT provides the functionality of writing real-time processes that can be explicitly scheduled. The use of real-time processes increase the context-switch and system call overhead and latency, but simplify the problem of application development in some cases. There is less risk associated with using real-time processes, as a bug in one real-time process cannot affect the rest of the system. Further details on the interface between the KURT core and RTMods is explained in Section 4.2.
Figure 4.1 shows the relationship between the KURT core, the RTMods, and user programs.
int switch_to_rt(int timer_chip_mode, unsigned long period,
int rt_mode, char *cmdline, int length);
In the above prototype:
int switch_to_normal(int force);
The force parameter specifies whether the kernel has to be immediately switched into the normal mode. If this parameter is set to 1, then all existing scheduled events are canceled, and the kernel is immediately switched to the normal mode. If this parameter is set to 0, then the system call blocks till all the scheduled real-time events have been serviced.
int rtmod_cmd(int rtmod_id, int command, void *buffer,
int size);
In the above prototype:
int rt_schedule_events(struct timeval *time, int type,
int num_times, char *filename);
In the above prototype:
As can be seen from Program 4.1, each event has a module identifier module_id associated with it. This specifies which RTMod has to be invoked for the event. The module_ids are assigned to modules when they register with the KURT core (see Section 4.2). Each event also has 12 bytes of module specific data that gets passed to the module when the event expires. The time when the event has to occur is specified as an offset the start of the schedule. The schedule file has a binary representation of this structure for each event.
int register_rtmod(struct rtmod_info *info);
register_rtmod returns the module_id for the real-time module. This identifier is used when scheduling events for this RTMod. A user can obtain a mapping of module_id and the name of the module through the get_rtmod_num library call. The prototype of this call is as follows:
int get_rtmod_num(char *rtmod_name);
The return value from this library call is the module_id of the real-time module that has the name rtmod_name. If the specified real-time module is not present then the return value is -1.
The rtmod_info structure (Program 4.2) provides information about the module to the KURT core.
As can be seen in Program 4.2, each module passes its name along with a set of function pointers to the KURT core when registering. These functions are called by the KURT core at the following times:
Each RTMod should at least provide the name and event_handler function to the KURT core. The other elements of rtmod_info can be set to NULL.
int unregister_rtmod(char *name);
The name used in the register_rtmod call is passed to this function to unregister this RTMod. The KURT core makes sure that no more events are scheduled for this RTMod.
Explicit scheduling using schedule files as described above works well for many real-time applications. However, some applications have periodic execution flows, for example, multimedia applications exhibit this characteristic, wherein a certain section of code is executed repeatedly at a fixed time interval. For these applications, specifying explicit event times does not make sense. A more natural approach is to allow these applications to specify the length of one period, and then have KURT make them execute the appropriate piece of code once every period. The system calls that allow a user process to achieve this are explained in the Section 4.3.3. SCHED_KURT processes that are not explicitly scheduled, are scheduled according to a Round-Robin policy based on the priority of the process. Even though these processes are scheduled using a Round-Robin policy, they still have higher priority than other processes using the standard Round-Robin scheduler. The SCHED_KURT policy is different from other scheduling policies in the Linux kernel, in the respect that it comes into effect only when the kernel is in real-time mode. When the kernel is in normal mode these processes are scheduled according to their normal scheduling policy.
![\begin{Program}
% latex2html id marker 731
\begin{verbatim}
1 struct rtparams {
...
... Processes]{Structure Specifying Characteristics of KURT Processes}\end{Program}](img35.gif)
int set_rtparams(int pid, int policy, struct rtparams *param);
In the above prototype:
int get_rtparams(int pid, struct rtparams *params);
int get_rtstats(int pid, struct rtstats *info);
![\begin{Program}
% latex2html id marker 761
\begin{verbatim}
1 struct rtstats {
2...
...atistics]{Structure Specifying Statistics of a SCHED\_KURT process}\end{Program}](img38.gif)
The rtstats structure is shown in Program 4.4. The elements of this structure are as follows:
int get_num_rtprocs(void);
int rt_suspend(int susp_mode);
The susp_mode parameter can have one of the following values:
The pseudo-code for a periodic SCHED_KURT process is shown in
Program 4.5.
![\begin{Program}
% latex2html id marker 792
\begin{verbatim}
1 void main()
2 {
3 ...
... SCHED\_KURT Process]{Pseudo-Code for a Simple SCHED\_KURT Process}\end{Program}](img39.gif)
The KURT version of the dummy application (see Program 4.6) used the scheduler's periodic mode, and set the period to 10 milliseconds. It would first switch the kernel to real-time mode and then suspend itself waiting for the KURT process module to wake it up periodically. At the start of each execution, the application would take a time-stamp and would log this time-stamp to disk. The non-KURT version of the application (see Program 4.7) would set its scheduler to be the standard Linux FIFO scheduler, and then enter a loop. At the top of the loop it would obtain a time-stamp and store it in memory. It would then obtain another time-stamp and compute the remaining time till the next iteration and sleep (using the select system call) for that much time. When it woke up, it would return to the top of the loop and repeat the same procedure. The difference between consecutive time-stamps gave a measure of the length of time between successive executions. The closer this measured length of time was to 10 milliseconds, the more accurate the scheduler.
Tests were performed in the SCHED_KURT and SCHED_FIFO cases using 1, 10, 20, and 30 concurrent executions of the dummy application. For the KURT case, tests were performed using the real-time mode set to SCHED_KURT_PROCS, in which only SCHED_KURT processes are allowed to run, as well as with real-time mode set to SCHED_ALL_PROCS, in which all processes are allowed to run. Comparison of the SCHED_ALL_PROCS test, with the SCHED_FIFO test provides an idea of the advantage of explicit scheduling without focusing the set of running processes. Apart from the test processes, the system was essentially idle.
Figure 4.2 presents the histograms obtained in the SCHED_FIFO case. These histograms plot the percentage of iterations versus the time between consecutive iterations.
Figure 4.3 presents the histograms obtained in the SCHED_KURT case, with real-time mode set to SCHED_KURT_PROCS. Figure 4.4 are the histograms obtained in the SCHED_KURT case, with SCHED_ALL_PROCS mode.
The following observations can be made from the these figures:
From these observations, it can be concluded that explicit scheduling without focusing the set of running processes is itself a viable alternative for scheduling firm real-time processes.
Figure 4.5 summarizes the results shown in Figure 4.2- 4.4. This figure illustrates the percentage of iterations with duration less than or equal to a given value on the horizontal axis.
Figure 4.5(a) clearly shows that the variance of the FIFO scheduler's performance increases significantly as the number of processes increases, whereas that of the KURT scheduler does not. Irrespective of the number of processes, KURT schedules each application at the appropriate time.
Table 4.1 presents a tabular comparison of the three schedulers. In these tables, the percentage of events within a specified time duration from the expected time is shown.
As can be seen in Table 4.1(a), SCHED_FIFO would
probably be suitable for running 1 firm real-time process. For
example, 62% of the events lie within 10
sec
(i.e. 0.1% ) of the expected time, and 100% of events lie
within 500
sec (i.e. 5% ) of the expected time. But as the
number of FIFO processes increase, the performance drops
drastically. For example, with 30 processes, only 1.21% of the
events lie within 0.1% of the expected time, and only
46.61% of the events lie within 5% of the expected
time. The performance is expected to decrease further as more
processes are added to the system.
In contrast to this, in the SCHED_KURT case in the SCHED_ALL_PROCS mode (Table 4.1(b)), for 1 process, 99.8% of the events lie within 0.1% of the expected time, and 100% of the events lie within 5% of the expected time. And for 30 processes, 97.3% of the events lie within 0.1% of the expected time, and 99.99% of the events lie within 5% of the expected time.
The performance of explicit scheduling in the SCHED_KURT_PROCS mode
(i.e. only SCHED_KURT processes are allowed to run) is shown in
Table 4.1(c). There is a slight improvement in
the performance of the explicit scheduler by switching the kernel to
SCHED_KURT_PROCS when compared to SCHED_ALL_PROCS mode. But this
performance improvement is not substantial, because the system was
essentially idle apart from the explicitly scheduled processes. If the
system was loaded, the performance of the explicit scheduler, in the
SCHED_ALL_PROCS mode, would be distorted to the extent of the
duration during which interrupts were disabled by other parts of the
operating system. Figure 4.6
shows the effect of background disk activity on the performance of the SCHED_FIFO and
SCHED_KURT schedulers. These experiments were conducted with
background disk activity because the data presented in
Section 4.4.3 shows that the disk subsystem of Linux
disables interrupts for the maximum duration of time. As can be
observed from these figures and Table 4.2
background disk activity distorts the schedule for durations less 500
sec duration. This is because the disk subsystem of Linux
disables interrupts for durations up to 250
sec and depending upon
the number of back to back blocking of interrupts, the performance of
the scheduler would be distorted to that extent. It can be noted that
even in the presence of background disk activity SCHED_KURT still
performs much better than SCHED_FIFO. Another observation that can be
made from Table 4.2(b) is that as the
number of processes increases, the performance of SCHED_KURT
improves. This is probably because, as the number of processes
increase, the process generating the background disk activity does not
get scheduled in that often and therefore the duration of time during
which interrupts are disabled is reduced. Section 4.4.3 presents some data that would
enable application developers to gauge the effect, in terms of
scheduling distortions, of using certain subsystems of Linux in their
real-time applications.
| Number of Processes | 5c|% of events found within 'x' Microseconds of Expected Time | ||||
| 10 |
50 |
100 |
500 |
1000 |
|
| 1 | 62.22% | 99.59% | 99.80% | 100.00% | 100.00% |
| 10 | 3.36% | 15.12% | 30.25% | 99.34% | 99.72% |
| 20 | 1.44% | 7.45% | 14.76% | 66.48% | 99.78% |
| 30 | 1.21% | 6.10% | 11.03% | 46.61% | 73.11% |
| Number of Processes | 5c|% of events found within 'x' Microseconds of Expected Time | ||||
| 10 |
50 |
100 |
500 |
1000 |
|
| 1 | 99.80% | 100.00% | 100.00% | 100.00% | 100.00% |
| 10 | 96.66% | 99.47% | 99.67% | 100.00% | 100.00% |
| 20 | 99.34% | 99.67% | 99.91% | 100.00% | 100.00% |
| 30 | 97.33% | 99.54% | 99.80% | 99.99% | 100.00% |
| Number of Processes | 5c|% of events found within 'x' Microseconds of Expected Time | ||||
| 10 |
50 |
100 |
500 |
1000 |
|
| 1 | 99.80% | 99.80% | 100.00% | 100.00% | 100.00% |
| 10 | 96.61% | 99.72% | 99.87% | 100.00% | 100.00% |
| 20 | 99.23% | 99.40% | 99.61% | 99.80% | 100.00% |
| 30 | 99.52% | 99.75% | 99.89% | 100.00% | 100.00% |
| Number of Processes | 5c|% of events found within 'x' Microseconds of Expected Time | ||||
| 10 |
50 |
100 |
500 |
1000 |
|
| 1 | 50.20% | 81.33% | 83.74% | 98.99% | 99.59% |
| 10 | 5.50% | 17.19% | 30.70% | 84.40% | 92.25% |
| 20 | 2.00% | 7.07% | 13.21% | 55.50% | 84.20% |
| 30 | 1.09% | 4.71% | 8.65% | 36.47% | 64.32% |
| Number of Processes | 5c|% of events found within 'x' Microseconds of Expected Time | ||||
| 10 |
50 |
100 |
500 |
1000 |
|
| 1 | 40.16% | 60.04% | 73.89% | 97.60% | 99.60% |
| 10 | 68.43% | 81.68% | 88.61% | 97.17% | 99.17% |
| 20 | 87.94% | 95.17% | 97.37% | 99.52% | 99.66% |
| 30 | 56.26% | 60.74% | 65.27% | 82.01% | 95.72% |
Figure 4.7 shows the distribution of the differences between actual and desired inter-arrival times. As can be seen from the figure, ARTS was generally able to send packets with the correct inter-arrival times. Consider the experiment in which packets were sent once every 500 microseconds. The variance of the data was only 2.74 microseconds, and 99.7% of all packets arrived within 1% of the specified interval, and all packets arrived within 7% of the expected time.
| Category | Description | Min |
Mean |
Max |
| Console | Console (TTY) code | 0.51 | 0.86 | 12.13 |
| Network | Network device driver and protocol stack code | 0.35 | 0.64 | 2.74 |
| Disk | Disk device driver and buffer management code | 0.55 | 34.68 | 250.45 |
| Process | Scheduling and run-queue management code | 0.48 | 1.47 | 12.06 |
| Timer | Timer handling and time query code | 0.50 | 1.81 | 40.65 |
| Memory | Memory management code | 0.49 | 1.40 | 6.40 |
| Other | All other code that does not fit into the above categories. This mainly consists of routines interfacing with the PCI BIOS. | 1.48 | 2.34 | 7.55 |
Figures 4.8 and 4.9 shows histograms of the
percentage of events versus the duration of time during which
interrupts were disabled for that event. As can be seen in Figure 4.8(a), in
the console subsystem of Linux, disabling of interrupts occurs for
period less than 2
sec. The maximum duration was 12.13
sec. In
the network subsystem of Linux, most disabling of interrupts occurred
for less than 2.5
sec, and the maximum duration was 2.74
milliseconds. In the disk subsystem (Figure 4.8(c)) of
Linux, the maximum duration for which interrupts were disabled was
found to be 250
sec. As can be observed from these measurements,
most subsystems of Linux are reasonably suited for firm real-time
applications. To make Linux suitable for hard real-time applications
(or much more demanding firm real-time applications like ARTS),
significant modifications have to be made to existing Linux code. The
disk subsystem blocks interrupts for the longest period of time and
hence it has to be modified before it can be used for demanding firm
real-time applications like ARTS.
Since the duration during which interrupts are disabled is highly dependent on the hardware used in the system, these results cannot be generalized, and measurements should be made on the target hardware under a variety of conditions to gain a better overall understanding of the suitability of the Linux kernel for a firm real-time system on that hardware. It should also be noted that apart from the duration during which interrupts are blocked, the frequency at which they occur is also a factor on which the performance of real-time systems depends. Further experiments need to be conducted to gain a better understanding of the effect of blocked interrupts on the performance of the KURT scheduler.
With the advent of multimedia and high-speed networking, a new class of real-time application software has emerged. The applications in this class require many of the services offered by generic timesharing or soft real-time operating systems, but have timing constraints more characteristic of hard real-time systems. Since applications of this nature cannot be described as either hard or soft real-time, they can be called firm real-time applications. In addition to the unique computing requirements presented by firm-real time applications, many are developed and used under significant budgetary constraints, precluding the use of expensive commercial real-time operating systems to support them.
The KURT system described here is based on the well-known and widely-used Linux operating system that offers a broad range of services to the applications it runs. With innovative modifications to the way timer interrupts are handled in the kernel, it was possible to increase the temporal resolution of Linux, without causing a significant increase in system overhead. The new scheduler that KURT adds to Linux is able to achieve high levels of scheduling accuracy and can coexist with the other currently available schedulers. As evidenced by the ARTS application, which makes use of existing ATM networking services offered by Linux, the changes made in KURT do not prevent firm real-time applications from using the existing operating system services. For these reasons, KURT successfully transforms Linux into a firm real-time operating system. In addition to the ability to support firm real-time applications, KURT has the advantage of low cost, since it runs on commercial off-the-shelf hardware, and is based on free software.
As the evaluation shows, KURT was able to demonstrate firm real-time scheduling capabilities, although a few sources of scheduling distortion remain. It should be noted, however, that the frequency of distortions in KURT is low, and falls well within the limits appropriate to a number of interesting applications. Since most distortions occur beca