A Firm Real-Time System Implementation using Commercial Off-The-Shelf Hardware and Free Software




by




Balaji Srinivasan


B.Tech. (Chemical Engineering), Indian Institute of Technology, Bombay, India, 1993
M.S. (Chemical Engineering.), University of Kansas, Lawrence, Kansas, 1995




Submitted to the Department of Electrical Engineering and Computer Science and the Faculty of the Graduate School of the University of Kansas in partial fulfillment of the requirements for the degree of Master of Science




(1,0)2
Professor in Charge



(1,0)2


(1,0)2
Committee Members



(1,0)1.5
Date Thesis Accepted


© Copyright 1998 by Balaji Srinivasan
All Rights Reserved


This thesis is dedicated to my parents Lakshmi Srinivasan and K.S. Srinivasan

Acknowledgments

I would like to express my sincere gratitude to Dr. Douglas Niehaus, my advisor and committee chairman, for his invaluable guidance and advice throughout the course of this research work. I would also like to thank Dr. John Gauch and Dr. Susan Gauch for accepting to be in my committee.

I would like to take this opportunity to thank my group members Robert Hill, Matthew Peters, Shyam Pather, Sean House, Anil Gopinath, Sandeep Bhatt, and Shyam Murthy for making it a joy to come to work and for helping me whenever humanly possible. I would also like to thank Anil Gopinath and the other users of KURT for their important suggestions on how to improve the system. I would like to thank Shyam Pather for helping me complete some of the work that I started.

I am indebted to Raghavan Menon for clarifying all my doubts (even the really silly ones) about the Linux kernel. I hope at some point in my life I will be able to achieve his level of maturity of the thought process.

I would like to thank Linus Torvalds and the thousands of other volunteers around the world who contributed to the Linux kernel and made it the best operating system in the world. I would also like to thank the creators of LaTEXfor making it so simple to write a technical document.

A million thanks goes to the encouragement provided during these years by my friends Preeti, Prashant (NBD), JP, Vinai, and Jaisri. Without their motivation, I would never have completed my second Masters. I would also like to thank Vishal, Vrushali, Dilip, and Praveen for making my last year in Lawrence a memorable one.

This acknowledgement would be incomplete if I forgot to mention my whole family for their love, support, and encouragement throughout this course of study.


\begin{spacing}
{1}
\begin{abstract}

The age of multimedia and high-speed netwo...
 ...llustrate the performance and utility of this system.\end{abstract}\end{spacing}

Contents


List of Tables


List of Figures

ProgramList of Programs

1 Introduction

Until the fairly recent past, most real-time applications fit into two broad categories [10]. The first consists of those applications with soft real-time constraints which require timely execution of tasks at coarse temporal resolution, but do not produce catastrophic consequences if their deadlines are violated. The second group consists of hard real-time applications. These applications impose stringent timing demands on their operating systems, with disastrous, and sometimes fatal, consequences resulting from temporal errors.

Many commercial operating systems have been extended in a straightforward manner to meet the modest timing needs of soft real-time applications. Typically, when equipped with a comparatively fine-grain timing mechanism (usually with resolution no finer than milliseconds) and some type of priority scheduling algorithm, most conventional operating systems can support such applications. Providing this slightly finer than normal temporal resolution seldom precludes these extended operating systems from providing the services such as network, graphic display, and disk access that they typically offer. Hard real-time applications, in contrast, typically require specialized operating systems that run on very specific hardware, and come at a significant cost. In order to meet the strict timing requirements of hard real-time applications, these operating systems can provide only a very austere execution environment, offering few services such as network access, graphics display or file-system interfaces.

Recent developments in many areas, including multimedia and ATM networking, have spawned several applications that defy the binary hard/soft classification system. Multimedia video applications, for example, exhibit comparatively strict timing requirements typical of hard real-time applications since they need to maintain very precise display refresh rates. However, most hard real-time operating systems cannot, or are only beginning to, support such video applications because they require comparatively sophisticated operating system services such as access to video display adapters and sound cards. Our own research in the development and performance evaluation of ATM networks has included several applications with similar combinations of fine grain timing requirements, typical of hard real-time systems, with system service requirements typical of soft real-time systems. These applications, which fall neither into the hard nor soft real-time classes, might reasonably be called firm real-time applications. Also, since most existing real-time systems are priority based, there is no explicit support for QoS guarantees (in terms of CPU time) required by multimedia applications [30]. Furthermore, these applications have stringent cost constraints. At present, there is no system which adequately satisfies the combination of temporal, service, and cost constraints exhibited by this class of applications.

The KU Real-Time (KURT) system satisfies the constraints of many of the applications in the firm real-time category through innovative modification of free software running on commercial off-the-shelf (COTS) hardware. KURT is based on the Linux operating system. The choice of Linux, a freely available, popular Unix clone, as the base operating system was motivated by two important design goals.

Linux meets both of these requirements since it runs on commercial off-the-shelf hardware, and is distributed freely.

Linux had to be modified in several important ways to support firm real-time applications. First, using the scheme described in Chapter 3, the system's temporal resolution has been increased, without significantly increasing the overhead of the software clock. This UTIME patch to Linux increases the utility of Linux for soft real-time applications.

The KURT patch builds on top of UTIME, adding a number of features. KURT enables the system to switch between normal mode, in which it functions as a normal Linux system, and real-time mode, in which it executes only designated real-time processes according to an explicit schedule. When in real-time mode, real-time processes can still access any of the system services that are normally available to non-real-time processes. While many approaches to scheduling can be easily implemented, KURT employs explicit scheduling, as it was found to be the most appropriate for our current set of firm real-time applications.

KURT and closely related approaches are also being applied in our lab to a range of problems, including synchronized distributed real-time computation, the implementation of a software ATM switch with precise rate control, the implementation of a real-time Object Request Broker (ORB), and support for multimedia applications. One such firm real-time application considered in this work is the ATM Reference Traffic System (ARTS), described in detail in Chapter 4. This application can record and generate packet-level ATM traffic streams with microsecond accuracy. ARTS uses KURT's scheduling abilities in its traffic generation module by placing packet transmission events in an explicit KURT schedule. In order to use inexpensive COTS hardware for such an application, two key simplifying assumptions were made. First, traffic is controlled at the packet, and not at the cell level. Also, ARTS only records and reproduces the load, and not the data content of the traffic. These restrictions are perfectly acceptable in a wide range of performance evaluation applications and ARTS has become an important part of our ATM performance evaluation toolkit.

KURT fills an important gap in the existing spectrum of real-time systems by providing a low cost system with fine-grain temporal resolution, capable of satisfying firm real-time performance constraints. However, the system is new, still under development, and is as yet not appropriate for all applications. It schedules tasks with extremely high, yet not perfect, accuracy and reliability. The generic Linux services have not been adapted to real-time execution, and are thus the source of some scheduling distortion. However, in spite of this limitation KURT has provided excellent support to our firm real-time applications for which no other system was appropriate. KURT also provides a well-structured environment within which to investigate the source and reduction of the remaining scheduling distortion.

The next chapter discusses related work on real-time operating systems. Chapter 3 describes the implementation of the UTIME Linux patch and provides a performance analysis of the UTIME system. Chapter 4 describes the design and implementation of the KURT system on top of UTIME. This chapter also provides results comparing the scheduling efficiency of KURT with the existing First-In-First-Out scheduler of Linux. Chapter 5 discusses our conclusions and possible future work.

2 Related Work

 Real-time systems are defined as those systems in which the correctness of the system depends not only on the logical result of computation, but also on the time at which the results are produced. Since real-time systems have timing constraints, generating a schedule so that all real-time tasks meet their deadlines is of utmost importance. Scheduling involves the allocation of resources and time to tasks in such a way that they meet their deadlines. Scheduling has been perhaps the most widely researched topic within real-time systems [21].

In this chapter the characteristics of existing firm real-time applications will be presented first. Several scheduling paradigms will then be analyzed to find the best scheduling algorithm for these applications. Finally, several available commercial and research real-time operating systems will be discussed and the constraints on their application to the class of emerging real-time applications will be presented.

2.1 Characteristics of Firm Real-Time Applications

 Since multimedia applications form a major portion of the present day firm real-time applications, the characteristics of multimedia applications will be presented in this section. This will be helpful in understanding the properties that a scheduling algorithm should have to be able to support multimedia applications.

Owing to the periodicity of continuous-media data, the processing of audio and video data must occur at specific times. Moreover, the execution of these operations must be finished within certain deadlines to serve the real-time characteristics of these media. Owing to these characteristics of audio and video data, multimedia systems have to provide mechanisms to support time-related quality-of-service (QoS) guarantees [30].

Multimedia systems for single-user, and especially single-task, machines provide only simple mechanisms to provide time-based operations (e.g. for delaying program execution) but no real-time support. It is often argued that this approach is sufficient for these systems since the CPU is used mostly for the multimedia application during its run time. In those situations where the user has another time-consuming application running, it is easy for him to abandon that application. For multi-user and server systems, such as video-on-demand servers, this assumption is not valid. Other user applications, including other multimedia sessions, can disturb multimedia applications in such a way that the perceived QoS is not acceptable.

In addition to providing time-related QoS guarantees, the operating system should also provide some programming constructs for supporting multimedia applications. Since continuous-media applications are inherently periodic, it will be instructive to look at approaches taken by application programmers for writing periodic programs.

2.1.1 Loop/Delay Constructs

Loop/Delay constructs rely on mechanisms provided by most operating systems to delay the execution of a program for a certain amount of time. One such example is the UNIX sleep function. They are used as shown in Program 2.1 where f denotes some function for processing the data (e.g. displaying a frame on the screen).


 \begin{Program}
% latex2html id marker 333
\begin{verbatim}
1 get_time(t);
 2 wh...
 ... Periodic Programs]{A Periodic Program using Loop/Delay Constructs}\end{Program}

While such constructs avoid the influence of the actual execution time of the operations, in a preemptive multi-tasking environment this approach is incorrect since the statements are not executed atomically. Any preemption between the calculation of the length of the delay period and its use in the delay operations leads to time drifts. In this case the computed value is too large. Using a function (e.g. delay_until) which delays the execution until a specified point in time instead of delaying a specified amount of time can solve this problem. This approach is shown in Program 2.2.
 \begin{Program}
% latex2html id marker 339
\begin{verbatim}
1 get_time(t);
 2 wh...
 ...iodic Programs]{A Periodic Program using the Delay-Until Construct}\end{Program}

Although this approach leads to correct timing, it places the burden of time handling and temporal state information management on the application programmer. It is better, and safer, to have the underlying operating system do this work [30].

2.1.2 Periodic Processes

The periodic processes approach separates the functional and the temporal specification and execution of the application program. In the periodic process model the process calls a function schedule_me to wait for the beginning of the next period. This execution scheme is illustrated in Program 2.3.
 \begin{Program}
% latex2html id marker 348
\begin{verbatim}
1 p = create_new_pro...
 ...Programs]{A Periodic Program using the Periodic Processes Approach}\end{Program}

Note that the function schedule_me can be easily implemented using the delay_until primitive. However, with schedule_me the programmer does not have to worry about the calculation of times as this is done by the system software.

The weakest point of the delay, delay-until, and other user-level scheduling paradigms in general, is that they do not provide any information about timing and requested resource usage to the system scheduler. Hence, no schedulability tests can be performed to check whether the system is able to support all requests. This may lead to overload situations which may degrade the QoS to unacceptable levels. Therefore, these models can be seen as simple ad hoc methods that are not acceptable in general multimedia systems where mechanisms for reliable QoS provision have to be available. As will be shown in Chapter 4, KURT provides the schedule_me construct to application programs.

Aside from research operating systems, most commercially available systems do not offer the needed periodic process model, but only provide basic real-time mechanisms to implement these models. For example, the POSIX threads extension provides the possibility of associating scheduling attributes with each thread defining the scheduling policy and the thread's fixed priority [11]. The timing of the thread's operation, e.g. suspending the thread until the next period, has to be implemented by the programmer. The timer operations offered by the POSIX real-time extension can be used for that purpose, but due to the soft real-time nature of the POSIX real-time extension, this is usually insufficient for handling multimedia tasks [11]. Chapter 4 presents data supporting this argument.

2.2 Scheduling Paradigms

Predictability is one of the primary issues in real-time systems [21]. Hence, a schedulability analysis has to be performed to predict whether a task will meet its deadline. Based on whether a system performs schedulability analysis, and whether the analysis is done statically or dynamically, existing scheduling algorithms can be split into four categories [21]. They are% latex2html id marker 2750
\setcounter{footnote}{1}\fnsymbol{footnote} % latex2html id marker 2751
\setcounter{{}{f}\fnsymbol{{} % latex2html id marker 2752
\setcounter{{}{{}\fnsymbol{{} :

2.2.1 Static Table-Driven Scheduling

Static table-driven approaches are applicable to tasks that are periodic. Given task characteristics, a table is constructed (using various search heuristics or other algorithms), that identifies the start and completion times of each task and tasks are dispatched according to this table. It is possible to check for the schedulability of tasks under this scheme. For example, for periodic tasks, there exists a feasible schedule if and only if there exists a feasible schedule for the LCM (least common multiple) of the periods. Given a set of periodic tasks, a typical algorithm attempts to construct a schedule for the LCM of the task periods. At run time, the set of tasks is repeatedly executed according to this schedule every LCM [21].

The static table-driven approach is highly predictable. But it suffers from inflexibility, since any change to the tasks or their characteristics may require a complete overhaul of the table. Though this method is suitable to many hard real-time systems, it is not suited for the emerging firm real-time applications, because of the unpredictable arrival rate of real-time tasks in present day systems (ie. the user may start one or more than one multimedia application at arbitrary times).

2.2.2 Priority-Driven Preemptive Scheduling

Liu and Layland [18] were the first to produce formal results for priority-driven algorithms for real-time systems. They focussed on the problem of scheduling periodic tasks on a single processor and proposed two preemptive solutions. The first algorithm, called the Rate-Monotonic (RM) algorithm assigns static priorities to tasks based on their periods. It assigns higher priorities to tasks with shorter period. They showed that this scheme is optimal among static-priority schemes. They also analyzed the Earliest-Deadline-First (EDF), a dynamic priority-assignment problem. In this algorithm, the closer a task is to its deadline, the higher is its priority. Dynamic priority schemes are applicable to both periodic as well as aperiodic tasks, whereas the RM priority-assignment policy is only applicable to periodic tasks, or those that can be converted to periodic tasks. On the other hand, dynamic priority schemes have more run-time overhead as the priorities have to be recalculated at regular intervals and when a new task arrives. Several modifications to the RM algorithms exist for improving the performance of this algorithm for tasks with different characteristics [12,17,25]. Rajkumar et. al. studied the effect of priority inversion on priority driven approaches [20]. Priority inversion is said to occur when a low priority process holds a resource needed by a high priority process. This would lead to a medium priority process being allowed to run and thereby lead to deadlines being missed by the higher priority process. Priority inheritance is one solution for this problem. In this scheme, if a low priority process holds a resource that a higher priority process is waiting for, the low priority process inherits the priority of the higher priority process.

2.2.3 Dynamic Planning-Based Scheduling

Dynamic planning-based schedulers provide the flexibility of dynamic approaches with some of the predictability of approaches that check for feasibility. In this approach, after a task arrives, but before its execution begins, an attempt is made to create a schedule that contains the previously guaranteed tasks as well as the new arrival. This approach provides for predictability with respect to individual arrivals and for achieving admission control [21].

Most next generation firm real-time applications like multimedia are periodic and need explicit guarantees from the operating system. Dynamic planning-based schedulers can have admission control mechanisms that prevent the overload of the CPU. Hence schedulers belonging to this class hold the most promise for scheduling the next generation firm real-time applications.

2.2.4 Dynamic Best Effort Scheduling

If a purely priority-driven preemptive approach is used without any planning, a task could be preempted at any time during its execution. In this case, until the deadline arrives or until the task finishes, whichever comes first, it is not known whether a timing constraint will be met. This is the major disadvantage of dynamic best effort approaches and hence they cannot be used to schedule firm or hard real-time applications.

2.2.4.1 Scheduling in Non-Real-Time Systems

Traditionally, a priority-based preemptive scheduling approach has been used in non-real-time systems. In most non-real-time systems, responsiveness is of more importance than throughput or meeting a deadline. For example, UNIX was originally designed as a time-sharing system in which average performance of the system is optimized [23]. Most UNIX operating systems have schedulers that are intended to balance response time and throughput and to ensure fair distribution of CPU time among processes [16]. This goal is often at odds with the low latency and high predictability requirements of real-time systems. The POSIX.1b-1993 [11] standard for real-time Unices try to correct some of these shortcomings by providing additional schedulers like the First-In-First-Out and the Round-Robin schedulers.

As noted in Section 2.1, continuous media applications are periodic and have explicit timing requirements. Priority driven approaches are either insufficient, in case there is no admission control mechanism, or unnecessarily complicated, when it takes into account priority inversions and priority inheritance. Hence a planning based scheduling approach might be necessary for supporting multimedia and other present day firm real-time applications.

2.3 Existing Real-Time Systems

Existing real-time systems can be grossly divided into two categories:

1.
Systems implemented using somewhat stripped down and optimized versions of conventional timesharing systems, and

2.
Systems starting with a clean slate, focusing on predictability as a first class design feature.

The major design challenge for a operating system supporting firm real-time applications is to strike an appropriate balance between system services and predictability. Many real-time UNIX operating systems, have been developed under the first approach [10]. There are also a large number of proprietary real-time systems, over 70 extant commercial real-time kernels, that take a similar design approach, including QNX [4], OS-9 [3], and LynxOS [1]. Research projects have also taken this approach, the most obvious example being Real-Time Mach, which is an extension of Mach [29]. There are other research operating systems like Spring [28], Maruti [24], and YARTOS [14] which take the second approach and design an operating system with predictability as a first class design feature.

2.3.1 Commercial Real-Time Systems

LynxOS, developed by Lynx Real-Time Systems, is an interesting example of the first class of systems. It has gone to greater lengths of adaptation than has been typical of this class in the past, and is POSIX-compliant [1]. It offers many of the features boasted by conventional, non-real-time operating systems, including demand-paged virtual memory, a graphical user interface, industry standard networking and a complete set of development tools. Due to its full POSIX compliance, applications written in C on conventional POSIX-compliant operating systems can be ported to LynxOS with relative ease. In addition, LynxOS ships with a complete suite of debugging tools to ease the task of application developers.

LynxOS uses a patented interrupt handling technology which makes use of kernel threads to decrease scheduling latency and distortion caused by heavy interrupt load [2]. Without this feature, several back-to-back interrupts running at higher priority than existing real-time tasks can cause significant scheduling distortion. The claims made for LynxOS' combination of conventional operating system features with the ability to meet stringent real-time performance constraints is impressive. As a set of modifications to the Linux system, the KURT approach is similar in large measure to that of LynxOS. However, use of free software and low cost COTS hardware provides a significant cost advantage over proprietary LynxOS. Also, the UTIME approach to event scheduling provides an important advantage in temporal granularity, since it appears, from references obtained so far, that LynxOS provides only the conventional 10 millisecond temporal resolution. Further, KURT's support for explicit event scheduling provides an alternative to the conventional priority scheduling of LynxOS and systems like it.

VxWorks [5] is a commercial real-time operating system that is not compatible with UNIX. Though VxWorks provides some of the POSIX interface functions, most of the VxWorks API is proprietary. In VxWorks the kernel and tasks run in one address space. This allows task switching to be very fast and eliminates the need for system call traps. A run-time linker allows dynamic loading of both tasks and system modules. An interactive shell with C-like syntax can be used to examine and modify variables, evaluate expressions, call functions and perform simple debugging. These features make development of applications for VxWorks somewhat easier. But since the same address space is shared by both the kernel and the real-time tasks, errors in one real-time task can easily affect other tasks as well as the whole system. Hence, VxWorks is not suitable for many present day firm real-time applications due to the lack of protection and other support common to conventional systems.

There is a current trend to use Windows NT for real-time processing. This would be the most viable scenario for present day firm real-time applications as Windows NT is the most popular desktop operating system and as it has a multitude of available applications. But Windows NT kernel is incapable of real-time processing [6]. The Win32 API is not designed for real-time since interrupt notification can be delayed for an unpredictable amount of time and memory preallocation is a problem. Several companies provide solutions for some of these problems. The system provided by LP-Elektronik uses non-maskable interrupts (NMI) generated by a special hardware device [7]. This approach guarantees timing independence of Windows since NMI interrupt requests are never disabled. Since multimedia applications are targeted towards systems that are inexpensive and widely used, and since this approach requires specialized hardware which is expensive, it is not suitable for running multimedia applications. The INtime product of Radisys Inc. modifies NT's Hardware Abstraction Layer (HAL) to trap Windows' attempt to disable interrupts or reset the clock [8]. This is an approach similar to that taken in RT-Linux [31]. As shown in the next section, this is not a viable approach to providing desktop computers with firm real-time capabilities.

2.3.2 Research Real-Time Systems

Real-Time Linux (RT-Linux) is a research operating system that has been developed at New Mexico Tech [9,31]. Though KURT and RT-Linux use Linux as the base operating system, the two systems take a fundamentally different approach to providing real-time abilities to Linux. Instead of improving the abilities of Linux directly, RT-Linux implements a small, real-time kernel which runs a non-real-time Linux kernel as a completely preempt-able, low-priority task. The RT-Linux model states that real-time applications can be split into real-time and non-real-time parts running under the separate environments. Communication between the real-time and non-real-time portions of a system is supported by lock-free queues and shared memory.

While in some respects this implementation is able to combine hard real-time performance with a wide range of services, it is applicable only to a limited set of applications that can be conveniently split into co-operating sets of real-time tasks requiring almost no services, and tasks that make use of operating system services but have almost no timing constraints. There are many important applications, including the ARTS system described in Chapter 4, that cannot be implemented this way. For example, if ARTS were implemented under RT-Linux, then its real-time module would have to be able to send an ATM packet into the network. This is not possible, however, because access to the networking subsystem of Linux is not supported by RT-Linux. Another example of an application that does not fit the RT-Linux model is a multimedia viewer. Usually a multimedia viewer reads a multimedia file (ie. mpeg, avi etc.) and displays it. The display of the frames in the multimedia stream have to occur at specific rates. Using the RT-Linux model it is not clear how this could be accomplished, as RT-Linux does not provide any access to video adapters, filesystems etc.

Other research systems take a substantially different approach. The Spring system developed at University of Massachusetts-Amherst took predictability as a first class design criterion, providing system services in a bottom-up manner only as implementations consistent with the predictability goals could be created [19,27,28]. One purpose of the project was to investigate the limits on predictability imposed by conventional hardware, but maximizing predictability clearly minimized services. Further, the only hardware supported by Spring is too costly and not always appropriate for the applications currently being addressed.

The Maruti system developed at the University of Maryland is not constrained by the same hardware dependencies as Spring [24]. In its most basic form, Maruti consists of a library of core routines needed to support the real-time demands of its applications. These core library routines were originally implemented on top of Unix, and were thus subject to the temporal resolution of the underlying system. Subsequent targeting on Mach is subject to the temporal granularity of Mach, and is similar to KURT in the sense that both help add real-time semantics to an existing system. Maruti, like KURT, represents the execution of its applications with an explicit schedule. Maruti is more sophisticated than KURT in many respects, and thus imposes more stringent constraints on the semantics of the applications it supports. Similar considerations of restrictions on supported hardware, cost, availability or the tradeoff between services and predictability apply to other well known research systems, such as MARS [22] and HARTS [26].

YARTOS is an experimental real-time operating system kernel that provides guaranteed response times to tasks [13,14,15]. The YARTOS scheduling model is composed of two basic abstractions: tasks and resources. A task is an independent thread of control that is invoked at sporadic intervals. Each invocation of a task must complete execution before a well defined deadline. The invocation interval and deadline of a task is derived from constructs in the higher level programming model. During the course of execution, a task may require access to a number of resources. A resource is a software object that encapsulates shared data and exports a set of procedures for accessing and manipulating the data. The scheduling algorithm used is an integration of a synchronization scheme for access to shared resources with the earliest deadline first processor scheduling algorithm. In YARTOS, the deadlines of the task are manipulated so that there is no contention for a shared resource and hence it ensures mutual exclusion on resource operations. This dynamic manipulation of a task's deadline is similar to the concept of priority ceiling in priority inheritance protocols [14]. Jeffay et. al. implemented a desktop video-conferencing utility using YARTOS [15]. They state that the YARTOS programming model has not supported the conferencing application seamlessly. Writing multimedia applications for YARTOS seems to increase the development complexity of the application and hence YARTOS is probably not suited for supporting present day firm real-time applications.

2.4 Limitations of Existing ATM Analyzers

Due to the proliferation of high speed networks, such as ATM, tools that can analyze the performance of network components have assumed great importance. The ATM Reference Traffic System (ARTS), described in detail in Chapter 4 can record and generate packet-level ATM traffic streams with microsecond accuracy. ARTS uses KURT's scheduling abilities in its traffic generation module by placing packet transmission events in an explicit KURT schedule. The motivation for the ARTS system was cost, as well as flexibility in conducting performance evaluation studies of ATM networks. Network analyzers are the obvious candidates for such a role, and are made by many companies, including: Hewlett-Packard, Adcom, Net2Net, and Radcom. These devices are, however, generally targeted at cell level behavior requiring significantly more expensive hardware and providing more restricted support for recording and reproducing network traffic. These devices commonly cost tens and even hundreds of thousands of dollars. Further, they often only provide restricted support for generating and recording traffic which is inadequate for the performance evaluation application we were considering. In contrast, the UTIME and KURT patches to Linux enabled the implementation of a system capable of sending and receiving essentially unlimited sequences of packets. For example, an ARTS system with 128 Megabytes of main memory can send or receive schedules of roughly 3 million packets without using the system disk during real-time execution.

3 UTIME: On Demand Microsecond Resolution Timers

  In order to implement KURT, several changes had to be made to Linux. First, its temporal granularity had to be refined, since many firm real-time applications need to track time at resolutions higher than that provided by generic Linux. In the ARTS application, for example, scheduling resolution on the order of 10 microseconds is required. This makes the standard Linux software clock resolution of 10 milliseconds (1 millisecond on the DEC Alpha architecture) unacceptable. In this chapter, the design of the UTIME system which allows for microsecond resolution timers is outlined. Performance measurements of the UTIME system are also presented.

3.1 Asynchronous Event Handling in Standard Linux

In Linux, any subsystem of the kernel that needs asynchronous notification creates a timer and adds it to a list maintained by the kernel. The structure of a timer is given in Program 3.1. As can be seen in Program 3.1, each timer has an expires field and a function (fn) field. The expires field specifies when the asynchronous notification represented by the timer is to happen and the fn field specifies the function that should be called when the timer expires. Each timer also has a data value that is passed to the function when it is called. The timers are kept in a doubly linked list sorted in ascending order of time% latex2html id marker 2753
\setcounter{footnote}{2}\fnsymbol{footnote}. Functions are provided to add and delete timers from this list.


 \begin{Program}
% latex2html id marker 442
\begin{verbatim}
1 struct timer {
2 s...
 ...xpires;
6 }\end{verbatim}\caption{Structure of Linux Kernel Timers}\end{Program}

In generic Linux, a hardware timer chip is programmed at boot time to interrupt the CPU at a fixed periodic rate. Each time the CPU is interrupted, the kernel updates its software clock to indicate the passing of another tick. Thus the software clock is nothing more than a running count of the number of clock ticks, or hardware timer interrupts, that have passed since the kernel was started. The length of time between each software clock tick is referred to as a jiffy. A simplified version of the timer interrupt service routine (ISR) is shown in Program 3.2. In the timer ISR, after updating the jiffies counter, the kernel calls run_timer_list (Program 3.3). The run_timer_list routine is the timer dispatch function which checks if there are any scheduled timers that are due for processing, and calls the appropriate function if necessary. Finally, the kernel performs other activities such as updating process times, scheduling a new process if necessary in the do_other_stuff routine.


 \begin{Program}
% latex2html id marker 453
\begin{verbatim}
1 void timer_interru...
 ...Routine]{Pseudo-code for the Linux Timer Interrupt Service Routine}\end{Program}


 \begin{Program}
% latex2html id marker 457
\begin{verbatim}
1 void run_timer_lis...
 ...mer Dispatch Function]{Pseudo-code for the Timer Dispatch Function}\end{Program}

Since the kernel only checks its list of pending scheduled events at each timer interrupt, the length of time between interrupts (or the length of each jiffy) is the smallest meaningful unit of time with which events can be scheduled. Thus the length of a jiffy determines the kernel's timing resolution.

3.1.1 Implementation of UTIME

In order to increase the resolution of the Linux software clock, the basic mechanism by which it is implemented had to be altered. A naive approach to increasing the timing resolution would be to simply program the timer chip to interrupt the CPU at a higher frequency. This would reduce the length of a jiffy, and therefore increase the software clock resolution. However, since the timer chip would regularly interrupt the CPU irrespective of whether or not any events were scheduled to occur at the time of each interrupt, this approach leads to excessive, unsupportable overhead.

A better solution to the problem of increasing the kernel's timer resolution is based on the following observation: there is a crucial difference between the temporal resolution and the frequency of events. In other words, even though firm real-time applications schedule events with microsecond resolution deadlines, events are rarely scheduled to occur every microsecond. To support such applications, what is needed is a mechanism by which timer interrupts are allowed to occur at any microsecond, not necessarily every microsecond. In light of this observation, UTIME departs from the common practice of using the timer chip to interrupt the CPU at a fixed rate. Instead, the timer chip is programmed to interrupt the CPU in time to process the earliest scheduled event.

In order to achieve microsecond accuracy, a new member had to be added to the timer structure. This member (usec) specifies the microsecond within the jiffy at which the timer will expire. Two global variables were also added: jiffies_u, which maintains the microsecond within the current jiffy, and jiffies_intr, the use of which will be explained later in this section.

Since interrupts no longer happen periodically, a function (update_time) that keeps track of elapsed time was added. The update_time routine makes use of the time stamp counter (TSC) that is available on most modern CPUs to maintain time% latex2html id marker 2754
\setcounter{footnote}{3}\fnsymbol{footnote}. This function calculates the time elapsed since the last time this function was called. It then updates the global variables jiffies and jiffies_u to reflect the elapsed time. The pseudo-code for update_time is shown in Program 3.4.


 \begin{Program}
% latex2html id marker 477
\begin{verbatim}
1 void update_time()...
 ...intaining Time in UTIME]{Pseudo-code for Maintaining Time in UTIME}\end{Program}

The pseudo-code for the modified timer ISR is shown in Figure 3.5. When an interrupt is serviced, the timer ISR first calls update_time to update the software clock. Then the time till the next scheduled event is found, and the timer chip is programmed to interrupt the CPU at that time. Finally, the modified run_timer_list (Program 3.6) is called to service all the events that have expired.


 \begin{Program}
% latex2html id marker 484
\begin{verbatim}
1 void timer_interru...
 ... ISR]{Pseudo-code for the Modified Timer Interrupt Service Routine}\end{Program}


 \begin{Program}
% latex2html id marker 488
\begin{verbatim}
1 void run_timer_lis...
 ...tch Function]{Pseudo-code for the Modified Timer Dispatch Function}\end{Program}

Thus the CPU is interrupted only when it needs to be, and not at regular intervals. In this simple form, our scheme eliminates the timer interrupts at which no events are due for processing. Since the timer chip can be programmed with microsecond accuracy, we effectively achieve microsecond resolution.

In this simple version UTIME, the jiffy counter is no longer incremented regularly, as is the case in standard Linux. Indeed the timer interrupts, at which jiffy counter updates can be made, often occur with little or no regularity. This presents a problem since there are several kernel subsystems that assume both implicitly and explicitly that the jiffy counter is being updated at regular intervals. To keep these systems working, a way to reproduce this periodic heartbeat% latex2html id marker 2755
\setcounter{footnote}{4}\fnsymbol{footnote} in the absence of regular, periodic interrupts had to be devised. Hence, when reprogramming the timer chip to trigger the next interrupt in time to process the next event, a check is made to see if that event occurs after the next jiffy counter update would have occurred. If this is found to be the case, then an interrupt is scheduled to occur just in time to increment the jiffy counter (Program 3.7). This introduces some interrupts at which no events are due for processing, but keeps all kernel subsystems, including the software clock, working as normal. If during any call to update_time (Program 3.4), the accumulated jiffies_u crosses the number of microseconds in one jiffy, the global variable jiffies_intr is incremented. As can be seen in Program 3.5, the timer ISR performs its normal functions only when a jiffy boundary is crossed. Though this increases the jitter of jiffy events, this jitter is still less than the resolution required by the systems assuming the periodic heartbeat.

The implications of UTIME, in terms of system overhead and scheduling effectiveness, are discussed in the next section.


 \begin{Program}
% latex2html id marker 501
\begin{verbatim}
1 unsigned long time...
 ...seudo-code for Calculating the Time until the Next Scheduled Event}\end{Program}

3.2 Evaluation of UTIME

  Most schemes that promise to increase the temporal resolution of an operating system do so by increasing interrupt frequency. This raises system overhead to unacceptable levels, even when no events are explicitly scheduled. By changing the fundamental mechanism supporting the Linux software clock, the UTIME system avoids this overhead increase, while achieving significant gains in temporal resolution. The gains in resolution are evident from the ability of the KURT scheduler (which is based on UTIME) to schedule events with microsecond accuracy. This is discussed further in Sections 4.4 and 4.4.2. In this section, we will discuss the tests performed to measure the overhead and accuracy of UTIME.

Most of the UTIME implementation centers around changes to the processing of timer interrupts. For this reason, comparing the processing time of a timer interrupt under UTIME with that of a timer interrupt in standard Linux, gives a quantitative measure of the overhead resulting from UTIME. To measure the processing time of a timer interrupt, we inserted code into the timer interrupt handler that would take time-stamps upon entry and upon exit from the ISR. Time-stamps were obtained by reading the time-stamp counter (TSC), which increments at the clock rate of the Intel Pentium Pro Processor used in the test system% latex2html id marker 2757
\setcounter{footnote}{5}\fnsymbol{footnote}.

Timer interrupt processing times were measured for three cases: standard Linux with no timer modifications, Linux with the UTIME modifications but without any events scheduled at microsecond resolution (i.e. behaving like a normal Linux system), and Linux with the UTIME system with several events scheduled at microsecond resolution. The processing times for around 8000 timer interrupts were measured for each of these three cases and Figure 3.1 shows the processing time distributions. Note that this figure illustrates the percentage of interrupts with overhead less than or equal to a given value on the horizontal axis. As can be seen from the figure, without UTIME, most timer interrupts take approximately 1 microsecond to process, and virtually all take less than 5 microseconds. With UTIME in place, most timer interrupts take approximately 15 microseconds to process, regardless of whether or not any microsecond events are scheduled. It should be noted that the timer interrupt processing time, as described here, does not include the time required to process any scheduled events: the timer interrupt handler simply checks which events are due for processing, and marks them as such.


  
Figure 3.1: Effect of UTIME on Timer Interrupt Processing Time
\begin{figure}
\begin{center}

\epsfig {file=figs/utime_overhead.eps, width=5.5in}\end{center}\end{figure}

The exact value of the timer overhead is less important than the fact that it is greater under UTIME than under standard Linux for two main reasons. Since timer interrupts do not occur on any regular interval, there is some amount of extra computation required to determine the new value to load into the timer chip to trigger the next interrupt. This is not the case in standard Linux because the timer chip is simply programmed, during system boot-up, to interrupt the CPU at a fixed rate. A large portion of the processing time increase, however, is because the timer chip on the systems we used required approximately 4 microseconds of latency before its inputs are latched. A more appropriately designed timer chip would therefore decrease the amount of overhead represented by UTIME. Even though the actual time required to process timer interrupts using UTIME increases, this penalty is incurred at most once every 10 milliseconds when no microsecond events are scheduled. Thus, the overhead presented by the UTIME modification is only about 0.15% , with an additional 15 microseconds per event explicitly scheduled with microsecond resolution.

To test the effect of UTIME on soft real-time applications, we wrote a dummy application that would try to run once every 10 milliseconds. The pseudo-code for this application is shown in Program 3.8. The application first sets its scheduler to SCHED_FIFO so that it gets preferentially scheduled over other processes% latex2html id marker 2758
\setcounter{footnote}{6}\fnsymbol{footnote}. Then it enters a loop, collects a time-stamp, and finds the difference between the current timestamp and the previous one. It then sleeps for the remainder of 10 ms. The time difference between consecutive loops is plotted in the UTIME case in Figure 3.2(a). As can be seen from Figure 3.2(a), the process actually gets scheduled every 10 ms. Whereas in the non-UTIME case% latex2html id marker 2759
\setcounter{footnote}{7}\fnsymbol{footnote} (Figure 3.2(b)) the process ends up getting scheduled every 20 ms. Since the POSIX standard for select states that the process should sleep for at least the time stated in the select call, the Linux kernel adds 1 jiffy (10 ms) to the time provided in the select call. As can be seen from these two figures, UTIME alone increases the utility of Linux for soft real-time applications.


 \begin{Program}
% latex2html id marker 521
\begin{verbatim}
1 void main()
2 {
3 ...
 ...udo-code for Program Testing Effect of UTIME on Process Scheduling}\end{Program}


    
Figure 3.2: Time Difference between Consecutive Loops with and without UTIME
\begin{figure}
\begin{center}
\subfigure[With UTIME]{
\epsfig {file=figs/with_ut...
 ...fig {file=figs/without_utime.eps, width=0.5\textwidth}
}\end{center}\end{figure}

Finally Figure 3.3 shows the amount of time elapsed between when a microsecond resolution event was scheduled, and when the function associated with the timer (fn in Program 3.1) was called. As can be seen from the figure, the timer handling function for most events are called within 50 $\mu$sec of the scheduled time. Hence this is the actual minimum resolution with which events in UTIME can be scheduled. This result has some implications on the scheduling of firm real-time events and will be discussed further in Chapter 4.


  
Figure 3.3: Difference between Scheduled Time and Timer Dispatch
\begin{figure}
\begin{center}

\epsfig {file=figs/utime_diff.eps, width=5.5in}\end{center}\end{figure}

4 KURT: KU Real-Time Linux

 As discussed earlier, in addition to increasing the temporal resolution of Linux, a new scheduling algorithm had to be implemented for firm real-time applications. Standard Linux is designed to schedule processes so that they each receive a fair share of CPU time. This is a worthwhile goal in a normal timesharing system, but is not well suited to real-time applications. Hence, a new scheduling policy had to be implemented that would give firm real-time tasks% latex2html id marker 2760
\setcounter{footnote}{8}\fnsymbol{footnote} the priority needed to meet their deadlines. This chapter discusses the architecture of the KURT system and presents some performance measurements of it.

4.1 Design of KURT

KURT is designed in a modular fashion: the scheduling mechanism and the functionality of the real-time system are independent of each other. KURT consists of two parts:

4.1.1 The KURT Core

 The KURT core is responsible for scheduling real-time events and invoking the appropriate RTMod at the appropriate time. The KURT scheduler is an explicit plan scheduler, which means that applications need to specify explicitly the times at which real-time events (i.e. the invocation of RTMods) are to occur. These times are specified in advance, and stored in a schedule file, which also list the data that should be passed to each RTMod at invocation. The format of the schedule file is explained in Section 4.1.4. The KURT core adds timers to schedule each real-time event in the schedule file and calls the appropriate RTMod when the event expires. As can be seen in Figure 3.3, it takes anywhere from 20 to 50$\mu$sec between the time when the interrupt is actually programmed to occur and when the timer handling function (fn in Program 3.1) is called. Hence, the KURT core programs the timer chip to interrupt the CPU 50$\mu$sec earlier then when the real-time event is actually due. It then uses the Pentium Time Stamp Counter (TSC) to spin loop till the actual time of the event. Though this scheme increases the minimum inter-event separation, it reduces the timing distortions that would have occurred otherwise.

Since Linux is a time sharing operating system, many of its components are not tuned to perform well under real-time constraints. As shown in Section 4.4.3, the filesystem code and the device driver code for the disk drive may block interrupts for up to 250$\mu$sec. Hence, it might be advantageous to turn off these components when performing real-time tasks. KURT provides a way in which this can be achieved. The KURT kernel can run in one of two modes. In the normal mode, all processes are allowed to run. Since all processes are allowed to run in the normal mode, some process may be using kernel services (like the disk) that might block interrupts for an extended period of time. To avoid this, the kernel can be switched to a real-time mode in which only processes that are marked as real-time processes are allowed to run. Since the scheduling occurs within a focussed set of processes, it is possible for the developer to make sure that those processes do not use any kernel service that might block interrupts and distort the performance of the KURT scheduler beyond acceptable limits. The KURT core provides system calls that allow a user process to toggle the kernel between real-time mode and normal mode (Section 4.1.3). The steps required to mark processes as real-time is explained in Section 4.3.

4.1.2 Real-Time Modules

 Real-time modules are loadable kernel modules that implement the functionality of a specific real-time task. Real-time modules execute in kernel mode and can therefore access devices, as well as other parts of the kernel, in ways that would not be permitted to normal user-level processes. For example, in the ARTS RTMod the module directly accesses the ATM network interface card (NIC) to transmit packets at accurate times. Since the real-time modules run in kernel mode, the overhead incurred in context switching to a specific process is avoided. Hence, writing real-time modules to do a certain task is much more efficient than writing real-time user mode processes to do the same task. In this respect, KURT is similar to VxWorks in which the real-time tasks and the kernel execute in the same address space [5].

The Process RTMod is a built-in real-time module that switches context to a specific user mode process when invoked. Through the Process RTMod, KURT provides the functionality of writing real-time processes that can be explicitly scheduled. The use of real-time processes increase the context-switch and system call overhead and latency, but simplify the problem of application development in some cases. There is less risk associated with using real-time processes, as a bug in one real-time process cannot affect the rest of the system. Further details on the interface between the KURT core and RTMods is explained in Section 4.2.

Figure 4.1 shows the relationship between the KURT core, the RTMods, and user programs.


  
Figure 4.1: Architecture of the KURT system
\begin{figure}
\begin{center}

\epsfig {file=figs/kurt_design.eps, width=5.5in}\end{center}\end{figure}

4.1.3 System Calls in the KURT Core

 The KURT core provides the following system calls to schedule real-time events, and to toggle the kernel between the real-time mode and the normal mode:

4.1.3.1 switch_to_rt

This system call allows a user process to switch the kernel to real-time mode. The prototype of this system call is as follows:
    int switch_to_rt(int timer_chip_mode, unsigned long period,
                     int rt_mode, char *cmdline, int length);
In the above prototype:

4.1.3.2 switch_to_normal

This system call switches the kernel to the normal mode. In the normal mode, the kernel behaves like a normal Linux kernel, and all processes are allowed to run. The prototype of this system call is as follows:
    int switch_to_normal(int force);

The force parameter specifies whether the kernel has to be immediately switched into the normal mode. If this parameter is set to 1, then all existing scheduled events are canceled, and the kernel is immediately switched to the normal mode. If this parameter is set to 0, then the system call blocks till all the scheduled real-time events have been serviced.

4.1.3.3 rtmod_cmd

This system call allows a user process to directly communicate with a real-time module. This is necessary in case a user program has to exchange data with a specific RTMod. The rtmod_cmd routine is similar to the ioctl call in UNIX. The prototype of this call is as follows:
    int rtmod_cmd(int rtmod_id, int command, void *buffer,
                  int size);

In the above prototype:

4.1.3.4 rt_schedule_events

This system call allows a user process to schedule real-time events. The prototype of this call is as follows:
    int rt_schedule_events(struct timeval *time, int type, 
                           int num_times, char *filename);
In the above prototype:

4.1.4 Format of the Schedule File

 The schedule file is a binary representation of the set of real-time events that have to be executed by KURT. The structure of an event is shown in Program 4.1.


 \begin{Program}
% latex2html id marker 679
\begin{verbatim}
1 struct rt_event {
...
 ...Event Structure]{Structure of a Real-Time Event in a Schedule File}\end{Program}

As can be seen from Program 4.1, each event has a module identifier module_id associated with it. This specifies which RTMod has to be invoked for the event. The module_ids are assigned to modules when they register with the KURT core (see Section 4.2). Each event also has 12 bytes of module specific data that gets passed to the module when the event expires. The time when the event has to occur is specified as an offset the start of the schedule. The schedule file has a binary representation of this structure for each event.

4.2 Interface Between Real-Time Modules and the KURT Core

 The KURT core provides functions that are used to register/unregister each RTMod.

4.2.1 register_rtmod

RTMods issue this call to register with the KURT core. The prototype of this function is as follows:
    int register_rtmod(struct rtmod_info *info);

register_rtmod returns the module_id for the real-time module. This identifier is used when scheduling events for this RTMod. A user can obtain a mapping of module_id and the name of the module through the get_rtmod_num library call. The prototype of this call is as follows:

    int get_rtmod_num(char *rtmod_name);

The return value from this library call is the module_id of the real-time module that has the name rtmod_name. If the specified real-time module is not present then the return value is -1.

The rtmod_info structure (Program 4.2) provides information about the module to the KURT core.


 \begin{Program}
% latex2html id marker 694
\begin{verbatim}
1 struct rtmod_info ...
 ...Core]{Information passed by each real-time module to the KURT core}\end{Program}

As can be seen in Program 4.2, each module passes its name along with a set of function pointers to the KURT core when registering. These functions are called by the KURT core at the following times:

Each RTMod should at least provide the name and event_handler function to the KURT core. The other elements of rtmod_info can be set to NULL.

4.2.2 unregister_rtmod

This function allows a previously registered RTMod to unregister itself, so that the KURT core cannot schedule events for it. The prototype of this call is as follows:
    int unregister_rtmod(char *name);

The name used in the register_rtmod call is passed to this function to unregister this RTMod. The KURT core makes sure that no more events are scheduled for this RTMod.

4.3 Process Module

 The Process RTMod is a built-in real-time kernel module whose function is simply to switch context to a specified user process. Through the Process RTMod, KURT provides the functionality of writing real-time processes that can be explicitly scheduled. The use of real-time processes increase the context-switch and system call overhead and latency, but simplify the problem of application development in some cases. The process module is special in that it cannot be unregistered.

4.3.1 The SCHED_KURT Scheduling Policy

The normal Linux scheduler is tuned to minimize response time and hence is not ideal for real-time systems. Though processes in Linux can be scheduled using the FIFO and Round-Robin schedulers, these schedulers are meant to support soft real-time processes. To allow firm real-time scheduling of user processes a new scheduling policy had to be implemented. This scheduling policy (SCHED_KURT) is implemented as an addition to the existing policies like SCHED_FIFO, SCHED_RR, and SCHED_OTHER. Processes that use the SCHED_KURT scheduling policy can be explicitly scheduled. The schedule file for a process-centric application would simply specify the times at which the process module should allow certain user processes to run.

Explicit scheduling using schedule files as described above works well for many real-time applications. However, some applications have periodic execution flows, for example, multimedia applications exhibit this characteristic, wherein a certain section of code is executed repeatedly at a fixed time interval. For these applications, specifying explicit event times does not make sense. A more natural approach is to allow these applications to specify the length of one period, and then have KURT make them execute the appropriate piece of code once every period. The system calls that allow a user process to achieve this are explained in the Section 4.3.3. SCHED_KURT processes that are not explicitly scheduled, are scheduled according to a Round-Robin policy based on the priority of the process. Even though these processes are scheduled using a Round-Robin policy, they still have higher priority than other processes using the standard Round-Robin scheduler. The SCHED_KURT policy is different from other scheduling policies in the Linux kernel, in the respect that it comes into effect only when the kernel is in real-time mode. When the kernel is in normal mode these processes are scheduled according to their normal scheduling policy.

4.3.2 Characteristics of SCHED_KURT Processes

 Each SCHED_KURT process has a structure that specifies its characteristics. This structure is used in the set_rtparams system call (see Section 4.3.3), and is illustrated in Program 4.3.
 \begin{Program}
% latex2html id marker 731
\begin{verbatim}
1 struct rtparams {
...
 ... Processes]{Structure Specifying Characteristics of KURT Processes}\end{Program}
The various members of the rtparams structure have the following meaning:

4.3.3 System Calls in the Process Module

 The following system calls are provided as part of the process module, to allow a user process to register/unregister as a KURT process, and to schedule themselves.

4.3.3.1 set_rtparams

This system call allows a user process to register/unregister itself as a SCHED_KURT process. The prototype of this system call is as follows:
    int set_rtparams(int pid, int policy, struct rtparams *param);
In the above prototype:

4.3.3.2 get_rtparams

This system call returns the SCHED_KURT characteristics of a particular process. The prototype of this call is as follows:
    int get_rtparams(int pid, struct rtparams *params);

4.3.3.3 get_rtstats

This system call returns the statistics that is collected for the specified SCHED_KURT process. The prototype of this call is as follows:
    int get_rtstats(int pid, struct rtstats *info);

 \begin{Program}
% latex2html id marker 761
\begin{verbatim}
1 struct rtstats {
2...
 ...atistics]{Structure Specifying Statistics of a SCHED\_KURT process}\end{Program}

The rtstats structure is shown in Program 4.4. The elements of this structure are as follows:

4.3.3.4 get_num_rtprocs

This system call returns the number of registered SCHED_KURT processes. The prototype is as follows:
    int get_num_rtprocs(void);

4.3.3.5 rt_suspend

 This system call suspends the calling process till the next time it is scheduled to be woken up. The prototype is as follows:
    int rt_suspend(int susp_mode);

The susp_mode parameter can have one of the following values:

The pseudo-code for a periodic SCHED_KURT process is shown in Program 4.5.
 \begin{Program}
% latex2html id marker 792
\begin{verbatim}
1 void main()
2 {
3 ...
 ... SCHED\_KURT Process]{Pseudo-Code for a Simple SCHED\_KURT Process}\end{Program}

4.4 Performance Measurements

 To validate the performance of the KURT system, tests were performed using user processes as well as real-time kernel modules. The tests performed with user processes is explained in Section 4.4.1 and the tests performed with the ARTS RTMod is explained in Section 4.4.2

4.4.1 Process Scheduling Performance

  To test the new firm real-time scheduler, a dummy KURT application that would try to run once every 10 milliseconds was written. We also wrote a version of this application that would use only the timing provided by Linux with UTIME running the standard FIFO scheduler (which claims to offer soft real-time performance). By comparing the effectiveness of these two applications with respect to the actual length of time between executions, we were able to evaluate the gains in scheduling accuracy provided by KURT.

The KURT version of the dummy application (see Program 4.6) used the scheduler's periodic mode, and set the period to 10 milliseconds. It would first switch the kernel to real-time mode and then suspend itself waiting for the KURT process module to wake it up periodically. At the start of each execution, the application would take a time-stamp and would log this time-stamp to disk. The non-KURT version of the application (see Program 4.7) would set its scheduler to be the standard Linux FIFO scheduler, and then enter a loop. At the top of the loop it would obtain a time-stamp and store it in memory. It would then obtain another time-stamp and compute the remaining time till the next iteration and sleep (using the select system call) for that much time. When it woke up, it would return to the top of the loop and repeat the same procedure. The difference between consecutive time-stamps gave a measure of the length of time between successive executions. The closer this measured length of time was to 10 milliseconds, the more accurate the scheduler.


 \begin{Program}
% latex2html id marker 805
\begin{verbatim}
1 void main()
2 {
3 ...
 ...seudo-code for Program Testing Effectiveness of the KURT Scheduler}\end{Program}


 \begin{Program}
% latex2html id marker 809
\begin{verbatim}
1 void main()
2 {
3 ...
 ...ode for Program Testing Effectiveness of the SCHED\_FIFO Scheduler}\end{Program}

Tests were performed in the SCHED_KURT and SCHED_FIFO cases using 1, 10, 20, and 30 concurrent executions of the dummy application. For the KURT case, tests were performed using the real-time mode set to SCHED_KURT_PROCS, in which only SCHED_KURT processes are allowed to run, as well as with real-time mode set to SCHED_ALL_PROCS, in which all processes are allowed to run. Comparison of the SCHED_ALL_PROCS test, with the SCHED_FIFO test provides an idea of the advantage of explicit scheduling without focusing the set of running processes. Apart from the test processes, the system was essentially idle.


      
Figure 4.2: Scheduling Effectiveness of SCHED_FIFO
\begin{figure}
\begin{center}
\subfigure[1 Process]{
\epsfig {file=figs/hist_fif...
 ...sfig {file=figs/hist_fifo_30.eps, width=0.5\textwidth}
}\end{center}\end{figure}


      
Figure 4.3: Scheduling Effectiveness of SCHED_KURT (SCHED_KURT_PROCS Mode)
\begin{figure}
\begin{center}
\subfigure[1 Process]{
\epsfig {file=figs/hist_kur...
 ...ig {file=figs/hist_kurt-1_30.eps, width=0.5\textwidth}
}\end{center}\end{figure}


      
Figure 4.4: Scheduling Effectiveness of SCHED_KURT (SCHED_ALL_PROCS Mode)
\begin{figure}
\begin{center}
\subfigure[1 Process]{
\epsfig {file=figs/hist_kur...
 ...ig {file=figs/hist_kurt-2_30.eps, width=0.5\textwidth}
}\end{center}\end{figure}

Figure 4.2 presents the histograms obtained in the SCHED_FIFO case. These histograms plot the percentage of iterations versus the time between consecutive iterations.

Figure 4.3 presents the histograms obtained in the SCHED_KURT case, with real-time mode set to SCHED_KURT_PROCS. Figure 4.4 are the histograms obtained in the SCHED_KURT case, with SCHED_ALL_PROCS mode.

The following observations can be made from the these figures:

From these observations, it can be concluded that explicit scheduling without focusing the set of running processes is itself a viable alternative for scheduling firm real-time processes.


     
Figure 4.5: Effect of Number of Processes on Scheduling
\begin{figure}
\begin{center}
\subfigure[FIFO]{
\epsfig {file=figs/breakthrough_...
 ...ile=figs/breakthrough_kurt-2.eps, width=0.5\textwidth}
}\end{center}\end{figure}


    
Figure 4.6: Effect of Number of Processes on Scheduling in Presence of Background Disk Activity
\begin{figure}
\begin{center}
\subfigure[FIFO]{
\epsfig {file=figs/breakthrough_...
 ...reakthrough_kurt-2_with_disk.eps, width=0.5\textwidth}
}\end{center}\end{figure}

Figure 4.5 summarizes the results shown in Figure 4.24.4. This figure illustrates the percentage of iterations with duration less than or equal to a given value on the horizontal axis.

Figure 4.5(a) clearly shows that the variance of the FIFO scheduler's performance increases significantly as the number of processes increases, whereas that of the KURT scheduler does not. Irrespective of the number of processes, KURT schedules each application at the appropriate time.

Table 4.1 presents a tabular comparison of the three schedulers. In these tables, the percentage of events within a specified time duration from the expected time is shown.

As can be seen in Table 4.1(a), SCHED_FIFO would probably be suitable for running 1 firm real-time process. For example, 62% of the events lie within 10$\mu$sec (i.e. 0.1% ) of the expected time, and 100% of events lie within 500$\mu$sec (i.e. 5% ) of the expected time. But as the number of FIFO processes increase, the performance drops drastically. For example, with 30 processes, only 1.21% of the events lie within 0.1% of the expected time, and only 46.61% of the events lie within 5% of the expected time. The performance is expected to decrease further as more processes are added to the system.

In contrast to this, in the SCHED_KURT case in the SCHED_ALL_PROCS mode (Table 4.1(b)), for 1 process, 99.8% of the events lie within 0.1% of the expected time, and 100% of the events lie within 5% of the expected time. And for 30 processes, 97.3% of the events lie within 0.1% of the expected time, and 99.99% of the events lie within 5% of the expected time.

The performance of explicit scheduling in the SCHED_KURT_PROCS mode (i.e. only SCHED_KURT processes are allowed to run) is shown in Table 4.1(c). There is a slight improvement in the performance of the explicit scheduler by switching the kernel to SCHED_KURT_PROCS when compared to SCHED_ALL_PROCS mode. But this performance improvement is not substantial, because the system was essentially idle apart from the explicitly scheduled processes. If the system was loaded, the performance of the explicit scheduler, in the SCHED_ALL_PROCS mode, would be distorted to the extent of the duration during which interrupts were disabled by other parts of the operating system. Figure 4.6 shows the effect of background disk activity on the performance of the SCHED_FIFO and SCHED_KURT schedulers. These experiments were conducted with background disk activity because the data presented in Section 4.4.3 shows that the disk subsystem of Linux disables interrupts for the maximum duration of time. As can be observed from these figures and Table 4.2 background disk activity distorts the schedule for durations less 500 $\mu$sec duration. This is because the disk subsystem of Linux disables interrupts for durations up to 250$\mu$sec and depending upon the number of back to back blocking of interrupts, the performance of the scheduler would be distorted to that extent. It can be noted that even in the presence of background disk activity SCHED_KURT still performs much better than SCHED_FIFO. Another observation that can be made from Table 4.2(b) is that as the number of processes increases, the performance of SCHED_KURT improves. This is probably because, as the number of processes increase, the process generating the background disk activity does not get scheduled in that often and therefore the duration of time during which interrupts are disabled is reduced. Section 4.4.3 presents some data that would enable application developers to gauge the effect, in terms of scheduling distortions, of using certain subsystems of Linux in their real-time applications.


[SCHED_FIFO]  
Table 4.1: Confidence Intervals for Process Scheduling using Different Schedulers
Number of Processes 5c|% of events found within 'x' Microseconds of Expected Time        
  10 $\mu$sec 50 $\mu$sec 100 $\mu$sec 500 $\mu$sec 1000 $\mu$sec
1 62.22% 99.59% 99.80% 100.00% 100.00%
10 3.36% 15.12% 30.25% 99.34% 99.72%
20 1.44% 7.45% 14.76% 66.48% 99.78%
30 1.21% 6.10% 11.03% 46.61% 73.11%
 
[SCHED_KURT (SCHED_ALL_PROCS Mode)]
Number of Processes 5c|% of events found within 'x' Microseconds of Expected Time        
  10 $\mu$sec 50 $\mu$sec 100 $\mu$sec 500 $\mu$sec 1000 $\mu$sec
1 99.80% 100.00% 100.00% 100.00% 100.00%
10 96.66% 99.47% 99.67% 100.00% 100.00%
20 99.34% 99.67% 99.91% 100.00% 100.00%
30 97.33% 99.54% 99.80% 99.99% 100.00%
 
[SCHED_KURT (SCHED_KURT_PROCS Mode)]
Number of Processes 5c|% of events found within 'x' Microseconds of Expected Time        
  10 $\mu$sec 50 $\mu$sec 100 $\mu$sec 500 $\mu$sec 1000 $\mu$sec
1 99.80% 99.80% 100.00% 100.00% 100.00%
10 96.61% 99.72% 99.87% 100.00% 100.00%
20 99.23% 99.40% 99.61% 99.80% 100.00%
30 99.52% 99.75% 99.89% 100.00% 100.00%
 


[SCHED_FIFO]  
Table 4.2: Confidence Intervals for Process Scheduling with Background Disk Activity
Number of Processes 5c|% of events found within 'x' Microseconds of Expected Time        
  10 $\mu$sec 50 $\mu$sec 100 $\mu$sec 500 $\mu$sec 1000 $\mu$sec
1 50.20% 81.33% 83.74% 98.99% 99.59%
10 5.50% 17.19% 30.70% 84.40% 92.25%
20 2.00% 7.07% 13.21% 55.50% 84.20%
30 1.09% 4.71% 8.65% 36.47% 64.32%
 
[SCHED_KURT (SCHED_ALL_PROCS Mode)]
Number of Processes 5c|% of events found within 'x' Microseconds of Expected Time        
  10 $\mu$sec 50 $\mu$sec 100 $\mu$sec 500 $\mu$sec 1000 $\mu$sec
1 40.16% 60.04% 73.89% 97.60% 99.60%
10 68.43% 81.68% 88.61% 97.17% 99.17%
20 87.94% 95.17% 97.37% 99.52% 99.66%
30 56.26% 60.74% 65.27% 82.01% 95.72%
 

4.4.2 ATM Reference Traffic System Performance

 Experiments were conducted with the ARTS system to test the effectiveness of scheduling real-time kernel modules. The ARTS system was tested by generating traffic streams with several inter-packet transmission times, and then observing these traffic streams with a hardware ATM cell analyzer. For example, in one experiment, ARTS was instructed to transmit an ATM packet once every 100 microseconds. We then used an ATM analyzer to record the actual inter-arrival time of the packets sent by ARTS. The accuracy of the ARTS system was measured by observing the difference between the actual packet inter-arrival times and those that ARTS was supposed to generate. This experiment was performed for various inter-packet transmission times.


  
Figure 4.7: ARTS Performance at Various Inter-packet Transmission Times
\begin{figure}
\begin{center}

\epsfig {file=figs/arts.eps, width=5.5in}\end{center}\end{figure}

Figure 4.7 shows the distribution of the differences between actual and desired inter-arrival times. As can be seen from the figure, ARTS was generally able to send packets with the correct inter-arrival times. Consider the experiment in which packets were sent once every 500 microseconds. The variance of the data was only 2.74 microseconds, and 99.7% of all packets arrived within 1% of the specified interval, and all packets arrived within 7% of the expected time.

4.4.3 Distortions In Scheduling Due to Blocked Interrupts

 Since the KURT scheduler depends on timer interrupts arriving on time for scheduling real-time tasks accurately, it is important to assess the effect of having interrupts disabled by the various subsystems of Linux. In order to do this, the standard Linux kernel was modified to timestamp (using the Pentium Time Stamp Counter) all occurrences of enabling and disabling interrupts. This data was then processed to find the duration of time during which interrupts were disabled. Table 4.3 list the subsystems and the minimum, maximum, and the mean durations for which interrupts were disabled in those subsystems.


 
Table 4.3: Duration for which Interrupts are Disabled in the Linux Kernel
Category Description Min $\mu$sec Mean $\mu$sec Max $\mu$sec
Console Console (TTY) code 0.51 0.86 12.13
Network Network device driver and protocol stack code 0.35 0.64 2.74
Disk Disk device driver and buffer management code 0.55 34.68 250.45
Process Scheduling and run-queue management code 0.48 1.47 12.06
Timer Timer handling and time query code 0.50 1.81 40.65
Memory Memory management code 0.49 1.40 6.40
Other All other code that does not fit into the above categories. This mainly consists of routines interfacing with the PCI BIOS. 1.48 2.34 7.55
 

Figures 4.8 and 4.9 shows histograms of the percentage of events versus the duration of time during which interrupts were disabled for that event. As can be seen in Figure 4.8(a), in the console subsystem of Linux, disabling of interrupts occurs for period less than 2$\mu$sec. The maximum duration was 12.13$\mu$sec. In the network subsystem of Linux, most disabling of interrupts occurred for less than 2.5$\mu$sec, and the maximum duration was 2.74 milliseconds. In the disk subsystem (Figure 4.8(c)) of Linux, the maximum duration for which interrupts were disabled was found to be 250$\mu$sec. As can be observed from these measurements, most subsystems of Linux are reasonably suited for firm real-time applications. To make Linux suitable for hard real-time applications (or much more demanding firm real-time applications like ARTS), significant modifications have to be made to existing Linux code. The disk subsystem blocks interrupts for the longest period of time and hence it has to be modified before it can be used for demanding firm real-time applications like ARTS.

Since the duration during which interrupts are disabled is highly dependent on the hardware used in the system, these results cannot be generalized, and measurements should be made on the target hardware under a variety of conditions to gain a better overall understanding of the suitability of the Linux kernel for a firm real-time system on that hardware. It should also be noted that apart from the duration during which interrupts are blocked, the frequency at which they occur is also a factor on which the performance of real-time systems depends. Further experiments need to be conducted to gain a better understanding of the effect of blocked interrupts on the performance of the KURT scheduler.


      
Figure 4.8: Duration of Disabled Interrupts in Various Subsystems of Linux - I
\begin{figure}
\begin{center}
\subfigure[Console]{
\epsfig {file=figs/irq_tty.ep...
 ...ig {file=figs/irq_process.eps, width=0.5\textwidth}
}\ \end{center}\end{figure}


     
Figure 4.9: Duration of Disabled Interrupts in Various Subsystems of Linux - II
\begin{figure}
\begin{center}
\subfigure[Timer]{
\epsfig {file=figs/irq_timer.ep...
 ...epsfig {file=figs/irq_memory.eps, width=0.5\textwidth}
}\end{center}\end{figure}

5 Conclusions and Future Work

 

With the advent of multimedia and high-speed networking, a new class of real-time application software has emerged. The applications in this class require many of the services offered by generic timesharing or soft real-time operating systems, but have timing constraints more characteristic of hard real-time systems. Since applications of this nature cannot be described as either hard or soft real-time, they can be called firm real-time applications. In addition to the unique computing requirements presented by firm-real time applications, many are developed and used under significant budgetary constraints, precluding the use of expensive commercial real-time operating systems to support them.

The KURT system described here is based on the well-known and widely-used Linux operating system that offers a broad range of services to the applications it runs. With innovative modifications to the way timer interrupts are handled in the kernel, it was possible to increase the temporal resolution of Linux, without causing a significant increase in system overhead. The new scheduler that KURT adds to Linux is able to achieve high levels of scheduling accuracy and can coexist with the other currently available schedulers. As evidenced by the ARTS application, which makes use of existing ATM networking services offered by Linux, the changes made in KURT do not prevent firm real-time applications from using the existing operating system services. For these reasons, KURT successfully transforms Linux into a firm real-time operating system. In addition to the ability to support firm real-time applications, KURT has the advantage of low cost, since it runs on commercial off-the-shelf hardware, and is based on free software.

As the evaluation shows, KURT was able to demonstrate firm real-time scheduling capabilities, although a few sources of scheduling distortion remain. It should be noted, however, that the frequency of distortions in KURT is low, and falls well within the limits appropriate to a number of interesting applications. Since most distortions occur beca