NSF Support
This research is supported by the National Science Foundation, grant CNS-0716740.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Society is dependent on many engineered systems whose increasing complexity and inter-connectedness have, in turn, increased their vulnerability to adversarial attacks. In many of these systems, protecting the execution of their computations is as crucial as ensuring the security of their data.
This research investigates how to maintain survivable operation of such systems, even in the face of invasive attacks where computations are intentionally subverted to interfere with other computations’ execution constraints. The goal of this research is to develop new techniques for isolating the effects of interactions among computations through specific resources in these systems, including: flexible specification and rigorous enforcement of computations’ execution constraints; explicit control of all OS kernel components under a single scheduler; detailed on-line monitoring of computations and their supporting OS kernel components; automated learning to discover previously unknown interactions among computations; and formal modeling and verification of computations, execution constraints, and system components and resources.
The expected benefits of this project include: a novel approach to non-bypassable isolation of computations from the effects of adversarial attack in which isolation can be enforced flexibly according to the system-specific execution constraints that must be satisfied; a high quality open-source software implementation of kernel-level scheduling and monitoring services that provide and measure such non-bypassable isolation; new formal models, analyses, and methodologies for verifiably correct configuration and management of those services; and empirical studies of the services’ ability to protect computations from interference under a wide range of adversarial attacks.
Investigators
Douglas Niehaus, Associate Professor, Electrical Engineering and Computer Science Department, is PI for collaborative research at the University of Kansas.
Christopher Gill, Associate Professor of Computer Science and Engineering, is PI for this research at Washington University in St. Louis.
William D. Smart, Assistant Professor of Computer Science and Engineering, is co-PI for this research at Washington University in St. Louis.
Students
Matthew Beattie, Undergraduate
Matt was an undergraduate in my Introductory OS class, who expressed some interest in systems programming, so he gave it a try over the summer in 2008. He learned a lot about how our methods of scheduling and performance evaluation work, he made some contributions to the on-line documentation, and he created a prototype GUI for describing scheduling hierarchies controlling computation behavior. However, he also discovered that systems programming was not as interesting as he expected, and so decided not to continue with the project.
Andrew Boie, Graduate Student
Andrew worked on the project for one semester at the very beginning (9/1/07-1/10/08). He worked on prototyping methods related to computation structure discovery. He also helped start Michael Jantz and Jared Straub up their Linux Internals learning curves.
Michael Jantz, Undergraduate and Graduate Student
Michael started as an undergrad during Spring 2008 with a desire to learn systems programming in general, and within this project, to learn how to modify the execution support for processes with the goal of providing execution security. He started as a graduate student in Fall 2008. His concentration during the reporting period was to learn how to work within the Linux kernel and then how to discover the structure of computations by observing their activities. Initially, this was done in post-processing of a recorded event stream describing a wide range of computation actions. The primary work during this phase was determining the full set of computation actions that had to be observed to permit deduction of computation structure during post-processing. The next step was to use this information to permit discovering the components of a computation at run-time, and to then control its execution and that of other computations using shared resources to provide execution security to target processes.
Bala Sridhar, Graduate Student
Bala participates as an unfunded Graduate Student. He first participated in the Advanced OS class project addressing the Discovery of computation structure, a continued during the Summer working with Michael Jantz to work out methods of deducing computation structure and use of system resources, especially those representing interaction channels. IN this phase of the work, the activities concentrated on (1) what the resources are, (2) what system events that can be observed permit deduction that a resource is used, and (3) post-processing of the event stream that supports deducing the structure of the computation. The next steps included transfer of some of the deductive processing to on-line active filters to permit tracking computation structure at run-time and representation of that structure to the scheduling subsystem to provide execution security to the whole computation. He is currently refining methods for discovery of computation structure and integration with Group Scheduling control of the discovered components in a group with the application they support.
Jared Straub, Undergraduate and Graduate Student
Jared started as an undergraduate, and became a Graduate Student in January 2009. Jared has concentrated on learning the methods of integrating scheduling and concurrency control to support controlling groups of computation components under arbitrary semantics that we call Group Scheduling. He has been Noah Watkins apprentice, in preparation for Noah leaving KU. The intent is to be able to write GS based scheduling decision functions which use the information discovered about computation structure to inform the scheduler about all components of the computation. This information can, in turn, be used to provide execution security to specific computations since all components, and thus all interaction channels among computations can be known.
Noah Watkins, Graduate Student
Noah was the developer of the Group Scheduling subsystem for over a year under other projects, and his involvement with this project occupied only the last 3 months before he left KU to begin PhD studies at CU Santa Cruz. However, his contributions were quite important because he helped show how discovered computation structure could be placed under Group scheduling control at run-time. Noah also was quite helpful in helping Michael Jantz and Jared Straub up the initial parts of the Linux systems programming learning curve.
Integration of the methods of discovering computation structure with control over computations. Noah has been teaching Jared how to develop within the Linux kernel with an emphasis on scheduling and concurrency control. Noah is leaving at the end of August and will be finishing his Mater’s Thesis while beginning his PhD this Fall at UC Santa Cruz.
The goal of security, in general, is to ensure that all required actions can take place, and that no forbidden actions are possible. In conventional terms, this generally takes the form of access control to data and system services. This project introduces the idea of execution security which supplements the other aspects of security by creating the ability to provide: (1) flexible specification and rigorous enforcement of computations’ execution constraints, (2) explicit control of all OS kernel components under a single scheduler, (3) detailed on-line monitoring of computations and their supporting OS kernel components. Using these capabilities, higher level monitoring and control functions become possible which use the detailed information about events on the system to construct an application -oriented view of the events on the system, and to make decisions based on that view.
An obvious, but often overlooked fact is that the application programs written by developers do not explicitly specify the entire computation that executes when the application is running. The most obvious and well-known example of this is the use of various libraries, which provide capabilities linked into the program when building the final executable. Less obvious, is that system components supporting interrupt handlers (hard-IRQs), network processing and work queues (soft-IRQs) and server or helper threads running at the user level can also be involved.
The whole computation is what executes on the machine, and thus determines how the application behaves. Thus, if we wish to provide execution security to an application by controlling its execution, we must first be certain of all the components of the computation. The problem with knowing this information is that there is no explicit representation of the whole set of components required for the application to execute. We must thus first discover the computation structure before we can control it.
Andrew Boie’s initial work showed that the idea of discovering computation structure by observing events at the OPS level was feasible. Michael Jantz and Bala Sridhar showed that a great deal could be implemented using only instrumentation points in the kernel and doing post-processing of the event stream. This brought us to the point where we could know what had happened in the computation after it was complete, but we could not yet know the structure of the computation while it was executing.
The next step was to implement this same analysis at run-time. This was done using a Data Stream Active Filter, which is a routine examining each event as it occurs in a stream configured to include all events required to discover computation structure. This active filter was able to deduce computation structure at run-time, an notify the Group Scheduling subsystem which controls the computation, thus controlling the computation as a whole. In the course of this work it was also shown that some aspects of computation structure will have to be observed by instrumenting user-level events, in contrast to the main thrust of discovery which was able to do a great deal using only events observable in the OS. For example, control of Java based applications requires instrumentation of the JVM running at user level to fully represent the set of JVM threads and their role in supporting Java application execution.
Noah Watkins and Jared Straub showed that integration of concurrency control and scheduling was possible and that a specialized programming model, including a specialized scheduler could be used to test the implementation exhaustively. This scheduling framework was show to be capable of supporting applications with specialized semantics, such as PTides, and that the execution security of those applications could be assured under a wide range of conditions. This work also showed that other system components had to be included in the set under control for a computation to complete support for execution security. These system components include: soft-IRQ, hard-IRQ, and work queues.
In combination this work showed discovery of computation structure and subsequent control of the discovered computations was possible. The next steps required implementation of support for tracking computation components and which sets of components comprise various computations. Further, support for a name space of components is required to permit specification of computation execution semantics and enforcement of those semantics at run-time.
The current approach is still vulnerable, it fails to provide perfect execution security, in that some of the computation components are shared among more than one computation. The system components hard-IRQs, soft-IRQs, and work queues are among the most obvious example within the OS. Outside the OS, server processes are the most obvious examples, such as the X-Server, which is a part of most computations on the system which performs graphical display, although some do write into the screen display buffer directly.
Every process performing graphics through the X-server establishes a socket connection to the X-serer, and sends requests for display updates to it as they are generated. The vulnerability to execution security here is that many processes use the X-Server, and so the work of many processes is multiplexed by the X-Server. The X-Server is thus an interaction channel through which the actions of one process may influence how the execution of another process proceeds.
There is an approach that can be taken which will reduce this vulnerability. This approach take measures to influence the order in which the server processes act on their pending requests. The most obvious approach is to rewrite the server in question to support separate work queues for each client it is serving. When the server is invoked by the scheduler it can be told which client’s work should be performed. Andrew Boie and Noah Watkins did this as an Advanced OS project several years ago, in a somewhat hacked-together proof-of-concept.
There is an approach which does not require modifying the application, which is to influence the return bit mask of the POLL and SELECT system calls which many servers use to look for input on their many open connections. Thus, when an application sends a message to the X-Server requesting a display operation, the Discovery Active Filter can note this, and update the GS state variables for the receiving X-Server which has been discovered as part of the computation’s components. The GS framework may then decide to run the X-Server on behalf of the computation that made the display request. Before it switches context to the X-Server thread, it can constrain the return value of the SELECT system call to show only work from the process on whose behalf GS is giving the X-Server context.
This method depends on servers using POLL and Select, and it cannot prevent the server from performing work on behalf of other threads that has already been requested. Only modifying the server cod to permit the GS framework to identify the work that should be performed can do that.
The threads supporting hard-IRQ and soft-IRQ computation components in the Linux kernel are similar to the X-Server in this respect. The multiplex work for many threads and thus many computations. The Discovery and GS frameworks can, in most cases, note when work for a given computation is generated, and can control when the CPU is given to a given soft-IRQ, but they cannot control which work the soft-IRQ thread performs. However, just as a proof-of-concept has shown that the X-Server could be modified to track work for each thread, so too a proof-of-concept implementation was done some years ago in Linux 2.4 showing that the Network Transmit and Receive soft_IRQs could be modified to support separate queues for each socket, and that when invoked, they could be told to perform work for a given computation or a specific set of computations.
This work also showed that there is an irreducible classification computation on incoming packets that cannot be controlled on a per-computation basis. Fairly obviously, when a packet first arrives, some minimum amount of processing of the packet is required to determine the receive port with which it is associated, and thus with which computation it is associated. This classification work must be done for every arriving packet. The proof-of-concept in 2.4 work split the Network Receive soft-IRQ into two pieces. The first was this classification component, placing the incoming packet on the queue associated with the relevant computation. The second soft-IRQ was then controlled on a per-computation basis by the GS framework.
The implication for execution security is that the classification operation will always remain as an interaction channel among computations receiving packets and thus does thus represent an imperfection in the execution security we believe can currently be achieved.
The next few research steps will address several of the existing vulnerabilities by increasing the set of computation components discovered at the OS level, and by modifying several of the OS component computations to reduce the degree to which they act as interaction channels.
Additional OS level computation components that can be discovered include a wide range of concurrency control components, data structures such as buffers used for I/O, and the OS hard-IRQ and soft-IRQ computations that support applications. The code for the latter can be refactored in some cases to maintain per-computation sets of data enabling the OS scheduling framework to not only, for example, give execution context to the relevant soft-IRQ when work for a given computation is pending, but also to specify which work the soft-IRQ should perform. While this is not relevant to every OS computation component, the Network Receive soft-IRQ would benefit because the processing of received network packets could be controlled so that the soft-IRQ would be executed when the packets are required by the computations that will receive them.
Similar measures can be taken to increase the precision with which execution of user-level servers, such as the X-Server, can be controlled. This increase in control precision can then be used to increase execution security by reducing the number of ways in which execution of one computation can have a deleterious effect on another computation. In the case of user-level server using SELECT and POLL, we will experiment with methods of informing the server process of only the pending work it should perform on behalf of the computation whose Group Scheduling representation led to the selection of the X-Server for execution on behalf of that computation.
Integration of control over the whole computation can then be used to ensure the computation executes according to its constraints. On the positive side, the ocmputations can be given the resources required for them execute conrrectly. On the negative side, computations identified as executing outside their specifications, or showing other signs of being a danger to the execution security of other processes, can have their access to resources restricted, to limit the damage they can do. Note that the reasons for their misbehavior, accident or intent, do not really matter all that much. Misbehaving processes will have their consumption of resources, and thus their ability to compromise the execution security of other processes, limited regardless of the reason.
The KUSP code supporting Discovery and the ability to write application-specific programming models is available through the KUSP WWW site: