User Login
Main Menu
Latest Downloads
File Icon Hybridthreads Compiler: Generation of Application Specific Hardware Thread Cores from C (441)
File Icon Supporting High Level Language Semantics Within Hardware Resident Threads (821)
File Icon RCC Project: Investigating the Feasibility of FPGA-Based Petascale Computing (714)
File Icon Memory Hierarchy for MCSoPC Multithreaded Systems (874)
File Icon htc_v1_setup (167)
Hthreads Design PDF Print E-mail
Written by Wesley Peck   
Tuesday, 01 August 2006

Hthreads Design 

The Hthreads system is a hardware/software co-design of existing system components which extends the basic underlying policies of the multi-threaded programming model into field programmable gate arrays (FPGAs). These new capabilities are accessed through the used of transparent and familiar APIs which interact with the hardware through memory mapped registers. Significant system processing, which used to require hundreds of assembler instructions, are now performed in tens of clock cycles using a single memory access assembler instruction.

The Hthreads system is design as a set of cooperating components with each component having hardware parts and software parts. The design of each of the Hthreads components is described below. 

Design of the Thread Manager

The Thread Manager (TM) component creates, maintains, and controls all software and hardware threads in the Hthreads system. The TM has no notion as to the type of thread it is managing, in regards to wether the thread is executing on the CPU or within the FPGA. In addition the TM performs error checking, and is the "entry" point for all other components, such as the mutex manager and interrupt controller, that wish to change the state of a thread from blocked to ready to run. The majority of thread management functions execute in the hardware in less than 10 clock cycles. See the Image:ThreadManager.pdf for a listing of functions and operations. The TM works in tight conjunction with the scheduler.

The TM was originally written by Mike Finley (Summer 2004), later revised by Erik Anderson and Jason Agron (Summer 2005).

 Design of the Thread Scheduler

A thread scheduler module that runs concurrently with the CPU and determines the next thread to be run while the current thread is running. The scheduler module includes a standard interface to allow system designers to specify their own custom scheduling algorithms. The duty of the scheduler module is to manage the ready-to-run queue and the operation of preemption (context switch) interrupt. The current scheduler module has been designed and implemented with a partitioned ready-to-run queue along with a parallel priority encoder. This design allows for constant time execution for FIFO, round-robin, and preemptive priority scheduling services which leads to a large decrease in scheduling overhead and jitter in the Hhreads system. All scheduling operations in the current module execute in under 30 clock cycles regardless of the number of active threads in the system. This capability provides very low overhead and jitter in the calculation of the next scheduling decision. Current testing shows that from the exact time (10 nsec resolution) a scheduling decision should be made, through the time it takes the hardware and CPU to generate and recognize an external interrupt request (around 100 clock cycles), execute the interrupt handling routine, and perform a context switch between threads is around 200 clock cycles (2 usecs on our system with the CPU running at 300 MHz and all other peripherals running at 100 Mhz system) with jitter (excluding cache misses) around 10 clock cycles (100 nsecs). Our actual on-chip tests verify that our current scheduler is capable of constant time scheduling operations independent of the number of threads on the ready-to-run queue. See timing_runs for histograms of timing tests. Mike Finley implemented the original FIFO scheduler (Spring/Summer 2004), and Jason Agron is the team leader for the new constant-time scheduler module, SW and HW thread support, and new scheduler algorithms. See the Image:NewSchedulerDesignDoc.pdf for more details, functionality, implementation, documentation, etc.

Design of the Synchronization Controllers

New synchronization primitives implemented with the FPGA that provide very fast mutex's and semaphore's for hardware, software, and combinations of hardware/software threads. A complete blocking semaphore operation, including the request and checking of the semaphore, and subsequent queuing of a thread id if blocked, is performed in 58 clock cycles (around 500 nsecs on our 100 Mhz system). The unique capabilities of the FPGA allow the synchronization primitives to encapsulate the atomic processing usually achieved by combinations of conditional instructions integrated in with the memory coherency protocols of snooping data caches, within 8 or less clock cycles. This allows the FPGA to provide a much simpler system solution and faster mechanisms for achieving classic semaphore semantics. See ERSA for more details on the semaphores. Razali Jidin is the team leader of the synchronization primitives.

 

Design of the Thread Interrupt Controller

The thread interrupt controller is still being designed, implemented, and tested. More documentation on this component will be available when it is more complete.

Design of the Software Libraries

The software libraries for the Hthreads system is designed to be API compatible with POSIX Thread specification. The end goal of this design is that, using a simple wrapper, existing pthreads programs can be compiled into Hthreads programs.

Hardware Thread Interface

The Hardware Thread Interface (HWTI) enables threads to run within the FPGA. The HWTI allows threads to be created, exited, accessed, and synchronize with all other system threads through Hybridthreads library API's. The HWTI maintains the "state" of a hardware thread, and provides execution control over the thread. The HWTI promotes portability by encapsulating platform specific signals within generic API's callable by the user code. See the Image:HwtiDesignDocument.pdf for more details on the HWTI.


Razali Jidin originally wrote the HWTI, later revised by Erik Anderson.

Run-Time Debugger

A Run Time Debug component for all hardware threads and components. This simple capability, similar to setting break points in software, allows the programmer to stop and start the execution of any hardware component and dump the "state" of the component out to a terminal for analysis. The starting and stopping of a hardware component is controlled form a program running on the CPU that writes commands into a central, memory mapped control register, and then accesses the state of a hardware thread through memory mapped registers. This somewhat simple capability is fundamental for integration and debugging of hybrid threads and hardware/software co-designed components. Although this capability is critical, it is still primitive and much work still needs done in this key area. Mike Finley is the team leader for the Run Time Debug support.


Last Updated ( Tuesday, 01 August 2006 )
 
© 2008 HThreads
Joomla! is Free Software released under the GNU/GPL License.