# A FPGA Implementation of an Adaptive Reconfigurable Image Encoder

#### Sarin G. Mathen, Joseph B. Evans

Information and Telecommunication Technology Center University of Kansas, USA

> HPEC 2000 20-21 September 2000





## **Adaptive Image Encoding - Motivation**

#### Why adaptive image compression?

- Application specific compression requirements e.g., video conference, streaming movie ...
- Changing bandwidth availability of the underlying network e. g., peak usage time ...

#### **Key requirements?**

- Support different levels of compression
- Real time performance in both encoding and switching between codecs

#### An FPGA based solution (Xilinx 4k series FPGAs + Wildforce PCI board)

- Computationally intensive problems dictate a hardware intensive solution
- Same piece of silicon can re-used for different configurations/codecs
- Real time performance in encoding, achieved about 10 frames/second
- Real time performance in switching between configurations







## **Wavelet Transform & Image Compression**

### (2,2) Cohen Daubechies Feauveau wavelet

- A sequence of pixels are represented by a set of *average* and *difference* coefficients
- Average coefficients are further resolved into next level of *average* and *difference* coefficients
  multiple levels of wave-letting
- Filter implemented with *lifting scheme*

#### Wavelet transform based image compression

- DWT of input image over 3 levels of wave-letting
- Coefficients are <u>quantized</u> coefficients in each subband, quantized separately
- Coefficients are zero thresholded different subbands have different thresholds
- Longs spells of zeroes are run length encoded
- The coefficients are then entropy encoded

### Achieving different compression ratios

- The error induced due to truncating a coefficient to 0, is proportional to its magnitude.
- Different zero thresholds result in different noise levels and compression levels.



FPGA Implementation of an Adaptive Reconfigurable Image Encoder (Adaptive Computing Systems project - University of Kansas)





# **Design Specs and Design Partition**

#### Specs

- Input image: 512x512 pixel gray scale frame, 8 bits/pixel
- Output image: Compressed DWT coefficients
- Support 3 different configurations of encoder with varying levels of compression

#### **Design Partition - into 2 stages**

- <u>Stage 1</u>: DWT coefficients over 3 stages of wave-letting
- Stage 2: Dynamic Quantization, Zero thresholding, RLE of zeroes, and Entropy encoding
- 2 stages are implemented on 2 separate FPGAs

# **Stage 1 - DWT Coefficients**

- Level 1: 512 pixels in a row => 256 average + 256 difference coefficients
- Level 2: 256 coefficients from level 1 => 128 + 128 coefficients
- Level 3: 128 coefficients from level 2 => 64 +64 coefficients
- Symmetric extension of coefficients at the boundaries
- In place computation
- Each level computed along X and Y directions





### **Dynamic Quantization** of DWT coefficients

The dynamic range of coefficients in each subband are divided into 16 levels

### Run length encoding of 0's

Coefficients below the zero threshold are truncated to 0. Continuous sequences of 0's are then Run Length Encoded.

#### **Entropy Encoding**

8 bit coefficients are variable length encoded between 3 and 18 bits.

Coefficients are then packed into 32 bit words.



FPGA Implementation of an Adaptive Reconfigurable Image Encoder (Adaptive Computing Systems project - University of Kansas)



5

# **Encoder at Different Compression Levels**



FPGA Implementation of an Adaptive Reconfigurable Image Encoder (Adaptive Computing Systems project - University of Kansas)





## **Compression Ratio and Noise Metrics**

 $MSE = \frac{1}{512X512} \sum_{i=1}^{i=512} \sum_{j=1}^{j=512} [p(i, j) - p'(i, j)]2 \qquad PSNR = 20\log_{10}(255 / RMSE)$ 

 $RMSE = \sqrt{MSE}$ 

|                                     | LENA          |       |              |        | BARBARA       |       |              |        | GOLDHILL      |       |              |        |
|-------------------------------------|---------------|-------|--------------|--------|---------------|-------|--------------|--------|---------------|-------|--------------|--------|
| Configuration                       | Comp<br>ratio | bpp   | PSNR<br>(dB) | RMS    | Comp<br>ratio | bpp   | PSNR<br>(dB) | RMS    | Comp<br>ratio | Врр   | PSNR<br>(dB) | RMS    |
| Config. 1<br>Minimum<br>compression | 9.11          | 0.878 | 30.783       | 7.368  | 8.82          | 0.906 | 24.891       | 14.520 | 8.7           | 0.916 | 29.733       | 8.314  |
| Config. 2<br>Medium<br>compression  | 47.18         | 0.169 | 29.630       | 8.414  | 32.01         | 0.249 | 24.412       | 15.343 | 43.18         | 0.185 | 28.001       | 10.150 |
| Config 3<br>Maximum<br>compression  | 69.58         | 0.114 | 28.040       | 10.104 | 53.33         | 0.149 | 23.525       | 16.992 | 72.09         | 0.110 | 26.355       | 12.267 |



FPGA Implementation of an Adaptive Reconfigurable Image Encoder (Adaptive Computing Systems project - University of Kansas)



## **Implementation Costs and Timing Results**

#### **Device Utilization on Xilinx XC4085 XLA**

| Block              | LUTS<br>(4) | LUTS<br>(3) | CLB<br>flops | Total<br>CLBs | I/O<br>Bufs | I/O<br>flops | Gate<br>count | Timing<br>(MHz) |
|--------------------|-------------|-------------|--------------|---------------|-------------|--------------|---------------|-----------------|
| Stage 1            | 547         | 109         | 406          | 399<br>(12%)  | 75          | 88           | 8244          | 26.553          |
| Stage 2<br>Conf. 1 | 1248        | 356         | 924          | 890<br>(28%)  | 77          | 88           | 17058         | 36.381          |
| Stage 2<br>Conf. 2 | 1297        | 367         | 975          | 948<br>(30%)  | 77          | 88           | 17937         | 31.254          |
| Stage 3<br>Conf. 3 | 1297        | 373         | 965          | 925<br>(29%)  | 77          | 88           | 17830         | 34.632          |

LUTS(4) : 4 input look up tables LUTS(3) : 3 input look up tables CLB : Configurable Logic Block XC4085 has 57x57=3249 CLBs

**Timing** : Results of static timing analysis in terms of maximum allowable clock rate



FPGA Implementation of an Adaptive Reconfigurable Image Encoder (Adaptive Computing Systems project - University of Kansas)



8