DESIGN AND ANALYSIS OF H.264 DECODER BY USING VERILOG

ABSTRACT

The H.264/AVC is the latest standard of video compression/decompression for future broadband system. This standard was produced through the Joint Video Team (JVT) from the ITU-T Video Coding Experts Group and the ISO/IEC MPEG institutionalization advisory group. In this undertaking H.264 decoder utilitarian square, for example, Context based Binary arithmetic coding (CABAC), Inverse Quantization and Inverse Discrete Cosine Transform are composed utilizing Verilog. CABAC incorporates three fundamental building squares of setting demonstrating, double math coding and Inverse binarization. Here the compacted bit-stream from NAL unit is extended by CABAC module to produce different linguistic structure components. Here the fundamental math unraveling circuit units are intended to share effectively by all linguistic structure components. Opposite Quantization and Inverse Discrete Cosine Transform practical squares are utilized to recreate the first picture pixels esteems.

1. INTRODUCTION

The H.264 is a new video compression scheme that is becoming the worldwide digital video standard for consumer electronics and personal computers. In particular, H.264 has already been selected as a key compression scheme (codec) for the next generation of optical disc formats, HD-DVD and Blu-ray disc. (Sometimes referred to as BD or BD-ROM) H.264 has been adopted by the Motion Picture Experts Group (MPEG) to be a key Video compression scheme in the MPEG-4 format for digital media exchange. H.264 is sometimes referred to as “MPEG-4 Part 10” (part of the MPEG-4 Specification), or as “AVC” (MPEG-4’s Advanced Video Coding). This new compression scheme has been developed in response to technical factors And the needs of an evolving market:

• MPEG-2 and other older video codec’s are relatively inefficient.

• Much greater computational resources are available today.

• High Definition video is becoming pervasive, and there is a strong need to store And transmit more efficiently the higher quantity of data of HD (about 6 times more than Standard Definition video)

H.264, MPEG-4 Part 10, or AVC (for Advanced Video Coding), is a digital video codec standard that is noted or achieving very high data compression. It was written by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) as the product of a collective partnership effort known as the Joint Video Team (JVT). It aims to increase compression rate significantly while transmitting high quality image at both high and low bit rates. Three profiles have first been defined, each with several levels.

Fig1. Overview.1.1 General Introduction

H.264 represents the state of the art in current video coding standards. In the consumer electronics market it is more often adapted. We see movies in the cinemas, we rent movies on DVD or Blue Ray disks and we watch movies and our popular series every evening on the television. Nowadays we can also download our favorite movies and series from Internet and watch them on our computers, HD-television or mobile device. We can even record our own movies with a digital camcorder, edit them on our computer and show them to our family at birthday parties.

In the last couple of years there was a tremendous shift in the world of consumer video from VHS to DVD to Blue Ray and to even more exotic standards found on Internet. The demand for better quality pictures, smaller sizes, lower energy consumption, lower cost of appliances and watching movies every time, any time, anywhere has driven this shift to even better video compression standards even further.

To provide better compression of video images the Moving Picture Experts Group and the Video Coding Expert Group (MPEG and VCEG) have developed a successor to the earlier MPEG-4 and H.263 standards. The new standard is called Advanced Video Coding (AVC) and is published jointly as MPEG-4 Part 10 and H.26414, 16. It achieves very high compression efficiency compared to earlier standards19. It can handle a wide range of applications and is more friendly to networks such as the internet. The downside of the increased compression efficiency is that the decoder complexity also grows. Context-based Adaptive Binary Arithmetic Coding (CABAC) is one of the two alternative entropy coding methods specified in H.264. The other alternative is called Context-based Adaptive Variable Length Coding (CAVLC). The H.264 standard improves the compression efficiency up to 50% with CABAC entailing a frame rate increase of 25% to 30% with bit rate reduction up to 16%.

1.2. REVIEW OF LITERATURE

Hendrik Eeckhaut, Mark Christiaens, Dirk Stroobandt, and Vincent Nollet, are proposed this paper addresses the problem of error-resilient decoding of bitstreams produced by the CABAC (context-based adaptive binary arithmetic coding) algorithm used in the H.264 video coding standard. The paper describes a maximum a posteriori (MAP) estimation algorithm improving the CABAC decoding performances in the presence of transmission errors. Methods improving the re-synchronization and error detection capabilities of the decoder are then described. A variant of the CABAC algorithm supporting error detection based on a forbidden interval is presented. The performances of the decoding algorithm are first assessed with theoretical sources and by considering different binarization codes. They are compared against those obtained with Exp-Golomb codes and with a transmission chain making use of an error-correcting code. The approach has been integrated in an H.264/MPEG-4 AVC video coder and decoder. The PSNR gains obtained are discussed.

Peng Zhang, Wen Gao, Don Xie, and Di Wu, are proposed This paper presents an efficient VLSI architecture for H.264/AVC CABAC decoding. We introduce several new techniques to extremely exploit, to the largest extent possible, the parallelism of the decoding process, including line-bit-rate decoding, multiple bin arithmetic decoding and efficient probability propagation scheme. The CABAC engine can ensure the real-time decoding for H.264/AVC main profile HD level 4.0. synthesis results show that the multi-bin decoder can run up to 45 MHz, and the total area is only 42K gates.

Wei Yu and Yun He, In this paper, we propose a high performance hardware architecture of CABAC decoder CABAC is the context adaptive binary arithmetic coding used in H.264/AVC video standard, which achieves significant compression enhancement while bringing greater complexity and costs in implementation. The necessity of hardware implementation for real-time CABAC decoders is introduced, and then a fast and cost effective architecture is proposed. The new architecture can achieve decoding speed of averagely 500 cycles/macroblock, for typical 4M bit stream of DI resolution, 30 frame/s. An ASIC implementation of the new architecture is carried out in a 0.18 /spl mu/m silicon technology. The estimated area is 0.3 mm/sub 2/ and the critical path is limited within 6.7 ns.

Yao-Chang Yang, Chien-Chang Lin, Hsui-Cheng Chang, Ching-Lung Su, and JiunIn Guo, are proposed In this paper we present a high throughput VLSI architecture design for context-based adaptive binary arithmetic decoding (CABAD) in MPEG-4 AVC/H.264. To speed-up the inherent sequential operations in CABAD, we break down the processing bottleneck by proposing a look-ahead codeword parsing technique on the segmenting context tables with cache registers, which averagely reduces up to 53% of cycle count. Based on a 0.18 mum CMOS technology, the proposed design outperforms the existing design by both reducing 40% of hardware cost and achieving about 1.6 times data throughput at the same time.

Y. S. Yi and I. C. Park, are proposed We present a high-performance entropy decoding system for H.264/AVC. This system includes a variable length coding (VLC) decoder, a context-based adaptive binary arithmetic coding (CABAC) decoder, and a run-level converter. Each syntax element above slice data is decoded by VLC decoder within a clock cycle. The CABAC decoder decodes the syntax elements at slice data and below through pipeline mechanism. A run-level converting method is designed and integrated into the system to improve the output efficiency of decoded quantized coefficients. This system can process 574,712 macroblock/s operating at 300 MHz, quite enough for real-time decoding for H.264/AVC HD level 4.1 video sequences.

2. CABAC ENCODING AND DECODING PROCESS

2.1.1 Terminology

To provide a better understanding of the H.264 standard it is important to explain the terminology used in the H.264 standard 14. A coded picture exists of an encoded field (of interlaced video) or a frame (of progressive or interlaced video). Each coded frame has its own frame number and each field has its picture order count, which defines the decoding order. References pictures can be used to inter predict further coded pictures.

A coded picture is made of a number of macro blocks. These macro blocks each contain 16 x 16 luma samples and the associated 8 x 8 Cb and 8 x 8 Cr chroma samples. The macro blocks are arranged in slices. A slice is a set of macro blocks in raster scan order. An I slice may only contain I type macro blocks, a P slice may contain P and I type macro blocks and a B slice may contain I-type and B-type macro blocks. Intra prediction is used to predict I-type macro blocks from decoded samples in the current slice. P-type macro blocks are predicted from reference pictures using inter prediction. The prediction of each macro block is done from one picture. The B-type macro blocks are also predicted using inter prediction from reference pictures, but two pictures may be used to predict.

2.1.2 The H.264 Codec

Figure 2.1: H.264 Encoder.

In the coding standards h.264 does not define an encoder decoder pair but rather defines the syntax of an encoded video stream to be decoded properly. This means that everyone is free to design his or her own hardware as long as the en- coded video stream can properly be decoded by any decoder. Most of the encoders and decoders will have similar basic functional elements as shown on figure 2.1 and figure 2.2.

The encoder has a forward dataflow path and a reconstruction dataflow path. The decoder has only a reconstruction dataflow path. We will look deeper into de decoding dataflow path.

Figure 2.2: H.264 Decoder

The decoder receives a compressed bit stream from the network abstraction layer (NAL). Before the NAL the video data is stored on a hard disk or is being transmitted over a transmission line. First the compressed data has to be entropy decoded to produce a set of quantized coefficients X. These are scaled and inverse transformed to produce a residual difference block Dn’, identical to the Dn’, shown in the encoder. From the decoded header information, the decoder creates a prediction block PRED. This prediction block is also identical to the prediction block PRED in the encoder. To produce uF0n , PRED is added to Dn’, . And finally to create each decoded block Fn’, uFn’, is filtered.

2.1.3 H.264 structure

The H.264 standard defines three profiles. The profiles support a particular set of coding functions as can be seen in figure 2.3. The baseline profile supports intra and inter-prediction coding and entropy coding with context-adaptive variable-length codes (CAVLC). The main profile includes support for Context-based adaptive binary arithmetic coding (CABAC), interlaced video, inter coding using weighted prediction and inter-coding using B-slices. The extended profile adds modes to enable efficient switching between coded bit streams and improved error resilience, but does not support interlaced video and CABAC.

units are to be entropy encoded to be sent to the Network Abstraction Layer (NAL). The compressed bitstream from the entropy encoder is made of the entropy-encoded coefficients and side information to decode each block within a macroblock. The side information includes prediction modes, quantizer parameters, motion vector information etc. The compressed bitstream is sent to the Network Abstraction Layer for transmission or storage.

In the main profile CABAC can be selected for the entropy encoding process. The alternative is CAVLC. When CABAC is selected for the entropy encoding process, the syntax elements are routed to a CABAC encoding algorithm to achieve good compression performance. This is done by:

selecting probability models for each syntax element according to the elements context;

adapting probability estimates based on local statistics and

using arithmetic coding rather than variable-length coding.

3. OVERVIEW OF THE CABAC DECODING SCHEME

3.1 CABAC Encoding Steps

As could be seen in figure 2.4 in the previous chapter the CABAC encoding process consists of the following three steps:

_ binarization;

_ probability modeling;

_ binary arithmetic coding.

In the binarization process of the CABAC encoding a given non-binary valued syntax element is uniquely mapped to a binary sequence 32, 31. This binary sequence is called a bin string. If the syntax element is already a binary sequence this binarization process can be bypassed. In the probability modeling process the bin string enters and a probability model is selected. The choice of the probability model may depend on previously encoded syntax elements or bins.

After the selection of the probability model the bin enters the arithmetic coding process where the bins are entropy coded into the bitstream. In the binary arithmetic coding process also the model update takes place for the subsequent bins in the probability modeling process. The two last steps can also be bypassed if there is no need for a probability modeling. This can be the case if there is equal probability of the value of the syntax elements. The encoding of the bin values takes place in the bypass coding engine.

3.2 CABAC Decoding

The CABAC decoding process is the inverse of the CABAC encoding process 13, 12. First is the corresponding context model selected to decode the bin. The bin is then decoded using the arithmetic decoding engine. The arithmetic decoding engine is quite similar to the binary arithmetic encoding engine.

3.2.1 FFmpegThe CABAC decoder is based on the H.264 standard and on the FFMPEG implementation of the standard. FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video. It includes libavcodec, a leading audio/video codec library. FFmpeg is free software and is licensed under the LGPL or GPL. The libavcodec includes a highly optimized version of the CABAC-decoder for the Intel processor platform, but we used the less optimal general implementation of the H.264 decoder with the CABAC-decoder in it.

As we have seen in the second chapter, each H.264 video sequence consists of frames. Each frame is build up out of one or more slices and each slice can have one or more macroblocks. Macroblocks are the units that carry the 16×16 luma samples and associated 8×8 Cr an Cb chroma samples. When the video sequence reaches the CABAC-decoder, it is just received by the Network Abstraction Layer either from transmission or from storage. The video sequence consists of a bitstream of encoded and compressed syntax elements. These syntax elements are only readable after the first step in the decoding process. Then these syntax elements can be used to reconstruct the original frame. The _rst step is the entropy decoder, in our case CABAC (Context-based Adaptive Binary Arithmetic Coding).

3.2.2 Context Model Selection

The first step in the decoding process is to initialize the CABAC-decoder 11, 10. This is done every time a new slice starts. Together with the encoded syntax elements or the bins, there is extra information sent with the bitstream. For example the Quantization Parameters are sent with the bitstream. The initial values of the Context Model Selection table are depended on this Quantization Parameters. The initial value of the Context Model Selection table is also depended on some other parameters, which increases adaptation to different types of video content.

There are a total of 366 Different Context Models which are all initialized into the table at the beginning of each slice. With the different parameters there are a large number of different tables that could be selected to be the initial table for the Context Model Selection table.

3.2.3 Coding engine

The coding engine consists of two registers, named Range and Low (or Value)20. At the beginning of a decoding sequence, i.e. at the beginning of a new slice, the coding engine is initialized. The range is set to 0x1FE. In the low register the first 9 bits of the bitstream are loaded. The CABAC engine is now initialized and can be used to decode the bitstream to bins. Bins are a string of bits that represent a syntax element. Some syntax elements are just the bits found in the bin, but other syntax elements are represented as symbols and should be de-binarized. The decoding engine is being called either in regular mode or in bypass mode. In the bypass mode there is no use of the context model selection table. In the regular mode the decoding engine has to know which context model or state to use. The state is the value found in the context model selection table at a specified index. Every bit is decoded with the same or a different state as the previous decoded bit in the bins. We now have values for Range, Low and State and the arithmetic decoder can do a first iteration.

Figure 3.1: Arithmetic decoding engine for one bin

Every iteration of the arithmetic decoder will have one bit as a result. What the result is, depends on the value of Low compared to Range. While the range register keeps track of the width of the current interval, the Low register keeps track of the input bitstream. The range is split in two intervals: rLPS and rMPS. The rLPS is the estimated probability interval of the Least Probable Symbol. rMPS is the estimated probability interval of the Most Probable Symbol.

The rLPS value is read from a fixed table and indexed by the first two bits of the range value and six bits of the state value. The value of the input bitstream, named Low, falls into one of the two intervals, rLPS or rMPS. This decides whether the bit is decoded as a LPS or a MPS symbol. The results depend further on the LSB of the value state. If the result is MPS than the LSB of the value state is the output bit.

If the result is LPS than the output bit will be the value of the LSB of the state inverted. Figure 3.1 shows the case that MPS occurs and the case that LPS occurs. MPS occurs if the Low is less than rMPS and LPS occurs if the Low is greater or equal to rMPS. After this iteration the values of range and low have to be renewed by the equation (3.1).

After this renewal step the next iteration can take place. To keep the precision of the decoding process, the MSB of range has to be always 1. To ensure this, the value of range has to be renormalized when detected a zero as MSB. The renormalization process shifts the value of range to the left, so that the MSB of range is again 1. The last bits are studied in as zeros so that the value remains 9 bits. The value of Low also shifts the same amount as the Range to the left. The Low register however receives the new bits at the LSB position from the input bitstream. This way the Low register receives bits from the input bitstream and keeps track of the position of the input bitstream in the current interval. In the bypass mode no context model is needed because of the equal probability of the syntax elements. The probability of the LPS is in this case 0.5. But we can compare the value of Low with the value of Range divided by two.

3.2.4 De – BinarizationIn the last phase of the CABAC decoding the resulting bits from the decoding engine are taken and de-binarized. A sequence of bits can form a bin which can be translated to a symbol. This symbol represents the syntax element that was encoded. Not all bins and thus syntax elements are represented by a symbol, some are just the string of bits they were in the bin.

To de-binarize the bins the bitstream has to go through a decoding tree. We don’t know on before hand where every bin starts and where they end. We dont know which bits from the decoding engine together form a syntax element which is represented as a bin. This makes it hard to parallelize and very time consuming. The whole tree has to be walked in order to get the right syntax element or symbol.

3.3 Motivation

The total CABAC decoder consists of three main stage: context model selection, arithmetic decoding and de-binarization. In this thesis we are going to research the arithmetic decoding engine. We are going to implement the arithmetic decoding engine into hardware and let it run in a software / hardware co-design. The main reason why we choose the arithmetic decoding engine to be implemented in hardware is the fact that it has very strong uniform, iterative data dependencies between all stages in the algorithm. Every decoded bit is depended on all the previous decoded bits in the same slice. This is because for every decoded bit in a slice, the context model selection table is updated. And the next bit to decoded can be depended on that updated value in the context model selection table. We like to see how fast we can make a software hardware co design implementation of the arithmetic decoding engine. We would also like to see what the speedup is and how we can arrange the architecture of the hardware implementation in such a way that we get the best increase in speed.

As we focus on the arithmetic decoding engine, we only look at one slice to decode, so we initialize the context model selection table only once. We also dont bother ourselves with the de-binarization phase of the CABAC decoder. This would however be a very good topic for further research. We could add the de-binarizer to our arithmetic decoding engine and measure if we could get an addition speedup from an intelligent de-binarization architecture.

Since every slice is independent of each other in terms of CABAC decoding, major improvement can be achieved with parallelism on the level of slices. Every frame is made up of one or more slices and every slice is made up of one or more macroblocks. The focus in this thesis is to accelerate the decoding of independent slices. Several slice accelerators could be used in parallel to achieve higher frame decoding rates. The main bottleneck in these slice accelerators is the CABAC decoding stage. To accelerate the whole slice decoding, acceleration of the CABAC decoding is necessary. This is done by the making of specialized hardware for the CABAC decoding, instead of decoding CABAC on a general purpose processor.

4. METHODOLOGIES

4.1. Background

H.264/AVC is a new recommendation international standard published jointly by ITU-T VCEG (Video Coding Experts Group) and ISO/IEC MPEG (Moving Picture Experts Group) the main purpose of this standard is to provide a broad range of multimedia applications with higher reliability and efficient encoding and decoding performance when transporting regular video through various networks compares to former standards. As H.264/AVC achieves enhanced compression rate and better error resilience by employee some unique techniques, the computation complexity of coding is also increased.

4.2. Procedure

4.2.1. Basic concepts of video compression

4.2.1.1 Compression:

The process of coding that will effectively reduce the total number of bits needed to represent certain information. Compression is useful due to reduction in volume of multimedia information and reduction in bandwidth needed for transmission of Multimedia.

4.2.1.2 Lossy compression

A lossy compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way. Lossy compression is most commonly used to compress multimedia data (audio, video, still images) especially in applications, such as streaming media and internet telephony.

There are two basic lossy compression schemes:

In lossy transform codecs, samples of picture or sound are taken, chopped into small segments, transformed into a new basis space, and quantized. The resulting quantized values are then entropy coded.

In lossy predictive codecs, previous and/or subsequent decoded data is used to predict the current sound sample or image frame. The error between the predicted data and the real data, together with any extra information needed to reproduce the prediction, is then quantized and coded. In some systems the two techniques are combined, with transform codecs being used to compress the error signals generated by the predictive stage.

4.2.1.3 Lossless compression

Lossless compression is a class of compression algorithms that allows the exact original data to be reconstructed from the compressed data. It is often used as a component within lossy data compression technologies. Lossless compression is used when no assumption can be made on whether certain deviation is uncritical.

4.2.2 Pixel

The basic unit of the composition of an image on a television screen, computer monitor, or similar display. To produce a digitized version of the camera output, each scan line can be divided into a series of small areas, called “pixels”, generally, the smallest addressable unit on a display screen or bitmapped image.

Each pixel can be encoded as a set of 3 integers: Red, Green, and Blue or Luminance, Y, Blue color difference, Cb, and Red color difference, Cr (Cg can be computed from these.)

4.3Block Diagram

Fig 2. Block diagram

4.3.1 Functional Blocks of H.264 Decoder

Entropy decoding

Inverse Quantization

Inverse Transform

Motion compensation

Intra prediction

De-blocking filter

5. IMPLEMENTATION OF THE CABAC DECODER

5.1 Overall system description

5.1.1 Introduction

The CABAC decoder is build around the Xilinx ML410 evaluation board 17, 25, 22, 27, 8. This board includes the Xilinx Virtex 4 FPGA. This FPGA has two PowerPC 440 processors of which we will use only one. A part of the CABAC decoder we will make in hardware. The part that we want to accelerate is being build in hardware on the FPGA and the rest of the CABAC decoder is run in software on the PowerPC processor (also residing on the Virtex 4 FPGA). We only implement a part of the total H.264 video decoder. The part we want to test is the CABAC decoder. So this system is only capable of producing test-results and can not actually decode a video stream. It therefore misses the Network Abstraction Layer (NAL) and the H.264 decoder parts after the CABAC decoder.

The CABAC decoder software which is run on the PowerPC will delegate some computational intensive parts of the algorithm to the custom build hardware 28, 29, 21, 24. This hardware on the FPGA is specially build to accelerate that part and can only be used to accelerated that part of the software. The hardware resides on the FPGA and communication between the PowerPC processor and the hardware accelerator is done over the Auxiliary Processing Unit (APU) bus of the processor. This bus is specially designed to incorporate custom hardware accelerators onto the processors local system.

The hardware accelerator can be handled by the processor via the APU through the use of a special processor instruction.

5.1.2 Validation

To validate and verification the different parts of the system, the system was tested with predefined test vectors. The hardware made in VHDL was simulated in ModelSim. A test bench was written to validate the correct operation. Different input vectors were made and put into the system. The input vector were first run trough software, so the software could be compared to the hardware output.

On the level of hardware/software co-design, the system was run on the Xilinx ML410 development board. Since large parts of the H.264 video decoding algorithm were not implemented in the software, they were out of the scope of our research, we couldn’t test the system with actual video streams. Test vectors of the video stream were made by pointing in the original software the input and output of our total system. This way we made test vectors from real video stream, but only the parts that needed to be tested. Again the outputs were compared for validating the correct operation.

5.2 Different parts in the system

5.2.1 Hardware Accelerator (cabac decoder)

Figure 4.1: Hardware accelerator

5.2.2 Software

The CABAC decoder software that was run on the platform was a loop of different CABAC decoding instructions. First run only on the processor and secondly run with the hardware accelerator. Both of the runs were timed and the difference in the timings would give us a speed-up.

5.3 H264AVC-CONTEXT-ADAPTIVE-BINARY-ARITHMETIC-CODING-(CABAC)

The H.264 Advanced Video Coding standard specifies two types of entrop coding: Context-based Adaptive Binary Arithmetic Coding (CABAC) and Variable-Length Coding (VLC). This document provides a short introduction to CABAC. Familiarity with the concept of Arithmetic Coding is assumed.

5.3.1 Context-based adaptive binary arithmetic coding (CABAC)

In an H.264 codec, when entropy_coding_mode is set to 1, an arithmetic coding system is used to encode and decode H.264 syntax elements. The arithmetic coding scheme selected for H.264, Context-based Adaptive Binary Arithmetic Coding or CABAC, achieves good compression performance through (a) selecting probability models for each syntax element according to the element’s context, (b) adapting probability estimates based on local statistics and (c) using arithmetic coding.

Coding a data symbol involves the following stages.

Binarization: CABAC uses Binary Arithmetic Coding which means that only binary decisions (1 or 0) are encoded. A non-binary-valued symbol (e.g. a transform coe?cient or motion vector) is “binarized” or converted into a binary code prior to arithmetic coding. This process is similar to the process of converting a data symbol into a variable length code but the binary code is further encoded (by the arithmetic coder) prior to transmission.

Stages 2, 3 and 4 are repeated for each bit (or “bin”) of the binarized symbol.

Context model selection: A “context model” is a probability model for one or more bins of the binarized symbol. This model may be chosen from a selection of available models depending on the statistics of recently-coded data symbols. The context model stores the probability of each bin being “1” or “0”.

Arithmetic encoding: An arithmetic coder encodes each bin according to the selected probability model. Note that there are just two sub-ranges for each bin (corresponding to “0” and “1”).

Probability update: The selected context model is updated based on the actual coded value (e.g. if the bin value was “1”, the frequency count of “1”s is increased).

6. RESULTS

Normal decoding

CONCLUSION

In this project H.264 decoder functional blocks such as Context based Binary arithmetic coding (CABAC), Inverse Quantization and Inverse Discrete Cosine Transform are designed using Verilog to increase the speed of decoding operation. Since CABAC decoding is a highly time consuming process, CPU or DSP is not being the appropriate choice for real-time CABAC decoding applications. This project work shows that the hardware design of CABAC Decoder, Inverse Quantization and Inverse Discrete Cosine Transform are possible for a commercially viable H.264/AVC based video application, especially with increase in image size and quality settings in the future.

FUTURE DEVELOPMENTS

In this project work, CABAC decoder, Inverse Quantization and Inverse Discrete Cosine Transform are designed using Verilog to increase the speed of decoding operation. Since CABAC is a key technology adopted in H.264/AVC standard, it offers a 16% bit-rate reduction when compared to baseline entropy coder while increasing access frequency from 25% to 30%.So CABAC decoding is a highly time consuming process. Multiple decoding engines and shared memory between the modules can be implemented in future to increase the decoding speed especially to suite for high bit rate applications such as HDTV, High Definition DVD, Broadcast and Streaming, Digital Television. So Much space is left for real-time applications of higher video quality and larger image resolutions in the future.

REFERENCES

1 Jian-Wen Chen, Cheng-Ru Chang, and Youn-Long Lin, A hardware accelerator for context-based adaptive binary arithmetic decoding in H.264/AVC, ISCAS (5), IEEE, 2005, pp. 4525{4528.

2 Hendrik Eeckhaut, Mark Christiaens, Dirk Stroobandt, and Vincent Nollet, Optimizing the critical loop in the h.264/avc cabac decoder, Proceedings of International Conference on Field Programmable Technology (Bangkok), IEEE, 12 2006, pp. 113{118.

3 M. Jeanne, C. Guillemot, T. Guionnet, and F. Pauchet, Error-resilient decoding of context-based adaptive binary arithmetic codes, Signal Image and Video Processing 1 (2007), no. 1, 77{87.

4 Chung-Hyo Kim and In-Cheol Park, High speed decoding of context-based adaptive binary arithmetic codes using most probable symbol prediction, ISCAS, IEEE, 2006.

5 Lingfeng Li, Yang Song, Shen Li, Takeshi Ikenaga, and Satoshi Goto, A hardware architecture of CABAC encoding and decoding with dynamic pipeline for H.264/AVC, – (2008), { (En).

6 D. Marpe, H. Schwarz, and T. Wiegand, Context-based adaptive binary arithmetic coding in the h.264/avc video compression standard, Circuits and Systems for Video Technology, IEEE Transactions on 13 (2003), no. 7, 620{636.

7 M.E.Castro, R.R.Osorio, and J.D.Bruguera, Optimizing cabac for vliw architectures, – (Barcelona (Spain)), 2006.

8 Harn Hua Ng, Xilinx: Accelerated system performance with the apu controller and xtremedsp slices, v1.1.1 ed., 2009.

9 Jari Nikara, Stamatis Vassiliadis, Jarmo Takala, and Petri Liuha, Multiple-symbol parallel decoding for variable length codes, IEEE Trans. VLSI Syst 12 (2004), no. 7, 676{685.

10 R. R. Osorio and J. D. Bruguera, High-throughput architecture for H.264/AVC CABAC compression system, IEEE Trans. Circuits and Systems for Video Technology 16 (2006), no. 11, 1376{138