Article discusses the popularly adopted current standards for video codecs( compression / decompression) namely MPEG2, H264, H265 and AV1
MPEG-2 (a.k.a. H.222/H.262 as defined by the ITU)
Generic coding of moving pictures and associated audio information
Combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth.
better than MPEG 1
Evolved out of the shortcomings of MPEG-1 such as audio compression system limited to two channels (stereo), No standardized support for interlaced video with poor compression , Only one standardized “profile” (Constrained Parameters Bitstream), which was unsuited for higher resolution video.
- over-the-air digital television broadcasting and in the DVD-Video standard.
- TV stations, TV receivers, DVD players, and other equipment
- MOD and TOD – recording formats for use in consumer digital file-based camcorders.
- XDCAM – professional file-based video recording format.
- DVB – Application-specific restrictions on MPEG-2 video in the DVB standard:
Introduced in 2004 as Advanced Video Coding (AVC), or H.264 or aka MPEG-4 AVC or ITU-T H.264 / MPEG-4 Part 10 ‘Advanced Video Coding’ (AVC).
Better than MPEG2
40-50% bit rate reduction compared to MPEG-2
Support Up to 4K (4,096×2,304) and 59.94 fps
21 profiles ; 17 levels
Video compression relies on predicting motion between frames. It works by comparing different parts of a video frame to find the ones that are redundant within the subsequent frames ie not changed such as background sections in video. These areas are replaced with a short information, referencing the original pixels(intraframe motion prediction) using mathematical function and direction of motion
Hybrid spatial-temporal prediction model
Flexible partition of Macro Block(MB), sub MB for motion estimation
Intra Prediction (extrapolate already decoded neighbouring pixels for prediction)
Introduced multi-view extension
9 directional modes for intra prediction
Macro Blocks structure with maximum size of 16×16
Entropy coding is CABAC(Context-adaptive binary arithmetic coding) and CAVLC(Context-adaptive variable-length coding )
- most deployed video compression standard
- Delivers high definition video images over direct-broadcast satellite-based television services,
- Digital storage media and Blu-Ray disc formats,
- Terrestrial, Cable, Satellite and Internet Protocol television (IPTV)
- Security and surveillance systems and DVB
- Mobile video, media players, video chat
High Efficiency Video Coding (HEVC), or H.265 or MPEG-H HEVC
Video compression standard designed to substantially improve coding efficiency
Stream high-quality videos in congested network environments or bandwidth constrained mobile networks.
Introduced in Jan 2013 as product of collaboration between the ITU Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG).
better than H264
overcome shortage of bandwidth, spectrum, storage
bandwidth savings of approx. 45% over H.264 encoded content
resolutions up to 8192×4320, including 8K UHD
Supports up to 300 fps
3 approved profiles, draft for additional 5 ; 13 levels
Whereas macroblocks can span 4×4 to 16×16 block sizes, CTUs can process as many as 64×64 blocks, giving it the ability to compress information more efficiently.
multiview encoding – stereoscopic video coding standard for video compression that allows for the efficient encoding of video sequences captured simultaneously from multiple camera angles in a single video stream. It also packs a large amount of inter-view statistical dependencies.
Enhanced Hybrid spatial-temporal prediction model
CTU ( coding tree units) supporting larger block structure (64×64) with more variable sub partition structures
Motion Estimation – Intra prediction with more nodes, asymmetric partitions in Inter Prediction)
Individual rectangular regions that divide the image are independent
Paralleling processing computing – decoding process can be split up across multiple parallel process threads, taking advantage multi-core processors.
Wavefront Parallel Processing (WPP)- sort of decision tree that grants a more productive and effectual compression.
33 directional nodes – DC intra prediction , planar prediction. , Adaptive Motion Vector Prediction
Entropy coding is only CABAC
- cater to growing HD content for multi platform delivery
- differentiated and premium 4K content
reduced bitrate enables broadcasters and OTT vendors to bundle more channels / content on existing delivery mediums
also provide greater video quality experience at same bitrate
Using ffmpeg for H265 encoding
I took a h264 file (640×480) , duration 30 seconds of size 39,08,744 bytes (3.9 MB on disk) and converted using ffnpeg
After conversion it was a HEVC (Parameter Sets in Bitstream) , MPEG-4 movie – 621 KB only !!! without any loss of clarity.
> ffmpeg -i pivideo3.mp4 -c:v libx265 -crf 28 -c:a aac -b:a 128k output.mp4 ffmpeg version 4.1.4 Copyright (c) 2000-2019 the FFmpeg developers built with Apple LLVM version 10.0.1 (clang-1001.0.46.4) configuration: --prefix=/usr/local/Cellar/ffmpeg/4.1.4_2 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags='-I/Library/Java/JavaVirtualMachines/adoptopenjdk-12.0.1.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/adoptopenjdk-12.0.1.jdk/Contents/Home/include/darwin' --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librubberband --enable-libsnappy --enable-libtesseract --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-videotoolbox --disable-libjack --disable-indev=jack --enable-libaom --enable-libsoxr libavutil 56. 22.100 / 56. 22.100 libavcodec 58. 35.100 / 58. 35.100 libavformat 58. 20.100 / 58. 20.100 libavdevice 58. 5.100 / 58. 5.100 libavfilter 7. 40.101 / 7. 40.101 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 3.100 / 5. 3.100 libswresample 3. 3.100 / 3. 3.100 libpostproc 55. 3.100 / 55. 3.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'pivideo3.mp4': ...
if you get error like
Unknown encoder 'libx265'
then reinstall ffmpeg with h265 support
Realtime High quality video encoder
product of product of the Alliance for Open Media (AOM)
Contained by Matroska , WebM , ISOBMFF , RTP (WebRTC)
better than H265
AV1 is royalty free and overcomes the patent complexities around H265/HVEC
- Video transmission over internet , voip , multi conference
- Virtual / Augmented reality
- self driving cars streaming
- intended for use in HTML5 web video and WebRTC together with the Opus audio format
- MPEG2 – https://en.wikipedia.org/wiki/MPEG-2
- ITU on HVEC H265 http://www.itu.int/net/pressoffice/press_releases/2013/01.aspx#.XX85C5MzY6V
- HVEC – wikipedia https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding
- Multiview-encoing / MVC 3D https://en.wikipedia.org/wiki/Multiview_Video_Coding
- AV1 – https://blogs.cisco.com/collaboration/cisco-leap-frogs-h-264-video-collaboration-with-real-time-av1-codec , https://en.wikipedia.org/wiki/AV1 , https://aomediacodec.github.io/av1-spec/av1-spec.pdf
- FFMpeg wiki – https://trac.ffmpeg.org/wiki/Encode/H.265