Article discusses the popularly adopted current standards for video codecs( compression / decompression) namely MPEG2, H264, H265 and AV1
MPEG 2
MPEG-2 (a.k.a. H.222/H.262 as defined by the ITU)
generic coding of moving pictures and associated audio information
combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth.
better than MPEG 1
evolved out of the shortcomings of MPEG-1 such as audio compression system limited to two channels (stereo) , No standardized support for interlaced video with poor compression , Only one standardized “profile” (Constrained Parameters Bitstream), which was unsuited for higher resolution video.
Application
- over-the-air digital television broadcasting and in the DVD-Video standard.
- TV stations, TV receivers, DVD players, and other equipment
- MOD and TOD – recording formats for use in consumer digital file-based camcorders.
- XDCAM – professional file-based video recording format.
- DVB – Application-specific restrictions on MPEG-2 video in the DVB standard:
H264
Advanced Video Coding (AVC), or H.264 or aka MPEG-4 AVC or ITU-T H.264 / MPEG-4 Part 10 ‘Advanced Video Coding’ (AVC)
introduced in 2004
Better than MPEG2
40-50% bit rate reduction compared to MPEG-2
Support Up to 4K (4,096×2,304) and 59.94 fps
21 profiles ; 17 levels
Compression Model
Video compression relies on predicting motion between frames. It works by comparing different parts of a video frame to find the ones that are redundant within the subsequent frames ie not changed such as background sections in video. These areas are replaced with a short information, referencing the original pixels(intraframe motion prediction) using mathematical function and direction of motion
Hybrid spatial-temporal prediction model
Flexible partition of Macro Block(MB), sub MB for motion estimation
Intra Prediction (extrapolate already decoded neighbouring pixels for prediction)
Introduced multi-view extension
9 directional modes for intra prediction
Macro Blocks structure with maximum size of 16×16
Entropy coding is CABAC(Context-adaptive binary arithmetic coding) and CAVLC(Context-adaptive variable-length coding )
Applications
- most deployed video compression standard
- Delivers high definition video images over direct-broadcast satellite-based television services,
- Digital storage media and Blu-Ray disc formats,
- Terrestrial, Cable, Satellite and Internet Protocol television (IPTV)
- Security and surveillance systems and DVB
- Mobile video, media players, video chat
H265
High Efficiency Video Coding (HEVC), or H.265 or MPEG-H HEVC
video compression standard designed to substantially improve coding efficiency
stream high-quality videos in congested network environments or bandwidth constrained mobile networks
Jan 2013
product of collaboration between the ITU Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG).
HVEC wikipedia
better than H264
overcome shortage of bandwidth, spectrum, storage
bandwidth savings of approx. 45% over H.264 encoded content
resolutions up to 8192×4320, including 8K UHD
Supports up to 300 fps
3 approved profiles, draft for additional 5 ; 13 levels
Whereas macroblocks can span 4×4 to 16×16 block sizes, CTUs can process as many as 64×64 blocks, giving it the ability to compress information more efficiently.
multiview encoding – stereoscopic video coding standard for video compression that allows for the efficient encoding of video sequences captured simultaneously from multiple camera angles in a single video stream. It also packs a large amount of inter-view statistical dependencies.
Compression Model
Enhanced Hybrid spatial-temporal prediction model
CTU ( coding tree units) supporting larger block structure (64×64) with more variable sub partition structures
Motion Estimation – Intra prediction with more nodes, asymmetric partitions in Inter Prediction)
Individual rectangular regions that divide the image are independent
Paralleling processing computing – decoding process can be split up across multiple parallel process threads, taking advantage multi-core processors.
Wavefront Parallel Processing (WPP)- sort of decision tree that grants a more productive and effectual compression.
33 directional nodes – DC intra prediction , planar prediction. , Adaptive Motion Vector Prediction
Entropy coding is only CABAC
Applications
- cater to growing HD content for multi platform delivery
- differentiated and premium 4K content
reduced bitrate enables broadcasters and OTT vendors to bundle more channels / content on existing delivery mediums
also provide greater video quality experience at same bitrate
Using ffmpeg for H265 encoding
I took a h264 file (640×480) , duration 30 seconds of size 39,08,744 bytes (3.9 MB on disk) and converted using ffnpeg
After conversion it was a HEVC (Parameter Sets in Bitstream) , MPEG-4 movie – 621 KB only !!! without any loss of clarity.
> ffmpeg -i pivideo3.mp4 -c:v libx265 -crf 28 -c:a aac -b:a 128k output.mp4 ffmpeg version 4.1.4 Copyright (c) 2000-2019 the FFmpeg developers built with Apple LLVM version 10.0.1 (clang-1001.0.46.4) configuration: --prefix=/usr/local/Cellar/ffmpeg/4.1.4_2 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags='-I/Library/Java/JavaVirtualMachines/adoptopenjdk-12.0.1.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/adoptopenjdk-12.0.1.jdk/Contents/Home/include/darwin' --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librubberband --enable-libsnappy --enable-libtesseract --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-videotoolbox --disable-libjack --disable-indev=jack --enable-libaom --enable-libsoxr libavutil 56. 22.100 / 56. 22.100 libavcodec 58. 35.100 / 58. 35.100 libavformat 58. 20.100 / 58. 20.100 libavdevice 58. 5.100 / 58. 5.100 libavfilter 7. 40.101 / 7. 40.101 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 3.100 / 5. 3.100 libswresample 3. 3.100 / 3. 3.100 libpostproc 55. 3.100 / 55. 3.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'pivideo3.mp4': Metadata: major_brand : isom minor_version : 1 compatible_brands: isomavc1 creation_time : 2019-06-23T04:58:13.000000Z Duration: 00:00:29.84, start: 0.000000, bitrate: 1047 kb/s Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480, 1046 kb/s, 25 fps, 25 tbr, 25k tbn, 50k tbc (default) Metadata: creation_time : 2019-06-23T04:58:13.000000Z handler_name : h264@GPAC0.5.2-DEV-revVersion: 0.5.2-426-gc5ad4e4+dfsg5-3+deb9u1 Codec AVOption b (set bitrate (in bits/s)) specified for output file #0 (output.mp4) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream. Stream mapping: Stream #0:0 -> #0:0 (h264 (native) -> hevc (libx265)) Press [q] to stop, [?] for help x265 [info]: HEVC encoder version 3.1.2+1-76650bab70f9 x265 [info]: build info [Mac OS X][clang 10.0.1][64 bit] 8bit+10bit+12bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 x265 [info]: Main profile, Level-3 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: Slices : 1 x265 [info]: frame threads / pool features : 2 / wpp(8 rows) x265 [warning]: Source height < 720p; disabling lookahead-slices x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3 x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / off / on x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra x265 [info]: tools: strong-intra-smoothing deblock sao Output #0, mp4, to 'output.mp4': Metadata: major_brand : isom minor_version : 1 compatible_brands: isomavc1 encoder : Lavf58.20.100 Stream #0:0(und): Video: hevc (libx265) (hev1 / 0x31766568), yuv420p, 640x480, q=2-31, 25 fps, 12800 tbn, 25 tbc (default) Metadata: creation_time : 2019-06-23T04:58:13.000000Z handler_name : h264@GPAC0.5.2-DEV-revVersion: 0.5.2-426-gc5ad4e4+dfsg5-3+deb9u1 encoder : Lavc58.35.100 libx265 frame= 746 fps= 64 q=-0.0 Lsize= 606kB time=00:00:29.72 bitrate= 167.2kbits/s speed=2.56x video:594kB audio:0kB subtitle:0kB other streams:0kB global headers:2kB muxing overhead: 2.018159% x265 [info]: frame I: 3, Avg QP:27.18 kb/s: 1884.53 x265 [info]: frame P: 179, Avg QP:27.32 kb/s: 523.32 x265 [info]: frame B: 564, Avg QP:35.17 kb/s: 38.69 x265 [info]: Weighted P-Frames: Y:5.6% UV:5.0% x265 [info]: consecutive B-frames: 1.6% 3.8% 9.3% 53.3% 31.9% encoded 746 frames in 11.60s (64.31 fps), 162.40 kb/s, Avg QP:33.25
if you get error like
Unknown encoder 'libx265'
then reinstall ffmpeg with h265 support
AV1
Realtime High quality video encoder
product of product of the Alliance for Open Media (AOM)
Contained by Matroska , WebM , ISOBMFF , RTP (WebRTC)
Wikipedia AV1
better than H265
AV1 is royalty free and overcomes the patent complexities around H265/HVEC
Applications
- Video transmission over internet , voip , multi conference
- Virtual / Augmented reality
- self driving cars streaming
- intended for use in HTML5 web video and WebRTC together with the Opus audio format
- Ref :
- MPEG2 – https://en.wikipedia.org/wiki/MPEG-2
- ITU on HVEC H265 http://www.itu.int/net/pressoffice/press_releases/2013/01.aspx#.XX85C5MzY6V
- HVEC – wikipedia https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding
- Multiview-encoing / MVC 3D https://en.wikipedia.org/wiki/Multiview_Video_Coding
- AV1 – https://blogs.cisco.com/collaboration/cisco-leap-frogs-h-264-video-collaboration-with-real-time-av1-codec , https://en.wikipedia.org/wiki/AV1 , https://aomediacodec.github.io/av1-spec/av1-spec.pdf
- FFMpeg wiki – https://trac.ffmpeg.org/wiki/Encode/H.265