WebRTC Stack Architecture and layers

WebRTC stands for Web Real-Time Communications and introduces a real-time media framework in the browser core alongside associated JavaScript APIs for controlling the media frame and HTML5 tags for displaying.

If you are new to WebRTC , read what is WebRTC ? From a technical point of view, WebRTC will hide all the complexity of real-time media behind a very simple JavaScript API. 

Codec Confusion :

Video Codecs

Currently VP8 is the codec of choice since it is royalty-free. In mobility today, the codec of choice is h264. H264 is not royalty-free. But it is native in most mobile handsets due to its high performance.

Audio Codecs

Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music. As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents Which cover Opus are licensed under royalty-free terms.

G.711 is an ITU (International Telecommunications Union) standard for  audio compression. It is primarily used in telephony. The standard was released in 1972. It is the required standard in many voice-based systems  and technologies, for example in H.320 and H.323 specifications.
Speex is a patent-free audio compression format designed for speech and also  a free software speech codec that is used in VoIP applications and podcasts. Some consider Speex obsolete, with Opus as its official successor, but since
significant content is out there using Speex, it will not disappear anytime soon.

G.722 is an ITU standard 7 kHz Wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in 1988. G722 provides improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.

AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz. Its data rate is between 6-12 kbit/s, and the codec is generally available on mobile phones.

Architecture :

WebRTC offers web application developers the ability to write rich, realtime multimedia applications (think video chat) on the web, without requiring plugins, downloads or installs. It’s purpose is to help build a strong RTC platform that works across multiple web browsers, across multiple platforms.

WebRTCpublicdiagramforwebsite

Web API – An API to be used by third-party developers for developing web-based video chat-like applications.

WebRTC Native C++ API – An API layer that enables browser makers to easily implement the Web API proposal

Transport / Session – The session components are built by re-using components from libjingle, without using or requiring the XMPP/jingle protocol.

RTP Stack – A network stack for RTP, the Real-Time Protocol.

STUN/ICE – A component allowing calls to use the STUN and ICE mechanisms to establish connections across various types of networks.

Session Management – An abstracted session layer, allowing for call setup and management layer. This leaves the protocol implementation decision to the application developer.

VoiceEngine – VoiceEngine is a framework for the audio media chain, from sound card to the network.

iSAC / iLBC / Opus

iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. iSAC uses 16 kHz or 32 kHz sampling frequency with an adaptive and variable bit rate of 12 to 52 kbps.

iLBC: A narrowband speech codec for VoIP and streaming audio. Uses 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames. Defined by IETF RFCs 3951 and 3952.

Opus: Supports constant and variable bitrate encoding from 6 kbit/s to 510 kbit/s, frame sizes from 2.5 ms to 60 ms, and various sampling rates from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced). Defined by IETF RFC 6176.

NetEQ for Voice– A dynamic jitter buffer and error concealment algorithm used for concealing the negative effects of network jitter and packet loss. Keeps latency as low as possible while maintaining the highest voice quality.

Acoustic Echo Canceler (AEC) – The Acoustic Echo Canceler is a software-based signal processing component that removes, in real-time, the acoustic echo resulting from the voice being played out coming into the active microphone.

Noise Reduction (NR) -The Noise Reduction component is a software-based signal processing component that removes certain types of background noise usually associated with VoIP. (Hiss, fan noise, etc…)

Video Engine – VideoEngine is a framework video media chain for video, from the camera to the network, and from network to the screen.

VP8  – Video codec from the WebM Project. Well suited for RTC as it is designed for low latency.

Video Jitter Buffer – Dynamic Jitter Buffer for video. Helps conceal the effects of jitter and packet loss on overall video quality.
Image enhancements -For example, removes video noise from the image capture by the webcam.

W3C contribution


w3c

  • Media Stream Functions

API for connecting processing functions to media devices and network connections, including media manipulation functions.

  • Audio Stream Functions

An extension of the Media Stream Functions to process audio streams (e.g. automatic gain control, mute functions and echo cancellation).

  • Video Stream Functions

An extension of the Media Stream Functions to process video streams (e.g. bandwidth limiting, image manipulation or “video mute“).

  • Functional Component 

 API to query presence of WebRTC components in an implementation, instantiate them and connect them to media streams.

  • P2P Connection Functions

API functions to support establishing signalling protocol-agnostic peer-to-peer connections between Web browsers

  • API specification Availability

WebRTC 1.0: Real-time Communication Between Browsers –  Draft 3 June 2013 available

  • Implementation Library: WebRTC Native APIs

Media Capture and Streams – Draft 16 May 2013

  • Supported by Chrome , Firefox, Opera in desktop of all OS ( Linux, Windows , Mac )
  • Supported by Chrome , Firefox  in Mobile browsers ( android )

IETF contribution

ietf

Communication model

Security model

Firewall and NAT traversal

Media functions

Functionality such as media codecs, security algorithms, etc.,

Media formats

Transport of non media data between clients

Input to W3C for APIs development

Interworking with legacy VoIP equipment

WG RFC   Date

  • draft-ietf-rtcweb-audio-02      2013-08-02
  • draft-ietf-rtcweb-data-channel-05      2013-07-15
  • draft-ietf-rtcweb-data-protocol-00      2013-07-15
  • draft-ietf-rtcweb-jsep-03      2013-02-27
  • draft-ietf-rtcweb-overview-07      2013-08-14
  • draft-ietf-rtcweb-rtp-usage-07     2013-07-15
  • draft-ietf-rtcweb-security-05      2013-07-15
  • draft-ietf-rtcweb-security-arch-07      2013-07-15
  • draft-ietf-rtcweb-transports-00      2013-08-19
  • draft-ietf-rtcweb-use-cases-and-reqs-11      2013-06-27
  • Plus over 20 discussion RFC drafts

What will be the outcome of WebRTC Adoption?

In simple words, it’s a phenomenal change in decentralizing communication platforms from proprietary vendors who heavily depended on patented and royalty bound technologies and protocols.  It will revolutionize internet telephony.  Also it will emerge to be platform-independent ( ie any browser, any desktop operating system any mobile Operating system ).

WebRTC allows anybody to introduce real-time communication to their web page as simple as introducing a table.

Read More about webRTC business benefits


update 2020 – This article was written very early in 2013 while WebRTC was being standardised and not as widely adopted since the inception of WebRTC began in 2012.

There are many more articles written after that to explain and emphasize the detailing and application of WebRTC. List of these is below :

For SIP IMS and WebRTC

Read about STUN and TURN which form a crtical part of any webrtc based communication system

Security of WebRTC based CaaS and CPaaS

WebRTC APIs


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.