Category Archives: webRTC

WebRTC compatible android client

This post describes the requirement of creating a SIP phone application on android over the same codecs as WebRTC ( PCMA , PCMU , VP8) . In my project concerning the demonstration of WebRTC inter operability ( presence , audio / video call , message )  with a native android client , I had to develop a lightweight Android SIP application , customized for the look and feel of the webrtc web application . This also enables the added services to WebRTC client such as geolocation , visual voice mail , phonebook , call control options be set from android application as well .

Aim :

Android webrtc- sip client development , using sipml5 stack implemented through web services and native android programming .  

Software Used:

⦁ Eclipse IDE
⦁ Java SE Development Kit 7.0
⦁ Android SDK

Tasks :

⦁ Authorization of a user, based on his/her credentials (Database local to the application).

webrtc_android_2
⦁ Navigation Drawer on the home page which shows a menu giving the user various options like:
⦁ View Home Page
⦁ View Contact List
⦁ View/Edit My Profile
⦁ View My Location
⦁ Sign Out

⦁ Phonebook sync : Importing contact list of the Android Phone into the application. Editing user profile with values like  User Name ,  Password ,  Domain. 

webrtc_android_1
⦁ Inclusion of a Web View in the application which currently opens the desired webpage(http://sipml5.org/call.htm).

⦁ Geolocation: Showing marker for the current location of user in Google Maps.Displaying the address of the user in a Toast Message.

webrtc_android_4

⦁ Audio / Video call capability 

android_webrtc

figure 1 : Login page , figure 2 : Call page , Figure 3 : Menu bar 

Future Roadmap:

⦁ Connecting the application to a database which sits on the cloud.
⦁ Based on the entries in the database the user will be able to:
⦁ Login to the application.
⦁ View or edit his/her details in the My Profile Section.
⦁ Understanding codes of sample applications for making SIP calls from Android OS like:
⦁ SipDroid
⦁ SipDemo
⦁ IMSDroid
⦁ Modifying the existing application to be able to make SIP calls like one of the apps listed above.

Modules :

Development Done:
  1. Development of an authorization page connecting the application to a local database from where values are inserted and retrieved.
  2. Development of navigation drawer where additional options for the application will be displayed making it a user friendly application.
Development Planned:

1.Connectivity to a cloud database.  

2. App engine on cloud.

3. Importing contacts from phone address book .

4. Offine storage of profile details and few call logs .  

Architecture:

webrtc_android_enviornment

……………………………………………………………………………………..

Advertisements

Difference between WebRTC and plugin based communication

A lot of service providers ie telecom operators had deduced their own ways to provide Web based communication even before WebRTC was born . With time , as WebRTC has become stronger , more secure , resilient to failure they have come around to migrate their existing system from previous closed box native APIs to opensource WebRTC APIs.

The first figure ( given below ) depicts a communication platform build over plugins and proprietary APIs using HTTP REST based signaling .

2014-07-22_1212

Web Communication Service Architecture over HTTP/ REST API

As the migration took place the proprietary API components were replaced by Open standard based entities such as plugins were replaced by WebRTC APIs, HTTP REST based signalling was replaced by SIP ( Session Initiation Protocol ) .

Web Communication Service Architecture over WebRTC SIP

Web Communication Service Architecture over WebRTC SIP

Note telecom operator network did not had to face transformation by integration of WebRTC elements .

WebRTC communication diagrams

webrtc Real Time communication between SIP softphone supporting both SIP over websockets


webrtc Real Time communication between native SIP and SIP over Websockets


webrtc Real Time communication between clients supporting sip over websockets


webRTC business benefits

Historically, RTC has been corporate and complex, requiring expensive audio and video technologies to be licensed or developed in house. Integrating RTC technology with existing content, data and services has been difficult and time consuming, particularly on the web.

Now with WebRTC the operator finally gets a chance to take the shift the focus from OTT ( Over The Top service providers like SKype , Google chat WebEx etc that were otherwise eating away the Operators revenue ) to its very own WebRTC client Server solution , hence making the VOIP calls chargeable , while at the same time being available from any client ( web or softphone based ) 

……………………………………………..

Where are we Now ?

WebRTC has now implemented open standards for real-time, plugin-free video, audio and data communication.

  • Many web services already use RTC, but need downloads, native apps or plugins. These includes Skype, Facebook (which uses Skype) and Google Hangouts (which use the Google Talk plugin).
  • Downloading, installing and updating plugins can be complex, error prone and annoying , such as Flash , Java .,etc
  • Plugins can be difficult to deploy, debug, troubleshoot, test and maintain—and may require licensing and integration with complex, expensive technology. It’s often difficult to persuade people to install plugins in the first place/ bookmark it or keep it activated at all times .

……………………………………………

WebRTC supported browsers :

BrowserIcons

    PC
Google Chrome 23
Mozilla Firefox 22[10]
Opera 12
Android
Google Chrome 28 (Needs configuration at chrome://flags/)
Mozilla Firefox 24
Opera Mobile 12
Google Chrome OS
For other browsers “WebRTC4all ”  plugin :
Safari
Internet Explorer  ( v 9 )

…………………………………………..

In Conclusion

The APIs and standards of WebRTC can democratize and decentralize tools for content creation and communication—for telephony, gaming, video production, music making, news gathering and many other applications.
…………………………………………..

what is WebRTC ?

webrtc draft

What is WebRTC ?

  • API definition

WebRTC (Web Real-Time Communication) is an API definition drafted by the World Wide Web Consortium (W3C) that supportsbrowser-to-browser applications for voice calling, video chat, and P2P file sharing without the need of either internal or externalplugins.[

  • Enables browser to browser applications for voice calling, video chat and P2P file sharing without plugins.
  • Awaiting standardization , on a API level at the W3C and at the protocol level at the IETF.
  • Enables web browsers with Real-Time Communications (RTC) capabilities
  • Free, open project
 The following is the browser side stack for webrtc media .
 WebRTC media stack Solution Architecture

Core technologies:

  •  WEBM codecs 
  • Javascript functions  to access and process the browser media stack
  • HTML5  to embed the video and audio elements .
  • Signalling 

Why is Web RTC importatnt ?

Significantly better video quality WebRTC video quality is noticeably better than Flash.
Up to 6x faster connection times Using JavaScript WebSockets, also an HTML5 standard, improves session connection times and accelerates delivery of other OpenTok events.
Reduced audio/video latency WebRTC offers significant improvements in latency through WebRTC, enabling more natural and effortless conversations.
Freedom from Flash With WebRTC and JavaScript WebSockets, you no longer need to rely on Flash for browser-based RTC.
Native HTML5 elements Customize the look and feel and work with video like you would any other element on a web page with the new video tag in HTML5.

The major players behind conception and advancement of WebRTC standards and libraries are  :webrtc players icon

IETF , w3C , Java community , GSMA .
The idea is to develop a Light -weight browser based call console , to make SIP calls from Web page .This was successfully achieved using fundamental technologies as Javascript , html5 , web-sockts  and TCP /UDP , open source sip server.It is good to note that there is no extra extension, plugin or gateway required , such as flash support  .Also it bears cross platform support ,  including Mozilla , chrome so on .

 Peer to peer Communication

 WebRTC forms a p2p communication channel between all the peers . that means as the participant count grows  , it converts to  a mesh networking topology with incoming and outgoing stream towards direction of each of its peers .

Two party call p2p

two party call

Multiparty Call and mesh network

Multiparty party call
In special case of broadcasting or  large number of viewers ( without outgoing media stream ) it is recommended to setup a Media Control Unit ( MCU) which will replay the incoming stream to large number of users without putting traffic load on the clients from where the stream is actually originating .
Important note :  
1.It should be notes that these diagrams do not depict the ICE and NAT traversal and have been simplifies for better understanding. In real world scenarios there is almost all the time a STUN and TURN server involved .

More on TURN Servers is given here : NAT traversal using STUN and TURN

2.Also the webrtc mandates the use of secure origin ( https ) on the webpage which invoke getusermedia to capture user media devices like audio , video and location .

Read more in the layers of webrtc  and their functionalities here :  WebRTC layers

webrtc_development_logowebrtcdevelopment
Open Source WebRTC SDK and its implementation steps https://github.com/altanai/webrtc

WebRTC layers

WebRTC stands for Web Real-Time Communications and  introduces  a real-time media framework in the browser core alongside associated JavaScript APIs for controlling the media frame and HTML5 tags for displaying.

From a technical point of view, WebRTC will hide all the complexity of real-time media behind a very simple JavaScript API . 

WebRTC simplified :

In simple words its a phenomenal thing , that will revolutionize internet telephony .  Also it will emerge to be platform independent ( ie any browser , any desktop operating system any mobile Operating system ) .

WebRTC allows anybody to introduce real-time communication to their web page as simply as introducing a table.

Codec Confusion :

Audiio Codecs

Currently VP8 is the codec of choice since it is royalty free. In mobility today, the codec of choice is h264. H264 is not royalty free. But it is native in most mobile handsets due to its high performance.

Voice Codecs

Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music. As an open format standardized through RFC
6716, a reference implementation is provided under the 3-clause BSD license. All known software patents Which cover Opus are licensed under royalty-free terms.
G.711 is an ITU (International Telecommunications Union) standard for  audio compression. It is primarily used in telephony. The standard was released in 1972. It is the required standard in many voice-based systems  and technologies, for example in H.320 and H.323 specifications.
Speex is a patent-free audio compression format designed for speech and also  a free software speech codec that is used in VoIP applications and podcasts. Some consider Speex obsolete, with Opus as its official successor, but since
significant content is out there using Speex, it will not disappear anytime soon.
G.722 is an ITU standard 7 kHz Wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in 1988. G722 provides improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.

AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz. Its data rate is between 6-12 kbit/s, and the codec is generally available on mobile phones.

Architecture :

WebRTC offers web application developers the ability to write rich, realtime multimedia applications (think video chat) on the web, without requiring plugins, downloads or installs. It’s purpose is to help build a strong RTC platform that works across multiple web browsers, across multiple platforms.

WebRTCpublicdiagramforwebsite

Web API – An API to be used by third party developers for developing web based videochat-like applications.

WebRTC Native C++ API – An API layer that enables browser makers to easily implement the Web API proposal.

Transport / Session

The session components are built by re-using components from libjingle, without using or requiring the xmpp/jingle protocol.

RTP Stack – A network stack for RTP, the Real Time Protocol.

STUN/ICE – A component allowing calls to use the STUN and ICE mechanisms to establish connections across various types of networks.

Session Management

An abstracted session layer, allowing for call setup and management layer. This leaves the protocol implementation decision to the application developer.

VoiceEngine

VoiceEngine is a framework for the audio media chain, from sound card to the network.

iSAC / iLBC / Opus

iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. iSAC uses 16 kHz or 32 kHz sampling frequency with an adaptive and variable bit rate of 12 to 52 kbps.

iLBC: A narrowband speech codec for VoIP and streaming audio. Uses 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames. Defined by IETF RFCs 3951 and 3952.

Opus: Supports constant and variable bitrate encoding from 6 kbit/s to 510 kbit/s, frame sizes from 2.5 ms to 60 ms, and various sampling rates from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced). Defined by IETF RFC 6176.

NetEQ for Voice– A dynamic jitter buffer and error concealment algorithm used for concealing the negative effects of network jitter and packet loss. Keeps latency as low as possible while maintaining the highest voice quality.

Acoustic Echo Canceler (AEC) – The Acoustic Echo Canceler is a software based signal processing component that removes, in real time, the acoustic echo resulting from the voice being played out coming into the active microphone.

Noise Reduction (NR) -The Noise Reduction component is a software based signal processing component that removes certain types of background noise usually associated with VoIP. (Hiss, fan noise, etc…)

VideoEngine 

VideoEngine is a framework video media chain for video, from camera to the network, and from network to the screen.

VP8 –Video codec from the WebM Project. Well suited for RTC as it is designed for low latency.
Video Jitter Buffer – Dynamic Jitter Buffer for video. Helps conceal the effects of jitter and packet loss on overall video quality.
Image enhancements -For example, removes video noise from the image capture by the webcam.


w3c

—Media Stream Functions

—API for connecting processing functions to media devices and network connections, including media manipulation functions.

—Audio Stream Functions

—An extension of the Media Stream Functions to process audio streams (e.g. automatic gain control, mute functions and echo cancellation).

—Video Stream Functions

—An extension of the Media Stream Functions to process video streams (e.g. bandwidth limiting, image manipulation or “video mute“).

Functional Component Functions

—API to query presence of WebRTC components in an implementation, instantiate them, and connect them to media streams.

—P2P Connection Functions

—API functions to support establishing signalling protocol agnostic peer-to-peer connections between Web browsers

  • API specification Availability

WebRTC 1.0: Real-time Communication Between Browsers –  Draft 3 June 2013 available

  • -Implementation Library: WebRTC Native APIs

Media Capture and Streams – Draft 16 May 2013

  • Supported by Chrome , Firefox , Opera in desktop of all OS ( Linux , Windows , Mac )
  • Supported by Chrome , Firefox  in Mobile browsers ( android )

ietf

  • —Communication model
  • —Security model
  • —Firewall and NAT traversal
  • —Media functions
  • —Functionality such as media codecs, security algorithms, etc.,
  • —Media formats
  • —Transport of non media data between clients
  • —Input to W3C for APIs development
  • Interworking with legacy VoIP equipment

WG RFC   Date

  • draft-ietf-rtcweb-audio-02      2013-08-02
  • draft-ietf-rtcweb-data-channel-05      2013-07-15
  • draft-ietf-rtcweb-data-protocol-00      2013-07-15
  • draft-ietf-rtcweb-jsep-03      2013-02-27
  • draft-ietf-rtcweb-overview-07      2013-08-14
  • draft-ietf-rtcweb-rtp-usage-07     2013-07-15
  • draft-ietf-rtcweb-security-05      2013-07-15
  • draft-ietf-rtcweb-security-arch-07      2013-07-15
  • draft-ietf-rtcweb-transports-00      2013-08-19
  • draft-ietf-rtcweb-use-cases-and-reqs-11      2013-06-27
  • Plus over 20 discussion RFC drafts

Next -> webRTC business benefits


 WEBRTC CALL BETWEEN BROWSER AND SIP PHONE

Call Between Web client and SIP client

  1. HTML5 and WebRTC enabled Web Client :

We are using open source HTML5 SIP client entirely written in javascript to make it light and to have easy integration with the SIP server. No extension, plugin or gateway is needed to initiate the call from the web Client. The media stack rely on WebRTC. The client can be used to connect to any SIP or IMS network from HTML5 and WebRTC enabled browser to make and receive audio/video calls and instant messages.

  1.  Proxy Server / WS to UDP Translator :

For the Proposed Solution we are proposing the Freeware light SIP – Server which besides acting like the normal Sip Server and Registrar can also act like the Translator Engine to convert the SIP over WS message to SIP over UDP. As one of the requirement we need to terminate the call on the hard-phone like Turret which supports only SIP over UDP we need to have the translator in the overall picture which can convert the SIP over WS request to SIP over UDP. Through this component the use case like initiating the call from the web Browser the terminating the call at the Hard-phone is possible.

  1. Soft Phone/ SIP  Client :

We are using the Boghe IMS client to act like the Soft phone which supports the Audio Codec required to talk with web Client like PCMU And PCMA audio Codec.

Working on the discussed Components we have successfully established the following Use- Case Scenario.

  1. Call Initiated from the Browser and Terminated on Browser :

(a)   Signalling Part – Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part – SDP is being exchanged as capture by Wire-shark and both the client can exchange Voice.

  1. Call initiated from the Browser and Terminated on the Softphone and Vice-Vera :

(a)    Signalling Part – Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part – SDP is being exchanged as capture by Wire-shark and both the client can exchange Voice but have some dependency on machine being used.

  1. Call initiated from the Softphone and Terminating on SoftPhone.

(a)   Signalling Part : Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part : No hiccup its working fine.


The structure for multi network traversal using ICE – STUN and TURN is described in the following diagram .

Call Between Web client and SIP client (1)

You can read more about NAT traversal using STUN and TURN here .

Detailed TURN server for WebRTC – RFC5766-TURN-Server , Coturn , Xirsys is here .