TFX is WebRTC based communication platform built entirely on open standards making it extensively scalable. The underlying API completely masks the communication aspect and lets the user enjoy an interactive communication session. It also supports easy to build widgets framework which can be used to build applications on the TFX platform .
TFX Sessions
TFX sessions is a part of TFX . It is a free Chrome extension WebRTC client that enables parties communicating and collaborating, to have an interactive and immersive experience. You can find it on Chrome Webstore here .
Features of TFX Sessions:
Through TFX, users can have instant multimedia Internet call sessions .
The core features are :
No signin or account management
No additional requirement like Flash , Silverlight or Java
URL based session management
secure WebRTC based communication
complete privacy with no user tracking or media flow interruption
Ability to share session on social network platforms like Facebook , twitter , linkedin , gmail , google plus etc
ability to choose between multiple cameras
The TFX platform has developer friendly APIs to help build widgets. Some of the pre-built widgets available on TFX are:
Coding
Drawing
Multilingual chat
Screen sharing
TFX sessions is free for personal use and can be downloaded from Chrome Webstore.
What is the differentiator with other internet call services?
No registration , login for account management required
Communication is directly between peer to peer ie information privacy.
Third party apps , services can be included as widgets on TFX platform.
Can be skimmed to be embedded inside Mobile app webview , iframe, other portals etc anytime .
TFX Sessions Integration Models
The 3 possible approaches for TFX Integration in increasing order of deployment time are :
WebSite’s widget on TFX chrome extension .
Launch TFX extension in an independent window from website
TFX call from embedded Window inside the website page
1. WebSite’s widget on TFX chrome extension .
This outlines the quickest deliverable approach of building the websites own customized widget on TFX widgets API and deployed on existing TFX communication setup .
Step 1 : Login using websites credentials to access the content
Step 2 : Access the website with the other person inside the TFX “ Pet Store “ Widget
2. Launch TFX in an independent window from “Click to Call” Button on website
This approach outlines the process of launching TFX in an independent window from a click of a button on website. However it is a prerequisite to have TFX extension installed on your Chrome browser beforehand.
Step 1 : Have TFX installed on chrome browser Step 2 : Trigger and launch TFX chrome extension window on click of button on webpage
3. TFX call from embedded Window inside Website page
This section if for the third approach which is of being able to make TFX calls from embedded Window inside of the webpage. Refer to sample screen below :
Step 1 : Have TFX embedded in an iframe inside the website
Step 2 : Make session on click of button inside the iframe.
Technical Details about TFX like architecture , widgets development , components description etc can be found here : TFX Platform
By design WebRTC was intended to be a secure p2p end to end encrypted form of real time communication tool. It ensures that
media is always encrypted (SRTP)
key exchange is secure ( DTLS)
WebRTC APIs should be invoked from a secure web site ( https)
secure signaling ( TLS on signaling such as WSS)
Additionally users and developers can ensure security by making sure browser updates, removing deprecated libraries and patching known vulnerabilities. However, still, the security challenges with Web Server based WebRTC service are many for example :
If both peers have a WebRTC browser, one peer can place a WebRTC call to the callee at any time. As call can be automatically answered, this might result in a denial of service(DoS) for the receiver.
Since the media is p2p and also can override firewalls settings through the TURN server, it can result in unwanted/ prohibited data being sent on the network.
Websocket packets are untraceable to detect whether they are used for normal web navigation or to share SDP hence one may secretly make no RTP calls to users through the web server and exchange information.
Threat from screen sharing, for example, a user might mistakenly share his internet banking screen or some confidential information / PII present on the desktop.
Giving long-term access to the camera and microphone for certain sites is also a concern. for example:
in an unclosed tab on a site that has access to your microphone and camera, the remote peer can secretly be viewing your webcam and microphone inputs.
Clever use of User Interface to mask an ongoing call can mislead the user into believing that call has been cut while it is secretly still ongoing.
Network attackers can modify an HTTP connection through my Wifi router or hotspot to inject an IFRAME (or a redirect) and then forge the response to initiate a call to themselves.
As WebRTC doesn’t have a strong congestion control mechanism, it can eat up a large chunk of the user’s bandwidth.
By visiting chrome://webrtc-internals/ in chrome browser alone, one can view the full traces of all webRTC communication happening through his browser. The traces contain details like the signaling server used, relay servers, and TURN servers. Additionally, they show peer IP and frame rates. This information can jeopardize the security of VoIP service providers.
Ofcourse other challenges that arrive with any other webservice based architecture are also applicable here such as :
Malicious Websites which automatically execute the attacker’s scripts.
User can be induced to download harmful executable files and run them.
Improper use of W3C Cross-Origin Resource Sharing (CORS) to bypass SAME ORIGIN POLICY (SOP)
Unlike most conventional real-time systems (e.g., SIP-based softphones) WebRTC communications are directly controlled by a Web server over some signaling protocol which may be XMPP, WebSockets, socket.io, Ajax etc. This poses new challenges such as
A web browser might expose JavaScript APIs which allows web server to place a video call itself. This may cause web pages to secretly record and stream the webcam activity from the user’s computer.
malicious calling services can record the user’s conversation and misuse.
malicious webpages can lure users via advertising and execute auto calling services.
Since JavaScript calling APIs are implemented as browser built-ins, unauthorized access to these can also make users’ audio and camera streams vulnerable.
If programs and APIs allow the server to instruct the browser to send arbitrary content, then they can be used to bypass firewalls or mount denial of service attacks.
The general goal of security is to identify and resolve security issues during the design phase so they do not cost service provider time, money, and reputation at a later phase. Security for a large architecture project involves many aspects, there is no one device or methodology to guarantee that an architecture is now “secure” Areas that malicious individuals will attempt to attack include but are not limited to:
Improperly coded applications
Incorrectly implemented protocols
Operating System bugs
Social engineering and phishing attacks
As security is a broad topic touching on many sections of WebRTC this section is not meant to address all topics but instead to focus on specific “hot spots”, areas that require special attention due to the unique properties of the WebRTC service. There are several security-related topics that are of particular interest with respect to WebRTC. The are discussed in detail in sections below.
Today the browser acts as a TRUSTED COMPUTING BASE (TCB) where the HTML and JS act inside of a sandbox that isolates them both from the user’s computer. With the latest tightening of patches around security concerns in webRTC platforms, a script cannot access a user’s webcam, microphone, location, file, desktop capture without the user’s explicit consent. When the user allows access, a red dot will appear on that tab, providing a clear indication to the user, that the tab has media access.
Figure depicting browser asking for user’s consent to access Media devices for WebRTC .Figure depicting Media Capture active on browser with red dot .
Persistent Identifiers like Reusing certificates can enable tracking. Mitigation: Generate ephemeral DTLS certificates per session.
Specific security concerns include around browsers. A type vulnerability typically found in Web applications (such as web browsers through breaches of browser security) that enables attackers to inject client-side script into Web pages viewed by other users.
A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same origin policy.
Cross-site scripting carried out on websites accounted for roughly 80.5% of all security vulnerabilities documented by Symantec as of 2007 according to Wikipedia.
Their effect may range from a petty nuisance to a significant security risk, depending on the sensitivity of the data handled by the vulnerable site and the nature of any security mitigation implemented by the site’s owner.
As the primary method for accessing WebRTC is expected to be using HTML5 enabled browsers there are specific security considerations concerning their use such as; protecting keys and sensitive data from cross-site scripting or cross-domain attacks, websocket use, iframe security, and other issues. Because the client software will be controlled by the user and because the browser does not, in most cases, run in a protected environment there are additional chances that the WebRTC client will become compromised. This means all data sent to the client could be exposed.
keys
hashes
registration elements (PUID etc.)
Therefore additional care needs to be taken when considering what information is sent to the client, and additional scrutiny needs to be performed on any data coming from the client.
(User Interface redress attack, UI redress attack, UI redressing) is a malicious technique of tricking a Web user into clicking on something different to what the user perceives they are clicking on, thus potentially revealing confidential information or taking control of their computer while clicking on seemingly innocuous web pages. It is a browser security issue that is a vulnerability across a variety of browsers and platforms, a clickjack takes the form of embedded code or a script that can execute without the user’s knowledge, such as clicking on a button that appears to perform another function. Compromised personal computer with installed adware, viruses, spyware such as trojan horses, etc. can also compromise the browser and obtain anything the browser sees.
The browser acts as a TRUSTED COMPUTING BASE (TCB) both from the user’s perspective and to some extent from the server’s. HTML and JavaScript (JS) provided by the web server can execute scripts on browser and generate actions and events . However browser operates in a sandbox that isolates these scripts both from the user’s computer and from server .
The users computer may have lot of private and confidential data on the disk . Browser do make it mandatory that user must explicitly select the file and consent to its upload before doing file upload and transfer transactions . However still it is not very rare that misleading text and buttons can make users click files .
Another way of accessing local resources is through downloading malicious files to users computer which are executable and may harm users computer.
We know that XMLHttpRequest() API can be used to secretly send data from one origin to other and this can be used to secretly send information without user’s knowledge. However now , SAME ORIGIN POLICY (SOP) in browser’s prevents server A from mounting attacks on server B via the user’s browser, which protects both the user (e.g., from misuse of his credentials) and the server B (e.g., from DoS attack).
SOP forces scripts from each site to run in their own, isolated, sandboxes. It enables webpages and scripts from the same origin server to interact with each other’s JS variables, but prevents pages from the different origins or even iframes on the same page to not exchange information.
As part of SOP scripts are allowed to make HTTP requests via the XMLHttpRequest() API to only those server which have same ORIGIN/domain as that of the originator .
CORS enables multiple web services to intercommunicate . Therefore when a script from origin A executes what would otherwise be a forbidden cross-origin request, the browser instead contacts the target server B to determine whether it is willing to allow cross-origin requests from A. If it is so willing, the browser then allows the request. This consent verification process is designed to safely allow cross-origin requests.
Once a WebSockets connection has been established from a script to a site, the script can exchange any traffic it likes without being required to frame it as a series of HTTP request/response transactions.
Even websockets overcome SOP and establish cross origin transport channels, they pose some challenging scenarios for a secure application deisgn.
WebSockets use masking technique to randomize the bits that are being transmitted , thus making it more difficult to generate traffic which resembles a given protocol , thus making it difficult for inspection from flowing traffic .
Jsonp is a hack designed to bypass origin restriction through script tag injection. A JSONp enabled server passes the response in user specified function
when we use <script> tags the domain limitation is ignored ie we can load scripts from any domain . So when we need to fetch get exchange data just pass callback parameters through scripts . For example
function mycallback(data){
// this is the callback function executed when script returns
alert("hi"+ data);</span>
}
var script = document.createElement('script');
script.src = '//serverb.com/v1/getdata?callback=mycallback'
document.head.appendChild(script)
There have been found vulnerabilities in the existing Java and Flash consent verification techniques and handshake.
Sender and receiver are able to share media stream after a offer answer handshake. But we already need one in order to do NAT hole-punching. Presuming the ICE server is malicious , in absence of transaction IDs by stun unknow to call scripts, it is not possible for the webpage of receiver to ascertain is the data is forged or original . Thus to prevent this the browser must generate hidden transaction Id’s and should not sharing with call scripts, even via a diagnostic interface.
As soon as the callee sends their ICE candidates, the caller learns the callee’s IP addresses. The callee’s server reflexive address reveals a lot of information about the callee’s location.
To prevent IP Address Leakage, server should suppress the start of ICE negotiation until the callee has answered.Also user may hide their location entirely by forcing all traffic through a TURN server.
Goal of webrtc based call services should be to create channel which is secure against both message recovery and message modification for all audio / video and data.
With the increasing requirement of screen sharing in web app and communication systems there is always a high threat of oversharing / exposing confidential passwords , pins , security details etc . This may either through some part of screen or some notification which pops up .
There is always the case when the user may believe he is sharing a window when in fact they are the entire desktop.
The attacker may request screensharing and make user open his webmail , payment settings or even net-banking accounts .
When user frequently uses a site he / she may want to give the site a long-term access to the camera and microphone ( indicated by ” Always allow on this site ” in chrome ). However the site may be hacked and thus initiate call on users’ computer automatically to secretly listen-in .
Unless the user checks his laptops glowing camera light LED or goes and monitors the traffic himself he would not know if there is active call in background, which according to him he had cut off . In such a case an attacker may pretend to cut a call shows red phone signs and supportive text but still keep the session and media stream active placing himself on mute .
Even if the calling service cannot directly access keying material, it can simply mount a man-in-the-middle attack on the connection. The idea is to mount a bridge capturing all the traffic.
mitigation strategy : To protect against this it is now mandatory to use https for using getusermedia and otherwise also recommended to keep webrtc comm services on https or use strict fingerprinting.
We know that the forces behind WebRTC standardization are WHATWG, W3C, IETF and strong internet working groups. WebRTC security was already taken into consideration when standards were being build for it . The encryption methods and technologies like DTLS and SRTP were included to safeguard users from intrusions so that the information stays protected.
WebRTC media stack has native built-in features that address security concerns. The peer-to-peer media is already encrypted for privacy . Figure below:
WebRTC encrypts video and audio data via the SRTP (Secure Real-Time Protocol) method ensuring that IP communications – your voice and video traffic – can not be heard or seen by unauthorized parties.
What is SRTP ?
The Secure Real-time Transport Protocol (or SRTP) defines a profile of RTP (Real-time Transport Protocol), intended to provide encryption, message authentication and integrity, and replay protection to the RTP data in both unicast and multicast applications.
Earlier models of VOIP communication such as SIP based calls had an option to use only RTP for communication thereby subjecting the endpoint users to lot of problem like compromising media Confidentiality . However the WebRTC model mandates the use of SRTP hence ruling out insecurities of RTP completely. For encryption and decryption of the data flow SRTP utilizes the Advanced Encryption Standard (AES) as the default cipher.
For such end to end media encryption the shared secret is exchanged between the endpoints.
SDES ( SDP Security Description for Media Stream) ensures that plaintext containing SDP inside a SIP packet can flow end to end securily over TLS. This was a common practise in SIP endpoints in IMS and telco eco-systems to share SRTP secret key. How inview JS stack in browser and open code access SDES is not applicable to Webrtc system adn are largely outdated.
Currently DTLS (Datagram Transport Layer Security) is used by webrtc endpoints to multiplex a cryptographic key exchange. For WebRTC to transfer real time data, the data is first encrypted using the DTLS method. DTLS-SRTP handshake has both ends choose “half” of the SRTP key.
(+) Already built into all the WebRTC supported browsers from the start (Chrome, Firefox and Opera).
(+) On a DTLS encrypted connection, eavesdropping and information tampering cannot take place.
(-) Primary issue with supporting DTLS is it can put a heavy load on the SBC’s handling encryption/decryption duties.
(-) Interworking DTLS-SRTP to SDES is CPU intensive
SRTP from DTLS-SRTP end flows easily
SRTP from SDESC end requires auth+decrypt, and encrypt+auth
What is DTLS ?
DTLS allows datagram-based applications to communicate in a way that is designed to prevent eavesdropping, MITM, tampering, or message forgery. The DTLS protocol is based on the stream-oriented Transport Layer Security (TLS) protocol .
DTLS Certificates are Ephemeral certificates are generated per session to authenticate peers. To prevent Long-Term Identifiers: DTLS certs are short-lived thus avoid tracking.
DTLS handshake
Together DTLS and SRTP enables the exchange of the cryptographic parameters so that the key exchange takes place in the media plane and are multiplexed on the same ports as the media itself without the need to reveal crypto keys in the SDP.
The media relay points which proxy media stream between endpoints can expose traffic and meta data howver due to end to end encrypted nature of WebRTC they cannot be used to decipher and listen in media packets.
TURN server
Mixers
Media engines
It is important that WebRTC’s SRTP stream is linkedin to another SRTP endpoint and RTP-SRTP gateways should be avoided.
In the recent months everyone has been trying to get into the WebRTC space but at the same time fearing that hackers might be able to listen in on conferences, access user data, or even private networks. Although development and usage around WebRTC is so simple , the security and encryption aspects of it are in the dim light.
A simple WebRTC architecture is shown in the figure below :
By following the simple steps described below one can ensure a more secure WebRTC implementation . The same applies to healthcare and banking firms looking forth to use WebRTC as a communication solution for their portals .
Ensure that the signaling platform is over a secure protocol such as SIP / HTTPS / WSS . Also since media is p2p , the media contents like audio video channel are between peers directly in full duplex.
To protect against Man-In-The-Middle (MITM) attack the media path should be monitored regularly for no suspicious relay.
User’s that can participate in a call, should be pre-registered / Authenticated with a registrar service. Unauthenticated entities should be kept away from session’s reach .
WebRTC authentication certificate
Make sure that ICE values are masked thereby not rendering the caller/ callee’s IP and location to each other through tracing in chrome://webrtc-internals/ or packet detection in Wirehsark on user’s end.
As the signalling server maintains the number of peers , it should be consistently monitored for addition of suspicious peers in a call session. If the number of peers actually present on signalling server is more that the number of peers interacting on WebRTC page then it means that someone is eavesdropping secretly and should be terminated from session access by force.
It was observed that many a times non tech savy users simply agree to all permissions request from browser without actually consciously giving consent. Therefore user’s should be made aware of API in websites which ask for undue permissions . For example permission to :
Third party API should be thoroughly verified before sending their data on WebRTC DataChannel.
Before Desktop Sharing user’s should be properly notified and advised to close any screen containing sensitive information.
WebRTC uses DTLS-SRTP to authenticate peers and ensure that only authorized parties can participate in the communication.
a. Certificates
Each peer generates a self-signed certificate during the WebRTC session.
These certificates are used to establish a secure DTLS connection and authenticate the peers.
b. Fingerprint Verification
WebRTC uses certificate fingerprints (hashes) to verify the identity of peers.
The fingerprints are exchanged during the signaling process and are included in the SDP (Session Description Protocol).
If the fingerprints do not match, the connection is rejected.
Support of WebRTC should not increase security risk to telecom network. Any device or software that is in the hands of the customer will be compromised, it is just a mater of time ! Some safeguards
All data received from untrusted/third party sources (i.e. all data from customer controlled devices or software) must be validated.
Expect that any data sent to the client will be obtained by malicious users
Ensure that the new service does not adversely impact the data security, privacy, or service of existing customers.
remove PII and sensitive information in meta data and other records or traces such as CDR ( Call detail Records)
For storing logs , recording , file , ssh keys or any others sensitive information encrypted by keys , we need a safe storage for keys and these tools are handy for password and key management – Dashlane , Lastpass , Bitwarden, 1Password so on.
Auto sign-in for WebRTC apps
Turn User Authentication On and enable Two-Factor Authentication/Bio-metrics. OTP based sign-on and captcha checks are also popular approaches to protect sign-in.
Public Wi-Fi
Even a WebRTC e2e encrypted connection can be tampered with on an insecure Wifi. Even though Man-in-middle cannot decipher message content, they can make out intelligible information from the packet size, frequency, end parties’ IP and ports in signalling, time delay for network detection of remote etc. For native clients, a precautionary measure is to enable Remote Lock and Data Wipe. Also advised to only use authorized apps to permit sensitive data such as image storage.
If you use a native WebRTC native app, there are multiple things that you need to be wary of.
Avoid All Jailbreaks : Jail-breaking a smartphone can enable the user to run unverified or unsupported apps, many of these apps carry security vulnerabilities. Majority of security exploits for Apple’s iOS only affect jailbroken iPhones.
Add a Mobile Security App : Mobile security reports shows that mobile operating systems such as iOS and (especially) Android are increasingly becoming targets for malware. Select a reputable mobile security app that extends the built-in security features of the device’s mobile operating system. Some well-known third-party security vendors offering mobile security apps for iOS, Android and Windows Phone – Avast, Kaspersky, Symantec
Also as a good practice Turn off the Bluetooth, Wi-Fi and NFC when not needed.
Information security ensures that both physical and digital data is protected from unauthorized access, use, disclosure, disruption, modification, inspection, recording or destruction.
Although WebRTC already has best secure tools in its spec list which provide end to end encrypted communication over SRTP DTLS as well as media device access mandatory from websites of secure origin over TLS, yet if the endpoints acting as peers themselves are compromised then all this is in vain . Hence security issues arises when
Endpoints are recording their media content and storing it on unsafe location such as public file servers
Endpoints are inturn re-streaming their incoming their media streams to unsafe streaming servers
Phishing , Pretexting , Baiting attacks , Quid pro quo , Tailgating , Water-Holing are soe of the common tactics to steal the data of a nonsuspecting user . They are as much applicate to WebRTC based communication site as they are to any other trusted website such as banking , customer care contacts , falsh sale portals , cupon / discount sites etc .
Phonephishing – Voice phishing a criminal phone fraud, imporsonating legitimate caller such as a bank or tax agent and using social engineering over the telephone system to gain access to private personal and financial information for the purpose of financial reward.
Phishing – WebRTC data channel messages can be used as a method to do phishing by send malicious links posing as legitmate sender. It is hard to track such attacks since the data channels are p2p.
Impersonation attacks – spear-phishing , emails that attempt to impersonate a trusted individual or a company in an attempt to gain access to corporate finances , Human resource details , sesitive data. Business email compromise (BECs) also known as CEO fraud is a popular example of an impersonation attack. The fake email usually describes a very urgent situation to minimize scrutiny and skepticism.
Other social engineering tactics – Trickery , Influencing , Deception , Spying
Network security breaches
Inspite of the fact that webrtc is a p2p streaming framework , there are always signalling server required which do the initial handshake and enable the exchange fo SDP for the media to stream in peer to peer fashion . Some wellknown attacks that compromise networks and remote / cloud server are :
Viruses, worms and Trojan horses
Zero-day attacks
Hacker attacks
Denial of service attacks
Spyware and adware
It is upto the WebRTC/ VoIP service provider to detect emerging threats before they infiltrate network and compromise data. Some crticial components to enhance security are Firewalls , Access Control Lists , Intrusion detection and prevention systems (IDS/IPS) , Virtual private networks (VPN)
GovernanceFramework – defines the roles, responsibilities and accountability of each person and ensures that you are meeting compliance.
Confidentiality: ensures information is inaccessible to unauthorized people via encryption
Integrity: protects information and systems from being modified by unauthorized people; provides accuracy and trustworthyness
Availability: ensures authorized people can access the information when needed and that all hardware and software are maintained properly and updated when necessary
Authentication, Authorization and Accountability(AAA): validate users autheticity via creds, enforcing policies on network resources after the user has gain access and ensuring accountability by means of monitoring and capturing the events done by the user
Non repudiation: is the assurance that someone cannot deny the validity of something. It provides proof of the origin of data and the integrity of the data.
As a first defense tactic , if a origination ip address is sending malicious or malformed packets which depict an exploitation or attack , trigger and notification for tech team and execute script to block on the origin IP of attacker via security groups in AWS or other ACL list in hosted server . Can also implement temporary firewall block on it and later monitor it for more violations.
Incase a server is compromised beyond repair such as attacker taking control of the file system, drain the ongoing sessions from it and store cached storage with session state variable like CDR enteries. Activate the fallback / standby server and make the current server a honeypot to explore the attackers actions. Common attacks involve either of below techniques:
exploiting the VoIP system to get free international calls
ransomware activities such as scp the files out of server and leaving behind a readme.txt file on root location asking for money transfer in return of data
bombard brute force DDOS attacks to bring down the system and make it incapable of catering to genuine requests , perhaps with the inetention of giving advantage to competitors.
As the media connections are p2p, even if we kill the signalling server, it will not affect the ongoing media sessions. Only the time duration ( probably 3 – 4 minutes ) it takes to restart the server , is when the users will not be able to connect to signalling server for creating new sessions. Therefore incase a system is under attack and non recoverable, just terminate it and respawn other server attaced to the domain name or floating IP or Load balancer.
Auto updates
Most browsers today like Google Chrome and Mozilla Firefox have a good record of auto-updating themselves withing 24 hours of a vulnerability of threat occurring.
Third party Call Control ( 3PCC)
If a call is confirmed to be compromised, it should be within the power of Web Application server rendering the WebRTC capable page to cut off the compromised call session by force termination request to endpoints or via turning off the TURN services if in use.
STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) are protocols that can be used to provide NAT traversal for VoIP and WebRTC. These projects provide a VoIP media traffic NAT traversal server and gateway.
TURN Server is a VoIP media traffic NAT traversal server and gateway.
This article describes working with some of the TURN servers
We know that a STUN only server such as the one below , may not work under firewalls like in enterperises or even in universities resulting in black video , one way video , inconcsistent streaming or even no video experience .
// google STUN
var iceservers_array=[{urls: 'stun:stun.l.google.com:19302'}] ;
To overcome this we rely on a publicaly avaiable TURN server which can step in and do ICE exchange to seup relay routes for the media stream. Some of the options for both self hosted and TURN as a service are described below .
rfc5766-turn-server
It is a legacy project mainly archives for reference( links at the end of section). This is a VoIP gateway for inter network communication which is opensource MIT licensed .
platforms supported : Any client platform is supported, including Android, iOS, Linux, OS X, Windows, and Windows Phone. This project can be successfully used on other *NIX platforms ( Aamazon EC2) too. It supports flat file or Database based user management system ( MySQL , postgress , redis ). The source code project contains , TURN server , TURN client messaging library and some sample scripts to test various modules like protocol , relay , security etc .
Protocols : Protocols between the TURN client and the TURN server – UDP, TCP, TLS, and DTLS. Relay protocol – UDP , TCP .
Authentication : The authentication mechanism is using key which is calculated over the user name, the realm, and the user password. Key for the HMAC depends on whether long-term or short-term credentials are in use. For long-term credentials, the key is 16 bytes: key = MD5(username “:” realm “:” SASLprep(password))
Installation : Since I used my Ubuntu Software center for installing the RFC turn server 5677 .
The content got stored inside /usr/share/rfc5766-turn-server. Also install mysql for record keeping
sudo apt-get install mysql-server
Intall MySQL workbench to monitor the values feed into the turn database server in MysqL. connect to MySQL instance using the following screenshot
The database formed with mysql after successful operation is as follows . We shall notice that the initial db is absolutely null
Terminal Commands
These terminal command ( binary images ) get stored inside etc/init.d after installing
turnadmin – Its turn relay administration tool used for generating , updating keys and passwords . For generating a key to get long term crdentaial use -k command and for aading or updateing a long -term user use the -a command. Therefore a simple command to generate a key is
or we can also check using the terminal based mySQL client
mysql> use turn;
Database changed
mysql> select * from turnusers_lt;
+------------+----------------------------------+
| name | hmackey |
+------------+----------------------------------+
| altanai | 57bdc681481c4f7626bffcde292c85e7 |
| turnwebrtc | 6066cbe0b5ee14439b2ddfc177268309 |
+------------+----------------------------------+
2 rows in set (0.00 sec)
turnserver – Its command to handle the turnserver itself . We can use the simple turnserver command to start it without any db support using just turnserver. Screenshot for this is
We can use a database like mysql to start it with db connection string
turnutils_uclient: emulates multiple UDP,TCP,TLS or DTLS clients.
turnutils_peer: simple stateless UDP-only “echo” server. For every incoming UDP packet, it simply echoes it back.
turnutils_stunclient: simple STUN client example that implements RFC 5389 ( using STUN as endpoint to determine the IP address and port allocated to it , keep-alive , check connectivity etc) and RFC 5780 (experimental NAT Behavior Discovery STUN usage) .
turnutils_rfc5769check: checks the correctness of the STUN/TURN protocol implementation. This program will perform several checks and print the result on the screen. It will exit with 0 status if everything is OK, and with (-1) if there was an error in the protocol implementation.
Test
1. Test vectors from RFC 5769 to double-check that our STUN/TURN message encoding algorithms work properly. Run the utility to check all protocols :
$ cd examples
$ ./scripts/rfc5769.sh
2. TURN functionality test (bare minimum TURN example).
If everything compiled properly, then the following programs must run together successfully, simulating TURN network routing in local loopback networking environment:
console 1 :
$ ./scripts/basic/relay.sh
console2 :
$ ./scripts/peer.sh
If the client application produces output and in approximately 22 seconds prints the jitter, loss and round-trip-delay statistics, then everything is fine.
Insert the above piece of code on peer connection config . Now call from one network environment to another . For example call from a enterprise network behind a Wifi router to a public internet datacard webrtc agent . The call should connect with video flowing smoothly between the two .
TURN working accross Enterprise firrewall on a WebRTC video platform written with SimpeWebRTC lbrary . Project tango FX
Project Coturn evolved from rfc5766-turn-server project with many new advanced TURN specs beyond the original RFC 5766 document. The databses supported are : SQLite , MySQL , PostgreSQL , Redis , MongoDB
Protocols :
The implementation fully supports the following client-to-TURN-server protocols: UDP , TCP , TLS SSL3/TLS1.0/TLS1.1/TLS1.2; ECDHE , DTLS versions 1.0 and 1.2. Supported relay protocols UDP (per RFC 5766) and TCP (per RFC 6062)
Authetication :
Supported message integrity digest algorithms:
HMAC-SHA1, with MD5-hashed keys (as required by STUN and TURN standards)
HMAC-SHA256, with SHA256-hashed keys (an extension to the STUN and TURN specs)
Install libopenssl and libevent plus its dev or extra libraries . OpenSSL has to be installed before libevent2 for TLS beacuse When libevent builds it checks whether OpenSSL has been already installed, and its version.
RFC 5766 -Traversal Using Relays around NAT (TURN):Relay Extensions to Session Traversal Utilities for NAT (STUN)
RFC 6062 -Traversal Using Relays around NAT (TURN) Extensions for TCP Allocations
RFC 6156 – IPv6 extension for TURN
RFC 7443 – ALPN support for STUN & TURN
Datagram Transport Layer Security (DTLS) as Transport for Traversal Using Relays around NAT (TURN) . It facilitate the resolution of TURN URIs into the IP address and port of TURN servers supporting DTLS as a transport protocol.
Mobile ICE (MICE) – Mobility with TURN . It minimizes traffic disruption caused by changing IP address during a mobility event by shorter network path .
Xirsys is a provider for WebRTC infrastructure which included stun and turn server hosting as well .
The process of using their services includes singing up for a account and choosing whether you want a paid service capable of handling more calls simultaneously or free one handling only upto 10 concurrent turn connections .
The dashboard appears like this :
To receive the api one need to make a one time call to their service , the result of which contains the keys to invoke the turn services from webrtc script .
window.addEventListener("load", function (e) {
let xhr = new XMLHttpRequest();
xhr.onreadystatechange = function ($evt) {
if (xhr.readyState == 4 && xhr.status == 200) {
let res = JSON.parse(xhr.responseText);
console.log("response: ", res);
iceservers_array.push(res.v.iceServers);
alert( iceservers_array);
}
};
xhr.open("PUT", "https://global.xirsys.net/_turn/webrtc", true);
xhr.setRequestHeader("Authorization","Basic " + btoa("altanai:<sec rettoken>"));
xhr.setRequestHeader("Content-Type","application/json");
xhr.send(JSON.stringify({"format": "urls"}));
});
The resulting output should look like ( my keys are hidden with a red rectangle ofcourse )
The process of adding a TURN / STUN to your webrtc script in JS is as follows :
iceServers:[
{"url":"stun:turn2.xirsys.com"},
{"username":"< put your API username>","url":"turn:turn2.xirsys.com:443?transport=udp","credential":"< put your API credentail>"},
{"username":"< put your API username>","url":"turn:turn2.xirsys.com:443?transport=tcp","credential":"< put your API credentail>"}]
// before unload update status on main site
window.addEventListener("load", function (e) {
let xhr = new XMLHttpRequest();
xhr.onreadystatechange = function ($evt) {
if (xhr.readyState == 4 && xhr.status == 200) {
let res = JSON.parse(xhr.responseText);
console.log("response: ", res);
iceservers_array = res.iceServers;
console.log("iceservers_array: ", iceservers_array);
CallSessionBegins();
}
};
xhr.open("POST", "https://<ourdomain>:3000/token", true);
xhr.setRequestHeader("Content-Type","application/json");
xhr.send(JSON.stringify({"format": "urls"}));
});
Asterisk is an open source carrer grade SIP server which also provides Firewall traversal . A github repo containing some asterisk dialplan examples is https://github.com/altanai/asteriskexamples. An article discussing Asterisk and its implementation
Asterisk is a framework or toolkit designed for VOIP systems . It can support Enterprise communication systems like PBXs, call distributors, VoIP gateways , conference bridges etc . It is open source and free to use . It is developed in C and runs in linux . Technically , Asterisk has protocol support for many…
DNS sub-system in Kamailio DNS failover DNS load balancing NAT ( Network Address Translation) NAT ( Network Address Translation) Why is NAT is important in SIP? Types of NAT solutions NAT behaviours RTP NAT Fixing NAT NAT Traversal Module Why use keepalive when Registrations are already there for NATing ? How keepalives work for NATing…
WebRTC : Web-based real-time communications is a gamechanger for real-time communication systems. WebRTC is one such open-source, royalty-free, unencumbered browser-based platform using the browser’s embedded media application programming interface (API). It allows developers to add custom JavaScript & HTML5 to control the media setup and flow. WebRTC has enabled developers to build apps, sites, widgets, plugins and extensions capable of delivering simultaneous audio, video, data, and screen-sharing capability in a peer to peer fashion.
Issues accross Networks : But something which escapes our attention is how media is traversing across the network. Of course, the webrtc sessions run smoothly when both the peers are on the open public internet without any restrictions or firewall blocks. But the real problem begins when one of the peers is behind a Corporate/Enterprise network or using a different Internet service provider with some security restrictions. In such a case the normal ICE capability of WebRTC is not sufficient to set up a bidirectional media streaming setup. For network restriction what is required is a NAT ( Network Address Traversal) mechanism that performs address discovery.
NAT and ICESolution : STUN and TURN server protocols handle session initiations with handshakes between peers in different network environments. In the case of a firewall blocking a STUN peer-to-peer connection, the system fallback to a TURN server which provides the necessary traversing mechanism through the NAT.
Network Address Translation provides a mapping of internal to external IP addresses. This helps in network address modification for packets which in transit across a traffic routing node such as inter networks.
A private address on the inside of the NAT is mapped to an external public address. Port address translation (PAT) resolves conflicts that arise when multiple hosts happen to use the same source port number to establish different external connections at the same time.
Some ways to acheive this
Application Layer Gateway (ALG)
Interactive Connectivity Establishment ( ICE )
UPnP Internet Gateway Device Protocol
propertiary SIP based Session Border Controller, so on
Lets us just look at ICE in detail which is the default implementation for WebRTC
ICE (Interactive Connectivity Establishment) candidates play a key role in the process of establishing peer-to-peer connections between devices. These are basically the various routes than can potentially hold the p2p path for media flow.
ICE (Interactive Connectivity Establishment ) framework ( mandatory by WebRTC standards ) find network interfaces and ports in Offer / Answer Model to exchange network based information with participating communication clients. ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). ICE is defined by RFC 5245 – Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols.
Types of ICE candidates:
Host : Local IP addresses on the device.
Server Reflexive ( STUN) :Public IP addresses learned via STUN server.
Peer Reflexive : Public addresses discovered through peer communication.
Relayed :IP addresses from a TURN server that relays media traffic between peers.
WebRTC needs SDP Offer to be sent to B from A. Client B uses this SDP offer to generate an SDP Answer for A. The SDP ( as seen on chrome://webrtc-internals/ ) includes ICE candidates which map open ports in the firewalls.
However, in case both sides are symmetric NATs, the media flow gets blocked. For such a case TURN is used which tries to give a public IP and port mapped to internal IP and port. This relay path provides an alternative routing mechanism like a packet mirror. It can open a DTLS connection and use it to key the SRTP-DTLS media streams.
All requests from the same internal IP address and port are mapped to the same external IP address and port. It also allows external hosts to send packet to internal host by using the mapped external address.
All requests from the same internal IP address and port are mapped to the same external IP address and port, but external hosts can send packet to internal host only if internal host had previously sent a packet to that IP address.
All requests from the same internal IP address and port are mapped to the same external IP address and port, but external hosts can send packet to internal host only if internal host had previously sent a packet to that IP address and that port.
All requests from the same internal IP address and port, to a specific destination IP address and port, are mapped to the same external IP address and port. Any traffic from same internal IP+port to a different destination uses a new mapping. Also external hosts which receives a packet can send a UDP packet back to internal host.
In order to Understand this better consider various scenarios that determine the NAT Mapping Behavior one could run tests using cli or network analyzer tools and checking checking the XOR-MAPPED-ADDRESS value of the Binding Response message that the client receives
Mapping behavior
Endpoint-Independent Mapping NAT (EIM-NAT)
Address-Dependent Mapping NAT (ADM-NAT)
Address and Port-Dependent Mapping NAT (APDM-NAT)
Filtering behavior
Endpoint-Independent Filtering NAT (EIF-NAT)
Address-Dependent Filtering NAT (ADF-NAT)
Address and Port-Dependent Filtering NAT (APDF-NAT)
CGNAT
We just saw that NAT is widely used in home networks where multiple devices share a single public IP address. However Unlike standard NAT, which is typically used within a home or small business network, CGNAT is deployed by ISPs on a larger scale, typically in the carrier’s infrastructure, to manage IP address resources. This lets the carrier keep a smaller pool of public ipv4 addresses. While it lets ISPs to scale their infrastructure to handle more users by using CGNAT and reduces the need for purchasing large blocks of public IPs, this process creates performance degradation for Real Time applications and VPNs leading to blackholing of traffic.
CGNAT challenges:
Breaks End-to-End Connectivity for p2p
limited port range mapping restricts simultaneous connections leading to port exhaustion
lack of static IPs restricts hosting
One potential long-term solution to CGNAT limitations is the adoption of IPv6, which provides a vastly larger pool of IP addresses compared to IPv4, allowing every device to have its unique public IP address. Other solutions include using STUN, TURN and ICE as we have learned in this blog.
As long as one end of the connection is able to determine the dynamic association of thee other [arty by NAT and send data , hole punching can work.
Permissive NAT mapping techniques which map the same internal address/port consistently to an external address/port are suitable for hole punching such as full cone , address or port restricted NAT. However pure symmetric NAT have inconsistent destination specific port mapping and thus cannot do hole punching.
However the media is restricted resulting in a black / empty / no video situation for both peers . To combat such situation a relay mechanism such as TURN is required which essentially maps public ip to private ips thus creating a alternative route for media and data to flow through .
WebRTC media flow when peers are behind NAT . Uses TURN relay mechanism
The config file of the turn server need to be altered to map the public and private IP. The diagrammatic description of this is as follows :
WebRTC media flow when peers are behind NAT and TURN server is behind NAT as well . TURN config files bind a public interface to private interface address .
References :
RFC 3489 STUN – Simple Traversal of User Datagram Protocol (UDP)Through Network Address Translators (NATs)
RFC 5928 Traversal Using Relays around NAT (TURN) Resolution Mechanism
This blog is in continuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC ).
Attempt 4: Stream the content to a WebRTC endpoint which is hidden in a video call . Pick the stream from vp8 object URL send to a streaming server
This process involved the following components :
WebRTC API : simplewebrtc on Chrome
Transfer mechanism from client to Streaming server: webrtc media channel
Problems : No streaming server is qualified to handle a direct webrtc input and stream it on network .
Attempt 4.1 : Stream the content to a WebRTC endpoint . Do WebRTC Endpoint to RTP Endpoint bridge using Kurento APIs.
Use the RTP port and ip address to input into a ffmpeg or gstreamer or VLC terminal command and out put a live H264 stream on another ip and port address .
This process involved the following components :
API : Kurento
Transfer mechanism : HTML5 webrtc client -> application server hosting java -> media server -> application for webrtc media to RTP media conversation -> RTP player
Screenshots of attempts with Wowza to stream RTP from a IP and port
Problems : The stream was black which means 100% loss.
Lesson learned : RTP is not suitable for over the intgernet transmission especially with firewalls
Attempt 4.2 : Build a WebRTC Endpoint to Http endpoint in kurento and force the video audio encoding to be that of H264 and PCMU.
Code snippet for adding constraints to output media via pipeline and forcing choice of codecs( H264 for video and PCMU for audio ).
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
HttpGetEndpoint httpEndpoint=new HttpGetEndpoint.Builder(pipeline).build();
org.kurento.client.Fraction fr= new org.kurento.client.Fraction(1, 30);
VideoCaps vc= new VideoCaps(VideoCodec.H264,fr);
httpEndpoint.setVideoFormat(vc);
AudioCaps ac= new AudioCaps(AudioCodec.PCMU, 65536);
httpEndpoint.setAudioFormat(ac);
webRtcEndpoint.connect(httpEndpoint);
Alternatively one can opt to use gstreamer filter to force the output in raw format.
// basic media operation of 1 pipeline and 2 endpoints
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();
// adding Gstream filters
GStreamerFilter filter1 = new GStreamerFilter.Builder(pipeline, "videorate max-rate=30").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter2 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=video/x-h264,width=1280,height=720,framerate=30/1").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter3 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=audio/x-mpeg,layer=3,rate=48000").withFilterType(FilterType.AUDIO).build();
// connecting all poin ts to one another
webRtcEndpoint.connect (filter1);
filter1.connect (filter2);
filter2.connect (filter3);
filter3.connect (rtpEndpoint);
// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);
End result : The output is still webm based and doesnt work on h264 clients.
Attempt 5 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Wowza streaming server
This process involved the following components
WebRTC Stream and object URL of the blob containing VP8 media
Kurento WebRTC Endpoint bridge to generate SDP
Wowza Streaming server
Snippet used for kurento to generate a SDP file from WebRTC to RTP bridge
@RequestMapping(value = "/rtpsdp", method = RequestMethod.POST)
private String processRequestrtpsdp(@RequestBody String sdpOffer)
throws IOException, URISyntaxException, InterruptedException {
//basic media operation of 1 pipeline and 2 endpoinst
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();
//connecting all poin ts to one another
webRtcEndpoint.connect (rtpEndpoint);
// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);
// write the SDP conector to an external file
PrintWriter out = new PrintWriter("/tmp/test.sdp");
out.println(requestRTPsdp);
out.close();
HttpGetEndpoint httpEndpoint = new HttpGetEndpoint.Builder(pipeline).build();
PlayerEndpoint player = new PlayerEndpoint.Builder(pipeline, requestRTPsdp).build();
httpEndpoint.connect(rtpEndpoint);
player.connect(httpEndpoint);
// Playing media and opening the default desktop browser
player.play();
String videoUrl = httpEndpoint.getUrl();
System.out.println(" ------- video URL -------------"+ videoUrl);
// send the response to front client
String responseSdp = webRtcEndpoint.processOffer(sdpOffer);
return responseSdp;
}
End result : wowza doesnt not recognize the WebRTC SDP and play the video
screenshot of wowza with SDP input
Attempt 5.1 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Default Ubuntu media player
End result : wowza doesnt not recognize the WebRTC SDP and play the video : deformed media
screenshot of playing from a SDP file
Attempt 5.2 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over VLC using socket input
End result : nothing plays
screenshot of VLC connected to play from socket and failure to play anything
Attempt 5.3: Create a WebRTC endpoint and connected it to RTP endpoint via media pipelines . Also make the RTP SDP offer and answering the same . Play with ffnpeg / ffplay / gst playbin
Write the requestRTPsdp to a file and obtain a RTP connector endpoint with Application/SDP .It plays okay with gst playbin ( 10 secs without audio ). Successful attempt to play from a gst playbin
gst-launch -vvv playbin uri=file:///tmp/test.sdp
but refuses to be played by VLC , ffplay and even wowza . The error generated with
End result : This results in “Could not find codec parameter for stream1 ( video:h263, none ) .Other errors types are , Could not write header for output file output file is empty nothing was encoded”
Error screenshots of trying to play the RTP SDP file with ffmpeg
Attempt 6 : Use a WebRTC capable media and streaming server ( eg Kurento ) to pick a live stream of VP8 .
Convert the VP8 to H264 ( ffmpeg / RTP endpoint )
Convert H264 to Mp4 using MP4 parser and pass to a streaming server ( wowza)
End Result : yes it did work on mozilla but with considerable lag
Update : Thankfully the updates to WebRTC standards mandated the support for PCMU and AVC/H264 CB profile in the media stack of the browser thus solving the “from scratch buildup of transcoder between webrtc and non webrtc endpoints”.
Video Codecs : RFC 7742 specifies that all WebRTC-compatible browsers must support VP8 and H.264’s Constrained Baseline profile for video.
Audio Codecs : RFC 7874 specifies that browsers must support at least the Opus codec as well as G.711’s PCMA and PCMU formats.
The latest Webrtc specification lists a set of codecs which all compliant browsers are required to support which includes chrome 52 , Firefox , safari , edge.
References :
RFC7742: WebRTC Video Processing and Codec Requirements
RFC 7874: WebRTC Audio Codec and Processing Requirements
As the title of this article suggests I am going to pen my attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc.
Some of the high level archietctures for streaming Webrtc Video to multiple endpoints can be viewed in the post below.
Aim : I will be attempting to create a lightweight WebRTC to raw/h264 transcoder by making my own media engine which takes input from WebRTC peerconnection or getusermedia. I am sharing my past experiments in hope of helping someone whose objective may be to acheive the same since many non webrtc supported endpoints ( Rpi , kisosks , mobile browsers ) could benifit heavily from webrtc streaming . Even if your objective is not the same as mine, you may gain some insigh on what not to do when making a media transcoder.
It uses API fromwebrtc-experiment.com. The broadcast is in one direction only where the viewrs are never asked for their mic / webcam permission .
problem : The broadcast is for WebRTC browsers only and doesnt support non webrtc players / browsers
Attempt 1.1: Stream the media directly to nodejs through websocke
window.addEventListener('DOMContentLoaded', function () {
var v = document.getElementById('v');
navigator.getUserMedia = (navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia);
if (navigator.getUserMedia) {
// Request access to video only
navigator.getUserMedia(
{
video: true,
audio: false
},
function (stream) {
var url = window.URL || window.webkitURL;
v.src = url ? url.createObjectURL(stream) : stream;
v.play();
var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
waitForSocketConnection(ws, function () {
console.log(" url.createObjectURL(stream)-----", url.createObjectURL(stream))
ws.send(stream);
console.log("message sent!!!");
});
},
function (error) {
alert('Something went wrong. (error code ' + error.code + ')');
return;
}
);
} else {
alert('Sorry, the browser you are using doesn\'t support getUserMedia');
return;
}
});
//Make the function wait until the connection is made...
function waitForSocketConnection(socket, callback) {
setTimeout(
function () {
if (socket.readyState === 1) {
console.log("Connection is made")
if (callback != null) {
callback();
}
return;
} else {
console.log("wait for connection...")
waitForSocketConnection(socket, callback);
}
}, 5); // wait 5 milisecond for the connection...
}
Problem : The video is in form of buffer and doesnot play
Attempt 2: Record the WebRTC media ( 5 secs each ) into chunks of webm format-> transfer them to other end -> append the chunks together like a regular file
This process involved the following components :
Recorder Javascript library : RecordJs
Transfer mechanism : Record using RecordRTC.js -> send to other end for media server -> stitching together the small webm files into big one at runtime and play
Programs :
Code for video recorder
navigator.getUserMedia(videoConstraints, function (stream) {
video.onloadedmetadata = function () {
video.width = 320;
video.height = 240;
var options = {
type: isRecordVideo ? 'video' : 'gif',
video: video,
canvas: {
width: canvasWidth_input.value,
height: canvasHeight_input.value
}
};
recorder = window.RecordRTC(stream, options);
recorder.startRecording();
};
video.src = URL.createObjectURL(stream);
}, function () {
if (document.getElementById('record-screen').checked) {
if (location.protocol === 'http:')
alert('https is mandatory to capture screen.');
else
alert('Multi-capturing of screen is not allowed.Have you enabled flag: "Enable screen capture support in getUserMedia"?');
} else
alert('Webcam access is denied.');
});
Code for video append-er
var FILE1 = '1.webm';
var FILE2 = '2.webm';
var FILE3 = '3.webm';
var FILE4 = '4.webm';
var FILE5 = '5.webm';
var NUM_CHUNKS = 5;
var video = document.querySelector('video');
window.MediaSource = window.MediaSource || window.WebKitMediaSource;
if (!!!window.MediaSource) {
alert('MediaSource API is not available');
}
var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);
function callback(e) {
var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
GET(FILE1, function (uInt8Array) {
var file = new Blob([uInt8Array], {type: 'video/webm'});
var i = 1;
(function readChunk_(i) {
var reader = new FileReader();
reader.onload = function (e) {
sourceBuffer.appendBuffer(new Uint8Array(e.target.result));
if (i == NUM_CHUNKS) mediaSource.endOfStream();
else {
if (video.paused) {
video.play(); // Start playing after 1st chunk is appended.
}
readChunk_(++i);
}
};
reader.readAsArrayBuffer(file);
})(i); // Start the recursive call by self calling.
});
}
mediaSource.addEventListener('sourceopen', callback, false);
mediaSource.addEventListener('webkitsourceopen', callback, false);
mediaSource.addEventListener('webkitsourceended', function (e) {
logger.log('mediaSource readyState: ' + this.readyState);
}, false);
// function get the video via XHR
function GET(url, callback) {
var xhr = new XMLHttpRequest();
xhr.open('GET', url, true);
xhr.responseType = 'arraybuffer';
xhr.send();
xhr.onload = function (e) {
if (xhr.status != 200) {
alert("Unexpected status code " + xhr.status + " for " + url);
return false;
}
callback(new Uint8Array(xhr.response));
};
}
Shortcoming of this approach
The webm files failed to play on most of the media players
The recorder can only either record video or audio file at a time .
Attempt 2.Chunking and media proxy
Since the previous approach failed to support on webrtc endpoinst , the next iteration of this approach was to channel the webrtc media via a nodejs server thus disrupting the peer to peer media strem in favour of centralized / proxied emdia stream. This would enable me to obtain raw media packets form teh stream using low level C based vp8 decoder libraries and then re encode them to h364 or other media formats suitable for endpoints .
In theory media could be reencoded jusing openH264 library and the frame could be then send to players
let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs=vp9',
new VP9Decoder());
let buffer = await loadBuffer();
sourceBuffer.appendBuffer(buffer);
Further extending for uncompressed video
let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/raw; codecs=yuv420p');
for (let p in demuxPAckets()) {
let frame = await codec.decode(p);
sourceBuffer.appendBuffer(frame);
}
Atleast that was the plan .
Attempt 2.1: Record the WebRTC media ( 5 secs each ) into chunks of webm format ( RecordRTC.js) > Use Kurento JS script ( kws-media-api,js) to make a HTTP Endpoint to recorded Webm files -> append the chunks together like a regular file at runtime
// UI elements
function getByID(id) {
return document.getElementById(id);
}
var recordAudio = getByID('record-audio'),
recordVideo = getByID('record-video'),
stopRecordingAudio = getByID('stop-recording-audio'),
stopRecordingVideo = getByID('stop-recording-video'),
broadcasting = getByID('broadcasting');
var canvasWidth_input = getByID('canvas-width-input'),
canvasHeight_input = getByID('canvas-height-input');
var video = getByID('video');
var audio = getByID('audio');
// Audio video constraints
var videoConstraints = {
audio: false,
video: {
mandatory: {},
optional: []
}
};
var audioConstraints = {
audio: true,
video: false
};
// Recording and stop recording - to be convrted into real time capture and chunking
const ws_uri = 'ws://localhost:8888/kurento';
var URL_SMALL = "http://localhost:8080/streamtomp4/approach1/5561840332.webm";
var audioStream;
var recorder;
recordAudio.onclick = function () {
if (!audioStream)
navigator.getUserMedia(audioConstraints, function (stream) {
if (window.IsChrome) stream = new window.MediaStream(stream.getAudioTracks());
audioStream = stream;
audio.src = URL.createObjectURL(audioStream);
audio.muted = true;
audio.play();
// "audio" is a default type
recorder = window.RecordRTC(stream, {
type: 'audio'
});
recorder.startRecording();
}, function () {
});
else {
audio.src = URL.createObjectURL(audioStream);
audio.muted = true;
audio.play();
if (recorder) recorder.startRecording();
}
window.isAudio = true;
this.disabled = true;
stopRecordingAudio.disabled = false;
};
Recording and stop recording video inot small media files ( chunks )
recordVideo.onclick = function () {
recordVideoOrGIF(true);
};
stopRecordingAudio.onclick = function () {
this.disabled = true;
recordAudio.disabled = false;
audio.src = '';
if (recorder)
recorder.stopRecording(function (url) {
audio.src = url;
audio.muted = false;
audio.play();
document.getElementById('audio-url-preview').innerHTML = '&amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;gt;Recorded Audio URL&amp;amp;amp;lt;/a&amp;amp;amp;gt;';
});
};
function recordVideoOrGIF(isRecordVideo) {
navigator.getUserMedia(videoConstraints, function (stream) {
video.onloadedmetadata = function () {
video.width = 320;
video.height = 240;
var options = {
type: isRecordVideo ? 'video' : 'gif',
video: video,
canvas: {
width: canvasWidth_input.value,
height: canvasHeight_input.value
}
};
recorder = window.RecordRTC(stream, options);
recorder.startRecording();
};
video.src = URL.createObjectURL(stream);
}, function () {
if (document.getElementById('record-screen').checked) {
if (location.protocol === 'http:')
alert('&amp;amp;amp;lt;https&amp;amp;amp;gt; is mandatory to capture screen.');
else
alert('Multi-capturing of screen is not allowed. Capturing process is denied. Are you enabled flag: "Enable screen capture support in getUserMedia"?');
} else
alert('Webcam access is denied.');
});
window.isAudio = false;
if (isRecordVideo) {
recordVideo.disabled = true;
stopRecordingVideo.disabled = false;
} else {
recordGIF.disabled = true;
stopRecordingGIF.disabled = false;
}
}
stopRecordingVideo.onclick = function () {
this.disabled = true;
recordVideo.disabled = false;
if (recorder)
recorder.stopRecording(function (url) {
video.src = url;
video.play();
document.getElementById('video-url-preview').innerHTML = '&amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;gt;Recorded Video URL&amp;amp;amp;lt;/a&amp;amp;amp;gt;';
});
};
Broadcasting the chunks to media engine
function onerror(error) {
console.log(" error occured");
console.error(error);
}
broadcast.onclick = function () {
var videoOutput = document.getElementById("videoOutput");
KwsMedia(ws_uri, function (error, kwsMedia) {
if (error) return onerror(error);
// Create pipeline
kwsMedia.create('MediaPipeline', function (error, pipeline) {
if (error) return onerror(error);
// Create pipeline media elements (endpoints &amp;amp;amp;amp; filters)
pipeline.create('PlayerEndpoint', {uri: URL_SMALL}, function (error, player) {
if (error) return console.error(error);
pipeline.create('HttpGetEndpoint', function (error, httpGet) {
if (error) return onerror(error);
// Connect media element between them
player.connect(httpGet, function (error, pipeline) {
if (error) return onerror(error);
// Set the video on the video tag
httpGet.getUrl(function (error, url) {
if (error) return onerror(error);
videoOutput.src = url;
console.log(url);
// Start player
player.play(function (error) {
if (error) return onerror(error);
console.log('player.play');
});
});
});
// Subscribe to HttpGetEndpoint EOS event
httpGet.on('EndOfStream', function (event) {
console.log("EndOfStream event:", event);
});
});
});
});
}, onerror);
}
problem : dissecting the live video into small the files and appending to each other on reception is an expensive , time and resource consuming process . Also involves heavy buffering and other problems pertaining to real-time streaming .
Attempt 2.2 : Send the recorded chunks of webm to a port on linux server. Use socket programming to pick up these individual files and play using VLC player from UDP port of the Linux Server
End Result : Small file containers play but slow buffering makes this approach non compatible for streaming files chunks and appending as single file.
Attempt 2.3: Send the recorded chunks of webm to a port on linux server socket . Use socket programming to pick up these individual webm files and convert to H264 format so that they can be send to a media server.
This process involved the following components :
Recorder Javascript library : RecordJs
Transfer mechanism :WebRTC endpoint -> Call handler ( Record in chunks ) -> ffmpeg / gstreamer to put it on RTP -> streaming server like wowza – > viewers
Programs : Use HTML webpage Webscoket connection -> nodejs program to write content from websocket to linux socket -> nodejs program to read that socket and print the content on console
Snippet to transfer the webm recorder files over websocket to nodejs program
// Make the function wait until the connection is made.
function waitForSocketConnection(socket, callback) {
setTimeout(
function () {
if (socket.readyState === 1) {
console.log("Connection is made")
if (callback != null)
callback();
} else {
console.log("wait for connection...")
waitForSocketConnection(socket, callback);
}
}, 5); // wait 5 milisecond for the connection...
}
function previewFile() {
var preview = document.querySelector('img');
var file = document.querySelector('input[type=file]').files[0];
var reader = new FileReader();
reader.onloadend = function () {
preview.src = reader.result;
console.log(" reader result ", reader.result);
var video = document.getElementById("v");
video.src = reader.result;
console.log(" video played ");
var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
waitForSocketConnection(ws, function () {
ws.send(reader.result);
console.log("message sent!!!");
});
}
if (file) {
// converts to base64 encoded string of the file data
//reader.readAsDataURL(file);
reader.readAsBinaryString(file);
} else {
preview.src = "";
}
}
Program for Linux Sockets sender which creates the socket for the webm files in nodejs
var net = require('net');
var fs = require('fs');
var socketPath = '/tmp/tfxsocket';
var http = require('http');
var stream = require('stream');
var util = require('util');
var WebSocketServer = require('ws').Server;
var port = 3000;
var serverUrl = "localhost";
var socket;
/*----------http server -----------*/
var server = http.createServer(function (request, response) {});
server.listen(port, serverUrl);
console.log('HTTP Server running at ', serverUrl, port);
/*------websocket server ----------*/
var wss = new WebSocketServer({server: server});
wss.on("connection", function (ws) {
console.log("websocket connection open");
ws.on('message', function (message) {
console.log(" stream recived from broadcast client on port 3000 ");
var s = require('net').Socket();
s.connect(socketPath);
s.write(message);
console.log(" send the stream to socketPath", socketPath);
});
ws.on("close", function () {
console.log("websocket connection close")
});
});
Program for Linux Socket Listener using nodejs and socket . Here the socket is in node /tmp/mysocket
var net = require('net');
var client = net.createConnection("/tmp/mysocket");
client.on("connect", function() {
console.log("connected to mysocket");
});
client.on("data", function(data) {
console.log(data);
});
client.on('end', function() {
console.log('server disconnected');
});
Output 1: Video Buffer displayed
Output 2 : Payload from Video displayed that shows the pipeline works but no output yet.
ffmpeg format of transfering the content from socket to UDP IP and port
ffmpeg -i unix://tmp/mysocket -f format udp://192.168.0.119:8083
problems of this approach : The video was on a passing stage from the socket and contained no information as such when tried to play / show console.
Attempt 3 : Use existing media engine like kurento to do the transocding for me
Send the live WebRTC stream from Kurento WebRTC endpoint to Kurento HTTP endpoint then play using Mozilla VLC web plugin
So I haven’t written anything worthy in a while , just published some posts that were lying around in my drafts . Here I write about the main thing . some thing awesome that I was trying to accomplish in the last quarter .
TFX Sessions is a plug and play platform for VoIP ( voice over IP ) scenarios. Intrinsically it is a very lightweight API package and shipped in form of a Chrome Extension . It is a turn-key solution when parties want instant audio/audio communication without any sign-in ,plugin installation or additional downloads . Additionally TFX Sessions is packaged with some interesting plugins which enable the communicating parties to get the interactive and immersive experience as in a face to face meeting.
There is a market requirement of making a utterly simple WebRTC API that has everything needed to build bigger aggregate projects but the available solutions are either just to basic or much too complex . So I initially started writing my own getuserMedia APIs, but left it midway and picked up simplewebrtc API instead for want of time .Then I focused on the main crux of the project which was the widget API and ease of integration.
How Does TFX Sessions Work ?
Signalling channel establishes the session using Offer- Answer Model
Browser’s media API’s , like getUserMedia and Peerconnection are used for media flow
A Widget is essentially any web project that wants communication over webrtc channel . Once the platform is ready I have core APIs , widgets and signalling server. Then came up the subject of enterprise internet blocking my communication stream . Time for TURN ( Coturn in my case ).
TFX create / join roomTFX startup screen
Components of TFX
Client Side Components of tangoFX :
broplug API Inhouse master library for TangoFX. Makes the TFX sessions platform .Masks the low level webrtc and socketio functions . Provides simple to use handles for interesting plugins development in platform . simplewebrtc performs webrtc support , peer configuration , wildemitter , utils , event emitters , JSON formatter , websocket , socket namespace , transport , XHR more like so socket.io.js exports and listener for real time event based bidirectional communication Jquerry JS library for client side scripting Bootstrap HTML and CSS-based design templates for typography, forms, buttons, navigation and other interface components.
Server Side components
Signalling Server -signal master socket.io server for webrtc signalling TURN -Coturn TURN protocol based media traversal for connecting media across restricting domains ie firewalls, network policies etc . redis Data structure to maintain current and lost sessions
Architecture
So here is the final architecture of TFX chrome extension widget based platform .
The client side contains widgets , chore extension APIs , chrome’s WebRTC API’s , socket.io client for signalling , HTML5 , Jquerry , Javascript , and CSS for styling.
The Server Side of the solution contains socket.io server for signalling , manuals and other help/support materials , HTTPS certificate and TURN server implementation for NAT .
TFX platform Server client components . WebRTC media and socketio communication . Build as chrome Extension
Salient Features
The underlying technology of TangoFX is webrtc with socket based signalling . Also it adheres to the latest standards of W3C , IETF and ITU on internet telephony .
TangoFX sessions is extremely scalable and flexible due to the abstraction between communication and service development. This make it a piece of cake for any web developer using TangoFX interface to add his/ her own service easily and quickly without diving into the nitty gritties .
TangoFX is currently packaged in a chrome extension supported on chrome browser on desktop operating system like window , mac , linux etc .
The call is private to both the parties as it is peer-to-peer meaning that the media / information exchanged by the parties over TangoFX does not pass through an intervening server as in other existing internet calling solutions.
TangoFX is very adaptive to slow internet and can be used across all kinds of networks such as corporate to public without being affected by firewall or restricting policies .
TFX Widget Screens
Alright so that’s there . Tada the platform is alive and kicking . Right now in beta stage however . Intensive testing going on here . However here are some screenshots that are from my own developer version .
TFX recording widgetTFX face detection and overlay widgetTFX multilingual communicationTFX screen-sharingTFX video FiltersTFX audio visualizerTFX text messaging widgetTFX cross domain access . flicker hereTFX draw widgetTFX code widget supportes many programming languagesTFX webrtc dynamic statsTFX introduction widget
Note that the widgets described above have been made with the help of third party APIs.
TFX Sessions Summary
We saw that TFX is WebRTC based communication and collaboration solution .It is build on Open standards from w3c , IETF , Google etc. Scalable and customizable. Immersive and interactive experience . Easy to build widgets framework using TangoFX APIs.
Steps for building and deploying WebRTC solution Step 1 : Pick a WebRTC API and run locally ( ie open 2 browsers and run on local machine )
Step 2 : Use cloud Server and different client Browsers
Now what good is it doing to anyone if its running locally on my machine with addresses like localhost and 127.0.0.1 . Let us put it on the cloud and at-least let my colleague / friends enjoy it . Cloud Web Server and Nodejs signalling server . That is okay use amazon’s Ec2. works for most of the people most of the time .
Steps for building and deploying WebRTC solution Step 2 : Put Server on cloud and WebRTC clients on different machine
Here is when we discover the issues of ICE ( Interactive Connectivity Establishment ) I have mentioned this in detail on the post NAT Traversal using STUN and TURN . Briefly ICE helps us in coping up with NAT ( Network Address Traversal and Firewalls ) .
Note that this step only works if everyone you want to connect to is either on same intranet or on public internet without and UDP blocks / firewalls / restriction .
As we try to connect 2 WebRTC clients from different machine and different networks we find that network address from client’s OS and network card fails to connect to Signalling Server due to either Firewalls issues or other Network policies . We therefore use a STUN server to map the private IP to a publicly accessible IP that will help in completing the signalling
The Signalling is establishes using a STUN server for address mapping and NAT . One can use google’s default STUN server stun.l.google.com:19302. Easy and free .
Steps for building and deploying WebRTC solution Step 2.1 : Put Server on cloud and WebRTC clients on different machine + STUN for address discovery ( NAT traversal )
There you go everything is looking good from here now , both peers join the session successfully , but the video may appear black . This is so because the media under most inter network conditions fails to flow between private and public network .
This is where step 3 comes into picture ie using a TURN ( media relay ) server .
Step 3: TURN server to Call people in a inter-network fashion
Sure the architecture I have setup is bound to work everywhere where the network is open and public . However error in connectivity , errors in console , blank video are the problems that might appear when one tries to connect from private to public connections.
To bypass network firewalls , corporate net policies , UDP blocks and filters we require a TURN server which help in media traversal across different networks in a relay mechanism.
2. Build your own TURN server with RFC 5766 ( COTURN ) , or rather easier would be to use any open source TURN server code available in Github.
3. Pay and use a commercial TURN service provider or you can even use their trail version to see if things work out for you ( example Xirsys) .
Remember you can use any TURN service it does not affect your WebRTC API functionality . All we need to do is add it to Peerconnection configuration like
</address><address>peerConnectionConfig: {<br>
iceServers:[</address><address>{"url": < stunserver address >},</address><address>{"username":"xx","url":< turn server address transport=udp>,"credential":"yy"},</address><address>{"username":"xx","url":< turn server address transport=tcp> , "credential":"yy"}]</address><address>},</address><address>
There we go , now anyone from anywhere should be able to use our WebRTC setup for making audio , video calls or just exchanging data via DataChannel ( like screen-sharing , file transfer , messages , playing games , collaborative office work etc ) .
Steps for building and deploying WebRTC solution TURN based media Relay for WebRTC traffic
The setups covers scenarios wherein user is on office corporate network , home network , mobile network , no problem as long as he / she has a webrtc enables browser ( read Chrome , Mozilla , Opera ) .
It is noteworthy that ideally voice should be traversing on TCP while video and data can go around in UDP however unless restrained the WebRTC API’s self determine the best protocol to route the packets / stream .
Debug helper
Common issues around media playback
DOMException: The play() request was interrupted by a new load request
webrtcdevelopment_min.js:1 [Violation] Only request notification permission in response to a user gesture.
Read more about best WebRTC frameworks and code in this book
In the last few months, I have been keenly tracking how the course for WebRTC is turning out. In my opinion, it is an incredible game-changer and a market disrupter for the telco industry plagued by licensed codecs and heavy call session control software.
Contrary to my expectations the fundamental holes in WebRTC specification are still the same with less being done to fulfil them – Desktop sharing on Chrome Extension, media compatibility with desktop popular H264 being prominent obstacles to adoption. Of course, now there exists an abundance of interactive use-cases for WebRTC APIs from gaming to telemedicine. However, none of the applications is complete and standalone since each uses a new gateway to connect to their existing platform or service.
As new webrtc SDKs and open-sourcing platforms surface, many seem to be wrapping around the same old WebRTC functions ( getusermedia data-channel and peer-connection ) with few or no addons. Few of I am listing down some popular working ones in this blog but there still exists no concrete stable reliable guide to set up the backbone network ( yes I am referring to Media interconversion, relay, TURN, STUN servers ). It is left to a telecom software engineer/developer to find and figure out the best integrations to configure session handling and PSTN or desktop application compatibility.
Some commercial off the shelf service providers have begin to extend interconnecting gateways ( SBC’s) for their backend for Web-Javascript based WebRTC implementations but there are concerns on the end to end encryption and media management as it passes via transcoding media server and many points of relay. This in my opinion completely defeats the objective of WebRTC’s peer-to-peer communication which by design is supposed to be independent of centralised server setup. WebRTC was meant to *everything you can’t do with proprietary communication tools and networks*.
Well moving on , here are some nice API implementations of WebRTC ( only for Websockets no SIP/WebSockets ) which can be quickly used by web developers to create peer to peer media session on web endpoints via a WebRTC supported browser web page.
1.appRTC
Neat process of setting up offer-answer and SDP . Notice the Relay candidate gathering
Session Description ( SDP) for the WebRTC peer with audio / video codecs and other session specificatiosn such as bitrate , framerate , codec profile , RTP specs etc.
No over the top media control which is good as media flows end to end here without any centralized media server .
2. talky.io
Also related to SimpleWebRTC which is a lightweight MIT licensed library providing wrappers around core WebRTC API to support application building while easing the lower level peerconnection and session management from developer.
Simmilar offer-answer handshake and SDP excahnge.
There seems to be better noise control management which could be my browser acting on my fluctuating network bandwidth as well .
3. tokbox
More control from UI on Media settings which is provided as part of getusermedia in WebRTC specification. Read more about the webrtc APIs in my other writeup
Simmilar to previous WebRTC session ( totally dependant on my own network and CPU ) , independant of any thord party control .
As my peers network degrated , my webrtc session automatically adjust based on RTCP feedback to send lower resolutions and framerate.
4. webrtcdevelopment
Open source MIT licensed libray to spin up Webrtc calls quickly and easily from a chrome supported browser . It is a fork of best features from multiple libraries such as apprtc , webrtcexperiments , simplewebrtc and is maintained by n open community of users including me.
WebRTC Media Stack is explained in following articles
Acoustic Echo Hybrid / Electronic Echo in PSTN phones Noise Suppression in WebRTC Echo Cancellation WebRTC Echo Cancellation Automatic Gain Control (AGC) Echo is the sound of your own voice reverberating. If the amplitude of such a sound is high and intervals exceed 25 ms, it becomes disruptive to the conversation. Its types can be…
WebRTC Video Codecs VP8 VP9 H264/AVC constrained AV1 (AOMedia Video 1) Stats for video based media stream track Non WebRTC supported Video codecs H.263 H.265 / HEVC WebRTC Audio Codecs Opus AAC G.711 (PCMA and PCMU) G.722 iLBC iSAC Speex AMR-WB DTMF and ‘audio/telephone-event’ media type Stats for Audio Media track DataChannel Stats for Datachannel…
Until recently a customised or property extension could signal multiple media streams within an m-section of an SDP and experiment with media-level “msid” (Media Stream Identifier ) attribute used to associate RTP streams that are described in different media descriptions with the same MediaStreams. However, with the transition to a unified plan, they will experience…
This blog is in continuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC )
As the title of this article suggests I am going to pen my attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc. Some of the high level archietctures for streaming Webrtc Video to multiple endpoints…
Media Stream Tracks in WebRTC Video Streams Video Capture insync with hardware’s capabilities Capture Resolution SDP attributes for resolution, frame rate, and bitrate Dynamic FPS control based on actual hardware encoding Stream Orientation Audio Streams Audio Level GAIN calculation Acoustic Echo Cancellation (AEC) SDP signaling and negotiation for media plane Media Source Peer-to-Peer Media Stream Frames…
We know the power of the Internet protocol suite as it takes on the world of telecom. Already half of the Communication has been transferred from legacy telecom signaling protocols like SS7 to IP based communication ( Skype, Hangouts, WhatsApp, Facebook Messenger, Slack, Microsoft Teams ). The TV service providers too are largely investing in IP based systems like SIP and IMS to deliver their content over Telecom’s IP based network ( Packet-switched ).
IPTV
A consumer today wants HD media content anytime anywhere. The traditional TV solutions just don’t match upto the expectations anymore. The IPTV provider in today’s time must make investments to deliver content that is media-aware, and device-aware. Not only this it should be personal, social, and interactive and enhance the user experience.
Few popular applications for IPTV solutions developers :
Menu overlay with detailed description of channels , categories , programs , movies
Record and Replay option also referred to as timeshift. It allows a user to pause , resume and record the show in his absence and view it later
Video on demand which concerns paying and viewing music albums , movies etc on demand
Live streaming of events such as president speech , tennis match etc .
Application that can be build around the IPTV context
Record and Playback content
Information overlay on streaming content
Social networking services integrated with IPTV content
Parental Control to realtime view , monitor and control what your child is watching on the IPTV
Watch the surveillance footage from IP cameras anywhere
Real time communication on IPTV with advanced features like call continuity , content sync .
IPTV over IMS core
Key Components for IPTV service includes
IPTV client such as a set top box (STB) or smart TV which is capable of processing SIP packets to initiate and terminate SIP streams
SIP is also used for channel switching
SIP proxy server which can route messages between IPTV clients and media server
Media server that can deliver RTP/RTSP streams
RTSP manages playback (play/pause/seek), while RTP delivers video
IMS core which can employ its node for QoS, billing and other SIP based processes
IPTV multicast
In this case a single stream is sent to multiple users simultaneously. Uses UDP multicast (IGMP) to efficiently distribute live TV. However this requires multicast-enabled routers (PIM, IGMP snooping).
Since 1 stream serves thousands, this approach is suitable very scalable media streams such as Live sports or concerts. Also in contrast to unicast IPTV this approach avoids duplication on network links since viewers watching a single stream are added to a group.
IPTV unicast
IPTV multicast
1-1 flow of media
1:many flow of media where Server rs[ponds with multicast group and STB joins the IGMP multicast group to receive the stream.
scales with horizontal scalability of media server
Employs CDN based edge caching , Prebuffering for fast access
Better for personalized ads/targeted content with more user control for rewind
Mobile TV is an advanced services that lets users download and play songs and videos, send text messages, conference with colleagues and friends, and exchange pictures or videos on whichever device they are using.
Mobile TV Service platform bundles different types of content (live TV, VoD, podcast, etc.) into IP service streams and selects the transmission bearer depending on the targeted audience.
broadcast (DVB-H or DVB-SH based)
unicast (2G, 3G, etc.)
DVB-H standard is optimized transmission on broadcast technology for hand-held, low power devices.
Components
bearer selection
Electronic Service Guide (ESG) to list the available programs and contents.
Streaming servers
Video Encoders
rich media service node
Add-on
recommendation engine
personalized advertising
audience measurement
Content and subscriber management
(+) use existing cellular networks
radio sites and antennas enhanced with broadcast repeaters
Video on demand using Adaptive Multirate dispatch using a Transcoder. This enables flexible playback over unicast streams.
Components
HTML5 webpage on chrome browser for WebRTC input : contains the client side script to record the video and send the blob to server side for processing.
Amazon EC2 instance : The amazon ec2 instance hosts the web interface for login and video recording . It converts the incoming blob / webm format to mp4 . After the video conversion it uploads the video mp4 to s3 bucket.
Amazon S3 bucket : The s3 bucket is connected to the transcoder via pipeline and holds the video storage as well.
Amazon transcoder : The transcoder provides adaptive multirate dispatch on viewer access .
RDS for any mysql storage ( optional ) : Optional for record keeping credentials, storage links, linked information etc.
WebRTC has been applied in the basic communication sector with overwhelming results. However there the capability to stream media is not just limited to communication it can be applied to stream multimedia content from streaming servers as well. This section describes the application of WebRTC in IPTV ( IP Television ), VOD ( Video on Demand ), Online FM ( audio from Radio stations online ) applications. All this without the need to download plugins or any additional installations of third party products. Also with the inclusion of the
An architecture representing some of these use cases in the context of WebRTC based media transmission is depicted below.
References:
RFC 3515 The Session Initiation Protocol (SIP) Refer Method
In order to enable gradual deployments of voice communication over IP networks, 3GPP enabled service continuity which defines that – When a party in area with good quality IP network, the UA establishes voice communication using SIP – When moving to area with IP network with insufficient quality, the SIP session(s) are transferred to the traditional circuit switched call.
This ensures an uninterrupted user experience of a service when a UE (User Equipment) changes its network access or location. Similar call / service continuity principle can be applied to SIP phone and WebRTC calls.
WebRTC is an evolving technology which promises simplified communication platform and stack for developers and hassslefree experience for users. It has the potential to provides in-context, call routed to the best personnel in service calls. Real time mapping of caller’s IP , locations and source metadata can be used for IVR eliminated. Such a complete collaboration tool is possible through WebRTC which is easy set-up, requires no installation no pugins and no download. Extremely secure, WebRTC can interoperate with existing VoIP, video conferencing and even PSTN. The only concern is the Integration with legacy PSTN and teleco environment.
In the present age of IP telephony when telecom convergence is the big thing all around the world, need of the hours is to enable fixed and mobile Service Providers ( SP ) to monetize the subscriber’s phone by extending it to new web based services. SPs can offer a WebRTC Communicator endpoint that uses the same phone number as the subscriber’s fixed or mobile phone. Advanced features enable calls to be transferred between fixed-line, mobile and WebRTC endpoints.
Position of WebRTC on Network protocol stack.
GSM is incompatible with WebRTC media stream due to legacy codecs, even if the WebRTC UA was to support these codecs the signalling translation will be a dffucult feat. Signalling is used for subscriber mobility, subscriber registration, call establishment, etc. Mobile Application Part (MAP), Base Station System Application part (BSSAP), Direct Transfer Application part (DTAP), ISDN User Part (ISUP) are some of the protocols making up GSM. In my opinion Some of the ways to integrate WebRTC to GSM backened could be
Develop GSM-To-IP Interworking Component and integrate it with GSM network components (like BTS ).
Integrate solution with H.323 based VoIP (Voice Over IP) components like Gatekeeper, Gateways/PBXs, to provide a complete voice/data network solution
Using telco service provider’s SIP trunk , if available, is the easiest way to connect to such backend communication systems.
•A interface – connection between MSC and BSC;
Abis interface – connection between BSC and BTS;
D interface – connection between MSC and HLR;
Um interface – radio connection between MS and BTS.
GPRS/UMTS Mobile Network with WebRTC
GPRS/UMTS Mobile Network can be compatible to WebRTC via Data based communication on GPRS gateway.
LTE Network using Evolves Packet Core can communicate with WebRTC using realtime transcoding and SIP (Session initiation protocol) endpoints conneted to core IMS. AnICE server provides the reflexive IP addresses that the WebRTC implementation needs; the signaling gateway converts the WebRTC webapp’s communication into SIP/IMS signaling and the media relay converts the WebRTC media framing into the telco conformant representation.
Interworking between a WebRTC enabled browser and IMS based Telco Backened
A session is established so that the web app sends an initial INVITE, including an SDP offer for the “outgoing” stream, to the gateway. The signalling gateway will reserve the resources from the media relay in both directions. Consequently, the signalling gateway will send an SDP answer to the initial INVITE and create an SDP offer of its own. This SDP offer is carried in a SIP UPDATE. Once the media between the web app and media relay is set, the session will progress towards IMS and will be handled like any other session. At this point, the media relay has mapped two unidirectional “web app streams” into one bidirectional “IMS stream” and will forward all media between the two. The mapping is done for both audio and video streams, meaning that we are able to support both audio and video calls between WebRTC and Telco clients and conferences.
WebRTC bypasses many limitation of earlier p2p (Peer-to-peer media) streaming frameworks like NATS. It opens avenues for innovative cross-platform use cases such as Healthcare, service technicians on call, Retail and financial communications, phone payments and insurance claims. Other applications such as Unified communication and collaboration are applicable for sales, CRM, remote education etc.
Transfer mobile callto WebRTC session
SPs can offer 3rd Party WebRTC endpoints to access the user’s phone number and subscription . E.g. enable web applications such as Facebook, Amazon or Netflix to allow their users to make/receive calls or messages directly from the web applications
Revenue Streams :
monthly fee for access to WebRTC endpoints and for receiving calls from by 3rd Party WebRTC endpoints
One time upgrade fees for Accessing the Web service integration with telecom network like a plan upgrade
Brownie points
No software is required to be downloaded on the subscriber’s computer, tablet or mobile phone
No desktop support required for the service provider
Plans For Consumer Customers:
Subscribers can use the WebRTC endpoints on their computers, tablets or mobile phones as a fixed-line device at home, as a desktop solution when away from home and to avoid international tolls when traveling
Subscribers can connect their web services (e.g. Websites , Facebook, Amazon, Netflix) to their fixed or mobile services subscriptions using their SP-provided phone number
Call continuity for SIP based UCaaS or PBX Enterprise Customers:
Enterprises can deploy a WebRTC endpoint for their employees that provides a single corporate communications endpoint that can be connected to any of the corporation’s UC/PBX and Call Recording systems. Then a verified enterprise user can use the WebRTC endpoint as their office phone at work, home or when traveling
Connects to all leading UC/PBX and Recording platforms simultaneously
Enterprises can deploy a single WebRTC endpoint across all their UC/PBX and Recording platforms
Easy for IT departments to deploy – no software is required to be downloaded to employees’ computers, tablets or mobile phones
Enables corporate policies and features from the WebRTC endpoint including
This post is about communication from application to WebRTC using Web Services. For instance showing advertisements on WebRTC interface before p2p streaming or even during. Advertisements could be an overlay or an multiplexed stream.
WebRTC + Advertisement Engine
HTTP and XML is the basis for Web services. The WebRTC engine, in addition to media stream processing and transmitting , typically requires the support for mixing of streams for different sources. IT would also need the NAT/FW traversal support for media streams.
Approach 1 : Direct Play
Third Party advertisement Engine plays media file containing advertisement directly to the Caller Callee on their browser
Approach 2 : Direct Play
Use Multiplexing to embed the advertisement in the media content of the caller or callee using external media processing
Components of the Advertisement engine that interacts with the WeRTC engine
WSDL / Web Services Description Language. It specifies the location of the service and the operations (or methods) the service exposes.XML-based language for describing Web services.
SOAP / Simple Object Access Protocol. It is an XML based protocol for accessing Web Services and is based on XML
UDDI/ Universal Description, Discovery and Integration. It is a directory service where companies can search for Web services. UDDI is described in WSDL. It communicates via SOAP
RDF / Resource Description Framework,. It is a framework for describing resources on the web and is written in XML
Logically the Communication Service provider or the platform hosting the WebRTC engine needs ti define the functional split between the WebRTC browser call and the third party webservice with which the WebRTC engine is interacting .
Other usecases of WebRTC with Web services can offer application-components like: currency conversion, weather reports, or even language translation as services.
The MediaStreamTrack interface typically represents a stream of data of audio or video and a MediaStream may contain zero or more MediaStreamTrack objects.
The objects RTCRtpSender and RTCRtpReceiver can be used by the application to get more fine grained control over the transmission and reception of MediaStreamTracks.
Media Flow in VoIP systemMedia Flow in WebRTC Call
WebRTC compatible browsers are required to support Whie-balance , light level , autofocus from video source
Video Capture Resolution
Minimum WebRTC video attributes unless specified in SDP ( Session Description protocl ) is minimum 20 FPS and resolution 320 x 240 pixels.
Also supports mid stream resilution changes such as in screen source fromdesktop sharinig .
SDP attributes for resolution, frame rate, and bitrate
SDP allows for codec-independent indication of preferred video resolutions using a=imageattr to indicate the maximum resolution that is acceptable.
Sender must send limiting the encoded resolution to the indicated maximum size, as the receiver may not be capable of handling higher resolutions.
Dynamic FPS control based on actual hardware encoding
video source capture to adjust frame rate accroding to low bandwidth , poor light conditions and harware supported rate rather than force a higher FPS .
Stream Orientation
support generating the R0 and R1 bits of the Coordination of Video Orientation (CVO) mechanism and sharing with peer.
Audio level for speech transmission to avoid users having to manually adjust the playback and to facilitate mixing in conferencing applications.
Normalization is considering frequencies above 300 Hz, regardless of the sampling rate used. Can be adapted to avoid clipping, either by lowering the gain to a level below -19 dBm0 or through the use of a compressor.
GAIN calculation
If the endpoint has control over the entire audio-capture path like a regular phone the gain should be adjusted in such a way that an average speaker would have a level of 2600 (-19 dBm0) for active speech.
If the endpoint does not have control over the entire audio capture like software endpoint then the endpoint SHOULD use automatic gain control (AGC) to dynamically adjust the level to 2600 (-19 dBm0) +/- 6 dB.
For music- or desktop-sharing applications, the level SHOULD NOT be automatically adjusted, and the endpoint SHOULD allow the user to set the gain manually.
Media plane adaptation is done at the SBC for network carried media, it should be done for all network hosted media services which face peer-to-peer media.
The high-level architecture elements of WebRTC media streams consists of
Encryption, RTP Multiplexing, Support for ICE
Audio – Interworking of differing WebRTC and codec sets
Video – Use of VP8, Support for H.264
Data – Support of MSRP ( RCS standard for messaging over DataChannel API)
Direct connection to media servers and media gateways.
Use common codec set wherever possible to eliminate transcoding Use regionalized transcoding where common codec not available Real-time video transcoding is expensive and performance impacting.
On-going standards/device/network work needs to be done to expand common codec set. WebRTC codec standards have not been finalized yet. WebRTC target is to support royalty free codecs within its standards.
Media
WebRTC
legacy
Audio
G.711, Opus
G.711, AMR, AMR-WB (G.722.2)
Audio – Extended
G.729a[b], G.726
Video
VP8
H.264/AVC
Supporting common codecs between VoLTE devices and WebRTC endpoints requires one or more of the following:
Support of WebRTC codecs on 3GPP/GSMA
Support of 3GPP/GSMA codecs on WebRTC
WebRTC browser support of codecs native to the device
After considerable time( 10 minutes in my case ) the quality of the media stream adjust to network conditions and variations ( peaks and dips) flat out.
after some timeafter some time has passedafter some time
Data Channel API of Webrtc allows bidirectional communication of arbitrary data between peers. It uses the same API as WebSockets and has very low latency.
(+) DataChannel is p2p and is also ened to end encrypted leader to higher privacy
(+) build in security due to p2p transfer
(+) high throughput than text transfer via a messaging server
(+) lower latency as p2p transfer takes shortest route
SCTP is the protocol that opens connectiosn for peer to peer data channel support in WebRTC. It can be configured for reliability and ordered delivery. It provides flow and congestion control to the data messages.
Webrtc Changes bitrate , resolution and framerate dynamically to accomodate the network conditions, policy constraints or user equipment capability. Higher the bitrate, higher the media quality.
Birate of Audio Codecs
Lossey formats – iLBC (narrow band )13.33, 15.20 kbit/s – iSAC ( wideband) 10–52 kbit/s – GSM-EFR 12.2 kbit/s – AAC 8–529 kbit/s (stereo) – AMR-WB (G.722.2) 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 kbit/s – Opus – 6–510 kbit/s(-) higher bitrate consumes more bandwidth (-) can cause congestion on network route
This post is deals with some less known real world implication of developing and integrating WebRTC with telecom service providers network and bring the solution in action . The regulatory and legal constrains are bought to light after the product is in action and are mostly result of short sightedness . The following is a list of factors that must be kept in mind while webRTC solution is in development stages
WebRTC services from telecom provider depend on the access technology, which may differ if the user accessing the network through a third party Wi-Fi hotspot.
User/network type may also dictate if decryption of the media is possible/required.
For Peer-to-Peer paths, media could be extracted through the use of network probes or other methodology
Then there are Other Considerations such as specific services, for example if WebRTC is used to create softphones software permitting users to receive or originate calls to the PSTN, the current view is to treat this as a fully interconnected VoIP service subject to all the rules that apply to the PSTN – regardless of technologies employed.
CALEA
Communications Assistance for Law Enforcement Act (CALEA) , a United States wiretapping law passed in 1994, during the presidency of Bill Clinton.
CALEA requirement for an LTE user may be very different than the CALEA requirements for a user accessing the network through a third party Wi-Fi hotspot.
For media going through the SBC, CALEA may use a design similar to existing CALEA designs.
calea intercept infrastructure
Read more on WebRTC Security here which discusses SOP (single origin policy ) , CORs ( cross origin requests) , JSONP , ICE , location sharing , scerensharing , Long term access to camera and microphone , SRTP DTLS as well as best practises for secure communication
VoIP and WebRTC platform security largely depend on the underlying protocols such as SIP . SIP is an robuts and time tested VoIP proctol to facilitate VoIP calls . To learn more about SIP security against atacks like
This post describes the requirement of creating a SIP phone application on android over the same codecs as WebRTC ( PCMA , PCMU , VP8) . In my project concerning the demonstration of WebRTC inter operability ( presence , audio / video call , message ) with a native android client , I had to develop a lightweight Android SIP application , customized for the look and feel of the webrtc web application . This also enables the added services to WebRTC client such as geolocation , visual voice mail , phonebook , call control options be set from android application as well .
Aim :
Android webrtc- sip client development , using sipml5 stack implemented through web services and native android programming .
Software Used:
⦁ Eclipse IDE ⦁ Java SE Development Kit 7.0 ⦁ Android SDK
Tasks :
⦁ Authorization of a user, based on his/her credentials (Database local to the application).
⦁ Navigation Drawer on the home page which shows a menu giving the user various options like: ⦁ View Home Page ⦁ View Contact List ⦁ View/Edit My Profile ⦁ View My Location ⦁ Sign Out
⦁ Phonebook sync : Importing contact list of the Android Phone into the application. Editing user profile with values like User Name , Password , Domain.
⦁ Inclusion of a Web View in the application which currently opens the desired webpage(http://sipml5.org/call.htm).
⦁ Geolocation: Showing marker for the current location of user in Google Maps.Displaying the address of the user in a Toast Message.
⦁ Audio / Video call capability
figure 1 : Login page , figure 2 : Call page , Figure 3 : Menu bar
Future Roadmap:
⦁ Connecting the application to a database which sits on the cloud. ⦁ Based on the entries in the database the user will be able to: ⦁ Login to the application. ⦁ View or edit his/her details in the My Profile Section. ⦁ Understanding codes of sample applications for making SIP calls from Android OS like: ⦁ SipDroid ⦁ SipDemo ⦁ IMSDroid ⦁ Modifying the existing application to be able to make SIP calls like one of the apps listed above.
Modules :
Development Done:
Development of an authorization page connecting the application to a local database from where values are inserted and retrieved.
Development of navigation drawer where additional options for the application will be displayed making it a user friendly application.
Development Planned:
1.Connectivity to a cloud database.
2. App engine on cloud.
3. Importing contacts from phone address book .
4. Offine storage of profile details and few call logs .