WebRTC , SIP , IMS, VoLTE , SaaS , SBC , REST , Cloud , IOT , media Streams
Category: WebRTC Media Stack
Management of media plane in webRTC based communication. Involves media engine and codecs . Also deals with interactive webrtc media applications and streaming , recording etc .
Echo is the sound of your own voice reverberating. If the amplitude of such a sound is high and intervals exceed 25 ms, it becomes disruptive to the conversation. Its types can be acoustic or hybrid. Echo cancellers need to eliminate the echo while still preserving call quality and not disrupting tones such as DTMF.
Usually the background or reflected noise which is an undesired voiceband energy transfers from the speaker to the microphone and into the communication network. Mostly found in a hands-free set or speakerphone. In a multiparty call scenario, it could also occur due to unmatched volume levels, challenging network conditions on one party, background noise, double talk or even proximity between user and microphone
In a public telephone system, local loop wiring is done using two-wire connections carrying bidirectional voice signals. In PBX, a two-to-four wire conversion is done using a hybrid circuit which does not perform perfect impedance matches resulting in a Hybrid echo.
An efficient echo canceller should cancel out the entire echo tail while not leading to any packet loss. It needs to be adaptive to changing IP network bandwidth and algorithm should function equally well in conference scenarios where there may be more than one echo sources. Benchmarking tools like MOS (Mean Opinion scores ) are used to gauge the results. Often voice quality enhancement technologies are also integrated into AEC modules, such as :
Codecs signifies the media stream’s compession and decompression. For peers to have suceesfull excchange of media, they need a common set of codecs to agree upon for the session. The list codecs are sent between each other as part of offeer and answer or SDP in SIP.
As WebRTC provides containerless bare mediastreamgtrackobjects. Codecs for these tracks is not mandated by webRTC . Yet the codecs are specified by two seprate RFCs
RFC 7878 WebRTC Audio Codec and Processing Requirements specifies least the Opus codec as well as G.711’s PCMA and PCMU formats.
RFC 7742 WebRTC Video Processing and Codec Requirnments specifies support for VP8 and H.264’s Constrained Baseline profile for video .
In WebRTC video is protected using Datagram Transport Layer Security (DTLS) / Secure Real-time Transport Protocol (SRTP). In this article we are going to dicuss Audio/Video Codecs processing requirnments only.
WebRTC is free and opensource and its woring bodies promote royality free codecs too. The working groups RTCWEB and IETF make the sure of the fact that non-royality beraning codec are mandatory while other codecs can be optional in WebRTC non browsers .
WebRTC Browsers MUST implement the VP8 video codec as described in RFC6386 and H.264 Constrained Baseline described in RFC 7442.
Most of the codesc below follow Lossy DCT(discrete cosine transform (DCT) based algorithm for encoding. Sample SDP from offer in Chrome browser v80 for Linux incliudes these profile :
AVC’s Constrained Baseline (CBP ) profile compliant with WebRTC.
propertiary, patented codec, mianted by MPEG / ITU
Constrained Baseline Profile Level 1.2 and H.264 Constrained High Profile Level 1.3 . Contrained baseline is a submet of the main profile , suited to low dealy , low complexity. suited to lower processing device like mobile videos
Multiview Video Coding – can have multiple views of the same scene ,such as stereoscopic video.
Other profiles , which are not supporedt are Baseline(BP), Extended(XP), Main(MP) , High(HiP) , Progressive High(ProHiP) , High 10(Hi10P), High 4:2:2 (Hi422P) and High 4:4:4 Predictive
supported containers are 3GP, MP4, WebM
Parameter settings:
packetization-mode
max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br
sprop-parameter-sets: H.264 allows sequence and picture information to be sent both in-band and out-of-band. WebRTC implementations must signal this information in-band.
Supplemental Enhancement Information (SEI) “filler payload” and “full frame freeze” messages( used while video switching in MCU streams )
Already used for video conferencing on PSTN (Public Switched Telephone Networks), RTSP, and SIP (IP-based videoconferencing) systems.
suited for low bandwidth networks
(-) not comaptible with WebRTC
but many media gateways incldue realtime transcoding existed between H263 based SIP systems and vp8 based webrtc ones to enable video communication between them
H.265 / HEVC
proprietary format and is covered by a number of patents. Licensing is managed by MPEG LA .
Container – Mp4
Interoprabiloity between non WebRT Compatible and WebRTC compatible endpoints
With the rise of Internet of Things many Endpoints especially IP cameras connected to Raspberry Pi like SOC( system on chiops )n wanted to stream directly to the browser within theor own provate network or even on public network using TURN / STUN.
The figure below shows how such a call flow is possible between an IP cemera ( such as Baby Cam ) and its parent monitoring it over a WebRTC suppported mobile phone browser . The process includes streaming teh content from IOT device on RTSP stream and using realtime trans-coding between H264 and VP8
Interoprabiloity between non WebRT Compatible and WebRTC compatible endpoints
Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music and supportes multiple compression algorithms
Constant and variable bitrate encoding – 6 kbit/s to 510 kbit/s
frame sizes – 2.5 ms to 60 ms
sampling rates – 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced).
container- Ogg, WebM, MPEG-TS, MP4
As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents which cover Opus are licensed under royalty-free terms.
(+ ) flexible, suited for speech ( by SILK) and music ( CELT)
(+) support for mono and stereo
(+) inbuild FEC( Forward Error Correction) thus resilient to packet loss
(+) compression adjustability\ for unpredictable networks
(-) Highly CPU intensive ( unsuitable for embedded devices like rpi)
(-) processing and memory intensive
For all cases where the endpoint is able to process audio at a sampling rate higher than 8 kHz, it is w3C recommends that Opus be offered before PCMA/PCMU.
AAC (Advanvced Audio Encoding)
part of the MPEG-4 (H.264) standard. Lossy compression but has number pf profiles suiting each usecase like high quality surround sound to low-fidelity audio for speech-only use.
supported containers – MP4, ADTS, 3GP
G.711 (PCMA and PCMU)
G.711 is an ITU standard (1972) for audio compression. It is primarily used in telephony.
ITU published Pulse Code Modulation (PCM) with either µ-law or A-law encoding. vital to interface with the standard telecom network and carriers. G.711 PCM (A-law) is known as PCMA and G.711 PCM (µ-law) is known as PCMU
It is the required standard in many voice-based systems and technologies, for example in H.320 and H.323 specifications.
Fixed 64Kbpd bit rate
supports 3GP container formats
G.722
ITU standard (1988) Encoded using Adaptive Differential Pulse Code Modulation (ADPCM) which is suited for voice compression
7 kHz Wideband audio codec operating
Bitrate 48, 56 and 64 kbit/s.
containers used 3GP, AMR-WB
G722 improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.
Comfort noise (CN)
artificial background noise which is used to fill gaps in a transmission instead of using pure silence. It prevents – jarring or RTP Timeout.
Should be used for streams encoded with G.711 or any other supported codec that does not provide its own CN. Use of Discontinuous Transmission (DTX) / CN by senders is optional
Internet Low Bitrate Codec (iLBC)
A opensource narrowband speech codec for VoIP and streaming audio.
8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames.
Defined by IETF RFCs 3951 and 3952.
Internet Speech Audio Codec (iSAC)
iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. It is designed for voice transmissions which are encapsulated within an RTP stream.
16 kHz or 32 kHz sampling frequency
adaptive and variable bit rate of 12 to 52 kbps.
Speex
patent-free audio compression format designed for speech and also a free software speech codec that is used in VoIP applications and podcasts. May be obsolete, with Opus as its official successor.
AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality. This is codec is generally available on mobile phones.
wider speech bandwidth of 50–7000 Hz.
data rate is between 6-12 kbit/s, and the
DTMF and ‘audio/telephone-event’ media type
endpoints may send DTMF events at any time and should suppress in-band dual-tone multi-frequency (DTMF) tones, if any.
Until recently a customised or property extension could signal multiple media streams within an m-section of an SDP and experiment with media-level “msid” (Media Stream Identifier ) attribute used to associate RTP streams that are described in different media descriptions with the same MediaStreams. However, with the transition to a unified plan, they will experience breaking changes.
The previous SDP format implementation called “planB” was transitioned to “unified plan” in 2019.
Who it does effect ?
Uses various media tracks within m line in SDP such as for video stream and screen sharing simultaneously
Munges SDP, uses MCUs or SFUs
used track-based APIs addTrack, removeTrack, and sender.replaceTrack or legacy addstream removeStream exposed senders and receivers to edit tracks and their encoding parameters
Who it does not affect ?
This does not affect any application which has only single audio and video track.
Multiple media stream may be required for cases such as video and screen share stream in same SDP or in specific cases of SFU.
This implementation in Plan B will result in one “m=” line of SDP being used for video and audio. While within the video m= section multiple “a=ssrc” lines are listed for multiple media tracks.
In Unified Plan, every single media track is assigned to a separate “m=” section. Hence for video and screen sharing simultaneously two m sections will be created.
Interoperability between unified plan and plan B
A mismatch in SDP (between Plan B and Unified Plan) usually results :-
only Unified Plan client receives an offer generated by a Plan B client – the Unified Plan client must reject the offer with a failed setRemoteDescription() error.
only Plan B client receives an offer generated by a Unified Plan client – only first track in every “m=” section is used and other tracks are ignored
This blog is in continuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC ).
Attempt 4: Stream the content to a WebRTC endpoint which is hidden in a video call . Pick the stream from vp8 object URL send to a streaming server
This process involved the following components :
WebRTC API : simplewebrtc on Chrome
Transfer mechanism from client to Streaming server: webrtc media channel
Problems : No streaming server is qualified to handle a direct webrtc input and stream it on network .
Attempt 4.1 : Stream the content to a WebRTC endpoint . Do WebRTC Endpoint to RTP Endpoint bridge using Kurento APIs.
Use the RTP port and ip address to input into a ffmpeg or gstreamer or VLC terminal command and out put a live H264 stream on another ip and port address .
This process involved the following components :
API : Kurento
Transfer mechanism : HTML5 webrtc client -> application server hosting java -> media server -> application for webrtc media to RTP media conversation -> RTP player
Screenshots of attempts with Wowza to stream RTP from a IP and port
Problems : The stream was black which means 100% loss.
Lesson learned : RTP is not suitable for over the intgernet transmission especially with firewalls
Attempt 4.2 : Build a WebRTC Endpoint to Http endpoint in kurento and force the video audio encoding to be that of H264 and PCMU.
Code snippet for adding constraints to output media via pipeline and forcing choice of codecs( H264 for video and PCMU for audio ).
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
HttpGetEndpoint httpEndpoint=new HttpGetEndpoint.Builder(pipeline).build();
org.kurento.client.Fraction fr= new org.kurento.client.Fraction(1, 30);
VideoCaps vc= new VideoCaps(VideoCodec.H264,fr);
httpEndpoint.setVideoFormat(vc);
AudioCaps ac= new AudioCaps(AudioCodec.PCMU, 65536);
httpEndpoint.setAudioFormat(ac);
webRtcEndpoint.connect(httpEndpoint);
Alternatively one can opt to use gstreamer filter to force the output in raw format.
// basic media operation of 1 pipeline and 2 endpoints
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();
// adding Gstream filters
GStreamerFilter filter1 = new GStreamerFilter.Builder(pipeline, "videorate max-rate=30").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter2 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=video/x-h264,width=1280,height=720,framerate=30/1").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter3 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=audio/x-mpeg,layer=3,rate=48000").withFilterType(FilterType.AUDIO).build();
// connecting all poin ts to one another
webRtcEndpoint.connect (filter1);
filter1.connect (filter2);
filter2.connect (filter3);
filter3.connect (rtpEndpoint);
// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);
End result : The output is still webm based and doesnt work on h264 clients.
Attempt 5 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Wowza streaming server
This process involved the following components
WebRTC Stream and object URL of the blob containing VP8 media
Kurento WebRTC Endpoint bridge to generate SDP
Wowza Streaming server
Snippet used for kurento to generate a SDP file from WebRTC to RTP bridge
@RequestMapping(value = "/rtpsdp", method = RequestMethod.POST)
private String processRequestrtpsdp(@RequestBody String sdpOffer)
throws IOException, URISyntaxException, InterruptedException {
//basic media operation of 1 pipeline and 2 endpoinst
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();
//connecting all poin ts to one another
webRtcEndpoint.connect (rtpEndpoint);
// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);
// write the SDP conector to an external file
PrintWriter out = new PrintWriter("/tmp/test.sdp");
out.println(requestRTPsdp);
out.close();
HttpGetEndpoint httpEndpoint = new HttpGetEndpoint.Builder(pipeline).build();
PlayerEndpoint player = new PlayerEndpoint.Builder(pipeline, requestRTPsdp).build();
httpEndpoint.connect(rtpEndpoint);
player.connect(httpEndpoint);
// Playing media and opening the default desktop browser
player.play();
String videoUrl = httpEndpoint.getUrl();
System.out.println(" ------- video URL -------------"+ videoUrl);
// send the response to front client
String responseSdp = webRtcEndpoint.processOffer(sdpOffer);
return responseSdp;
}
End result : wowza doesnt not recognize the WebRTC SDP and play the video
screenshot of wowza with SDP input
Attempt 5.1 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Default Ubuntu media player
End result : wowza doesnt not recognize the WebRTC SDP and play the video : deformed media
screenshot of playing from a SDP file
Attempt 5.2 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over VLC using socket input
End result : nothing plays
screenshot of VLC connected to play from socket and failure to play anything
Attempt 5.3: Create a WebRTC endpoint and connected it to RTP endpoint via media pipelines . Also make the RTP SDP offer and answering the same . Play with ffnpeg / ffplay / gst playbin
Write the requestRTPsdp to a file and obtain a RTP connector endpoint with Application/SDP .It plays okay with gst playbin ( 10 secs without audio ). Successful attempt to play from a gst playbin
gst-launch -vvv playbin uri=file:///tmp/test.sdp
but refuses to be played by VLC , ffplay and even wowza . The error generated with
End result : This results in “Could not find codec parameter for stream1 ( video:h263, none ) .Other errors types are , Could not write header for output file output file is empty nothing was encoded”
Error screenshots of trying to play the RTP SDP file with ffmpeg
Attempt 6 : Use a WebRTC capable media and streaming server ( eg Kurento ) to pick a live stream of VP8 .
Convert the VP8 to H264 ( ffmpeg / RTP endpoint )
Convert H264 to Mp4 using MP4 parser and pass to a streaming server ( wowza)
End Result : yes it did work on mozilla but with considerable lag
Update : Thankfully the updates to WebRTC standards mandated the support for PCMU and AVC/H264 CB profile in the media stack of the browser thus solving the “from scratch buildup of transcoder between webrtc and non webrtc endpoints”.
Video Codecs : RFC 7742 specifies that all WebRTC-compatible browsers must support VP8 and H.264’s Constrained Baseline profile for video.
Audio Codecs : RFC 7874 specifies that browsers must support at least the Opus codec as well as G.711’s PCMA and PCMU formats.
The latest Webrtc specification lists a set of codecs which all compliant browsers are required to support which includes chrome 52 , Firefox , safari , edge.
References :
RFC7742: WebRTC Video Processing and Codec Requirements
RFC 7874: WebRTC Audio Codec and Processing Requirements
As the title of this article suggests I am going to pen my attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc.
Some of the high level archietctures for streaming Webrtc Video to multiple endpoints can be viewed in the post below.
Aim : I will be attempting to create a lightweight WebRTC to raw/h264 transcoder by making my own media engine which takes input from WebRTC peerconnection or getusermedia. I am sharing my past experiments in hope of helping someone whose objective may be to acheive the same since many non webrtc supported endpoints ( Rpi , kisosks , mobile browsers ) could benifit heavily from webrtc streaming . Even if your objective is not the same as mine, you may gain some insigh on what not to do when making a media transcoder.
It uses API fromwebrtc-experiment.com. The broadcast is in one direction only where the viewrs are never asked for their mic / webcam permission .
problem : The broadcast is for WebRTC browsers only and doesnt support non webrtc players / browsers
Attempt 1.1: Stream the media directly to nodejs through websocke
window.addEventListener('DOMContentLoaded', function () {
var v = document.getElementById('v');
navigator.getUserMedia = (navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia);
if (navigator.getUserMedia) {
// Request access to video only
navigator.getUserMedia(
{
video: true,
audio: false
},
function (stream) {
var url = window.URL || window.webkitURL;
v.src = url ? url.createObjectURL(stream) : stream;
v.play();
var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
waitForSocketConnection(ws, function () {
console.log(" url.createObjectURL(stream)-----", url.createObjectURL(stream))
ws.send(stream);
console.log("message sent!!!");
});
},
function (error) {
alert('Something went wrong. (error code ' + error.code + ')');
return;
}
);
} else {
alert('Sorry, the browser you are using doesn\'t support getUserMedia');
return;
}
});
//Make the function wait until the connection is made...
function waitForSocketConnection(socket, callback) {
setTimeout(
function () {
if (socket.readyState === 1) {
console.log("Connection is made")
if (callback != null) {
callback();
}
return;
} else {
console.log("wait for connection...")
waitForSocketConnection(socket, callback);
}
}, 5); // wait 5 milisecond for the connection...
}
Problem : The video is in form of buffer and doesnot play
Attempt 2: Record the WebRTC media ( 5 secs each ) into chunks of webm format-> transfer them to other end -> append the chunks together like a regular file
This process involved the following components :
Recorder Javascript library : RecordJs
Transfer mechanism : Record using RecordRTC.js -> send to other end for media server -> stitching together the small webm files into big one at runtime and play
Programs :
Code for video recorder
navigator.getUserMedia(videoConstraints, function (stream) {
video.onloadedmetadata = function () {
video.width = 320;
video.height = 240;
var options = {
type: isRecordVideo ? 'video' : 'gif',
video: video,
canvas: {
width: canvasWidth_input.value,
height: canvasHeight_input.value
}
};
recorder = window.RecordRTC(stream, options);
recorder.startRecording();
};
video.src = URL.createObjectURL(stream);
}, function () {
if (document.getElementById('record-screen').checked) {
if (location.protocol === 'http:')
alert('https is mandatory to capture screen.');
else
alert('Multi-capturing of screen is not allowed.Have you enabled flag: "Enable screen capture support in getUserMedia"?');
} else
alert('Webcam access is denied.');
});
Code for video append-er
var FILE1 = '1.webm';
var FILE2 = '2.webm';
var FILE3 = '3.webm';
var FILE4 = '4.webm';
var FILE5 = '5.webm';
var NUM_CHUNKS = 5;
var video = document.querySelector('video');
window.MediaSource = window.MediaSource || window.WebKitMediaSource;
if (!!!window.MediaSource) {
alert('MediaSource API is not available');
}
var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);
function callback(e) {
var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
GET(FILE1, function (uInt8Array) {
var file = new Blob([uInt8Array], {type: 'video/webm'});
var i = 1;
(function readChunk_(i) {
var reader = new FileReader();
reader.onload = function (e) {
sourceBuffer.appendBuffer(new Uint8Array(e.target.result));
if (i == NUM_CHUNKS) mediaSource.endOfStream();
else {
if (video.paused) {
video.play(); // Start playing after 1st chunk is appended.
}
readChunk_(++i);
}
};
reader.readAsArrayBuffer(file);
})(i); // Start the recursive call by self calling.
});
}
mediaSource.addEventListener('sourceopen', callback, false);
mediaSource.addEventListener('webkitsourceopen', callback, false);
mediaSource.addEventListener('webkitsourceended', function (e) {
logger.log('mediaSource readyState: ' + this.readyState);
}, false);
// function get the video via XHR
function GET(url, callback) {
var xhr = new XMLHttpRequest();
xhr.open('GET', url, true);
xhr.responseType = 'arraybuffer';
xhr.send();
xhr.onload = function (e) {
if (xhr.status != 200) {
alert("Unexpected status code " + xhr.status + " for " + url);
return false;
}
callback(new Uint8Array(xhr.response));
};
}
Shortcoming of this approach
The webm files failed to play on most of the media players
The recorder can only either record video or audio file at a time .
Attempt 2.Chunking and media proxy
Since the previous approach failed to support on webrtc endpoinst , the next iteration of this approach was to channel the webrtc media via a nodejs server thus disrupting the peer to peer media strem in favour of centralized / proxied emdia stream. This would enable me to obtain raw media packets form teh stream using low level C based vp8 decoder libraries and then re encode them to h364 or other media formats suitable for endpoints .
In theory media could be reencoded jusing openH264 library and the frame could be then send to players
let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs=vp9',
new VP9Decoder());
let buffer = await loadBuffer();
sourceBuffer.appendBuffer(buffer);
Further extending for uncompressed video
let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/raw; codecs=yuv420p');
for (let p in demuxPAckets()) {
let frame = await codec.decode(p);
sourceBuffer.appendBuffer(frame);
}
Atleast that was the plan .
Attempt 2.1: Record the WebRTC media ( 5 secs each ) into chunks of webm format ( RecordRTC.js) > Use Kurento JS script ( kws-media-api,js) to make a HTTP Endpoint to recorded Webm files -> append the chunks together like a regular file at runtime
// UI elements
function getByID(id) {
return document.getElementById(id);
}
var recordAudio = getByID('record-audio'),
recordVideo = getByID('record-video'),
stopRecordingAudio = getByID('stop-recording-audio'),
stopRecordingVideo = getByID('stop-recording-video'),
broadcasting = getByID('broadcasting');
var canvasWidth_input = getByID('canvas-width-input'),
canvasHeight_input = getByID('canvas-height-input');
var video = getByID('video');
var audio = getByID('audio');
// Audio video constraints
var videoConstraints = {
audio: false,
video: {
mandatory: {},
optional: []
}
};
var audioConstraints = {
audio: true,
video: false
};
// Recording and stop recording - to be convrted into real time capture and chunking
const ws_uri = 'ws://localhost:8888/kurento';
var URL_SMALL = "http://localhost:8080/streamtomp4/approach1/5561840332.webm";
var audioStream;
var recorder;
recordAudio.onclick = function () {
if (!audioStream)
navigator.getUserMedia(audioConstraints, function (stream) {
if (window.IsChrome) stream = new window.MediaStream(stream.getAudioTracks());
audioStream = stream;
audio.src = URL.createObjectURL(audioStream);
audio.muted = true;
audio.play();
// "audio" is a default type
recorder = window.RecordRTC(stream, {
type: 'audio'
});
recorder.startRecording();
}, function () {
});
else {
audio.src = URL.createObjectURL(audioStream);
audio.muted = true;
audio.play();
if (recorder) recorder.startRecording();
}
window.isAudio = true;
this.disabled = true;
stopRecordingAudio.disabled = false;
};
Recording and stop recording video inot small media files ( chunks )
recordVideo.onclick = function () {
recordVideoOrGIF(true);
};
stopRecordingAudio.onclick = function () {
this.disabled = true;
recordAudio.disabled = false;
audio.src = '';
if (recorder)
recorder.stopRecording(function (url) {
audio.src = url;
audio.muted = false;
audio.play();
document.getElementById('audio-url-preview').innerHTML = '<a href="' + url + '" target="_blank">Recorded Audio URL</a>';
});
};
function recordVideoOrGIF(isRecordVideo) {
navigator.getUserMedia(videoConstraints, function (stream) {
video.onloadedmetadata = function () {
video.width = 320;
video.height = 240;
var options = {
type: isRecordVideo ? 'video' : 'gif',
video: video,
canvas: {
width: canvasWidth_input.value,
height: canvasHeight_input.value
}
};
recorder = window.RecordRTC(stream, options);
recorder.startRecording();
};
video.src = URL.createObjectURL(stream);
}, function () {
if (document.getElementById('record-screen').checked) {
if (location.protocol === 'http:')
alert('<https> is mandatory to capture screen.');
else
alert('Multi-capturing of screen is not allowed. Capturing process is denied. Are you enabled flag: "Enable screen capture support in getUserMedia"?');
} else
alert('Webcam access is denied.');
});
window.isAudio = false;
if (isRecordVideo) {
recordVideo.disabled = true;
stopRecordingVideo.disabled = false;
} else {
recordGIF.disabled = true;
stopRecordingGIF.disabled = false;
}
}
stopRecordingVideo.onclick = function () {
this.disabled = true;
recordVideo.disabled = false;
if (recorder)
recorder.stopRecording(function (url) {
video.src = url;
video.play();
document.getElementById('video-url-preview').innerHTML = '<a href="' + url + '" target="_blank">Recorded Video URL</a>';
});
};
Broadcasting the chunks to media engine
function onerror(error) {
console.log(" error occured");
console.error(error);
}
broadcast.onclick = function () {
var videoOutput = document.getElementById("videoOutput");
KwsMedia(ws_uri, function (error, kwsMedia) {
if (error) return onerror(error);
// Create pipeline
kwsMedia.create('MediaPipeline', function (error, pipeline) {
if (error) return onerror(error);
// Create pipeline media elements (endpoints & filters)
pipeline.create('PlayerEndpoint', {uri: URL_SMALL}, function (error, player) {
if (error) return console.error(error);
pipeline.create('HttpGetEndpoint', function (error, httpGet) {
if (error) return onerror(error);
// Connect media element between them
player.connect(httpGet, function (error, pipeline) {
if (error) return onerror(error);
// Set the video on the video tag
httpGet.getUrl(function (error, url) {
if (error) return onerror(error);
videoOutput.src = url;
console.log(url);
// Start player
player.play(function (error) {
if (error) return onerror(error);
console.log('player.play');
});
});
});
// Subscribe to HttpGetEndpoint EOS event
httpGet.on('EndOfStream', function (event) {
console.log("EndOfStream event:", event);
});
});
});
});
}, onerror);
}
problem : dissecting the live video into small the files and appending to each other on reception is an expensive , time and resource consuming process . Also involves heavy buffering and other problems pertaining to real-time streaming .
Attempt 2.2 : Send the recorded chunks of webm to a port on linux server. Use socket programming to pick up these individual files and play using VLC player from UDP port of the Linux Server
End Result : Small file containers play but slow buffering makes this approach non compatible for streaming files chunks and appending as single file.
Attempt 2.3: Send the recorded chunks of webm to a port on linux server socket . Use socket programming to pick up these individual webm files and convert to H264 format so that they can be send to a media server.
This process involved the following components :
Recorder Javascript library : RecordJs
Transfer mechanism :WebRTC endpoint -> Call handler ( Record in chunks ) -> ffmpeg / gstreamer to put it on RTP -> streaming server like wowza – > viewers
Programs : Use HTML webpage Webscoket connection -> nodejs program to write content from websocket to linux socket -> nodejs program to read that socket and print the content on console
Snippet to transfer the webm recorder files over websocket to nodejs program
// Make the function wait until the connection is made.
function waitForSocketConnection(socket, callback) {
setTimeout(
function () {
if (socket.readyState === 1) {
console.log("Connection is made")
if (callback != null)
callback();
} else {
console.log("wait for connection...")
waitForSocketConnection(socket, callback);
}
}, 5); // wait 5 milisecond for the connection...
}
function previewFile() {
var preview = document.querySelector('img');
var file = document.querySelector('input[type=file]').files[0];
var reader = new FileReader();
reader.onloadend = function () {
preview.src = reader.result;
console.log(" reader result ", reader.result);
var video = document.getElementById("v");
video.src = reader.result;
console.log(" video played ");
var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
waitForSocketConnection(ws, function () {
ws.send(reader.result);
console.log("message sent!!!");
});
}
if (file) {
// converts to base64 encoded string of the file data
//reader.readAsDataURL(file);
reader.readAsBinaryString(file);
} else {
preview.src = "";
}
}
Program for Linux Sockets sender which creates the socket for the webm files in nodejs
var net = require('net');
var fs = require('fs');
var socketPath = '/tmp/tfxsocket';
var http = require('http');
var stream = require('stream');
var util = require('util');
var WebSocketServer = require('ws').Server;
var port = 3000;
var serverUrl = "localhost";
var socket;
/*----------http server -----------*/
var server = http.createServer(function (request, response) {});
server.listen(port, serverUrl);
console.log('HTTP Server running at ', serverUrl, port);
/*------websocket server ----------*/
var wss = new WebSocketServer({server: server});
wss.on("connection", function (ws) {
console.log("websocket connection open");
ws.on('message', function (message) {
console.log(" stream recived from broadcast client on port 3000 ");
var s = require('net').Socket();
s.connect(socketPath);
s.write(message);
console.log(" send the stream to socketPath", socketPath);
});
ws.on("close", function () {
console.log("websocket connection close")
});
});
Program for Linux Socket Listener using nodejs and socket . Here the socket is in node /tmp/mysocket
var net = require('net');
var client = net.createConnection("/tmp/mysocket");
client.on("connect", function() {
console.log("connected to mysocket");
});
client.on("data", function(data) {
console.log(data);
});
client.on('end', function() {
console.log('server disconnected');
});
Output 1: Video Buffer displayed
Output 2 : Payload from Video displayed that shows the pipeline works but no output yet.
ffmpeg format of transfering the content from socket to UDP IP and port
ffmpeg -i unix://tmp/mysocket -f format udp://192.168.0.119:8083
problems of this approach : The video was on a passing stage from the socket and contained no information as such when tried to play / show console.
Attempt 3 : Use existing media engine like kurento to do the transocding for me
Send the live WebRTC stream from Kurento WebRTC endpoint to Kurento HTTP endpoint then play using Mozilla VLC web plugin
The MediaStreamTrack interface typically represents a stream of data of audio or video and a MediaStream may contain zero or more MediaStreamTrack objects.
The objects RTCRtpSender and RTCRtpReceiver can be used by the application to get more fine grained control over the transmission and reception of MediaStreamTracks.
Media Flow in VoIP systemMedia Flow in WebRTC Call
WebRTC compatible browsers are required to support Whie-balance , light level , autofocus from video source
Video Capture Resolution
Minimum WebRTC video attributes unless specified in SDP ( Session Description protocl ) is minimum 20 FPS and resolution 320 x 240 pixels.
Also supports mid stream resilution changes such as in screen source fromdesktop sharinig .
SDP attributes for resolution, frame rate, and bitrate
SDP allows for codec-independent indication of preferred video resolutions using a=imageattr to indicate the maximum resolution that is acceptable.
Sender must send limiting the encoded resolution to the indicated maximum size, as the receiver may not be capable of handling higher resolutions.
Dynamic FPS control based on actual hardware encoding
video source capture to adjust frame rate accroding to low bandwidth , poor light conditions and harware supported rate rather than force a higher FPS .
Stream Orientation
support generating the R0 and R1 bits of the Coordination of Video Orientation (CVO) mechanism and sharing with peer.
Audio level for speech transmission to avoid users having to manually adjust the playback and to facilitate mixing in conferencing applications.
Normalization is considering frequencies above 300 Hz, regardless of the sampling rate used. Can be adapted to avoid clipping, either by lowering the gain to a level below -19 dBm0 or through the use of a compressor.
GAIN calculation
If the endpoint has control over the entire audio-capture path like a regular phone the gain should be adjusted in such a way that an average speaker would have a level of 2600 (-19 dBm0) for active speech.
If the endpoint does not have control over the entire audio capture like software endpoint then the endpoint SHOULD use automatic gain control (AGC) to dynamically adjust the level to 2600 (-19 dBm0) +/- 6 dB.
For music- or desktop-sharing applications, the level SHOULD NOT be automatically adjusted, and the endpoint SHOULD allow the user to set the gain manually.
Media plane adaptation is done at the SBC for network carried media, it should be done for all network hosted media services which face peer-to-peer media.
The high-level architecture elements of WebRTC media streams consists of
Encryption, RTP Multiplexing, Support for ICE
Audio – Interworking of differing WebRTC and codec sets
Video – Use of VP8, Support for H.264
Data – Support of MSRP ( RCS standard for messaging over DataChannel API)
Direct connection to media servers and media gateways.
Use common codec set wherever possible to eliminate transcoding Use regionalized transcoding where common codec not available Real-time video transcoding is expensive and performance impacting.
On-going standards/device/network work needs to be done to expand common codec set. WebRTC codec standards have not been finalized yet. WebRTC target is to support royalty free codecs within its standards.
Media
WebRTC
legacy
Audio
G.711, Opus
G.711, AMR, AMR-WB (G.722.2)
Audio – Extended
G.729a[b], G.726
Video
VP8
H.264/AVC
Supporting common codecs between VoLTE devices and WebRTC endpoints requires one or more of the following:
Support of WebRTC codecs on 3GPP/GSMA
Support of 3GPP/GSMA codecs on WebRTC
WebRTC browser support of codecs native to the device
After considerable time( 10 minutes in my case ) the quality of the media stream adjust to network conditions and variations ( peaks and dips) flat out.
after some timeafter some time has passedafter some time
Data Channel API of Webrtc allows bidirectional communication of arbitrary data between peers. It uses the same API as WebSockets and has very low latency.
(+) DataChannel is p2p and is also ened to end encrypted leader to higher privacy
(+) build in security due to p2p transfer
(+) high throughput than text transfer via a messaging server
(+) lower latency as p2p transfer takes shortest route
SCTP is the protocol that opens connectiosn for peer to peer data channel support in WebRTC. It can be configured for reliability and ordered delivery. It provides flow and congestion control to the data messages.
Webrtc Changes bitrate , resolution and framerate dynamically to accomodate the network conditions, policy constraints or user equipment capability. Higher the bitrate, higher the media quality.
Birate of Audio Codecs
Lossey formats – iLBC (narrow band )13.33, 15.20 kbit/s – iSAC ( wideband) 10–52 kbit/s – GSM-EFR 12.2 kbit/s – AAC 8–529 kbit/s (stereo) – AMR-WB (G.722.2) 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 kbit/s – Opus – 6–510 kbit/s(-) higher bitrate consumes more bandwidth (-) can cause congestion on network route