Augmented Reality – Telecom R & D

For the last couple of weeks , I have been working on the concept of rendering 3D graphics on WebRTC media stream using different JavaScript libraries as part of a Virtual Reality project .

AR vs VR
WebRTC
WebGL
Three.js
WASM/OpenGL

What is Augmented Reality ?

Augmented reality (AR) is viewing a real-world environment with elements that are supplemented by computer-generated sensory inputs such as sound, video, graphics , location etc.

How is AR diff. from VR(Virtual Reality) ?

Virtual Reality	Augmented Reality
replaces the real world with simulated one , user is isolated from real life , Examples – Oculus Rift & Kinect	blending of virtual reality and real life , user interacts with real world through digital overlays , Examples – Google glass & Holo Lens

Methods for rendering augmented Reality

Computer Vision
Object Recognition
Eye Tracking
Face Detection and substitution
Emotion and gesture picker
Edge Detection

Web based Augmented Reality platform building has for a Web base Components for end-to-end AR solution such as WebRTC getusermedia , Web Speech API, css, svg, HTML5 canvas, sensor API. Hardware components that can include Graphics driver, media capture devices such as microphone and camera, sensors. 3D Components like Geometry and Math Utilities, 3D Model Loaders and models, Lights, Materials,Shaders, Particles, Animation.

WebRTC (Web based Real Time communications)

Browser’s media stream and data. Standardization , on a API level at the W3C and at the protocol level at the IETF. WebRTc enables browser to browser applications for voice calling, video chat and P2P file sharing without plugins.Enables web browsers with Real-Time Communications (RTC) capabilities.

Code snippet for WebRTC API

1.To begin with WebRTC we first need to validate that the browser has permission to access the webcam. Find out if the user’s browser can use the getUserMedia API.

function hasGetUserMedia() {
	return !!(navigator.webkitGetUserMedia);
}

Get the stream from the user’s webcam.

var video = $('#webcam')[0];
if (navigator.webkitGetUserMedia) {
         navigator.webkitGetUserMedia(
			{audio:true, video:true},
			function(stream) { video.src = window.webkitURL.createObjectURL(stream);  },
			function(e) {	alert('Webcam error!', e); }
		);
}

Screenshot AppRTC

https://apprtc.appspot.com

Augmented Reality in WebRTC Browser - Google Slides

End to End RTC Pipeline for AR

WebGL

Web Graphics Library
JavaScript API for rendering interactive 2D and 3D computer graphics in browser
no plugins
uses GPU ( Graphics Processing Unit ) acceleration
can mix with other HTML elements
uses the HTML5 canvas element and is accessed using Document Object Model interfaces
cross platform , works on all major Desktop and mobile browsers

WebGL Development

To get started you should know about :

GLSL, the shading language used by OpenGL and WebGL
Matrix computation to set up transformations
Vertex buffers to hold data about vertex positions, normals, colors, and textures
matrix math to animate shapes

Cleary WebGL is bit tough given the amount of careful coding , mapping and shading it requires .

Proceeding to some JS libraries that can make 3D easy for us .

CCV

website : http://libccv.org/
SourceCode : https://github.com/liuliu/ccv

Awe.js

Website : https://buildar.com/awe/tutorials/intro_to_awe.js/index.html#
SourceCode : https://github.com/buildar/awe.js

ArcuCO

SourceCode: https://github.com/jcmellado/js-aruco

Potree

Karenpeng

emotion & gesture-based arpeggiator and synthesizer

SourceCode : https://github.com/karenpeng/motionEmotion
Demo: http://motionemotion.herokuapp.com/

Three.JS

MIT license javascript 3D engine ie ( WebGL + more).

website : http://threejs.org/
SourceCode : https://github.com/mrdoob/three.js/
Demo: http://www.davidscottlyons.com/threejs/

3D space with webcam input as texture

Display the video as a plane which can be viewed from various angles in a given background landscape. Credits for below code : https://stemkoski.github.io/Three.js/

1.Use code from slide 10 to get user’s webcam input through getUserMedia

Make a Screen , camera and renderer as previously described
Give orbital CONTROLS for viewing the media plane from all angles

	controls = new THREE.OrbitControls( camera, renderer.domElement );

Make the FLOOR with an image texture

[sourcecode language="html"]
	var floorTexture = new THREE.ImageUtils.loadTexture( 'imageURL.jpg' );
	floorTexture.wrapS = floorTexture.wrapT = THREE.RepeatWrapping;
	floorTexture.repeat.set( 10, 10 );
	var floorMaterial = new THREE.MeshBasicMaterial({map: floorTexture, side: THREE.DoubleSide});
	var floorGeometry = new THREE.PlaneGeometry(1000, 1000, 10, 10);
	var floor = new THREE.Mesh(floorGeometry, floorMaterial);
	floor.position.y = -0.5;
	floor.rotation.x = Math.PI / 2;
	scene.add(floor)
[/sourcecode]

6. Add Fog


scene.fog = new THREE.FogExp2( 0x9999ff, 0.00025 );

7.Add video Image Context and Texture.

video = document.getElementById( 'monitor' );
videoImage = document.getElementById( 'videoImage' );
videoImageContext = videoImage.getContext( '2d' );
videoImageContext.fillStyle = '#000000';
videoImageContext.fillRect( 0, 0, videoImage.width, videoImage.height );
videoTexture = new THREE.Texture( videoImage );
videoTexture.minFilter = THREE.LinearFilter;
videoTexture.magFilter = THREE.LinearFilter;
var movieMaterial=new THREE.MeshBasicMaterial({map:videoTexture,overdraw:true,side:THREE.DoubleSide});
var movieGeometry = new THREE.PlaneGeometry( 100, 100, 1, 1 );
var movieScreen = new THREE.Mesh( movieGeometry, movieMaterial );
movieScreen.position.set(0,50,0);
scene.add(movieScreen);

Set camera position

	camera.position.set(0,150,300);
	camera.lookAt(movieScreen.position);

Define the render function

    videoImageContext.drawImage( video, 0, 0, videoImage.width, videoImage.height );
    renderer.render( scene, camera );

Animation

   requestAnimationFrame( animate );
   render();

Augmented Reality in WebRTC Browser - Google Slides (4)

WASM/OpenGL

WASM ( Web Assembly) is portable binary-code format and a corresponding text format. It is used for facilitating interactions between c++ programs and their host environment such as Javascript code into the browsers.

Emscripten is a one of the compiler toolchain and is a way to compile C++ into WASM.

GPL support for AR

Compiler/CUDA/OpenGL/Vulcan/Graphics/Fortran/GPGPU Developer.

Web media APIs like MSE (Media Source Extensions) and EME (Encrypted Media Extensions)

AR Processing pipeline

credits : Media Pipe Google AI

On deveice Machine Learning piipeline consists of a platform solution such as MediaPipe above along with WASM. The WASM SIMD (Single instruction, multiple data for parallel process ) ML inerface can be XNNPACK or any other mobile platform based neural network inference framework. This is followerd by rendering.

GPU Accelerated segmentation ( WebGL) outperforms CPU Segmentation using WASM SIMD by talking the latency down from ~8.7ms to ~4.3 ms. Novel WebGL interface can via optimized fragment shaders using MRT.

Credits Intel Video analytics pipeline p7

GStreamer Video Analytics

Ref :

<html> <head> <title>Spinning colored Cube</title> <style> body { margin: 0; } canvas { width: 100%; height: 100% } </style> </head> <body> <script src="js/three.min.js"></script> <script>// Our Javascript will go here. </script> </body> </html>

<html> <head> <title>Shaded Material on Sphere </title> <style> body { margin: 0; } canvas { width: 100%; height: 100% } </style> <script src="js/jquery.min.js"></script> <script src="js/three.min.js"></script> <script>// Our Javascript will go here.</script> </head> <body> <div id="container"></div> </body> </html>

var scene = new THREE.Scene(); var camera = new THREE.PerspectiveCamera(45, 600/600 , 0.1, 10000); var renderer = new THREE.WebGLRenderer(); renderer.setSize(600 , 600 ); $container.append(renderer.domElement); scene.add(camera); camera.position.z = 300; // the camera starts at 0,0,0 so pull it back

scene = new THREE.Scene(); camera = new THREE.PerspectiveCamera(125, window.innerWidth / window.innerHeight, 1, 500); camera.position.set(0, 0, 100); camera.lookAt(new THREE.Vector3(0, 0, 0)); var renderer = new THREE.WebGLRenderer(); renderer.setSize( window.innerWidth, window.innerHeight ); document.body.appendChild( renderer.domElement );

var geometry = new THREE.TorusKnotGeometry( 8, 2, 100, 16, 4, 3 ); var material = new THREE.MeshLambertMaterial( { color: 0x2022ff } ); var torusKnot = new THREE.Mesh( geometry, material ); torusKnot.position.set(3, 3, 3); scene.add( torusKnot ); camera.position.z =25;

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

	Boris Ivanov on Asterisk – installation…
	Paras Kumar on Hosted IP-PBX and SBC
	altanai on Hosted IP-PBX and SBC
	Debra Olsen on Streaming / broadcasting Live…
	Things to know about… on WebRTC
	Hugo K on FreeSwitch SIP and Media …
	Bert H on Evolution of voice Commun…

Category: Augmented Reality

AR/VR on WebRTC WebGL , Three.js and WebRTC