openRMC Documentation


Author(s): Brendan O’Farrell

Email List:

Rev Date: September 18, 2012

Date: June 14, 2012

Copyright: LICENSE.txt


This document introduces WebRTC.Real time communication across multiple browsers.

1. Introduction

The primary aim of this document is to gain an understanding of WebRTC (Real Time Communication). How and why is it being developed. It will look at the organisations developing WebRTC and those which will set the standards for developers and users alike. The document will investigate the required components of WebRTC. It will review it’s key features and the associated technology. It will discuss the possible effects of WebRTC on the established Telecommunication’s Industry. We will take a look at the future possibilities and benefits of WebRTC.

As WebRTC is still in the process of having standards and protocols ratified, this document will evolve over time and will require regular content updates.

2. What is WebRTC

"WebRTC is a free, open project that enables web browsers with Real-Time Communications (RTC) capabilities via simple Java script APIs. The WebRTC components have been optimized to best serve this purpose." WebRTC

WebRTC allows real time peer to peer audio visual communication via a HTML5 compliant browser. Not all browsers have WebRTC capability at present. At this time of writing both Google Chrome and Opera 12 have it available to test, Firefox 18 will have it 1st quater 2013 with Internet Explorer following with their version CU-RTC-WEB in early 2013. There are no plugins required for WebRTC to work and no expensive pieces of hardware either. Just a WebRTC enabled browser, a camera (which is often quite standard on all new laptops), a mic/headset or mic/speakers and you real time communication available to you.

Having WebRTC integrated in a HTML5 enabled browser means you can now make real time audio video calls to any other WebRTC enabled device including such devices as tablets, smart phones, e-readers etc. The quality of these calls are only limited by the quality of the hardware, the audio and video codecs are open sourced and of a very high quality giving the client an excellent audio visual experience.

With IETF having set the standard for protocol and signalling, and W3C having set the standard for the APIs for app developers, this means millions of Java Script developers can now deliver and define web based communication. No longer will it be the domain of the small number of SIP developers and VOIP system resellers.

WebRTC has the potential for real change in how we communicate, much the same way the browser did for information. The effect can be that big. What we need is that all invested parties comply with the standards laid down by W3C and IETF whether this comes to bear only time will tell.

3. Who are the organisations behind WebRTC

There are number of companies involved in the WebRTC project in its present state. They are Google, Mozilla, Opera on the browser side with W3C and IETF on the standards side. Only recently Microsoft have also shown an interest in this project but we are not expected to see any development from them until 2013. Having said that they are currently recruiting developers to work on a project combining Skype with WebRTC, so development could occur earlier than expected.

3.1. Google and WebRTC

Google have been to the forefront of this project. They want to develop a standard based real time media engine available in all browsers. In order to drive the development of real time communication Google have released nearly $70 million worth of open source code to developers. This open source audio and video codecs came about through the acquisition by Google of companies such as Global IP solutions and On2 Technologies.

In early 2010, Google finalized its acquisition of On2, a video codec company that has developed the VP series of codecs, with the latest codec being VP8. On2 has always positioned its codecs as a patent free replacement to the H.26x series of codecs, which were standardized, patented and widely used. It then went about opening On2’s technologies to the world and open sources VP8 under the name of WebM. The idea was to replace H.264 for web videos and by that, reduce patent costs for everyone - especially Google itself.

Google went on and during 2010 acquired Global IP Solutions (GIPS), a company known for their media frameworks - a piece of technology that makes developing VoIP and video calling applications easier. At the time, GIPS had a large market share in VoIP, which caused most of the industry to scurry and search for alternative solutions. As with On2, Google took GIPS assets and open sourced them. This time they threw out all voice and video codecs that had patent owners and added an additional layer - a Java Script API as an integration layer to web browsers. The idea, have bidirectional media processing and media coding technologies available in every browser. It then went on to push it as a standard at the IETF and W3C, where such standards are set and approved . This real time communication is now called WebRTC. Google have a Google+ forum page which keeps up with the latest WebRTC developments.Google+ WebRTC

3.2. Mozilla/Firefox and WebRTC

Mozilla Firefox has started showing off WebRTC support in their browser. Nothing stable enough to be able to release it in their main branch of the browser, but this is a positive step as developers can test in multiple browsers.

Mozilla attended IETF 83 in Paris, and showed an early demo of a simple video call between two Browser-ID-authenticated parties in a special build of Firefox with WebRTC support.

Mozilla have been experimenting with integrating social features in the browser, to combine it with WebRTC to establish a video call between two users who are signed in using Browser ID (now called Persona). The Social-API add - on, once installed, provides a sidebar where web content from the social service provider is rendered. In the demo social service, you can see a "buddy list" of people who are currently signed in using

3.3. Opera

Opera released its new version of its web browser Opera 12 on June 14 last. The new version includes preliminary support for WebRTC. WebRTC will eventually enable standards-based audio and video chat in Web applications. There is also support for the WebRTC media capture APIs, which allow Web content to capture live media streams from the user’s microphone and web cam.

The WebRTC getUserMedia API works out of the box in Opera 12 and can be used by any website. Due to the potential privacy and security implications, the user is automatically prompted by the browser before the feature is allowed to be

3.4. Microsoft and WebRTC

"Microsoft, Internet Explorer has put its weight behind WebRTC, a plugin-free technology for voice and video communications in the browser. However, it proposed a different approach other than the one currently favored by other browser vendors, and warned against implementing the technology before there’s a common standard." Janko Roettgers

"Customizable, Ubiquitous Real Time Communication over the Web," or short CU-RTC-Web, is Microsoft’s contribution to the W3C WebRTC working group. Microsoft have stated that they have been closely involved in the work on the WebRTC standard with both the IETF and W3C since 2010. Unlike other browser companies their work has been very quite and not as publicly available as other interested parties. This has of course now changed with the release of their version of WebRTC, CU-RTC-Web.

There are a number of reasons Microsoft have taking a different approach, the most notable being the VP8 video codec that has been put forward as the default video codec. Although Microsoft have and still have to some issues with this codec, they feel developers should not be tied down to individual codes. They also have concerns in the predetermined way media is proposed to be sent over the network, they prefer a more flexible and customised approach to implement the technology on legacy devices.

Aside from the issues with codecs and media management, Microsoft feel that eventually all parties concerned will agree on common standards. With CU-RTC-Web Microsoft envisage that the wall garden approach of Skype will come tumbling down and allow interoperability between Skype and other OTT operators such as Google Talk.

3.5. Apple/Safari and WebRTC

At present Apple have no part in the implementation or development of WebRTC in their Safari web browser, as they have vested interest in their FaceTime walled garden of audio/visual communication and in the H.264 audio codec, which isn’t part of WebRTC at present.

However WebRTC4all built by the Google development team is an extension for Safari and other browsers. This allows developers to add the WebRTC functions such as audio/video streaming in all browsers now and they will easily be able to switch to the official implementation when it’s added by Apple and others.

3.6. IETF & W3C

The organisations which are responsible for the WebRTC standards are IETF which is responsible for protocol and signalling and W3C which will look after the standards for APIs for all app developers. We will take a more in-depth look at these organisations and there roles later in this document.

4. Who will set the standards for WebRTC

The organisations which are responsible for the WebRTC standards are the IETF, which is responsible for protocol and signalling, and the W3C which looks after the standards for APIs for all app developers.

4.1. W3C (World Wide Web Consortium)

Web Real-Time Communications Working Group Charter (Formed 2011)

"The mission of the Web Real-Time Communications Working Group, part of the Ubiquitous Web Applications Activity, is to define client-side APIs to enable Real-Time Communications in Web browsers. These APIs should enable building applications that can be run inside a browser, requiring no extra downloads or plugins, that allow communication between parties using audio, video and supplementary real-time communication, without having to use intervening servers (unless needed for firewall traversal, or for providing intermediary services)

The working groups scope is enabling real-time communications between Web browsers require the following client-side technologies to be available:

  • API functions to explore device capabilities, e.g. camera, microphone, speakers (currently in scope for the Device APIs & Policy Working Group)
  • API functions to capture media from local devices (camera and microphone) (currently in scope for the Device APIs & Policy Working Group)
  • API functions for encoding and other processing of those media streams,
  • API functions for establishing direct peer-to-peer connections, including firewall/NAT traversal
  • API functions for decoding and processing (including echo cancelling, stream synchronization and a number of other functions) of those streams at the incoming end,
  • Delivery to the user of those media streams via local screens and audio output devices (partially covered with HTML5)"

The working group will not be looking at the protocols for WebRTC, this will the domain of IETF.W3C

4.2. IETF (Internet Engineering Task Force)

IETF create standards that apply to the internet to improve internet usability. The IETF RTCWEB WG was formed in April 2011 and is currently generating RFCs in moving to a standard. The IETF is charged with setting all the protocols and signalling involved in WebRTC. To View all drafts please click on link. IETF and WebRTC

Key documents are:

  • draft-ietf-rtcweb-data-channel
  • draft-ietf-rtcweb-jsep
  • draft-ietf-rtcweb-overview
  • draft-ietf-rtcweb-rtp-usage
  • draft-ietf-rtcweb-security-arch
  • draft-ietf-rtcweb-security
  • draft-ietf-rtcweb-use-cases-and-requirements

5. The components of WebRTC

There are 3 components to WebRTC:

  • Audio
  • Video
  • Network

5.1. Audio

The WebRTC project offers a complete stack for voice communications. It includes not only the necessary codecs, but other components crucial for a great experience. This includes software based acoustic echo cancellation (AEC), automatic gain control (AGC), noise reduction, noise suppression and hardware access and control across multiple platforms. There are audio codec standards set by the WebRTC working group charter. They are the iLBC and the iSAC audio codecs,both developed by Global IP Solutions. Global IP Solutions was purchased by Google in 2010.Since then Google have provided the audio codecs royalty free.

There is a third codec which many have been clamouring to be excepted into WebRTC, and that is the Opus codec.The Internet Engineering Task Force has standardized the Opus audio compression technology as RFC 6716 in early September 2012. The move paves the way for much broader use of Opus for anything from playing music to online voice chats. Opus is implemented in Skype and now Mozilla are to adapt it in their Firefox browser for WebRTC.

The iSAC audio codec

iSAC is a robust, bandwidth adaptive, wideband and super-wideband voice codec developed by Global IP Solutions used in many Voice over IP VoIP and streaming audio applications. iSAC is used by industry leaders in hundreds of millions of VoIP endpoints. This codec is included as part of the WebRTC


  • The sampling frequency is 16kHz (wideband) or 32 kHz (super wideband).
  • Adaptive and variable bit rate is 10 kbit/s to 52 kbit/s
  • Adaptive packet size is 30 to 60ms.
  • Complexity comparable to G.722.2 at comparable bit-rates
  • Algorithmic delay of frame size plus 3ms.

The iLBC audio codec

iLBC is a free narrowband voice codec that was developed by Global IP Solutions used in many Voice over IP VoIP and streaming audio applications. In 2004, the final IETF RFC versions of the iLBC codec spec and the iLBC RTP Profile draft became available. This codec is included as part of the WebRTC project.webrtc.orgIETF Draft


  • Bitrate 13.33 kbps (399 bits, packetized in 50 bytes) for the frame size of 30 ms and 15.2 kbps (303 bits, packetized in 38 bytes) for the frame size of 20 ms
  • Basic quality higher then G.729A, high robustness to packet loss
  • Computational complexity in a range of G.729A

Opus interactive audio Codec

The Opus codec is designed for interactive speech and audio transmission over the Internet. It is designed by the IETF Codec Working Group and incorporates technology from Skype’s SILK codec and Xiph.Org’s CELT codec.

The Opus codec is designed to handle a wide range of interactive audio applications, including Voice over IP, videoconferencing, in-game chat, and even remote live music performances. It can scale from low bit-rate narrowband speech to very high quality stereo


  • Bit-rates from 6 kb/s to 510 kb/s
  • Sampling rates from 8 to 48 kHz
  • Frame sizes from 2.5 ms to 60 ms
  • Support for both constant bit-rate (CBR) and variable bit-rate (VBR)
  • Audio bandwidth from narrowband to full-band
  • Support for speech and music
  • Support for mono and stereo
  • Support for up to 255 channels (multistream frames)
  • Dynamically adjustable bitrate, audio bandwidth, and frame size
  • Good loss robustness and packet loss concealment (PLC)
  • Floating point and fixed-point implementation

You can read the specification in the latest Internet Draft.

5.2. Video

The VP8 codec is the codec of choice for the WebRTC project, introduced in 2010 as part of the WebM project. It includes components to conceal packet loss, clean up noisy images as well as capture and playback capabilities across multiple platforms. WebM is an audio-video format designed to provide royalty-free, open video compression for use with HTML5 video. The project’s development is sponsored by Google Inc.WebMproject

The VP8 video codec

"VP8 is a highly efficient video compression technology that was developed by On2 Technologies. Google acquired On2 in February 2010. It is the video codec included in the WebRTC project This is the first time a video codec that has been open sourced compares favourably with the industry standard H.264. In July 2010 The Mpeg-tech group carried out a comparison between H.264 and Vp8.Mpeg-4 Tech group Every year, the MP4-Tech experts group compare every h.264 implementation in order to track performance and quality improvements.The Graphics and Media Lab of Moscow State University published a new, deep study of the performance of VP8, x264 and XviD implementations.

It’s unusual that Mpeg-4 tech group would test a codec other than h.264 but they did with VP8 and they prove that the results are respectable in many areas.

In HDTV for example, VP8 performed similar to x264 (considered the best implementation of h.264 by previous comparisons) but with 5-20% lower encoding speed. Comments from VP8 developers say that "old comparisons results have an inherent bias against VP8 because input sequences were previously encoded using another codec before being applied to VP8".

These results can improve very quickly with optimizations, and the Russian lab hasn’t yet tested implementations of VP8 other than the one provided by Google developers. Ronald Bultje, David Conrad, and Jason Garret-Glaser, x264 developers, are now creating a native VP8 video codec implementation for the open source FFmpeg project.This is the most in depth study done to date, all other studies that compare the two codec have been specific and not as wide ranging.

The two may differ in results at times, but the difference is not visually apparent to the naked eye. The way seems widely open for the WebM(VP8) Project to provide a truly free, high quality codec for the world. OS News Although at the time of writing the VP8 codec still has to be ratified."

The VP8 encoder and decoder are available from the WebM Project

VP8 Data Format and Decoding Guide from The WebM project

The H.264 video codec

H.264 is an ITU standard for compressing video based on MPEG-4. Taking advantage of today’s high-speed chips, H.264 delivers MPEG-4 quality with a frame size up to four times greater. It can also provide MPEG-2 quality at a reduced data rate, requiring as little as one third the original bandwidth. H.264 is used in many video applications today such as blue ray players and mobile devices.

H.264 Features

  • High compression performance
  • Advanced Intra-Prediction
  • Strong Motion Isolation (4x4, 1⁄4-pel resolution)
  • Multiple Reference Frames
  • Very High Complexity!
  • Weighted Bi-Prediction
  • Context-adaptive VLC/BAC
  • Average bit rate reduction of 50% given fixed fidelity compared to any other video standard
  • Exact match decoding
  • Integer Transform
  • Improved Perceptual Quality
  • In-Loop De-blocking Filter
  • Network friendliness
  • NAL (Network abstraction layer)
  • Enhanced Error Resilience
  • AddPac Technology Confidential

The biggest disadvantage the H.264 codec has with regard to WebRTC is that it is not royalty free, unlike the VP8 codec. However the VP8 codec is not an option on many existing smart phones or tablets. So what does this mean for WebRTC. Google and Opera are firmly in the VP8 camp, Apple have a large vested interest in H.264. Microsoft is an unknown at present although they do use H.264 in IE9 and if rumours are to be believed, Mozilla will use H.264 but offer the option of using VP8. A standard industry approach, called confusion. Looking at the lay of the land now it would seem we could end up with duel codec support. This would defeat the ideal of WebRTC, royalty free codes for everyone and interoperability.

5.3. Network

Dynamic jitter buffers and error concealment techniques are included for audio and video that help mitigate the effects of packet loss and unreliable networks. Also included are components for establishing a Peer to Peer connection using ICE / STUN / Turn / RTP-over-TCP and support for proxies. This technology comes in part from the Libjingle


"Libjingle is a collection of open-source C++ code and sample applications that enables you to build a peer-to-peer application. The code handles creating a network connection (through NAT and firewall devices, relay servers, and proxies), negotiating session details (codecs, formats, etc.), and exchanging data. It also provides helper tasks such as parsing XML, and handling network proxies.

Features of Libjingle:

  • A multi-user voice chat application
  • A multi-user video conferencing application
  • A multi-user live music streaming application
  • A peer-to-peer file sharing application

Libjingle is available on Google Code for both Windows and UNIX/Linux operating systems.The source code is provided as part of Google’s commitment to promoting consumer choice and interoperability in Internet-based real-time-communications. This code is made available under a Berkeley-style license, which means you are free to incorporate it into commercial and non-commercial software and distribute it".Libjingle project.

Libjingle was formed when Google and the XMPP standards foundation designed Jingle. Jingle is an extension of XMPP. It adds peer to peer session control for VoIP or video conferencing. The libjingle library is used by Google to implement jingle in their Google Talk client application. Googel talk provides both voice and text communication.


ICE and STUN are standardized methods for establishing a peer-to-peer connection on the internet, even if the two end points are behind private network addresses (NAT). At present Google’s current stack deviates from the official standard.They are working to rectify this. Google will also support TURN servers to allow connections through tougher firewalls, where relaying and encapsulation are needed. Exactly what type of TURN will be supported has yet to be define.


"ICE (Interactive Connectivity Establishment) is a protocol for NAT (Network Address Translator) transferal for UDP-based multimedia sessions established with the offer answer model. ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). ICE can be used by any protocol" IETF/ICE : IETF/ICE/PDF presentation


"Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (Nat’s) (STUN) is a lightweight protocol that allows applications to discover the presence and types of Nat’s and firewalls between them and the public Internet. It also provides the ability for applications to determine the public Internet Protocol (IP) addresses allocated to them by the NAT. STUN works with many existing Nat’s, and does not require any special behavior from them.As a result, it allows a wide variety of applications to work through existing NAT infrastructure."IETF/STUN


"If a host is located behind a NAT, then in certain situations it can be impossible for that host to communicate directly with other hosts (peers). In these situations, it is necessary for the host to use the services of an intermediate node that acts as a communication relay. This specification defines a protocol, called TURN (TraversalUsing Relays around NAT), that allows the host to control the operation of the relay and to exchange packets with its peers using the relay. TURN differs from some other relay control protocols in that it allows a client to communicate with multiple peers using a single relay address.

The TURN protocol was designed to be used as part of the ICE (Interactive Connectivity Establishment) approach to NAT traversal, though it can be also used without ICE."IETF/TURN

6. The key features of WebRTC

6.1. The 3 key features of WebRTC

  • Media streams (getUser media): This is used to gain access to the users camera and microphone. It can also be used to design WebRTC applications.
  • Peer connection: This is the engine required to make high quality audio visual calls on the web.
  • Data channels: The specification for data channels has yet to ratified. It will be used for such things as, file transfer, games, co-browsing, shared whiteboard, shared document editing and more.

Media Integration Coding examples

6.2. Media streams

A media stream represents a media source, containing one or more synchronized media stream tracks.It can be converted to an object URL and passed to a <video> element. You use the getUser media API to get a media stream for the web cam and microphone. The user is then prompted to allow their consent, an example of this can be seen in Opera 12 where a text box drops down and requests permission from the user to allow use of their camera.

Some examples of using the getUser media. The photo booth app allows you take a photograph of the video image and add different effects to it. This app is great fun for the consumer but commercially it can be used for a multitude of reasons. You could be looking to by a car, the callee can be showing you the car live using his smart phone and you can take snapshots of say the engine number or some damage that might require a quote to repair. This is only the tip of the iceberg.

Another example is face recognition. We could do away with passwords for on-line banking, social media sites, on-line shopping or any interaction which requires a password to gain access to an account. These are only two examples of using getUser media, we can expect to see the number of these apps multiply daily.Media capture and streams, W3C

Image source W3C

6.3. Peer connection

This is the engine behind making high quality audio/visual calls on the web.Peer connection allows us to the take the media stream and send it across the web peer to peer. The actual code that implements PeerConnection is now a part of libjingle. While PeerConnection has no session protocol and no XMPP/Jingle is required, we’ve reused many useful components from the Libjingle package.WebRTC

Peer connection is the API for establishing audio video calls (sessions).

Some built in features:

  • It establishes peer to peer links
  • Manages all the various audio and video codecs
  • Sets the encryption
  • Tunes the audio video streams to make better use of the bandwidth

Peer connection calling sequences

The original content for the following 4 sequence calls was supplied by

Start a call sequence diagram

This diagram takes us through the step by step sequence of starting a call.

Disconnect a call sequence diagram

This diagram takes us through the step by step sequence of disconnecting a call.

Receiving a call from a remote peer

This diagram takes us through the step by step sequence of receiving a call from a remote peer.

The remote peer begins the disconnection of a call

This diagram takes us through the step by step sequence of the remote peer initiating the disconnection of a call.

6.4. Data channels

A data channel is a peer to peer exchange of arbitrary application data. It has low latency, high message rate/throughput and optional unreliable semantics.

There are many potential use cases for the Data Channel API, including:

  • In gaming if you need to send data about say positions, directions. It is more efficient to send them over a peer to peer connection than over HTTP
  • Real time text. An example of this is sending code or a process to an engineer who is out on site
  • File transfer, no more having to drive to your accountants with boxes of paper
  • Remote desktop applications
  • Decentralized networks, you can communicate on a private encrypted channel.

Key features of the Data channel API:

  • Leverages peer connection session setup
  • Multiple simultaneous channels, with prioritization
  • Reliable and unreliable delivery semantics
  • Built in security (DTLS)
  • Congestion control
  • Can be used with or without audio and video
  • Similar API to websockets

The syntax is somewhat similar to WebSocket, with send() and onmessage, as you will see in the code sample below:

// PeerConnection setup and offer-answer exchange omitted
var dc1 = pc1.createDataChannel("mylabel"); //create the sending DataChannel
var dc2 = pc2.createDataChannel("mylabel"); // create the receiving DataChannel

// append received DataChannel messages to a textarea
var receiveTextarea = document.querySelector("textarea#receive");
dc2.onmessage = function(event) {
  receiveTextarea.value +=;

var sendInput = document.querySelector("input#send");
// send message over the DataChannel
function onSend() {

At present the specification for data channels has yet to ratified, here are some initial proposals and an interim report.

IETF/Data Channels

Data channels WebRTC

Interim report on data channels from Randell Jesup/IETF

7. WebRTC Technology

There are a number of elements are required for a client with real time communication, a media engine,an interface and a framework. See figure 1 Media engine components.

The large rectangle area to the left houses the communications /collaboration GUI, this is the visual interface. The visual interface can be a hard interface such as a tablet screen, phone keypad, pc or any other device.The media engine is housed in the the largest rectangle area, and its main function is that it manages the real time transmission and receipt of a video/audio stream, the rest of the diagram contains the framework. Image source PKE

Here are the set of functions which enable the media engine to deliver and receive high quality sound.


  • Setup and control the hardware
  • RTP, compression, encryption, statistics, etc.
  • Produce low-latency audio from microphone
  • Conceal loss, de-jitter and play audio from the network
  • Cancel echo, VAD, reduce noise, etc.
  • Manage codecs

Here are the set of functions which enable the media engine to deliver and receive high quality vision.


  • Render video, capture camera input
  • Video processing (blue screen, gamma, etc.)
  • Conceal loss, de-jitter and play video from the network
  • Cancel echo, VAD, reduce noise, etc.
  • Manage codecs
  • Bandwidth Management

The main aim of WebRTC is to combine the media engine and a set of standard APIs with the result being a browser capable of real time communication. In figure 2 you can see the internal workings of a media engine.

Image source WebRTC

The WebRTC Media Engine uses both a set of standard components, including codecs to minimize the issues of two WebRTC end points communicating, It also includes a set of standard APIs so a server that the browser connects to can control the WebRTC Media Engine in the client. Beyond the basic media functions, WebRTC includes an API set that enables the controlling server software to cause a direct connection between two WebRTC devices without any other external signaling. By merely telling two WebRTC devices to communicate, the server can initiate a IP based voice or video communications. PKE Consulting

Image source Silvia Pfeiffer

Sample code

WebRTC native APIs

Sample client application

Libjingle source code

8. WebRTC and the communications industry

This document will look at VoIP, video conferencing and how they will have adapt and change with the advent of WebRTC. It will also look at the existing audio visual communication companies such as Skype.

8.1. VoIP

There are three main components to VoIP:

  • Network
  • Signaling
  • Media


The network gets the data from one point to the other . It has its own characteristics which then affect what types of signaling and media are used. In the case of the internet, we can generally speak about these characteristics:


  • No guaranteed quality of service: data sent might not be received
  • No guaranteed latency: data sent may arrive with different delay characteristics
  • Heterogenic in nature: different parts of the network behave differently (WiFi, LAN, Ethernet, MPLS, LTE, etc)
  • Asymmetric: NAT and firewall devices may restrict reaching certain addresses


Signaling is the addressing part of VoIP.

Signaling includes:

  • Registration and discovery - how do I tell the world how to reach me?
  • Dialling - how do I dial, receive calls, drop them, etc.
  • capabilities - how do participants in the call understand what each side is capable of?
  • Supplementary services - the usual suspects: hold, mute, transfer, forward, park, conferencing.

The above must be done in the same language between all participants all should know how to communicate in SIP, H.323, XMPP.

You can look at signaling for another angle, and that is non-functional features it needs to provide:

Non functional features:

  • Networking: IPv4, IPv6, UDP, TCP, etc.
  • Security: things like privacy, authentication, etc.
  • Connectivity: ability to connect endpoints no matter where they are. This focuses mainly on NAT traversal issues
  • The non-functional features also fit in with requirements for the media.


Media is what we are here for. The moment enough knowledge exist between the participants of the call, media kicks in and starts to flow between the participants to get us our call.

Media is usually thought of as the voice and video codecs along with their transport mechanism.In recent years, media got wrapped by vendors into components called media engines or media frameworks. These took care of the codecs and their transport. As everything else, these vendors tried (and are trying) to move up the food chain, and for them this means adding some signaling features into these media engines. It works great most of the times, as it eases the integration and the development efforts of application vendors.

8.2. WebRTC and VoIP

WebRTC is a media engine that target browsers. It offers a Java Script API that is being standardized, so it will be available to all web developers eventually.

WebRTC also include a bit of signaling into it, non-functional NAT traversal mechanisms: STUN, TURN and ICE. Luckily, these 3 mechanisms are also the ones defined for NAT traversal use in both SIP and XMPP.Bloggeek

When Google purchased GIPS and open sourced their codes, the VoIP industry had to look for alternatives. GIPS was the largest voice engine company around and almost every VoIP developer had some dealings with them.

With Google having open sourced the codes, it meant there was no real way you as a customer can get an SLA from Google for maintaining and improving the GIPS voice engine. And it is an issue, as a lot of GIPS customers used custom built voice engine packages, ones that are specific to their clients needs.

The VoIP can still continue but it will have to adapt. It will have to include the audio codecs of WebRTC which will be awkward but not to difficult.They can take WebRTC and embed it into their own products, web based or not. Gateways will need to developed to connect the world of WebRTC and that of SIP and PSTN. The two worlds can live together once the industry adapts and evolves.

8.3. Skype

Skype was launched as a start-up in 2003 and to date it has 100 million plus active users. Microsoft purchased Skype in 2011 for an astonishing $8.5 billion, which is basically around $1.000 per user.

Presently Skype are running an ad campaign called, "It’s time to Skype".One of the more effective ads is titled "When did it become okay to text Mum Happy Birthday?". Skype believe that face to face communication is being lost in this digital age and it needs to be brought back.

While Skype may view the advent of texting and instant linkage through social media as a threat to its market share in communication, it may have a bigger threat on the horizon, WebRTC. Skype may just become obsolete or will it?. Microsoft state they did not purchase Skype for what it is but what it will become. Three things have occurred recently which lend credence to that statement. One that Skype have started to adapt the VP8 video codec, at present this is the codec of choice for WebRTC. Secondly they have been actively trying to recruit developers to develop an app which will incorporate WebRTC and Skype. The app will allow communication between their legacy customers and WebRTC. The third move was partnering with Facebook, to offer Skype calling through Facebook. In early September 2012 Skype announced that it is to start using the Opus audio codec. Many are looking for this codec to be implemented as standard in WebRTC. Also in early September the IETF announced a standardization of the OPUS codec in RCF 6716.

These issues indicate that Skype is heading towards a pure web based service, but more importantly it means that Skype are not afraid to change to fit the current or future market. And at present the biggest change of all is WebRTC.

Skype can and will adopt WebRTC when the time comes. They will do the necessary changes in their network architecture to fit WebRTC right into their business plan and continue to grow.

8.4. WebRTC Disruptive Technology for Enterprise Communications.

This is a newsletter from STC (Society of Telecommunications Consultation)
and their view on WebRTC's impact on their industry. This article is
written by Chris Vitek board member of STC.

"The ubiquitous adoption of true peer-to-peer voice and video communications is upon us. It will be free, easy-to-use and compatible with SIP via SDP. WebRTC has been embraced by the IETF and the W3C just as SIP was 16 years ago. The ITU may have to sit this one out for a while since the architecture does not lend itself to centralized control. Which is the precise issues that contact centers and enterprises will need to address if they want to compete for the modern, smart-phone or tablet wielding consumer.

To help put WebRTC in perspective I have chosen to use some common use-cases for the technology.

Web-page initiated communications. With traditional tools there is a need to download an application that will support live communications. Multiple clicks and loading a new application on your PC is not the stuff of convenience, not to mention the enterprise security issues that come into play. HTML 5 has gone a long way to minimize the number of clicks, but it still requires the user to download the application. WebRTC is different. With WebRTC the voice/video application will be embedded in the browser (called via Java Script) so no download is necessary. The impact on the user is that a voice/video communications session can be initiated with a single click. Fast, easy-to-use and convenient and it will work the same way from a smart-phone, tablet or computer, except for Apple (more about Apple later ). The simplicity of this approach will drive adoption by consumers. Imagine a LinkedIn hyperlink that connects you live with your business contacts. No more 10 digit dialing, no more phone directories and it works the same way to-and-from all of your devices. Now imagine an e-mail shipping confirmation that has embedded links for the customer service contact center. If the package does not arrive on time the customer simply clicks the hyperlink and connects with a representative to get information about the status.

The use cases are compelling for both personal and enterprise use; however, the enterprise implementation of this technology can only be described as disruptive. Traditional communication are routed through a public switched telephone network (PSTN) using rout selection based on a multi-digit, DTMF dialing plan. WebRTC communications are not. They are routed via TPC/IP and DNS. Traditional trunks are not necessary nor are traditional switches. There will be a period of years where enterprises need to support traditional and WebRTC communications. This will be a time of great transition. Should the enterprise continue to support a traditional infrastructure? Should they explore SIP switches? Are the current crop of SIP switches compatible with WebRTC? Does the web team at any given enterprise understand the engineering issues surrounding WebRTC? The result of answering any of these questions wrong could cripple a B-2-C company in a matter of months.

Consider the office products business. 12 years ago 99% of their orders came through telephone conversations of fax. Today, 70% of orders come through the web. If an order goes bad, then the customer has to send an e-mail or talk to customer service. The later case requires that the customer picks up the phone and dial a number. Then, the customers has to identify them selves, the order and the item. This all takes time. In the WebRTC model, the confirmation e-mail can have embedded links next to each item. These links can offer options to e-mail or call customer service about the item. If a call is initiated, then information about the call (item SKU, order number, customers identifier) arrives at the contact center before the call arrives. Routing to the right agent and CTI are triggered by the associated data and the customers can solve their problems more quickly. These communications will be supported by SDP compliant, peer-to-peer signalling; however, they will not be handled by a telephony carrier, they will communicate directly with the IP enterprise infrastructure. Basically, virtual SIP-like sessions established on the enterprise Internet infrastructure. Network security folks will have a significant hand in architecting these solutions. Where to place and how to configure the SBCs, WebRTC gateways, SDP protocol conversion, firewalls, proxies, web services, user agents and/or SIP switch (ACD) will all come into play. Further, there will be a need rethink processes in terms of this new communications infrastructure. Some of the most complex processes that customer service operations support today will become automated in ways that we never imagined.

WebRTC is available today on Google’s Chrome browser (Alpha) and Opera Mobile. Mozilla/Firefox has an active development project underway. Microsoft has publicly supported the effort, but they have a lot on their plate with the roll out of Win 8 later this year. Voxeo has an SDK available. Apple offers a competing solution in the form of Face Time.

Skype took nine years to acquire 700 million users and it requires a down-loadable application. Without the need for a download, WebRTC has the potential to triple the number of voice end-points in the public network within two to three years. At the same time it will create an order of magnitude increase in video end-points. Ubiquitous availability of WebRTC will drive adoption at a much faster rate than Skype. Contact centers and web developers will need to collaborate more than ever to leverage this new opportunity".Chris Vitek

9. Possible applications for WebRTC

Here we take a look at a small selection of possible use case scenarios.

  • Have you ever played an online game were you found yourself screaming at your team mate or opponent but to no avail as the technology doesn’t allow for audio/visual interaction while playing online. WebRTC will give you that capability. You can now play a game where you see the video and hear the audio of your opponent or team mate and still play the game live using WebRTC data channels.
  • Do you find looking for the right employee time consuming and a drain on your resources. You have to put an interview panel together or use the services of an employment agency, this all takes time and money to organise. We have all heard of companies like Microsoft using LinkedIN to find the right type of employee. That is an excellent use of available resources but, what if we can take it one step further. Initial contact is made by the HR department from Microsoft using LinkedIN and WebRTC peer to peer communication, so now they can filter out the people that don’t suit their criteria without any costly interviews. Once they have the people they think can suit their needs, then the panel interviews them. But WebRTC is not finished, all the participants can do the interview process from their office or home and with file sharing, and soon screen sharing any tasks the interviewee is asked to complete during the interview process, can be viewed by all the panel members on their screens. The monitory costs and the time it normally takes to fill a position is drastically reduced. This is all made possible by using WebRTC.
  • So, your self employed, maybe with a couple of employees or you just work on your own. You are norm anally on site from early morning and you are not home until late the same evening. The tax returns are due but you can’t afford the time to drive to your accountants and lose a days work. But you have WebRTC on your desktop PC, so you have peer to peer communication, file sharing and screen sharing. So instead of going to the accountants office they visit you at your office via WebRTC in your HTML5 enabled browser on your desktop PC. Now the accountant can guide you through your tax returns, by talking you through the online process, organising the files you have shared with them while talking to them, or if need be, you can share your screen with your accountant so they can confirm you online submission to the tax authorities is correct. This process has now only taken a small amount of time, so you are available to work on site for a period that same day.
  • You purchased a smart TV a few months ago and you just received an e-mail offering a range of new services. But this will require a firmware update and not being technically minded you will need help with this upgrade. A call out fee of one hundred euro from the retailer is much to expensive. So what to do? Press the help button on your remote control. The browser opens in your TV and asks you if you require assistance from the technical help and if so press select. Now like magic a real person appears on your TV screen and talks you through the process. He can show you which buttons to press on your remote as he has also got the same remote in his hands. In the space of a few minutes and of no expense to you, the upgrade is complete and you can avail of all the new services.
  • You really need to talk to your friend face to face about your school project but you live to far away. WebRTC is the answer. Open your Facebook account, select the audio/visual call button. Then you choose the friend you wish to talk with. They accept the request and now you have one to one audio visual communication. With WebRTC file sharing, you can send and receive the project notes, while still discussing the project all at the same time. This form of integrated communication will take social media to a whole new level.
  • No more having to drive long distances to the monthly sales meeting. Now you have video conferencing and sharing files from inside an app. The office calls their sales representative from inside the app and are able to share files through the peer to peer connection as they are talking via the data channel. Thus saving valuable sales time and in turn creating a better profit margin. All done in real time through multiple HTML5 enabled browsers.
  • You are a paramedic/doctor attending a serious road traffic accident, a driver is trapped in his car and needs to be removed to hospital urgently but this requires an on site operation which you have no experience of. With a smart phone or tablet using WebRTC, an off site surgeon based anywhere in the world can now direct you as he views the situation and talks you through the process thus saving the driver’s life.
  • The area you live can be become isolated in the winter months leaving you unable to leave your house to go to school or college. But you still have deadlines to meet such as a project which requires the implementation of an application. The lecture has to interview you about the project, view you coding and talk through it and finally see the application running on your desktop. With WebRTC peer to peer communication, file sharing and screen sharing, all of this is possible so you never have to fall behind the other students or lose valuable marks for a late presentation.
  • Dating sites can now introduce possible couples to one another using WebRTC, negating the security risks of a blind date. The first meeting can take place in the comfort of your own, home where everybody is at their most natural. No costly dinners or new cloths and if you don’t like them just end the call.
  • This user case came to mind when the local priest was visiting my father at home. He proceeded to bemoan the fact that he had to look after a number of small parishes now, while trying to call out to the old and sick for confession or communion and time was against him. Now WebRTC can’t do a whole lot about the communion, not yet anyway. But the confession can be facilitated quite easily in WebRTC. Yes, you will say the elderly are not particular computer savvy but they don’t have to be. They press the power button on their PC, they click on their web browser, they wait, the priest calls, they are asked do they want accept this call, they press yes, now communication can begin. It really is that simple.
  • Adult entertainment industry. There are no reasons to expand on this user case just to say the possibilities are endless.

10. Current innovators of WebRTC applications

Here we will take a look at some of the current, and up and coming WebRTC innovators in the communication market. Also we will have a look at some fun interactive games which will highlight the different possibilties of WebRTC. Although WebRTC is been driven by the large browser companies, the app development side is coming from the small to medium side companies.

10.1. Current Innovators

  • Asterisk/Digium (Open Source PBX) is free, open source software that makes it simple to create and deploy a wide range of telephony applications and services. They are adding WebRTC to the Asterisk project which means that with a software update to your Asterisk PBX, your smart-phones, tablets, laptops and desktops can become endpoints for all your PBX features and services. Digium discusses WebRTC and Asterisk Digium
  • Bistri (Social Video): Video chat with fun video effects, take screen shots of calls, share them with friends or social networks. Bistri runs in the browser, so there’s no need to install additional software or plugins.Bistri
  • frisB (Free global calling) provides free global calling between any web browser and any phone (or web browser) with no downloads. Simply type in the number you want to call from the frisB screen in a browser that implements WebRTC—the service then calls your contact but does not connect, so a local number is available to call back in the contact’s phone call history. The contact then calls back this local number; for the initial use case, the contact needs to know it’s you that’s calling using frisB. The frisB service then takes the contact’s local call and directly connects it to you on the browser (wherever you may be in the world). It’s a two-step connection process between a browser and a phone, but does work with any phone on the planet.frisB
  • Tenhands (Enterprise HD Video Collaboration) is a desktop HD video collaboration service, it’s free and built for business needs. They’re in the process of adding WebRTC to their service, so using their service does not require any download to a browser.Tenhands
  • Utribo (Software as a Service): "Connect" by Utribo is a service that enables subscribers to receive calls made in a web browser to their computer, phone, or PBX. Based on WebRTC, "Connect" provides voice and video calling capabilities.Utribo
  • Voxeo Labs (Open source enabler for WebRTC services): Voxeo’s Phono is a jQuery plug-in that turns any Web browser into a multichannel communications platform, capable of placing and receiving VoIP telephone calls from the browser, as well as handling real-time chat communications. jQuery is a cross-browser JavaScript library, so developers can do more and code less. Phono is a client-side solution and requires zero server-side logic on the part of a developer; all communication is handled by the Voxeo Cloud. Using the Phono plug-in, applications such as Your Second Phone have been created and are available in the Chrome Web Store.Voxeo Labs
  • is a simple service that lets you make voice and video calls straight from your browser. No downloads, installations or firewall configurations are required. also offers chat (with in-line media, so your shared content loads right beside your conversation), history, SMS, and call options like Quick Call and Group Call. Previously they required a Flash plug-in to run from a web browser, but with WebRTC no download is required. They’re available in the Chrome Web
  • Zingaya ("Call" button for websites) enables voice calls through any computer from a web page. No download or phone is required. Zingaya offers this seamless voice calling capability to any website, whether it’s a large e-commerce enterprise or a start-up. Simply embed a "Call" button into the website. Visitors can click that button and the call is forwarded to the website operator’s preferred land-line or mobile phone. All that is required is a website; all the visitors need is a browser and microphone.Zingay.
  • Doubango Telecom: After the world’s first SIP video clients for Android and iOS (early 2009) Doubango Telecom are proud to present sipML5 Project. sipML5 is the world’s first open source HTML5 SIP client entirely written in javascript for integration in social networks (FaceBook, Twitter, Google+), online games, e-commerce websites… No extension, plugin or gateway is needed. The media stack rely on WebRTC project. The client can be used to connect to any SIP or IMS network from your preferred browser to make and receive audio/video calls and instant messages. Live demo
  • Vline: A cloud video conferencing platform for developers. VLine enables web and mobile developers to integrate high quality, video conferencing into their applications without using Flash or plug-ins. The product provides the necessary tools and cloud services for developers to add instant, live video capabilities into the browser, facilitating Real Time
Original source from Alan Quayle of

Alan Quayle/no jitter

10.2. Ones to watch out for.

  • Hookflash: Have produced an excellant ipad app which integrates LinkedIn’s directory, giving business users a free over-the-top alternative for voice, HD video and messaging. They are now looking for developers to take this app and re-develop it using WebRTC protocols. Hookflash
  • basysKOM: Are currently working on WebRTC based video chat as a plugin for settop box and TV solution Qt Media Hub. See their latest demonstration. basysKOM
  • Twinsee: Are developing a WebRTC app for mobile and web that gives the user real time communication. This app will be specially geared towards the older generation allowing them to interact with their familly and friends.Twinsee

10.3. WebRTC/Games

Some fun and interesting prototype games using WebRTC. Check out Protothon Blog for some more ideas using WebRTC.

  • itspuzzlible: A live-video puzzle game. A leap into the future of social in-browser gaming. There are 2 scrambled screens of real-time video. The game is re-organise the scrambled video stream whilst your opponent outwits by moving around and confusing you. A truly fun, truly interactive WebRTC puzzle. [itspuzzlible]
  • We Blend: A live, real-time distortion tool so friends can play with friends. From far away. An elegant, poetic and addictive take on the uses of WebRTC. We blend
  • Team Charades: A multiplayer, real-time, online classic. Built by a team of international masters. Team charades is a game in which teams of two compete to guess a number of secret phrases. The phases are revealed by the app and given to a player in each team. Charades
  • PingPong: A 3D WebRTC ping-pong game. By using colour tracking PingPong turns a post-it note into a table-tennis paddle. A fun, real-time video game of PingPong! Real-Time Colour Tracking. Ping pong

11. The benefits of WebRTC

  • Voice becomes just like all your other communications: organized into your preferred social or office tools.
  • As long as you’re data connected, communications is in the cloud. People need only break out to PSTN (Public Switched Telephone Network) when the other person is not data connected, or the call quality is too low due to their Internet connection. The PSTN becomes the communications path of last resort.
  • The company’s website now becomes its call center front end. A web-log becomes your personal communications assistant.
  • Communication service aggregators save customers running multiple clients on their phone; these would run in the cloud and be controlled from the browser.
  • Click to call doesn’t require an operator’s voice network, just access to the Internet.
  • Communications becomes like using any application on a smart-phone; users can add features or capabilities or people throughout a call; e.g., N-way calling finally becomes simple and obvious with a simple point and swipe.
  • New CRM (Customer Relationship Management) methods: click from email, from web page, from app, from TV. The ability to communicate becomes embedded in most transactions.
  • QoS (Quality of Service) remains an issue, but as the people using Vonage and Skype over the years will attest, QoS is rarely an insurmountable obstacle.
  • Directory services become critical sources of value in connecting all the different IDs: telephone numbers, SIP IDs (IDentifier), web session IDs, other OTT (Over The Top) IDs (e.g. Skype name), etc.
  • All the OTT communication applications e.g. Viber/Skype/Whatsapp can now be used everywhere: on your PC, smart phone, tablet, and TV. Breaking free from the app store allows them to explore new business models, just like the Financial Times did in breaking free of the app store in using HTML5.
  • The new communication services that telcos are building on top of RCS (Rich Communication Services) will need to inter operate with the WebRTC world; currently WebRTC is a closed book to RCS.
  • VAS (Value Added Services) leaves telco. Any web developer can create new communication services that deliver value and solve problems for customers. It the customer who will decide.
  • Advertising finally enters the communications space, opening up business model innovation.
  • Gaming becomes even more interesting, as any connected device with a camera can become a controller using gesture controls, as well as the more traditional methods for network-based games.Alan Quayle

12. Conclusion

Up until now you could do anything in your browser except for real time video calling, of course you could use flash but flash is not available on all devices or platforms. WebRTC gives you real communication it will reduce the need for flash in the browser.

Allowing the audio and video codecs to be open sourced under a very lenient open source license has made it very attractive to place WebRTC into commercial products. With Goggle, Microsoft, Opera and Mozilla backing WebRTC it would seem that all is well in the browser world. But until all the standards have been set by W3C and IETF, and adapted by all interested parties we won’t celebrate the dawn of a new communication phase. This has become all the more apparent in the recent weeks with Microsoft releasing their suggested version of WebRTC called CU-RTC-Web. All we as the end user’s can hope for is that some common sense will prevail, sooner rather than later.

Aside from all the political posturing, WebRTC is here to stay. It will bring great innovation, vast commercial opportunities and most importantly real time communication for everyone.


AEC: Acoustic echo cancellation

AGC: Automatic gain control

API: Application programming interface

CBR: Constant bitrate

CELT: Constrained energy lapped transform

DTLS: Datagram transport layer security.

DTMF: Dual tone multi frequency

GIPS: Global investments performance standards

HD: High definition

HTML: Hypertext mark-up language

HTTP: Hypertext transfer protocol

ICE: Interactive connectivity establishment

IETF: Internet engineering task force

iLBC: Internet low bitrate codec

iSAC: Internet speech audio codec

ITU: International telecommunications union

JSEP: JavaScript session establishment protocol.

LAN: Local area network

LTE: Long term evolution

MPEG: Moving pictures expert group

MPLS: Multi protocol label switching

NAT: Network address translator.

OTT: Over the top

PBX: Private branch exchange

PLC: Packet loss concealment

PSTN: Public switched telephone network

RFC: Request for comment

RTC: Real time Communication.

RTP: Real-time transport protocol.

SDP: Session description protocol

SIP: Session initiation protocol.

SLA: Service level agreement.

STC: Society for technical communications

STRP: Secure real-time transport protocol.

STUN: Simple transversal of UDP through NAT

TCP: Transmission control protocol

TURN: Transversal using relays around NAT.

UDP: User datagram protocol.

URL: Uniform resource locator

VAD: Voice activity detection

VBR: Variable bitrate

VoIP: Voice over internet protocol

VP: Video protocol

VP8: Video codec

W3C: World Wide Web consortium

XML: Extensible mark-up language

XMPP: xtensible messaging and presence protocol


Primary References

Additional References,2817,2332339,00.asp