NETINT Technologies about

Show video

good morning we are here for another episode of voices of video and uh as always we are bringing you the foremost experts and uh I would even say interesting people who are doing uh really fun and and cool things in videos so um uh this week uh we are not going to disappoint you I promise and with that I want to welcome sahid thank you you for joining us it's great to have you here yeah so uh let's just jump right in to this conversation but first of all tell us who you are and uh how it is that you began to work in web artc so Lev blog me for those who don't know and also senior product director of product man senior director of product management at SI um so I guess I started doing voice and video over IP Years Ago by from the age of 21 that was 25 26 years ago been doing that as a developer then project manager product manager and then CTO um I was CTO on one of the business units at Trad Vision back in the day and then we RTC came out and when it came out I looked at it that was 2011 I looked at it and I said well something is going to change I went back and said we need to invest in this I gave four different angles they wanted to invest in one of them was web RTC him high fa you're you're a person and we decide what we want to do with it they said no budget and then they said tell us what we're going to do next a so we can tell you what the budg know so we can use the budget and I said well that's that's not how you build the future as a CTO and they left um when I left I opened blog blog me as a website just to write the things that I wanted to write about and for one reason or or another it became the place for people to read about web RTC unintentionally and that's the case even today a few years later I founded co-founded test RTC with two other people um company that does we RTC monitoring and testing the company was acquired by by spear line in 2021 and then spear line was acquired by Siara in 2023 which is where I am now with testing and monitoring and other than that I've got my courses on wec and the bit of Consulting around we RTC so that's where I come from to this specific space yeah it's great I mean you've got um I I I really like you know I've heard you tell the story before um you know when when Google uh first introduced um web RTC as a protocol and we're going to talk by the way um we're going to jump right in here to really what web RTC is because I I I don't think everybody on the surface you know all of our listeners are are saying well of course I know web RTC is but you know it's a protocol uh it's not actually you know an application even though it gets talked about a lot um but yeah I I've heard you tell the story that in 2011 you know you you you read about it you learned about it you said the world's going to change in terms of video and and it and it certainly um has um so why don't we start there uh can you tell us uh go ahead and you you know assume that um maybe not everybody is an expert in understanding the web RTC protocol and and what the components are okay so let's start from the beginning okay we actually need to stop because it's Israel and there is a sireno oh okay so okay yes so if you we can continue about two three minutes but that's like yeah yeah well you need to you you you you take care of yourself I'm in the right but we need to close here and stuff yeah yeah no you take care of yourself okay um well this is live and so you know this is live and uh we uh hear on voices of video absolutely stand with Israel so um uh sayi is joining us from uh from Israel and uh so we want to take this um very serious so I think while we're waiting um for for say to um rejoin us I um I I might make a few comments I'm actually joining um for those that see me on video um you know that my background's a little bit different and uh even blurred you can probably tell I'm in a hotel room so um so yeah uh I am here in San Francisco where demmo just wrapped up and uh if any of you were at De and we didn't get to uh you know to meet uh or shake hands or get introduced then uh apologies because uh we would love to to to meet all of you um but what an amazing conference and if if dmux has been on your uh short list or on your list of uh must attend conferences I I really can't recommend it um highly enough you know it's two days of of uh very interesting talks uh it is an engineering conference it is for engineers and um it's uh you know Wonderful Time of networking you know there's like I said a lot of really great conversation ations but you know some really really interesting themes um that are coming out you know both at Deo and other venues at the moment and it actually plays into our conversation right now um there's there's one very common theme and that is that anybody who isn't taking a very very hard look or a sharp pencil to your operational cost um well it's coming get ready you might as well start now uh and and I think everybody's already well um well on the way and then there is this whole move to live uh live workflows and of course ultra low latency plays into this and obviously that's where web RTC uh comes into play here but um uh this this shift of uh live and then reducing cost hey Sai all right hello good okay that's live here word yeah this is live I I was i i i i f i felt like a news reporter I'm like well let's see so let's talk about the mo and you know fill in there so anyway um good welcome back um yeah and we just started so web RTC yeah right so it's perfect web RTC okay so when you look at web RTC the first thing is that it's two separate things by the way it's not one the first one is there is a protocol stack which is a standard specification of what is web RTC people are getting out of the room now um and the second thing is that it's actually the implementation of Google itself of that protocol stack and that implementation today is called Le web RTC but it's web RTC and this is what is currently implemented in all browsers so if you use Chrome or Firefox or Edge on or Safari they all use the same implementation with different Trappers around that but at the end of the day this is the main implementation there are others but this is the main one from the perspective of what web RTC is designed to do okay from my point of view and I come from video conferencing sotc is a kind of a media engine okay we had media engine before you have a video conferencing solution means I have media engine there somewhere that needs to handle the voice and video that means in is built and designed for real time live stuff I don't care about stuff that goes over VOD things that are streamed and you know I can send it to you and if you get it a minute from now that's fine we're not talking about these thing it's live I need it below one second or things are going to be bad okay um and the main difference between that media engine and everything else that came before it is that it has a standardized API okay it's very important important because up until then what you had these different voiceover protocols where someone needed to go think about what the API surface need to be and then implement it and whatever overc came and said well you know what there is an API that API is Javascript we're going to plug it into the browser and this is how developers are going to use it it's not this is how developers need to implement we RTC it's how developers need to use web RTC when they running the browser these are like the capabilities that we're giving them so first of all it means that you don't have access to everything like yes you get packets over the network yes the receive over UDP no is an application in the browser you don't have access to them if you have a native application yes but not in the browser why because nobody gave you that kind of an API surface and if you have problems with that go to the W3 3C and explain why you you need it and you know go through that motion of that process um and that mindset changed a lot of things because now what happens is you don't need to learn voice over IP in order to build a Voiceover IP application you just need to be a developer and a web developer and there are a bunch of them a lot more than voiceover developers and they're building a lot of different things and some of the things that they are building are not even video conferencing they're just streaming services or totally different other things that nobody thought about before giving them such an API and that for me is what makes it so magical and what gives it all of the power that it has today um somebody who wants to implement you know let's walk through the process of building uh with web RTC you know thank you for for for that introduction so um we do I I'm sure in our audience we have some folks who are working in video conferencing so I I I don't want to uh dismiss anyone who is but mostly um you know our audience would be building you know let's say direct consumer type applications you know live streaming or you know maybe working in a very large social network where you're trying to you know give your users ability to communicate Etc so how would you begin approaching you know building or implementing web RTC into your um solution I would start by using a third party so I don't need to learn web RTC too much because it's too much of a hassle that's the main thing uh but really um so first of all you need to decide where to use it and why let's say that I want to build the next Netflix okay I wouldn't touch web RTC with a long stick now why is that because all of the content on Netflix was pre-recorded MH okay and if I'm going to play it back I don't care if you wait 5 seconds before your movie begins I just don't care it's fine you won't care either and if you care then we can optimize it to two seconds and everyone will be happy but there is no need to me to go down to zero and now if you go down to zero you start losing quality okay and what TC is about compromises what does it mean I'm going I'm using UDP and not ECP on the network because I don't need R Transmissions R Transmissions are bad in web RTC why if I'm sending you video and you don't get it and then you need to retransmit it once you will get it again it might be too late to use it so there are times and instances where user transmissions in we RTC for specific things but they're not for the things that you use them in streaming usually there is no buffering in Weber TC in the sense that you see in streaming videos yeah okay so when you say I want live and I care more about the fact that this will be live than I care about the quality of what I'm receiving let's say I am betting on something in sports in then live is more important for me than adding latency because if I had latency it means that someone's has a an edge on me when he's betting he knows things before it's like you know this latency of two seconds becomes important or five seconds I want to hear the goal along with all of my neighbors not after them okay uh I want to do this session with you Mark and then want to stream that to someone else so at least the both of us needs to be in need to be in a conversation that is live so the first thing to ask yourself is where exactly are you going to place web RTC because there are different places to plug them in so the first thing is well the viewers do they need it live yes or no if they don't don't use webrtc if they do okay that's you we might that's a good consideration of using wec um let's go to the engress the broadcasters should they be on web RTC well maybe if they're doing this kind of conversation they should be at least between themselves but then someone need to go mix that and record that and then it goes where yeah that's right or let's say you know what I want to have someone join from a browser I don't want him to need to install OBS and I don't need him to install any other application I just want him to open URL in his browser and magically he's there okay and he can broadcast from the browser from anywhere the only way to actually do that today is web RTC okay so when I start looking at the solution I would look at each and every component that requires video and I would start asking myself things like what's the latency what is being used does it need to run in the browser or can I use something else would using a browser for the user with a camera or microphone be useful for me is that beneficial that it give me an edge and again if they answer is yes I'll go do that and use we RTC MH okay um usually I would use a third party either commercial or open source there are many of them out there self-development is nice but you still need to start somewhere nobody starts by using RTC directly not in these not in these domains when you say third party you're talking about like like like someone who's developed an SDK that is supported that I can license and you know they'll help me what do you mean by that there are three different Alternatives the first one is I am going to go and use a Cass vendor communication platform as a service I'm going to use maxtc and streaming I can use twio to do the video Vonage daily and you know Dolby many other companies yeah what they do is they give me a manage service that I can use and that includes web RTC and I can just go and embed that into my application the whole experience around that and some of them are um their main focus is going to be streaming Dolby has a solution for that daily goes to I think up to 10,000 or 100 100,000 viewers and then you have um stream stream know uh cloud cloud flare that has their own solution okay yeah there's live switch you know um there's a lot of them cus vendors the other one is well I'm going to go with the streaming vendors but RTC Solutions and I'm going to use them because well I know how to use them there you'll find red five Pro wza ant media nanocosmos these vendors where what they what they give you is the traditional or classic solutions for streaming that also have web RTC support so they know how to mix the two mhm and third alternative is to go It Alone you know I'm going to develop it and I'm going to use an open source platform I'm going to use Janus or media soup for that I'll bat it from scratch or I'll use Pion and different types of um usually media servers that would be able to get the traffic that I need and then forward it to wherever I need it and then from that I can mix and match the solution that I need around that okay now if you go that route you should really know wec well understand you mentioned H about quality you know that that quality in web RTC is is is more challenged um talk to us about the codecs that are supported and then and then where those quality Cliffs are um so why is it you know there's a quality issue CeX for audio you've got Opus mainly they're g711 but everyone uses Opus for video today there's vp8 vp9 h264 and av1 all of them are available um the most common one is still going to be vp8 and h264 you'll see a lot more vp9 these days and you see the beginnings of av1 especially on the decoding sign in the browser encoders are harder based and different in nature so everything is there or most of the ones that are interesting hvc there are noises from Apple of doing that in iOS the main challenge there is going to be all the patents and royalties around that so it seems that most companies in the domain of we RTC Are shying away for from using hvc and they would use h264 only if they must that's usually how things happen with web RTC today the next part of your question was about quality and cliffs so I guess if the network is great and it never is then the quality is going to be great period either that and you've got you or you've got issues with the CPU and it's going to e the stupid things are about processing power that is needed but mostly you'll get issues from the network if the application is built correctly now the challenge with a network is that you don't really own or control it in any way I'm doing that here from my desktop with my machine sitting right next to the soft switch that I have and that one is connected with ethernet cable to this machine because I know the things that I need to use and it's fiber to the home and I pay dearly for that that's not necessarily the case with all other people this is me because that's my job but mostly you won't find that people will be on their Wi-Fi located far from the access point trying to do video calls while they drive on Highway over cellular okay and then the problem with these Network NS is that they have packet losses on them and latencies and Jitter on all of the nice things that you like we like to say about these networks and what happens is this again if I throw a packet at you if you catch it you can decode it but if you can't what happens then okay and if you go to the let's call it the traditional streaming way I'm throwing a packet at you but I'm not throwing directly I'm putting inside TCP or htps which is TCP and TCP will make sure that if I send it to you you're going to receive it because if you don't receive it I'm going to retransmit it again and again again again until you either receive it or we both decide that the conne connection got broken okay so life is going to be easy the only thing that's going to happen is a bit of buffering I'm going to be stuck on your screen and then it's going to continue right where we stopped you can do it with live so I'm throwing packets at you okay if you don't catch them because they didn't arrive what happens then is that you need to do something about that or we need to do something about that so one thing that you can do is say well you know what I missed that packet but I'm going to assume that was an audio packet and I'll take the previous one that I received and just reduce a bit the volume and you know pray for hope for the best yeah I'll skip that one or I didn't receive it on time but I can wait 20 or 30 more milliseconds just to be sure or I can go and say well I didn't receive it I know that it's an important packet so I'm going to ask for that to be retransmitted again yeah okay so there are a lot of mechanisms in there that are going to be there to try and fix things but you're fixing them while you know that they're broken and it's like Show Must Go On I cannot just wait and pause everyone like like we had this you know this chat between us and then the siren came up here and then a few minutes I was out and was the end of it but you had to continue yeah okay yeah because this is live assume the same thing but shrink it into like 50 100 milliseconds of time yeah yeah you cannot just I cannot lose that audio because then whatever I'm saying you won't understand or my video won't pass the way it should be and again we have a challenge because the tools that we have must be tools that run in real time and cannot use retransmissions too much I I I think most of our listeners are probably familiar with scalable um video codecs and but but that's something that if you haven't worked in web RTC is kind of a interesting concept and the concept um you know and so um you can actually explain it why don't you explain it um both for uh temporal and spatial you know scalability because it it is interesting how how these bandwidth issues are handled you know I think let's start with something simple in live in streaming what you usually do is use something something called abbr right adaptive bit trate whatever which means I'm going to receive the video on the server I have time so I'll take that video and then transcode it into five eight whatever number of other streams that I want each big stream with different bit rates and cap qualities and capabilities and going to segment in into two seconds or whatever so someone can jump from one to the next Now wec doesn't work like that not because it doesn't but because wec again is for video conferencing and in video conferencing I don't have the time to do these eight different pit trates and also I don't need to because this isn't going to a million people it's going to the other three people so I don't want to invest so much energy for so little benefit yeah so what happened in Weber is the first thing is I don't know what the bit rate is going to be like so I'm going to just dynamically decide on the call while we're on this call using Zoom which isn't web RTC but it's going to use the same kind of a mechanism Zoom is going to check on each side of this call how much bit rate is available for me and for you and then it's going to play with that throughout the call going up or down based on the network and the CPU capabilities and everything else like just this movement of my camera yeah change the amount of bits that needs to be sent over video and that might change how bit trates are going to be um allocated for the code okay so the first thing that you have is a bandwith estimator inside web RTC that trans dynamically changes the betat trate that's the first thing now the second thing which means that you don't get perfect quality at all at all time because I don't have the network for that I might not have the second thing is that you can use scalable video SVC okay scalable video coding it's not that common by the way today with web RTC it's there but a lot don't use it with scalable video coding I can create a single encoded beit beit bit stream for video and then layer it as if it's an onion and each layer is going to add something to the video I can say the first layer is a temporal layer with low resolution and 15 frames per second in the second layer I'm going to add more frames so go to 30 frames per second the next one is going to add resolution and the last one is going to add like quality on that resolution anyway now I'm going to take that video that I create that bigstream send it to a server now the server can take that and open peel that layers like an onion and you can use how many layers he wants once he's sending the to other participants so now I'm gaining something like AB on the server without the server needing to to transcode anything you can use less CPU but serve a lot of different participants that have different capabilities either because of network or CPU power this is nice but what's actually really used in we RTC in a lot of scenarios is something that is called simulcast okay in simulcast what I'm going to do is something else I'm going to create two or three different videos of the same Source each one in different bit rates so one of them is going to be at 100 kilobits per second the next one is 500 then the third one with whatever is left I'm sending all of these three to the server and the server will decide which one to send to whom okay it's like a poor man's SVC yeah yeah why do I do that why do I need that SVC doesn't work well with hardware encoders and decoders which means that everything needs to be done on the on the CPU but simulcast is something that hardware decoders and encoders are fine with it makes it simpler for them to do that and the extra bandwidth that I'm paying for is not that high in a lot of different use cases there's there there's a lot to web RTC um you you know you started a company um doing tests and you know helping companies um test talk to us about where you see the biggest um you know pitfalls or what are the common mistakes that people make uh in building their workflows or or implementing uh even if they're turning to a third party you know what are some lessons learned that you can share I think the biggest one is not knowing or or understanding web RTC and when you know you don't know it you come with um unrealistic expectations okay I'll give an example I had someone come to me and say I want to build an application that was before the pandemic even I want to build an application that allows people to go travel the world I want them to be able to go you know see the Himalayas so the actual person out there goes into the M with his iPad and the person at home can watch that from his you know comfort of his TV screen yeah it is so fine the quality would be so great that it's better than Google meat and then you try and go to explain well there is no network in the IM malayas and see was created by Google which runs Google meet so how can you get a quality that is higher than that yeah with something that is like it it doesn't match okay so and there are lot of this is an extreme but there are a lot of these kind of unrealistic expectations yeah did they actually try and build it or did they listen to you no no I'm my my kids call me shatter of Dreams because I just it's that's my job most of the time um the other one most of the mistakes you'll see people do with turn and with bet trate calculations they just don't understand that somehow turn servers or natur traversal is black magic for a lot and then when they try solving it they Solve IT incorrectly so usually you will start I get people come to me and say we have a problem it doesn't work well we're trying to optimize and and we're checking if we can use vp9 and you go like okay but what is the problem like where did you start from and the story is it's the sub story of well we have one user on desktop the other on mobile we did a native application and the quality isn't good okay why isn't it good is that the network the CPU is it for a specific user in a specific location scenario you don't rush it's like people rush to fix things before they understand what the problem really is because they heard someone that V somewhere that vp9 has better quality than vp8 yeah and if I've got a quality problem with my video it must be that it must be that I need a different codak go explain them that up until a few years ago m 2 was still the dominant codec running their TV shows and they were happy with the quality there so it might not be the Cod it might be something else yeah and just trying to you know let's Go Back to Basics let's see what the problem is analyze that and then decide on what the correct solution is yeah so most of the time it's just it's that it's looking at figuring out what the actual problem is not what solution doesn't work for that person just understanding the problem before charging towards the solution that that is gold the advice that you just gave even you know the vp9 example like you know oh uh I you know I have a quality problem so uh I need to implement a new codec because that's gonna fix it you know I need a better codc we're going to av1 yeah AV yeah av1 exactly yeah yeah it's interesting um you know obviously nettin we have a hardware av1 uh encoder and so we're uh in the middle of a lot of uh av1 you know migrations or uh it's one of the you know it's a major driver for us um but you know we hear consistently that one of the very very first tests that people do is they need to understand like wait a second if I'm going to adopt av1 I need to make sure that it's at least on par with vp9 no it needs to be better than vp9 in terms of you know bit rate efficiency and and and the quality metrics but you know and in some ways um that sounds like well that's obvious right but it's shocking how many evaluations or Engineers start running down a path and they haven't even started with the kind of first principles you know so yes and I need a lot of CPU power to decode av1 especially if what I'm looking for is high quality at high bit trate yeah now if you look for example today in video conferencing where do you use av1 you use it in very low bit rates and you're starting to see companies experimenting with that for screen sharing why low beit rates because it takes so much CPU that it's easy to do on low beit rates and then you actually see a quality improvement and why screen sharing yeah because text looks better in av1 encoding yes so so so while you stepped away there for a few minutes um I I was sharing a little bit from demux you know since I just came back um well I'm still in San Francisco and there was a presentation yesterday from a company that showed a a um uh video conferencing application it was screen sharing and they showed the difference between h264 uh and it was a Wikipedia page you know they showed and it was like h264 at like so therefore your point text you know it was pretty much all text there were some images but um and I think it was around 900 maybe just slightly above 900 kilobits um h264 and then they showed av1 at like 170 kilobits you know so um yes but that's you know and it was pretty um remarkable the difference um not only in bit rate but you know it's it's not well why why is it not a match so so soain because in a way in a way av1 has the tools that make it better for text period once you have that you know there's no competition yeah but the thing is how the hell do I know beforehand that this is going to be the content that I'm going to encode it's not as if I saw the movie at Netflix I know that this is you know this is I don't know it's a drama it's you know it's drawing something that is drawn how do you call that um you Drama Oh animated yeah animated movie or is that like an action film I don't know that in advance so you know I don't have the luxury of doing encoding three times in three different codex and then deciding which one is better deciding which one yeah yes Netflix can do that and they are doing that to reduce TW trates I can't it's live yeah yeah yeah yeah that's interesting you need to get everyone a lot more optimized yeah in a lot of different broader use cases to be able to work with it everywhere at all times it's not that you can't use it today definitely you can and in some use cases it's the best thing ever but you need to know what it is that you're doing if you're going to use that specific codic yeah and other cods it's like S I want to you mentioned turn servers um and and it got me thinking that we haven't explained the pth that a packet goes goes through using web RTC from the moment that you know the photons uh hit my camera sensor and then they show up on your screen why don't you real quick uh explain um you know it's complic what does that look like what is the traffic where are the toll boosts tell us where the toll boosts are I think the answer is it depends okay okay um we RTC as a whole doesn't require anything it works peer-to-peer from my machine to yours directly if we want to and if it can okay if it can't we're going to use a turn server to relay that data there are stand servers there are a lot of things out there okay but at the end of the day there are two things that are going to decide how these media packets are flowing the first one is the architecture of your application and that you can you can control because you own that and that's a very important piece like where you do you have media servers if you do then well you go through media servers and that's out of scope of FY although these media servers communicate in we TC protocols yeah but they're not defined in the standard you're just using them because it makes sense yeah and they're very common usually most sessions today I guess would be between a user and the media server MH and somewhere in between especially in large codes or in stream and stuff um so a lot of these sessions are going to be from a client to a server and where the server is located is important I want that to be as close as possible to the user and then there are turn servers turn servers are there if I cannot reach the other side the other side might be you or it might be a media server and if I can't reach you directly I'm going to use a turn server and the purpose of the turn server is to relay the media through the turn server to wherever it needs to go turn server can run Med media across UDP TCP or TLS and it's kind of in webert everything is kind of best effort I'm going to try UDP if it doesn't work I'm going to try relay if it doesn't relay over UDP I'll try to really overtip and if that doesn't work I'll try to do that over TLS and what's going to be used well I don't know let's try and see what happens in this specific call in this specific scenario you mentioned media servers so I is there a Trend towards media servers being used and where and why and also explain the functionality that can be in a media server within a web RTC construct okay so first of all it's not a trend it's a reality of life it's been from day one and what do I mean I see a lot of people that say that web RTC isn't good and we're going to improve and make we bber is better yeah we have some streaming companies doing that and yeah you Nam you named some of them earlier when you were mentioning the platforms anyway we won't rename them what they says we make we RTC better because we RTC is not good it's peer-to-peer yeah and the answer is no it's not you can do whatever you want with it if you want you can use a media server it's great that you're doing that but everyone else in the industry is doing that okay I cannot send packets from my machine to a million machines impossible I can't so what am I going to do I'm going to send the packets to a server and that server is going to just go and rout that packets to everyone because that's the role of that server yeah hence in streaming everything you're going to do is going to have media servers in there in video conferencing for large groups everything I'm going to do is going to have media servers if I want to record the session most probably I want to record it and then play it back somewhere in the cloud so I need to record it to and on server that is on online media server or whatever but I need a server yeah okay sometimes I would say well I'm going to do it peerto peer because it's only the two of us I don't care about recording and I want to I want to be cheap and not pay for the bandwidth over the servers yeah that's fine yeah but it's just one of the use cases where you can do this thing it's not the major use case it's one of them so media servers are used everywhere they're used to do group calls they're used to do streaming to large amounts of people they're used to do recording and transcriptions and a lot of other things that you need when you want to manage the actual to host a meeting the session or whatever you want to call it is it is it easy to build your own media server or is this where you get into really wind up use a third party a lot of the courses that I have I do with Philip H he's great he's like the person with the most amount of bugs opened on Chrome on web RTC um and we we talked about that about what why why would someone in his right mind after 2020 would go and write his own media server and the only the answer is because it's fun okay there is no other reason to do that there are multiple Alternatives today that are quite good Janus jitsy media Su Pion these are the main ones okay they are all open source you can download them them use them change their code modify them optimize them improve on them whatever it is that you want and they all have large communities around them each one is suitable for slightly different use cases and scenarios and they use different coding languages and whatever nobody cares you can just go and use that yeah so that's what I would do if I had to build some something today from scratch I wouldn't write code in C++ to start implementing web RTC on the media server yeah I would go to one of the existing ones and just use that and use the one that I'm most comfortable with and that the developers around me can use so um we had a question come in I'm going to ask it uh so we will get to this so um uh let's see VJ if you're still on uh we will get to it um but it's appropriate that we talk about scale of web RTC because um I was smiling at your comment about you know commercial companies web RTC is bad but we've made it better scale we yeah and and then and then the next comment is it doesn't scale but we scale no one else can scale yes so so why don't you talk to us again explain where the web RTC doesn't scale comment comes from and then what the solution is so I'll start with a solution have you have you ever used Google me mhm does it scale I mean I use it for one to one or you know six people or something so I don't know it scales it scales well it doesn't it's like it scales like Zoom doesn't really matter yeah they have millions of users across the globe and nobody people are complaining but they're complaining like everything else they complain about yeah sure sure sure they complain about Zoom they Lain about teams you know we complain about exactly Google meet at the end of the day is pure we RTC wec scales now you started by explaining at the beginning that look Weber TC is a protocol yeah do whatever you want with it if you want it to scale make it scale if you don't want it to scale don't invest the time in making it that's it yeah okay where does that command come from you go into a room there's a product manager three marketing guys and the 10 salese and they won't give me something that I can tell the customers who we buy from us what do we tell you okay and then you get these stupid comments and things that companies say because they need to say something it doesn't scale we help you scale you know we RTC was never tested if you want to test it come use test RTC I can say that but that would be stupid okay because robertc was tested but if you're building your own application you probably need to test your application application you can use someone else and you can do it manually it's up to you to decide how but we are the only ones doing testing yes sure yeah yeah okay so this is where it's it comes from I don't think they don't do it on purpose no like just how companies think and work where what they want to do is fud fear uncertainty and doubt that's fine yeah so explain again how does someone let me I I'll ask a question it's my turn okay and does hls scale well yes because oh it's dependent upon the network and it's dependent upon theet buy servers and you need to buy transcoders and you need to build the application properly and need to put CDN in place and then it scales yeah HS on its own is written on a piece of paper somewhere in you know ITF documentation or whatever it doesn't scale you can copy that HTML document and put it somewhere else but someone needs to build the application that would scale okay someone decided what type of servers to you how use how to use transcoding how big to make the chunks yeah someone did all that now wec isn't any different wec scales no it doesn't but if I want to build an application with RTC I need to build an application that would scale for my audience not more than that not less than that if I'm going to have you know a target audience of 100 people in a single session I'm going to run different types of servers and if what I need to do need to scale to millions in real time it's going to have very different impact on how I'm going to architect my solution and what the media ser servers are going to look like on the network okay I remember the first time I talked to a company that wanted to build a kind of a social network SL messaging solution what they said we want to start by thinking about 100 million users and more when you look at that kind of scale you cannot use the same techniques that you use yeah when you want to build signaling solutions for a dating application for the country of Israel that have 10 million people and luckily 100,000 are going to use that solution it's totally a totally different problem with a totally different architectural solution I can say that because in test RTC for example each year we need to change the way we architect the solution why because we need to increase our scale considerably versus the previous year and that means taking the next step or the next sleep in how you need to do things and work to reach that kind of scale so does see scale yes it does but you need to you need to put an effort into that yeah where is the cost in scaling because you also hear this comment I've I've certainly heard it is it hey web RTC is great you know for people um who maybe have a a decent understanding they say they they agree with you web RTC is great hey it's perfect you have 50 participants maybe you want a hundred on a on a session maybe you want a couple hundred no problem but you want like a thousand you want 10,000 you want 50,000 it's too expensive it is so I've heard that so where is that cost so I think you can put it into three buckets okay the first one is it is too expensive because we're still early in the game of webrtc okay costs will go down because today you pay for the investment of me doing that implementation that was never there in order to optimize it to run at such scales and over time we're going to reduce that price points like 20 years ago using h264 was ridiculously expensive today it's cheap as hell compared to that so give it time five 10 more years and the technology would be common place everywhere in terms of how you architecture now all of the best practices are going to be in place the second thing I think is remember when we talked about real time yes so when I go from 10 seconds to five there is a cost associated with that right now if I'm going to spend the energy from going to five close to zero the price goes up because I need to invest more energy into the tools and the solutions that I'm using to get to that level okay to that some second friction that I'm looking for if you look today at video conferencing it's really easy to do because you've got already 30 years or so of experiencing doing that it's still hard but it's really easy compared to what okay compared to Cloud to cloud gaming that also use we RTC now why because in video conferencing I can use 200 milliseconds and that's fine I'll talk you'll barge into what I'm saying and we'll manage okay mhm 200 milliseconds latency it's fine 300 also okay 400 will will still be okay now if I'm on a game cloud game and that's 100 milliseconds and that's a first person shooter I'm dead yeah so then I need not 200 milliseconds I need to just push it down to 50 and you see the energy that people invest in building cloud gaming and putting on the tools that are not really used when you do video conferencing okay so the faster you want a solution to be the lower the latency needs to be the more energy you need to invest the higher the price point is going to be just because of the realities of life so if you're saying well it's too expensive yes but you wanted it real time you didn't want it five seconds exactly yeah so either you pay for it or you don't and this is an interesting observation um you know there's a very well-known industry analyst who um you know I think some of the industry are tired of hearing him say this but um it's true and you know his observation is is that as much as we all like to talk about ultra low latency and he's obviously not referencing you know like like gaming or gambling type apps or you know there there's clearly video applications where it needs to be you know and you mentioned that earlier you know if I'm gambling like two seconds really matters or could really matter you know um but just in terms of like entertainment distribution there just isn't a business case for it you know and and and uh and and you know and look I'm you know we're we're we're friends with everybody in the ecosystem so you know I know I'm going to get some hate mail from from somebody you know who's building a a platform for ultra low latency sports or whatever um that's great and you know maybe we'll get there but but as you said as you push those latencies down the costs just go up so high and to the consumer you know it doesn't matter you will see that needed elsewhere though and that would be on the production floor yes yes okay that's where you start seeing solutions that are interesting because first of all I want this discussion between us to be in real time and as quick as possible to actually go out there course of course yeah and if we're talking about TV level production I maybe I want to do that remotely and have the producers sit somewhere else and then using web RTC to just go over the streams and the sign which C to take and how to merge and whatever is something that you do with web RTC while the stream goes to other participant to viewers or whatever so yeah it's not that there is no room fortc but it's not in the side of the viewers but as a distribution to mass viewers there's not yeah okay well we're coming to a close I and I promise I'd answer or we'd ask this question so the question was broadly speaking where do you see the industry going in terms of simal cast adoption versus SVC you know this goes back to a few minutes ago and talking I think we're going to stay in the world of simcast for at least three or four more years at the very least it's not going to change I think that we might see more SVC when av1 becomes Common Place yes but but only if av1 SVC will be good enough and we'll have support for hrun quers and decorders that make sense which means it is useful for realtime Solutions and not just for streaming yeah otherwise we'll keep on using simulcast yeah so um working in the codec and encoding space I can confirm what you just said about av1 um av1 does have U much more comprehensive tools as as that's why you're pointing out av1 um for SVC and I do know I I can't comment just because I'm not really on top of uh some of the open- source projects but I knew I do know of one commercial company that has an av1 encoder that they're they've been focusing um quite a lot you know on the um video conferencing use case and and exploiting those SBC tools so what I said is slightly different because what you need is for Intel Nvidia qualcom arm for these companies to have an av1 encoder and decoder with SVC support yeah that is suitable for video conferencing if you have that then SVC in we RTC will become common place otherwise it will be simle cast uh here's another question that came in uh so can you use web RTC to transport audio only I can also use it to transport video only I can also use it to transport only data no audio no video yeah that's right there's a data yeah there's a data Channel yeah yeah it's good yeah and so for this person that asked the question um you know reach out uh you can reach out directly to SI or um or to us and we'll we'll put you in touch you know if you want to want to chat more about that well uh this was a wonderful conversation thank you again for sharing this this amazing uh body of knowledge you know that you've accumulated so um thank you to the to the listeners uh and certainly whether you're watching live or on the replay as they say um feel free to reach out to us or Reach Out directly to S um he's an amazing resource and make sure you sign up for his newsletter by the way all right well thank you again and uh thank you to all the listeners and until uh next time um you know keep encoding and streaming video using web RTC thanks for having me Mark

2023-11-10

Show video