
Interviewed by Hans Tung and Rita Yang.
For this episode, we have Tony Zhao, founder, and CEO of Agora.io. Agora is the only real-time engagement platform designed to cross borders and reach users on low-bandwidth networks and lower-powered devices. It currently serves over 40 billion minutes of voice and video on its network, supporting a wide range of industries including social, gaming, education, IoT, finance, healthcare, enterprise training, and much more.
As a response to the recent outbreak of the coronavirus, Agora.io teamed up with New Oriental, one of the largest providers of private educational services in the world to launch the “New Oriental Cloud Classroom”, bringing more than one million students into virtual classrooms in just seven days’ time.
Tony is a serial entrepreneur. He founded Agora.io in 2014, with a vision to provide high-quality voice and video as a ubiquitous platform to developers and businesses around the world. Before Agora, he was CTO and board director at YY.com (NASDAQ: YY), one of the world’s first video-based social network and live streaming apps with over 300 million users. He was also a founding engineer at WebEx, which was acquired by Cisco in 2007 for $3.2 billion.
Rita Yang: So first question for Tony, in the simplest possible language, tell us what Agora does.
Tony Zhao: Agora builds a state of art, real-time audio and video capabilities and delivers them as SDK and API for app developers. In short, Agora enables app developers to create a world where everyone can easily connect and interact with others in real time to conduct social and business activities as if they were in the same place without feeling any technical, physical or financial difficulties.
Rita Yang: We actually have a lot of developers listening to the show. So can you give an introduction for them specifically?
Tony Zhao: Of course. We are a developer-friendly company, maybe because I, myself, was a developer for so long time, and I’m still a developer. So from the beginning, we don’t even have a business model and had zero revenue. We decided to open our API services to all developers with 10,000 minutes every month for free. And I think it’s really for the benefit of all developers to explore. And basically, it’s a playground for developers to test and create new use case and new services.
Hans Tung: In just this year alone, you and your team have done an incredible job of helping students, especially in China, to continue to learn online during the outbreak of coronavirus. Most people outside of the tech world or outside of China, don’t quite understand how you could do this, so quickly rolling out with the New Oriental, one of the largest education company in the world that’s listed on New York Stock Exchange in seven days. How is that possible to do so quickly? And how do you make that work?
Tony Zhao: So, there are a lot of challenges because when this situation happens, New Oriental didn’t realize in the beginning, but when they realize, it’s already like holiday. A lot of facilities and equipments actually are not accessible, so thanks to our past intelligent design which can scale easily and quickly with a small resource adjustment, but eventually what we do is we leverage all vendors we can get and increase our capacity quickly to serve all those certain demand including New Oriental. And we also have a more mature and professional service team which can effectively help them to build up their education classroom experience and also diagnostic of certain issues, to solve their challenge in practical environment with their students globally. So that’s basically how we do it.
Hans Tung: Within the people in the know, you’re very well known for your special technical architecture of how your product works, can you kind of explain what is uniqueness and differentiation at a high level?
Tony Zhao: We do look into this problem deeper, and we solve it in a quite different way. The way we look at it is overall, the major quality issue comes with voice or IP, or video, or public internet, is instability of public internet where internet itself is the best effort network, all the audio packets or real packets. There’s no guarantee to transmit the packet on time to receiver. There’s no even guarantee to ensure they are going to arrive or they won’t be lost to deal with that we actually create or invent software defined network, but optimized for real time traffic, they call it a software defined real time network to compact the authority of public internet, and that ensure us to have a certain quality of service level which can get close to dedicated network or dedicated lines to get the overall service the stability we need. And also we come up with world deep technology innovation around audio codec and video codec and other signal processing algorithms, so in challenging environment to deal with low end device or unstable last mile transmission, and we can still make it work.
Rita Yang: So that’s probably most of the technical side of the things. And as a business to respond to a public crisis like this, you must have a lot more key decisions you have to make as the founder and CEO of Agora, can you share with us the key decisions that you have made during the past few weeks?
Tony Zhao: We do have a culture of putting customers at the center of our considerations. So as soon as we realize a lot of offline education companies are going to be shut down, we mobilize the team to prepare for the potential needs of migrating them online. One of the hard decisions is when we try to satisfy all the surging demand, we found that we need to purchase a lot more new resources, but by that time, a lot of facilities were locked up due to the holiday. And a lot of vendors are not working. So the resource is quite scarce and hard to get, plus at that time, a lot of different companies are also trying to get resource. In the end, what we can get is some really expensive fundamentals and resources. The decision is challenging because we are still a startup, and we decide not to hike any price at that time. And in fact, we actually decide to donate some of the services for charity purpose to online education or online medical services, because with the virus, people are scared to go outside and schools, basically any kind of offline training or schools is shut down. And a lot of kids cannot access to usual class as past, so we decided to do that. That’s one challenge.
On the other side, we are facing some hiked resource purchase price. And whether we sacrifice the thing we can do for our education customers, or we sacrifice the quality to our customers, but we eventually decide to stick to what our quality assurance we plan to offer to our customer base, and to take whatever cost on resource side to make sure we have first enough resource, and second, the quality assurance to our customers. Some of the resource we purchase eventually cost us five times more than normal price. I guess that’s one of the challenging decisions.
Hans Tung: Just give our listeners a sense of scale. How many students at the same time can you support with your technology.
Tony Zhao: I think we do have quite an intelligent design where I think the architecture has literally no limitation on how many students can be using it. However, it has to have enough resource to work in design to support that. During the crisis, which I think it’s not completely over yet, maybe situation is still unclear in all regions or countries. But at the peak time so far, we support more than million users concurrently online, at the same moment, taking classes.
Hans Tung: It’s one session or multiple sessions? How it works?
Tony Zhao: It is a lot of sessions. It’s like a normal classroom experience. A lot of those are like 10 to 40 students in one classroom, but there are many forms, there are large sessions like one session with 20,000 students in one session.
Hans Tung: 20,000? How do you manage with 20,000?
Tony Zhao: It’s part of our design, the software defined real time network, it optimized towards the transmission paths, but also have a capability of aggregating all the students transmission paths into one single session. And actually it can support a large number of people in one session also, like the design itself can support a million people within one session. In practical experience, we do have experience to support 100,000 people in one single session.
Hans Tung: What kind of session is that, 100,000?
Tony Zhao: It’s a broadcasting session with some performers who are quite influential, and all the fans and whoever likes that type of performing get in to that.
Hans Tung: Was that in the case of YY or in the case of Agora?
Tony Zhao: It’s in the case of Agora, you know YY well, so in YY there is other similar or even bigger type of event happening, that was many many years ago on the original platform, and most audience are in China, but for us, the audience are really global.
Hans Tung: You’re a technologist, you can build a product for almost any situation on anything. And you started about the same time that Eric Yuan, founder of zoom, who was also an earlier guest on our podcast. You guys both started the companies not too far apart from each other. You evolve your business model a couple times and decide on the current model of working for companies like New Oriental who already have customers, and help them to do a better job of delivering a better user experience. You end up not choosing the model that Eric had, or the model that YY had. So what were some of your thinking process that leads you to the current model you have now? Or why did you choose not to adopt some of the other models?
Tony Zhao: I do have a working experience on real time audio video itself for long enough time, and I personally was the guy who wrote the audio session, video session for WebEx. So I know a lot of headache in it. And by the time I see smartphone really growing into the whole population, I see there’s an opportunity to combine this real time audio technology together with smartphone, because smartphone is such a complete and ready device for this technology, it has high quality building cameras, two, nowadays maybe three or four. And high quality microphone design as a communication device with programming capability. So, by that time when I start to look into this opportunity, I feel there’s room for us to create value to deal with the challenge, to build high quality, real time audio video technology, to deal with network uncertainty, and to deal with signal processing algorithm challenge on different devices.
The reason for me to really work on API services is when I look into this opportunity emerge with a smartphone penetration, I see the industry could be also changing, because smartphone is such a ready device for this kind of sessions. I see maybe one thing could be different, comparing mobile apps to the apps from PC era, on PC era, the applications cannot just assume the PC has a camera, or high quality microphone. On smartphones, you do have that kind of assumption. And in that maybe if someone can provide easy integrated API to support that capability, now app builders can feel no barrier into using real time audio video in their apps. And therefore, it could be a lot of use cases are creating from there. And since this is such a new possibility, and I think as a developer, I naturally would want to create something new and something useful for our customers or consumers. So the best way of doing it maybe is through serving developers from our API. And that’s where it comes from.
Rita Yang: And talking about use case of this technology, what are some of the biggest category you have seen on Agora’s platform? And How does it vary across different markets you serve?
Tony Zhao: We see a lot of use cases growing on our platform. I think there’s more than 100 use cases coming along. But the most significant use case this years and the last few years, was on social, education, gaming, podcasting and sometimes workplace, corporation and healthcare, and some of those IoT equipments start to quickly adopt this technology as well.
Rita Yang: Can you give us an example in the case of healthcare and IoT?
Tony Zhao: For sure. Under this virus situation, people are scared. So one of our customers leverage our technology to build an online doctor visit. So people who have just a cold, and they also have the symptom like fever, they’re scared, whether they should go to hospital or not, but they could see someone online with their app, and quickly get some advice, whether this is something you need to take more rest, and take some general pill and get to recover most likely. Or this is something you need to be serious and find the right hospital to go to, so that’s one of it.
And we also have an early use case on tele health, and I think it’s making a lot of sense where there’s a tele psychology services using apps, the patient is basically talking to the therapist and the tele psychology services is basically the severs will chat with you, and talk about your growing experience all that. And so patients would think that the environment is comfortable and relaxing. And a lot of times this is their homes. So this is also how the service to be more comfortable and more popular among some of those patients.
Rita Yang: What about IoT?
Tony Zhao: On IoT use cases, there are a lot of robots, if you know, they actually also put some video chat capability inside of the robots. So someone could talk to that robot to someone who’s in front of the robot. So a lot of the robot providers today is actually using us to do that. And it’s a very interesting use case, we sometimes also play with those robots in our developer gatherings. It’s a very interesting experience.
Rita Yang: And with the different use cases and the growth you have seen, what are some growths that are unique to say the Southeast Asia market or the Indian market compared to what’s going on in the US and China?
Tony Zhao: Southeast Asia, I think there is growing demand on online education as well, which maybe a little lag compared to US and China. But it’s growing very rapidly. In terms of different use cases, sometimes there are more interesting use case, using this technology in religion purpose, like people having gatherings online through our platform for certain religious ritual.
Rita Yang: Interesting. And with the mission to democratize real time communication technology and with all the type of clients and industries and markets you serve in, how do you prioritize the needs coming up from them?
Tony Zhao: So we’re looking to all of the demands coming to our platform, our first priority is definitely where developers’ interest goes. When we see a lot of developers trying to build certain social use case, we will focus more on enabling the use case, and sometimes co-inventing with the developers. Sometimes it’s not about use case, it could be something more fundamental, such as quality of user experience. That’s why we developed tools, like Agora analytics to make this quality of experience more transparent to them so they can better leverage the technology and services.
Hans Tung: You talked about ubiquitous interactivity as the next evolution for internet. Can you share with us more about what’s your thinking behind that?
Tony Zhao: I think the internet is pretty mature for ordinary purpose, like sending emails, exchange messages or just read some articles online. But if you look into our real life, in real life, there’s full of interactions, but there’s only limited occasions when we surf on the internet or using technologies to go online. There’s only a limited occasion that we have real time interactivity when we do it online. By adding real time interactions, an app user’s experience can simply be more natural, efficient and engaging. It will increase the user’s stickiness and engagement and often disrupt an existing vertical. We already see examples from Peloton, changing the way how people see workout equipment and doorbell, and how powerful this technology is.
Hans Tung: As the investor in Peloton, we can totally relate to what you’re saying as well. Zoom has done very well since IPO, and you can see that there’s more competition heating up. Microsoft Teams has added a video conferencing capability. You also see other players RingCentral, for example, teaming up with Avaya to offer video conferencing as well. Video conferencing is one usage case. There are many others that you mentioned earlier, like education, healthcare and other things. As product roadmap, how do you think about changing the product or tailor product to more specific usage cases that most existing video conferencing companies today may not spend as much time thinking about?
Tony Zhao: First of all, like I said, we definitely put developers’ needs in higher priority. So that decision is more coming from where developer really wants to do. So far on our platform, we do see a lot of demand more on like social, gaming and online education side, let’s say even online education, some of the most interesting and the heaviest demand come from non-traditional, online education, meaning some more social education or casual education, although it could be designed for K12, which is a very traditional education needs, but it becomes more fun, and more probably should be casual but more social.
And with that, the API design for the purpose is different, is going to support sort of different angles of the real time experience. And on our platform, online cooperation is actually not that big of a demand, but we support more on embedded workflow, like a lot of our customers want to use it in their intranet apps, it’s something you build in their workflow. They can spontaneously jump on a call just like they’re working in the same office room, they can stand up and talk to someone. So, we see that as a growing demand also, and we more focused on serving that purpose.
Hans Tung: Yeah, as you know, we also had David Li from YY to be on this podcast before, I think yourself and your startup to YY helped to make YY what it is today, without all your technology YY probably won’t be able to support millions of concurrent users live. So at YY, you were a CTO and before that you’re a founding engineer at WebEx actually building the audio and video product. So as you evolve from a pure technologist, to someone who’s increasingly good at figuring out how to solve other people’s pain points and problems, and have live video be used in more and more untraditional areas. Can you walk us through your journey and kind of just highlight some of the key changes that made this evolution possible? What were the lessons you learn at each stop that allow you to become the next version of yourself?
Tony Zhao: It is a long journey and I learned a lot definitely from a pure engineer in the beginning, trying to leverage my skills or knowledge in programming to help people build something or solve some problems. I think WebEx is a quite successful startup and I learned a lot, even on technology itself, that’s where I really start to first working on real time audio sessions. And I still remember, in the early days, it’s only myself to build that session. And soon after I released the first version of that function, I start to regret to take that job, because I got a lot of complaints after the first release, and I realized there are some complaints that I believe I cannot solve at that time, for the audio sessions, people would complain that it’s cutting off, it’s broken, or saying you definitely have some serious bugs.
But when I look into it a lot of times, it’s around year 2000, the last mile is all like dial-up. And even in corporate network, the internet inside has a lot of different difficulties, I have to prove to them that it’s the network’s problems, not my problems. And it’s really a headache. Soon I realized that I cannot teach everyone that there’s problem in the network, not in my code. And it’s really a headache at that time. So that’s why I started to regret. But it takes time and gradually realize, if that’s a problem, maybe as an engineer, I could think out of the box and do something, do the actual work, and help them to deal with the technical problem, which they don’t also understand. And that leads something to Agora, why and how Agora creates, like the software defined network is something designed to deal with that.
But that’s only one side, I think WebEx taught me how to really know this the challenge around real time video session. Meanwhile, as a very successful startup, I think I learned a lot from the experience of growing from 10 plus people to a pioneer to create the web conference industry, like the persistence of the leaders and their willpower, and the quality of our teammates,. I think high quality teammates, especially from the beginning of the team, and I learn a lot from them. And I think for YY, I’m proud of what I build and what I achieve in YY, it’s also an experience to open my eyes into the industry. I start to realize this technology is not going to be only sitting in the conference room, not only serve the purpose for people to negotiate or discuss on certain business topics, it actually helped people to live online. Like they can play, they can have a party, they can sing karaoke, they can actually making friends online through video chat, audio chat.
I can envision from there, people can actually live online and that’s part of the inspiration lead to Agora as well. And from David, he’s an incredible business leader, and his courage and he has deep understanding into how to create a successful business. And he’s a great product guy, he has a deep thought into almost every area of business operation and business strategies. From an engineer trying to build a company, I learned a lot on business operations, on market environment, on philosophy of how to create a successful business out of nowhere. So those are all the things that helped me. Also, with Agora, I still need to continue to learn and change to adopt to the new challenges, then that’s probably a high level look back of what’s going on in the past 10 years.
Hans Tung: Yeah, I highly encourage people listen to the podcast we have with Wu Hao, the director of the Republic of Desire. It is fascinating about how people live online, everything that he produced happen on YY platform that you built.
Tony Zhao: Yeah, it is. And I was truly amazed by how creative that YY’s user could be, they invent so many use cases online.
Hans Tung: Did any use case strike you as something that surprised you the most from your YY experience that people who live outside of China may not even be aware?
Tony Zhao: Yeah, there are a lot of those, like people literally teaching how to play violin online. At that time, I feel it’s unimaginable, and some people teach very rare language online. I forgot what language that is. But that’s a rare piece of language and there are people really learning.
Hans Tung: That’s kind of an inspiration that gave you to want to do Agora.
Tony Zhao: Exactly. If the audio video is connected, right now we’re face to face, it’s opened up of the auto view channel. So basically I can see you and I can hear you and you also can see me and hear me, and that’s how we can start to social or discuss or do everything. And now if we open that channel like Hans is on remote, but with all the video, it feels like face to face. So my understanding becomes if the audio video is connected in real time, now people can basically do almost everything online.
Hans Tung: Here comes part of the most important question, based on experience at WebEx as a founding engineer, you realize the infrastructure is a huge bottleneck for you to build your product. 5G is just around the corner, and China probably roll out first, how will the rollout of 5G impact the way you build your product and solution? What kind of experiences do you think you can deliver to users in the 5G world? And how would this be so different in with VR or AR be any part of what you would do in the future?
Tony Zhao: 5G definitely opens up more imaginations, making a lot of things that’s not likely in the past to be possible. And we are also very excited about how it will eventually become or evolve. Like I said, first of all, the developer is going to be much more flexible into building our use cases on wireless equipments, including our equipments, and developer would also be interested in first building a higher fidelity experience with video and audio, like they will naturally increase video resolution and quality of the picture. Also they start to leverage at least more soundtrack in audio communication or audio route and engagement. And they might start to use ambisonic in some of use cases to create even better user experience.
And for VR, we already see some company using us to build like VR education, which is quite attractive. I had my own experience, I think it’s a game changer. It’s still early days, but I think it’s a game changer because it’s a language learning class, and you’ll wear the equipment. You feel you’re in a country whose native language is that language, and starts to do things like buy stuff in the shop, or go to restaurant orders food or go into a bar to social with others, then you naturally start to practice the language online in the classroom. And it’s a lot more fun and natural compared to a teacher standing in front of a whiteboard trying to teach you a language.
Rita Yang: So if you help us to imagine with 5G and with machine learning and artificial intelligence, what would real time communication look like in our daily life in five years?
Tony Zhao: I think in five years, there’s definitely going to be a lot more use cases, to be allowable and to be popular among internet users. The things that could be hard today will be more common, like online education is growing rapidly in the past few years, but still more focus in larger cities, tier one cities, but it will definitely go to rural areas more, and to allow teaching resource like high quality teachers be able to help students in less developed areas or countries. Without the teacher visiting through a long trip to see those students, they can have daily classes in their home, and that’s going to be helpful to a lot of people.
And I know the situation right now relies a lot on doctors’ effort, but if we can relieve a lot of the burden by not having patients all go to hospital. Scientists now can have video consulting with a doctor online. And so I don’t know, even 50% of those not that serious symptom, it’s going to be much more efficient. And I think that should be everywhere. It will change a lot more industries, like even insurance, why do you still need to have someone to go check up your car, if it’s having some problem or run into a small accident, video call should solve all problems. Everyone from consumers to insurance companies can all benefit from the efficiency from there. Those should be all happening. And a lot of IoT use cases can also get people’s life more easier. We see a trend that a lot of companies trying to build video calls into large screen TV, and families can easily connect through smartphone to their TVs in their living room, when they are travel or be outside of the home, they can connect to their kids anytime. So those are all things I think would be happening in five years. A lot of them could be a normal life experience not really something new anymore.
Hans Tung: I’m not sure whether you have spent time thinking about it, but in the next 10 years, they will be 6G for sure. How will that be even more different?
Tony Zhao: In my imagination, I think with the maturity of the network and connectivity everywhere, plus a technology like us to enable all use cases. I feel people’s life, whatever things they’re doing, they would be able to have two options of doing it. One is traditional way and do it offline in real world. Another is they can almost like doing everything online with the same or similar or sometimes maybe even better experience, with VR equipment or other help of the technology. you do can achieve certain experience that is offline, you are still going to be limited by the physical constraints. But in online world, it will be literally no limitation.
Rita Yang: You talked about some of the principles you learned from David when you were at YY? What are some of the philosophies or principles that you do now at Agora in helping the company to grow?
Tony Zhao: I think one thing I learned from David, the most impressive one, I think it’s actually quite simple. It’s focus. And of course I do learn this not just from David, I think also from some other friends in my early days, I hear less is more, this kind of philosophy. But honestly, how deep you understand that is different. I think through these years, I gradually understand more and more, how valuable simple piece of advise is, focus could be everywhere. And of course, you just need to deeply understand what it means.
Hans Tung: Can you give us an example? What is it in YY or Agora? How does his definition focus differ from yours and what did you learn from it?
Tony Zhao: For Agora is quite simple, actually in the beginning, it’s a challenging time. Although we see the future where we think there’s going to be a lot of use case coming on the platform, building different use cases, but in 2014, I think there’s nearly not that much. And a lot of people disagree, and at the time, we made some decision that we focus only on the real time audio video side, we decided to not even doing non real time audio video. And we also don’t really do like pure instant messaging type of services or like SMS services. Those services actually were much more popular than real time audio video at that time. But I think we still believe the real time audio video is something that’s more challenging for developers to build on themselves.
And by keep working and investing into building a better technology stack and creating simple APIs, we could be able to create people valuable. If by the time the demand will gradually come to mobile platform and to the business world, then those value could be amplified, or even times. So that’s one focus, the decision on just to focus more on the real time part of the audio video sessions. I think it paid off later on, because we only have like 10 people in the beginning, and that decision of even allocating one person to do something else would be a big factor.
Hans Tung: Make a lot of sense. Thank you for sharing that words of wisdom.
Rita Yang: So let’s go to the final round of quick fire questions. Just tell us what’s on top of your mind. First question. If you could invite three people that are alive to a dinner party, who would they be and why?
Tony Zhao: Well, it’s all out of nowhere, I would say like Bell, who invented telephone. And Shannon, who discovered the secret of communication bandwidth. And maybe one of my college teachers who teach me to not giving up on anything you have a passion in.
Rita Yang: What’s the best advice you have ever been given?
Tony Zhao: Well, I think I just mentioned, being focused and understanding what does focus mean. It is one piece of it.
Hans Tung: What’s the best advice ever given other people?
Tony Zhao: I realized sometimes being humble is not just the attitude, it is about how much you want to build or how much you want to create something, if the thing you want to achieve is even bigger than your current knowledge level or capability level, you will be humble. So actually, I think I recently gave some advice that like I’m not asking you to be humble. I’m asking you to expect yourself bigger than what you can do today.
Rita Yang: And what is the investment, financial or non-financial, you have made in the past year that yield the biggest return.
Tony Zhao: Well, I will say spending more time with my kids.
Rita Yang: Have you tried ping pong table?
Tony Zhao: Well, I did. Actually.
Rita Yang: That’s one of the advice Grab’s president Ming gave on this podcast. A ping pong table was his best investment. What’s something that you’ve read recently you would recommend?
Tony Zhao: The Selfish Gene. It tells why selfish genes actually help people to evolve and naturally build a better society.
Rita Yang: Interesting. Last question, what’s the habit that you have you think that have changed your life?
Tony Zhao: Like I just mentioned, in the beginning, building real time audio video session is like headache for me, I seriously think of stop doing and change to another project. Because I realized that it’s really hard to explain to everyone that there’s something wrong with the device or the network. And that’s initial in one or two years, but then gradually on one side, I start to think of other approach, on the other side, I feel like I learned something into dealing dealing with that, even in just communicate with people on that. I start to be able to quickly explain to them on things, although I still cannot explain to everyone, but I’m more experienced in doing that. And I even come up with the list of frequently asked questions. So I think that’s something helps me even on how to run organization or build a team, the engineering background is hard initially. Later on, I also learn something from it and start to like it.
Rita Yang: So do it until you love it.
Tony Zhao: Yeah, kind of.
Rita Yang: All right. Thank you, Tony, for being on the show with us.
Tony Zhao: Thank you.
Hans Tung: Very enjoyable. Thank you.
I would like to receive news and updates from GGV.