- Good morning, everyone. I'm Dan Galves, Chief Communications Officer of Mobileye. Let me start by reading a brief forward-looking statement.
Before we begin, please note that today's discussion contains forward-looking statements based on the business environment as we currently see it. Such statements involve risks and uncertainties. Please refer to the accompanying presentation and Mobileye's periodic reports and other filings with the U.S. Securities and Exchange Commission, in particular, the sections therein entitled risk factors, which include additional information on the specific risk factors that could cause actual results to defer materially. Welcome to CES, everyone.
We're about one hour in. And this kind of annual talk by Amnon, it's an event. As a comms person, it's really tough to find events that appeal to OEMs, auto suppliers, investors, analysts, regulators. But, every year, I see those different groups in the crowd, and it's because this has become a real highlight of CES.
It's our 10-year anniversary of being at the show and doing this talk. And I'm excited to introduce Professor Amnon Shashua, President and CEO of Mobileye, for our annual address. Thank you. (audience applauds) - Hello, good morning.
CES is a great opportunity kind of to collect our thoughts and not only talk about what Mobileye is doing and how we see the industry, but also take a broader look, kind of a bird's eye view of where everything is going. I think Mobileye's on a very exciting trajectory, very meaningful trajectory, whose end goal is to revolutionize transportation. And what I want to spend the next hour is how we are going about it and, you know, what will really revolutionize the transportation and what is just merely a very cool product. But what will make the big impact? And this is where we are targeting.
You know, we spent the last few months, we had three months ago a AI Day, our Artificial Intelligence Day, where we had two hour talks, myself and Professor Shai Shalev-Shwartz, our CTO, about how transformers, what exactly are transformers, how it fits into our software stack. We showed some, I think, very innovative techniques on how we take a transformer and make it 100 times more efficient in order to, you know, serve a very particular task of autonomous driving. We published a academic-like safety report two months ago.
I'll briefly touch about it. I think it's kind of the, you know, next generation after RSS. It took a number of years to collect our thoughts, what should come after RSS in terms of safety.
We had the Investor Relation Day. And now everything comes together into this kind of talk. So kind of the question that I want to lead into the presentation, what does it take to revolutionize transportation? So first I'll show two clips. You know, on the left you see our SuperVision on Zeekr 001 in China, driving in Shanghai. On the right is our robotaxi project with ADMT on the ID Buzz platform.
We have additional other platforms that we're working in, but the lead customer is ADMT, the ID Buzz, slated for end of 2026 production. And you see here hands-off driving on the left-hand side. Still supervised. It's eyes-on.
And on the right-hand side, the activity going into building a robot taxi. So now the question is, what I showed you right now, is this example of a revolution? So the answer is no, but I want to build all the elements into why it is no. So one way to look at this. In machine learning, there's this precision-recall trade-off.
So recall is availability, and precision is all about safety. So in terms of precision, the KPI, what people normally talk about measuring, is mean-time between critical interventions or mean-time between failures. There are many numbers being thrown. There's not one accepted number. What we know, for example, in the U.S. that on average human drivers are involved in a crash once every 500,000 miles.
So it's somehow tens of thousands of hours of driving. But there are many nuances. What type of road you are driving, you know, what type of crash you are involved. So it's really not just one number. This is why it's such a complicated area. But we're talking about at least tens of thousands of hours of driving mean-time between critical intervention.
That could be a KPI for precision. Later in the talk, I'll show you why this is not sufficient. But let's stay there. And so what would be the sufficient MTBF for an eyes-off system? So at least tens of thousands hours of driving.
When we talk with OEMs, there are much bigger numbers being thrown. One million hours, 10 million hours. Availability, we break it down into three. One is operational design domain. On what type of roads the system is available, what weather conditions the system available? You know, day and night and dusk. What are the times of day and the types of road and the weather condition when the system is fully available? This is ODD.
So the more availability, the higher recall. Geographic scalability. How good is out-of-the-box performance in a new location? So you maybe you collected data in one geography.
How does it generalize to a different geography? Or do you need to collect more data in every geography that you want to deploy? So that's geographic scalability. The more scalable you are, the more available you are, the more on the right you are on this axis, the recall axis. And then cost.
It's not just the system cost, it's maintenance cost. If you have a system that has, you know, very delicate sensors that require very expensive maintenance, it also increases the cost. If you're talking about the robotaxi and you have many tele-operations, it also increases the cost. So cost is not just the bill of materials of a system.
So the lower the cost, the higher the availability. So we have these two axes, and where we want to be, we want to be on the top right corner. So we want to have 100% recall, 100% precision, and let's call that Level 5 at scale. So I'll show three schools of thought. So the first school of thought is what Waymo and companies like Waymo have been doing. So, they went on a relatively small recall but very high precision.
Small recall means that the system could be very expensive. Small recall means that it's not geographically scalable. That means every new geography, you need to build maps, you need to collect data. So the scalability is limited. Could be even, you know, the type of driving, the type of roads could be limited.
So that's a limited recall but very high precision. This is where Waymo is right now. And where Waymo needs to go, it needs now to increase its recall.
So, to continue on along the x-axis in order to reach the Holy Grail. Now, is where Waymo now at the point of revolutionized transportation? Well, the answer is clearly no because this kind of service hasn't changed you know, the behavior or the adoption of people. They did not yet abandon their private cars in return for driving an autonomous car. Because the cost is very high. It's multiple factor of an Uber or Lyft drive. The drive is relatively slow, it's more conservative.
It could take more time to reach your destination. So it's not yet at a point where you could say it's a revolution. When will it be a revolution? When the cost would be down significantly below the price of taking an Uber or a Lyft and you have full scalability, geographic scalability, then you'll create a revolution. So we're not there yet. So this is one school of thought. So the first school of thought is, identify a relatively low recall, but go to 100% precision, and then try to increase your recall.
The second school of thought is what Tesla is doing, is to optimize recall and figure out precision later. And Tesla, if you look at kind of the versions, Version 11, Version 12, Version 13, you see a very big expansion of recall. They can do much more in Version 13 than they did in Version 11. But when you look, for example, at the website FSD Tracker where people upload their MTBFs, the mean-time between critical interventions, you see it's more or less flat. It's about five to 10 hours of MTBF. So, precision is not yet the optimizing axis, it's the recall optimizing axis.
And the goal of Tesla is now reaching almost 100% recall to start or 100% availability to start going up the ladder of precision. And there's a question mark whether this is a point of discontinuity or this is a smooth transition. What is Mobileye doing? Mobileye is a third kind of school of thought. So on one hand our SuperVision goes through a similar trajectory. We have a system in China based on EyeQ5.
It was designed back in 2021. There are 300,000 vehicles in China. It's coming out in the west with the Polestar 4, both in Europe and later in 2025 also in the U.S. And the clip that I showed you before, the clip on the left-hand side was demonstration of what the system can do out of the box, both in highway and in urban. The recall is slightly less than the recall of the Version 13 of Tesla.
And our next generation, what the entire company is working on with EyeQ6 coming out next year, 2026, with Porsche and Audi and the entire Volkswagen group, would have a much higher level of precision. But still not the level of precision that allows you to be eyes-off. And in the next few slides, I'll show you what we're doing to get there. So this is one hand. What we're doing in parallel is a year later, introducing Chauffeur. Now, Chauffeur is an eyes-off system.
And what we're doing here is sacrificing recall. What are we sacrificing in recall? First of all, it's the type of roads. It's going to be only highways. But on highways it's going to be a very useful system. 130 kilometers per hour driving, autonomous lane changes. And it's an eyes-off system.
Eyes-off system is a big discontinuity jump. Because you can do something else in the car, you can work on your smartphone, you can do something, you can do something else. And this would be approved in terms of what you're allowed to do when driving the system. And this requires a very, very high precision.
What are we not sacrificing? We're not sacrificing cost. This is volume production cost. We are not sacrificing geographic scalability. This would be able to drive everywhere on the planet out of the box, using techniques that I'll show you in a moment. And then going into the future after that is gradually increasing the recall just by adding more and more road types.
And I'll show you what we're going to do there. So it's a different school of thought. It says that it's not that we take the SuperVision 62, which is coming out next year, which is basically camera-based, there is, you know, a few radars, but it's a camera-based system, and trying to crank up the precision, now, we're introducing more sensors, we're introducing a lidar, we're introducing an imaging radar. So we're introducing more sensor capabilities and more compute and reducing the road type to highways in order to reach that very high level of precision.
And from there, start moving up, moving right on the x-axis to gradually increase the availability of the system towards the end of the decade. So it's a different school of thought. And those two activities are in parallel. The SuperVision 62 is led by Porsche, and the Chauffeur 63 with three EyeQ6 is led by Audi coming up a few months after the SuperVision 62. Okay, so now the question is what does it take to get a high precision and sufficient recall? This is what Mobileye is really focusing on, focusing on the next generation system, which is coming out next year. So first the challenge is generalization.
So what is the challenge? First, there are many edge cases that you need to handle if you want to reach very high precision. And you have this out-of-distribution. You're trained on a certain geographical area, you got data which is biased towards one geographical area.
And the question is, when you go to a different geographic area, is it going to work out of the box or not? Maybe you need to collect more data, you need to build maps, you need to do more training. It's a question, right? So this is the challenge of generalization. Now, there are two approaches to handle this. One is collect sufficiently diverse data, collect more and more data, more and more geographies. This is what we have, for example, in our ADAS. We have close to 300 petabytes, tens of millions of hours of driving, of data collected over more than 20 years of our ADAS programs.
And it's very diverse. All over the planet, different weather conditions, day and night and dusk and whatever. So collect sufficiently diverse data. Another thing that you can do is inject abstractions.
So these abstractions take out-of-distribution and make it typical things that you have prior information on you can inject into your system, and, in that way, increase the generalization. So what are the Mobileye design principles here? first, we leverage ADAS, all the tens of millions of hours of driving, of video clips that we have all over the world. We leverage REM, our crowdsourced mapping technique. And we're using redundant systems with injecting abstractions in order to increase the mean-time between failure, increase the accuracy.
So I'll go into a few details here. So there are five pillars to this. So kind of the building blocks in order to get to this point where you can revolutionize transportation. One is about safety.
And safety is more than just mean-time between intervention. Second will be the technology stack. It is the software, the silicon, the sensors, all the technology stack. So there's more than just the software. Third would be the scalability.
How do you get out-of-the-box performance everywhere on the planet? What do you need to do there in terms of scalability? Then execution, productization. There is a huge discontinuity between having a nice demo and going into volume production, multiple brands with multiple OEMs. This is a big discontinuity. And what does it take in terms of execution to get there? And then cost. How do we create a system which synergizes an entire product portfolio and brings this to a consumer-level cost level? And Mobileye, in its DNA, it's a volume production company, right? We're not building something which could be very, very limited in terms of its volume production. We're building things that, you know, go into very high-volume production.
So, first, the high precision. So I mentioned before the MTBF, and here I'm going to make an argument why it's not sufficient. And it's not sufficient because of two things. One is there's this notion of unreasonable risk.
So I'm giving here the example. Let's assume that have a baby lying on a highway. Now, this is such a rare event. Perhaps it never happened during the last century, a baby lying on the highway. But when a human driver sees this thing, right, it'll take action.
It'll not say, "No, this is above the MTBF of my system, "so I'm going to run over it." So even if something is extremely rare, if it's an unreasonable risk, you need to be able to take action. So MTBF alone is not the only KPI. You need to be clear about what is reasonable and what is unreasonable risk. For example, a reasonable risk that you can argue is that I'm not going to account for two flat tires happening simultaneously.
One flat tire, I'm accounting for, two flat tires, I'm not accounting for, right? And I'll be open about it. So this could be a reasonable risk. Second argument of why MTBF is not sufficient is that comparing MTBF of humans and a robotic driving machine or engine is not exactly apples to apples. Because, you know, humans, you know, the MTBF of humans is largely driven by illegal behavior, driving under influence, being distracted and so forth, where a machine is never distracted, never driving under influence. So comparing the MTBF of these two kind of systems, the human driving and the robotic, could get into this issue where it's not really comparable. So you need more than just MTBF, and you need this notion of unreasonable risk.
So at Mobileye, and this is part of this, you know, paper that we published recently, we see four pillars of errors that the system could have. One is planning errors, lapse of judgment in making a decision and merging into traffic. This is something that we handled back in 2017 with the RSS paper. Our position is that this kind of error is zero, should be eliminated altogether by having kind of a system of rules that tells the robotic driver exactly what to do, what kind of margins, lateral and longitude margin, it should have in order to never create an accident.
You can be involved in an accident when it's not your fault, but you never create an accident. So this is the RSS, this is something that we published back in 2017, and it's really a pillar of our safety methodology. Then there are identifiable errors, hardware and software.
Say a camera malfunctions, a sensor malfunctions, you have a software bug in the system. This is all about FuSa. Here you need to create hardware redundancies.
For example, in our system, the Chauffeur system, there are two boards. One board has two EyeQ6 and the other board has a single EyeQ6. And those are also redundant to one another.
So when one board fails, the other board can take over and bring the car into a safe stop. So you create redundancy, and this is the whole chapter of FuSa. Then there are reproducible errors.
The system fails, it doesn't know that it fails, but you can reproduce the error. So you can reproduce the error and fix the system through training or whatever. But you can fix the system. And here you have to be clear about what is reasonable risk and what is unreasonable risk.
So, for example, the baby lying on the road as an example, a metaphorical example, this would be an unreasonable risk. Two flat tires, perhaps you can claim this is a reasonable risk, but you need to be open about it. And there are standards around this. It's called SOTIF.
So these are standards around this. Then comes the last, which is black swans. These are errors that are very difficult to reproduce, and let's call these AI bugs.
Because machine learning, at the end of the day, is kind of a black box. Sometimes it's difficult to understand why the machine learning engine failed, and that's called, that failure, a black swan. And here the remedy for that is redundancies. So in the case of identifiable errors, these were hardware redundancies. Here, we need to think about software redundancies, and you have sensor modality redundancies, and how to put everything together in a redundant system. And this is how you can increase the MTBF of the entire system.
So let's talk about the redundancy. When you say the redundancy, first it's key for both identifiable errors and the black swans. We boost the precision through design rather than through data-driven. So we designed the system to be immune to failures by having the system not fall under one failure point.
There should be at least two subsystems that fail in order for the system to fail in many aspects. It's not just sensor modality. And, yeah, we obtained that by building multiple systems. And the challenge would be, how do you fuse everything together? So you have all these different subsystems giving you input. How do you fuse them together? On the right-hand side, we have kind of sources of redundancies, could be sensors, we have cameras, radars, lidars.
You know, source of redundancy could be the type of algorithms we use, whether they're appearance-based in computer vision, whether they're geometry-based that you get from lidars, also you get from computer vision using different techniques, and the architecture that you build. All these can create redundancies. But now comes the question of how do you go and fuse it. And this is part of, you know, this paper that we wrote about our safety concept. And we call it primary-guardian-fallback fusion methodology.
Now, in order to motivate it, let's take the simple example where we have three different subsystems, one based on camera, the other one based on radar, the other one based on lidar. And we are following a vehicle in front, and we need to make a decision should we brake or not brake. We can have two options here. One is take the worst case. If one of the subsystems says you need to brake, you brake.
The downside of this is that you are tripling the false braking. So it'll create kind of an un-smooth, or, you know, the driving experience would not be as comfortable because you are increasing the false braking rate. The second option would be majority vote. It will deal with both mis-detection and the false detection.
So you're taking a majority, two out of three, and you can then show that, if each system has probability of error of epsilon, in a majority voting, it'll be epsilon squared. So you're really enjoying the fact that you have redundant systems. Okay, so now this is simple, but it's not so good. Why it's not so good? First, you'll compromise comfort. Because each sensor has weaknesses and advantages that the other sensor does not have. So cameras are very good at very high resolution and getting appearance.
They're less, you know, precise in getting depth measurements. Radars and lidars are very good at getting depth measurements. So you have advantages and disadvantages of every system, and you're not taking advantage of it by taking the majority rule. And, second, it'll not cover non-binary decisions.
So, for example, a non-binary decision here is now we need to decide whether we're steering right, steering left, or continuing straight, and each subsystem will give me a different answer. So there's no majority here. So what do we do here? So what we built here, a concept which we call primary-guardian-fallback, is we have a primary system that will predict where the lane is. We have also a fallback system that, using a different technique, will predict where the lane is. And we have a guardian system maybe using kind of a discriminative network, neural network that will check whether the primary is correct or not. And if it, you know, decide that the primary is valid, it'll choose the primary.
Otherwise, it'll choose the fallback. And what we can show here is that this system will fail if and only if two subsystems fail. So we get this epsilon squared, but we're not limiting ourselves to binary decisions. And this is kind of the paper. You can find it online.
So now let's try to put everything together, and look, what is Mobileye's architecture going forward for next year with our EyeQ6 system? So, the architecture has cameras, the sensors are cameras, map, could be an SD map and it could be our REM map, our high-definition map. It has radars, it has lidars. When you're talking about the SuperVision, it's mostly the camera system. When you're talking about the Chauffeur, it has both radars and lidars. So let's first focus on the top, on the camera system. The first abstraction is that we have kind of three different tokenizers or preparation of the image.
One is take the image view as is. The other one is take a wrapped view to create a system in which you are doing triangulation among the cameras in order to get explicit depth. I'll show you an example of that. We call that vidar, like visual lidar.
And the third one is a drone view, which is really optimal for deciding about the road structure. So this is kind of a preparation, what we call a tokenizer. If we go on top, this tokens goes into an encoder. This is a transformer.
It's called the STAT. And we talked about this transformer back in our AI Day. It's 100 times more efficient than out-of-the-box transformer. And we explained why there at AI Day. Our tokens have different types, and we designed the types in a way that really increases the efficiency 100 fold.
And this transformer has two heads. One is an end-to-end 3D of all objects, vehicles, and pedestrians. And the other head is a trajectory, kind of an end-to-end steering and braking control. So this is one route. Let's call this the kind of the 3D, from images to 3D to trajectory as an end-to-end. The second one is an end-to-end 2D.
What do we want to optimize here? What do we want to optimize is leveraging 300 petabytes of clips that we have from our ADAS activities such that the system will have very high chance of operating everywhere on the planet out of the box, just like our ADAS systems have. So we can take all this data and create an end-to-end in the 2D space, finding the vehicles and the pedestrian in an end-to-end from images to the output, which is where the vehicles and pedestrians are, but in each image space. Now, our ADAS comes from front-facing camera, but it really is designed to be very typical of any camera, whatever position it is. Because when, for example, you're approaching a junction, you see vehicles from the side, you see vehicles from any angle. So it's not just you see, you know, front end or back end picture of the vehicles.
So we can create an end-to-end two-dimensional representation of the vehicles. And then there is another transformer for traffic lights and traffic signs. Let's go down the wrap view.
The wrap view is to create a system we call vidar, which is now a different type of redundancy because now it's based on creating 3D depth and finding 3D objects. And I'll show an example of that we talked about a few years ago. It's part of the system. And the third one, using the drone view, is another transformer that receives two input. It receives the image tokens, but it receives also our map data, whether it's SD map or whether it is REM map. And then it handles the kind of handoff.
If it has map data, it uses the map data to determine the road structure. If it does not have map data, map data is not reliable, it leans towards what it sees in the computer vision, what it sees online, and does it in a very seamless manner. Going back, we have another piece which is the EyeQ6 Lite software stack.
So here we have kind of the best ADAS system in the world. It's really objective claim. Given the amount of business that we have with ADAS. So we don't want to reinvent stuff that we already have done, it's in volume production. AEB and ACC and traffic signs and ego-motion, all sorts of things going on there that are highly, highly optimized.
So this EyeQ6 Lite software stack takes a fraction of our EyeQ6 High silicon. So we put it inside there. So this also is fed into the integrator. And then with radars, we have a transformer that takes the radar input and outputs 3D objects. And the same thing for lidar.
We have another transformer that takes the lidar input and transforms it into 3D object. So all these inputs goes into this PGF fusion engine. And, for example, these are kind of three different examples of the PGF in three different areas. One is, say, physical objects, vehicles and pedestrian. So the primary, it's kind of a integration, a graph neural network that integrates all the inputs from the end-to-end 3D, from the end-to-end 2D, from the vidar, the visual lidar, from the RoadX, the road structure. Integrates everything into a perception of the world.
The guardian would be RSS being checked on each sensor individually and follows a majority rule, two out of three. And the fallback, it will be, if according to the two out of three sensors we violate our RSS, then we stop, or create kind of a safe maneuver to stay on the side, minimum risk maneuver, or just rely on the end-to-end stream. So this would be, for example, for physical object.
On lane semantics, it's completely different. The primary would be the RoadX fusion, fusion of the camera input and also the maps, the different maps. The guardian would be a discriminative deep network that will decide, you know, whether the lane is valid or not valid. And the fallback would be an end-to-end lane trajectory.
So you get this redundancy. And the same thing with traffic lights. And we have more with ego-motion, view range, policy, and so forth.
So this is an example where this kind of fusion technique of making sure that, if you fail, you have two subsystems failing simultaneously. It's applied not only for sensor modalities, it's applied all over the software, the software stack. Again, because the purpose here is to focus on the precision axis because this is where we can create a revolution. If we get to the 100% precision at the cost which is good for consumers and a sufficient recall, like Chauffeur is a sufficient recall to start with, highways, 130 kilometers per hour, then we're creating a revolution. But you need to reach 100% precision.
Without that, it's only going to be a cool product, but it's not going to give us this revolution point. Okay, so let me show show you a bit with examples. So kind of this end-to-end from pixels to a control command. So you have the various cameras.
We have 11 cameras. Seven long-range and the four parking cameras are all fed into this big transformer. And the output is control commands.
You don't really want to output control commands because then it will change from vehicle model to vehicle model. What you want to output is a trajectory. And then you have a piece of software that takes the trajectory and translates it to control commands based on the vehicle model.
So how would this look like? Before I run this clip, this is our vehicle, and the trajectory is displayed here. When the trajectory is long, it means that you can drive fast, when the trajectory is short, meaning that you're driving slow. And you'll see how the trajectory changes based on what happens in the scene. So as we are driving here, there would be kind of a parked car at some point here. So you see that now we are slowing the speed.
This parked car is now making a maneuver. You see it here. And then the speed increases. So this is end-to-end from images to control command. So this is one path in what we have. This is the visual, this is the vidar.
Let me run this clip. Again, input, cameras, output, what you see here. Being able to translate all the cameras into a 3D perception, just like a lidar. And this is a completely redundant path compared to looking at images from the appearance point of view. Everything here is 3D. Now, why is that important? Let's have a look at this clip.
This is from an actual scene from actual data. The car that you see here, and I'm going to run this clip, is actually moving in reverse. So it's facing us, moving in reverse. So if it was just appearance-based, you would think that this car is approaching us.
But because you have 3D perception directly through vidar, you know that this car is moving with the same trajectory like all other cars, even though it's facing us. So this is the power of redundancy. This is the power of abstraction. Instead of waiting to see this as an edge case and then collecting data for this particular edge case, we already abstracted it in.
It's part of the system because we have another route that looks at direct 3D perception, just like a lidar but using cameras. RoadX. RoadX input is a drone-view image, all images. REM, if available.
And, REM, we have a very, very big coverage, you know, Europe, U.S., big parts of China as well. SD map when available. And the output are the drivable paths, the boundaries, the lane marks and road edges, semantics where there's a split or merge in the road. Priorities right and left. And all sorts of attributes, whether it's a high-occupancy vehicle, a traffic direction, you know, bicycle lanes and so forth.
All of this is the output of the system. And you can see here, on the right-hand side is the top view. So you can see all the lane structure that comes. So this is an integration of REM and the camera sensing. But when REM is not available, when SD map is not available, it will do the best it can just from images alone. So it will allow us to do this very smooth handover between relying on a map and just relying on cameras in this fused network.
I mentioned, you know, transformer taking radar input. So what you see here is an imaging radar. On the left-hand side is the radar input, on the right-hand side is the output processing, and it looks as good as what you would expect from camera processing, right? So it leads me to our also imaging radar, which I'll show next. So this is a redundant path. So it's a redundant path of detecting object, detecting pedestrians, and an object redundant from a sensor modality point of view, not from software architecture.
This is the lidar-net. Again, it's a network taking front-facing lidar, in this case, or taking multiple lidars that we have on the ID Buzz and creating a 3D percept, as you would expect you can get from a lidar. So this is another redundant path. So what we talked about is, we call this compound AI system. This is a term used in language models.
It's a compound AI system. So you're using AI modules, whether they're end to end, whether decomposable, injecting abstractions where needed. And, by the way, this injecting abstractions is done all the time in language models.
For example, example that we gave at AI Day. If you ask a language model to compute multiplication of two numbers, say, you know, 10 digits each, it'll do a very lousy job because it does not generalize, does not understand the long multiplication algorithm just from looking at examples of two numbers and their product. What is being done with language models is translated into a tool, which a tool is a piece of Python code that will run a calculator and do the calculation for you and not through a neuronal output. So this is, again, an abstraction. So there are many abstractions being done in language models that we all use today.
And what I've been showing here is example of our abstractions being injected into the system. Now, all of this needs a very good silicon. So next I'll be talking about our chip, the EyeQ6 chip. And it's really designed for efficiency. So what what you see here is a comparison between EyeQ5 and EyeQ6.
But here I'm not comparing TOPS, tera operations per second. We always claimed that this is the wrong KPI to look at. We're looking at the number of different algorithms that we have, and we're looking at the numbers of frame per second to run this algorithm. And what you see here is a factor of 10.
Basically, the compute density of EyeQ6 is 10 times the compute density of of EyeQ5. So it gives us a lot of room to work with. So if we look at our Generation I versus Generation II. So Generation I is the SuperVision we have today.
It's based on two EyeQ5 High. The mean-time between failures is about five to 10 hours. And the first clip that I showed you at the beginning of this slide deck where the car, the SuperVision car was driving in Shanghai is an example of this kind of performance.
Gen II will be based on two EyeQ6. By the way, we have Gen II as a B sample. It's already running in pre-production cars of Porsche and Audi. So Gen II is on two EyeQ6, has 10 times compute, but due to the redundancy, 10 times compute could translate to 100 times in terms of MTBF, this epsilon squared thing. So we are expecting our architectural design to bring us to 500 to 1000 hours of MTBF just from the camera system alone, not counting the contribution of redundancies from radars and lidars in a Chauffeur system.
So this is Gen II coming out next year, 2026, at first with the Porsche and Audi vehicles and then the Chauffeur with three EyeQ6 on Audi. Now, another very important piece to complete this picture of moving up the precision axis is sensors. Now, back in 2018, we were kind of looking for the ideal sensor to complement cameras. Now, people think the ideal sensors are lidars. We disagree.
Lidars and cameras have lots of commonalities in terms of failure, bad weather conditions, wet road, rain, that there are many commonalities of failures. Radars and cameras have little commonality of failures, completely different principle. Radars are immune to weather conditions, immune to low sun. The problem with radars back in 2018 was the resolution, the performance.
It cannot be a standalone sensor to create a full perception of what's around us. So we started to design imaging radar. You know, this was assets that we received both IP and people, teams, full teams we received from Intel at the time, and we built, you know, software-defined imaging radar. Which the spec is really out of the charts. There's nothing comparable to it. And I'll show you in the next slide what I mean.
You know, dynamic range is 100 dB compared to 60 dB of any other competing radar, both competing physically or competing on paper. And the ability to separate, you know, objects, whether it's a pedestrian standing near a vehicle or, you know, a small wooden piece standing near a guardrail, the ability to do this kind of separation is really amazing. And we have already this radar in B sample, and standard production is next year.
And this is an ideal redundant sensor because it's very high resolution. In terms of virtual channels, it's 48 by 32. Say, compared to 16 by 16 of other competing imaging radars.
And, again, 100 dB in terms of dynamic range. So it has density of cameras, high accuracy, and we can get out of it the same kind of sensing state that we get out of cameras. So this is an ideal redundant sensor.
And very low-cost. That's few hundreds of dollars. So, for example, building a 360-degree perception of one BSR and four BSRC is less than $1,000. It's something like that.
So it's really very, very efficient in terms of cost. So here's work we're doing with a specific OEM on the imaging radar. And see kind of on the table here a list of tests.
And it's all about hazards. Because when you're talking about an eyes-off system on a highway driving at 130 kilometers per hour, the first thing that you need to protect against is hazards. Whether you have a tire on the road, whether you have a small wooden block on the road, this thing could create the car to flip over. Since you're driving at 130 kilometers per hour, you need to see it way above 100 meters away. So what you see on the middle column is the requirement above 130 meters. What you see on the right-hand side is the performance of the sensor, which is way, way, in most categories, way above the requirement.
But we put here in boldface the requirement of the weakest link. Which is very, very close to the requirement. Requirement is 130, and we have 136 meters.
So you need to pass all the tests, it's not just pass one and not pass another. So let me show an example. Here you have a dummy simulating or emulating the fact that you have a pedestrian lying on the road head first. It's very narrow in terms of what you see.
So this is eight meters away. And this would be 100 meters away. You see in the middle... The mouse here is not working. In the middle, the lower middle where the arc of 100, you see a green dot. This is the detection of the radar.
And this is 220 meters away. So the system can reliably detect that kind of object 220 meters away. So this is an ideal sensor for getting us to this 100% precision.
So we have the camera subsystem, we have the radar, this imaging radar subsystem, and we also have a lidar for three-way redundancy. So we talked about the technology. The technology was the architecture, this compound AI architecture, it was the silicon, our safety concept, how we fuse things together, and the sensor themselves, the imaging radar. And there are other stuff, but I think those are the main pillars. The third one is scalability, geographic and ODD.
So here what we have is REM, our crowdsourced way of creating data, very low-bandwidth data from from cars, millions of cars sending us data. What you see here, that 2024, we received close to 56 billion miles of road data. That was as of 2024.
In 2024 alone is about 30 billion miles of harvested data. Per day, it's about 60 million miles of data we receive per day. And you see that we have a growth. Quarter by quarter, it's growing. So we have more and more cars are being added to this network of sending us the data. And this data is very important to create kind of a memory.
Any worthy production system that is doing hands-off driving is using some sort of detailed map. Even those that claim that they are not using a map, you look at the display and you see immediately they're using a map. So having a memory is critical to get this very high precision. Here's an example of how this looks like.
And you see that, you know, we have... This is Europe. We have 95% of Europe already covered. Okay.
ODD expansion. So we said that our first step is the Chauffeur, which is just highway. And then over time we would like to move on the recall axis and add more availability. The way we add more availability is adding four surround imaging radars.
So not just the front-facing imaging radar that Chauffeur would have, but also four surround imaging radars to get 360 degree of radar input. And adding another EyeQ6. So four EyeQ6 in terms of compute or two EyeQ7. Both is a marginal addition of cost. So we're talking about high-volume production cost and increasing the ODD. So we don't need to deal with cost, we don't need to deal with geographic scalability because we have the REM, we have all the ADAS data that allows us out-of-the-box generalization.
We just, by adding more sensors and adding a bit more compute, we can gradually start increasing this recall. Productization, the execution. Going from a demo to a real product. This is something that is a bit underappreciated.
In order to kind of have a good productization, you know, say, in consumer car volumes, first of all you need geographic scalability. As I said, all the data that we have, more than 50 car makers. REM covers 95% of roads in Europe and U.S.
About 300 petabytes of video clips. Multiple car models and OEMs. If you are a supplier of a system to OEMs, you need to allow them to tune the system. This is the DXP that that we introduced last year that allows the OEM to completely control the driving experience of the car. The modular AI stack that we have allows us to be flexible in terms of sensor design, more cameras, less cameras, with the radar, without the radar. This allows a very large flexibility without kind of reinventing the system from scratch again.
And then, you know, industry standards. As I said, it's not just mean-time between critical interventions. You have FuSa, you have SOTIF. As I said, we have more than 50 OEMs, 1200 car models with our systems. We shipped 190 million chips over life.
So this is very significant in terms of execution. And, you know, we successfully deployed the SuperVision in China. Gives us a lot of experience in terms of productization of these kinds of system.
And also the safety architecture that I mentioned before. Here's an example of the advanced product, SuperVision and the Chauffeur. So SuperVision first generation is already launched. With Polestar, it'll come to Europe and U.S. this year. In terms of pre-production of our next generation.
So the entire focus of the company is the next generation. This is where 90%, 95% of the company's focused on. With the Volkswagen group with the SuperVision and Chauffeur, and then with the robot taxi, I said it's led ADMT, the ID Buzz. But we have also Schaeffler and Benteler and Verne following them with the same technology. In terms of kind of statistics of Mobileye.
As I mentioned, 190 million chips so far, 1200 car models. But also if you look at 2024 numbers, you know, 313 car models were equipped with EyeQ and launched in 2024. 82 software SOPs during 2024. So this is a very huge execution machine.
80 active ADAS products. So ADAS product means multiple OEMs, multiple brands. More than 460 software versions delivered in 2024.
Okay, so execution here is key to go from demo to a volume production. And this is key for success. Last is cost.
I think this would be the last slide. If we look at the pillars, both the ADAS, which is hands-on, eyes-on, the SuperVision now hands-off, eyes-on, and then Chauffeur, which is the first eyes-off system, and Drive, which is the robot taxi, the numbers you see there is the system cost. It's not the EyeQ cost, it's the entire system cost to the OEMs.
And we're talking about generation of EyeQ6 and then the generation of EyeQ7. These are costs that fit high volume production. And this is critical, again, this availability axis is very important to get there. So if I summarize what we have. I think this chart kind of puts things in perspective, kind of the different schools of thought how to reach the Holy Grail. So the Holy Grail is the right hand top, the top right-hand corner.
How to reach it? One starts with precision and then starts working along its way on the recall, where recall here is both the cost, geographic scalability, and the availability of what road types. And the other one goes, first, recall, gets to 100% recall, all road types, optimize the cost, geographic scalability, optimize all possible maneuvers, and then figures out how to go up the precision axis. Mobileye is going through that route with SuperVision but then is kind of forking into Chauffeur, which is compromising recall, adding more sensors, focusing on specific road types like highways, but targeting the 100% precision and then gradually adding more and more recall once you reach the 100% precision. But that recall, that availability is not in terms of cost and geographic scalability, it's just the different types of roads that you want to support. So this is kind of a summary of where Mobileye is going.
More details. If people are interested in more details, we had this two hour of AI Day that is online. We did that back in September.
The safety report is PGF. It's written in academic style, but it's really deep. And this is the pillar of our philosophy of how to build a safe system.
And it all converges into the presentation I made today. Thank you very much. (audience applauds) - Thank you, Amnon. And thanks, everyone, for coming. Please come visit us in our booth here in West Hall. At one o'clock, we're having a small talk with VW executives on the mobility-as-a-service activity.
So that could be interesting as well. But thank you very much for attending our press conference today. Thank you.
(soft electronic music)
2025-01-12 10:27