Chip Wars: The AI Revolution Unleashing a High-Stakes Silicon Showdown

Chip Wars: The AI Revolution Unleashing a High-Stakes Silicon Showdown

Show Video

It's not like they, they suck sunk like billions of dollars into R and D or something. Like how did suddenly in that three year, like what it seems like this is a huge story, right? That we're not just learning about, like how did that happen? Because the hockey stick happens after 2020. That's obviously the pandemic and everything, but the investment wise, like Intel has been like sinking money as well, right? It's not like the red bar on the left has dwarfed the realization. There are three factors at play. The first is the realization from investors

that Intel is not catching up to 2023, 2024. Hello, everyone. Welcome back to another episode of this amazing show with Sid and Bundy. Today, we're going to be talking about chips, not potato chips, which I really, really love, or banana chips for that matter, but actually electronic chips, semiconductors, the kind of stuff that goes inside every device, not just computing devices, because these things are in every single object from cars to washing machines to refrigerators and like pretty much any device you can think of. The odds are it probably has a microchip inside. Obviously, these are things that have absolutely revolutionized our life over the last 25 to 50 years. And they're just an amazing piece of technology, all built on a piece of sand, interestingly. It's what like silicon, which is one of the key ingredients to all these chips, is one of the ingredients in sand. It's amazing how, you know, think about when you're walking on a beach or something,

you're just stepping on these little... grains of sand that one day might just end up in a microchip. It's truly fascinating to think of. But yeah, Bundy, this is an exciting topic today, chips. And yeah, really excited to also discuss some of the recent advances, right? And one of the things that has happened is with the AI wars and AI being developed, one of the core resources fundamental for them has been chips. So that's the core input that goes into them to figure out how to... frame these models and what you need is chips. So now going back to just understanding what chips are, the chips are essentially the brains of the computer.

And if you actually look at any computer, so you can think about your iPad, iPhone or MacBook, the most expensive part of the hardware turns out to be the chip. the microchip that goes even more than the screen not the touch screen. The touchscreen is already I get my screen cracked all the time and Apple charges me a bomb to go fix it. But you're saying that's because it's Apple, right? So you could fix it yourself for 30 bucks in India. I can show

you some some shop that do that. The chips are the most expensive part and very interestingly and what has happened, what has been happening is there was this law called Moore's law. So that was that was proposed. decades back, right? So it states that the power of the chips would be doubling every two years. And what this is like, Gordon Moore, the chairman and CEO of Intel. Yes. One of the first Silicon Valley companies back in the day. Exactly. So what he conjectured was the number of transistors, which are essentially the basic atomic computing units in a microchip. He conjectured that they would be doubling every two years, right? Okay.

And that law has held up for the last three decades. So it's amazing. So you're telling me you can fit more transistors on a chip, right? Double the number every two years. Wow, that's insane. And that's been happening for the last two decades, last two to three decades. And what that means is what started off as a simple device. Now you could have billions of transistors on a single microchip. Right. So literally billions. And then now the the cheapest phone that you

have is more powerful than, you know, any computer that was available three decades back. Right. And and the main reason is the number of transistors that that your cheapest phone has. It has more transistors than the biggest, largest supercomputer in 30 years back. Right. So that's been that's

been a major, I think, driving force behind. you know, behind these chips and how they roll in the in computing. I think we said that, right, that we sent a man to the moon with a computer that was less powerful than the phone in most people's pockets these days, which are used for obviously checking out videos of cats. Right. Right. It's a very good use case. Yes, yes.

Compared to like sending a person to the moon. Exactly. Exactly. So today's chips, right. So they have billions of transistors. So that's the other part. Now, of course, what has become more I think more important is to not generate or create chips that are general purpose. That

was the original business of Intel to generate chips that will work everywhere. So that will work on a Mac laptop, but it will work on a for a phone and so on. But now there's been a lot of work done to create chips that are customized, right? Specialized chips that are optimized for certain processes. And the one area that they have been customized for is AI. So this is the, you know, what are called AI chips. So these are chips focused on optimizing for AI workloads. Right. So this is the AI chip that have happened. And that's something that

we'll dig deeper with our guests today, you know, later in the episode. And we will understand what the business of AI chips is and so on. But just to give everything in context, these is essentially semiconductor computing, right? Semiconductors based computing. But what's

been happening parallely has also been about, Hey, how do we use other technologies? For example, quantum is another area, which has come up as an alternative to semiconductor based chips. It's still in very early stages, but that's an area if Moore's law stops, or starts slowing down. So we already have a backup plan. So we just quantum computing. So which is an episode that we'll do later. So that's- Awesome. Yeah. So I have a basic question, why are they called chips? Right. So essentially it's basically, you can think of it as a, it's a physical aspect of it. So it's essentially a plate or a plate on which you're putting transistors. And that's essentially why they were initially called chips. So just like tortilla

chips, you dip in the guacamole and whatever else that you put on top. Like it's the platform on which all cool stuff happens. Yes. You're saying it's the same idea. Yeah. That's my guess. All the cool stuff is essentially the transistors being connected in many different

ways. Right. Okay. There are millions of transistors now. So they are connected in different, organized in different forms. And that determines the function of the chip. Interestingly, we can think of our brain as a chip. So brain is composed of multiple neurons. So they can be treated

as transistors. And one of the analogies that people think about is as each transistor as a neuron. And when they think about, hey, when will we achieve AGI, artificial general intelligence, the conjecture is we need trillions of transistors. And that's when we would achieve the power of a human brain. Right. But of course that's just a general conjecture. Yeah. I mean, I

don't think my brain is as powerful as like a trillion transistors. I don't know. That seems a little farfetched and maybe there's like parts of my brain that I'm not accessing or something, but yeah, you don't, I don't think I'm that smart. Yeah. You use maybe 10% of our brain. So the remaining 90% of the neurons are never activated. Got it. Yeah. Maybe more for some people. Maybe more in certain situations, like when I'm eating, I think it was gaming. Yeah, when you're gaming, maybe that's why you feel the trans and, you know, with the experience of that you feel right. Self-actualization is amazing. Yeah. What, what, what is a transistor?

Yeah. So transistor is this basic electronic device. So that has multiple nodes, right? So, you know, in some sense of positive node and a negative node, and then some information flows through. So, Transistors were used to create what I call the first logic gates. So what are logic gates? So let's say you're sending electricity through these two transistors. You can compute the result, which could be the product of these two numbers or the sum of these two numbers or something called an OR gate or an XOR gate and so on. So these are more technical, but essentially you can think of transistors as you take two signals, very basic signals, binary signals, zero one signals. And it computes either the product of those

signals or the sum of those signals. So two inputs, one output. That's a transition output. That's essentially. And then one, once you put a lot of these binary operations together, you can create any complex task, right? Okay. This goes back to even our thinking. So we

can think about any tasks that you're going to do as a series of decisions that you're making. Right. So do I wake up or not? Do I brush? Yes. You know, you can think of anything as. a binary decision you can build that tree and that decision by the way just to be clear binary decision meaning two possible outcomes yes or no yeah okay okay so most decisions could be broken down into a series of binary decisions in fact every day i see right okay okay by having enough granularity and that's why binary what why is binary so important like why why is everything there are ternary chips as well so binary was just an easier thing to do Okay, and most standardization, but yeah, we are turning the chips as well. But turns out going from two to three doesn't really increase the power as much. So it's

going from one to two. It's the if you look at exponent, right? So you have one to the power hundred is just one. Yes, two to the power hundred is huge. Our hundred is, you know, as big as two to the power hundred. So it's it's definitely bigger. But yes, the exponent as soon as you move away from one. Order of magnitude is roughly the same. So not a big yeah And then I think the other benefit I guess is that if you have only two states one can is almost like the circuit Is on and the other is off You don't have this error in measurement of like the circuit being Midway between on and off and that running like a third state and also like weird stuff happening if somebody accidentally touches the wire Yeah, and you know, I think in in nature I think it's also driven by nature so we have a lot of binary things in nature we have You know, electrons and protons, right? So for example, even going to the quantum states, we have the positive spin and the negative spin. So it depends on,

you know, which direction you are in. So there's some kind of natural state for having binary aspects. And that's what we're trying to replicate here. I love that dude. That's a very nice way to think about it. As a left and right is a binary choice. Up and down is a binary choice. Forward and backward is binaries. Like our world at some level. Any kind of choice where there's two outcomes, which is most of our choices, you're saying are binary choices. Yes. Yes. That's so interesting. So you're saying my entire journey. So when you say breaking down everything

into binary decisions, it's almost like going from here to like the bakery is just a series of decisions where I'm either going forward or backward, left or right or up or down. And you can break the entire journey into a series of steps that only involve these kind of binary decisions the whole time. I can achieve like a very complex task of like going from here to even New York. Yeah, that's pretty amazing. What you think about it though, it's that that's

awesome. Yes. In fact, so there's been this mathematician, Stephen Wolfram, right? So he has this theory of how very complex behavior can be got by very simple changes that you do at a binary level. So you could put some constraints on how agents behave by changing their binary choices and you get very complex behavior, right? Is this like chaos theory or this is different? Yeah. Related to that. So it's the exponential part of the chaos theory

comes from this. So you can have Any decisions where as soon as you take one side of the decision at the right point, then you have completely different behavior, right? I think everything comes back to this natural phenomena, which explodes as soon as you have Multiple choices, right? So yeah, I mean almost sounds like a fractal right? Like what you're describing is that you just start with like a very simple kind of rule and then Given enough iterations, like it generates these incredible complexity that just blows up flower arrangements and golden ratio and a bunch of like, I think, cool things that come out of all this. Okay. Yeah, that's about how this thing can be, you know, become complicated. But the flip side of that is any complicated behavior can also be captured by simple binary rules. Very simple. I see. And therefore, by modeling those binary operations using a transistor, so we are able to, you know, compute very complicated decisions, right? That's the key idea. Okay. So going back to what you said, you said that a computer is nothing but like some input where you're giving the computer some instructions. Then there's memory where those instructions are stored and there's

processing, which is like, I think we're talking about in terms of chips, all the brains of the actual operations happen and then some output where the computer then says, I've done what you told me to do. Here's like the result of all of it. You're seeing the most important part of this entire sequence is the processing. And that's where. all our advancements in semiconductors have happened over the last few years. Yes, of course, memory chips as well. So of course, now we have solid-state drives and so on, which are much faster, right? So copying at the rate of gigabytes per second is possible. That's true. Yeah, yeah, yeah. But it's still a static

thing. That was not the reason for development of computing, right? What has changed is this chips, the processing chips that's been the major driver. I mean, it's something like what you're describing is like the memory thing. I mean, I'm just making some random analogy here. But like the change in input device is like a person going from eyes to using a microscope is like you're making the eyes better. That's like input device getting better. But then the brain getting better something like you just like having two brains instead of one and like organizing a problem with like multiple people coming together. That's an example of

the brain being enhanced in a way. And then output being enhances like, you know, you put the plans on like a piece of paper and then now thousands of people can come together and build a pyramid. So like you can do things that you cannot do as an individual if you all come together. Yeah. Yeah. Okay. That's a great analogy. And then, in fact, if you look at the development of human species as well. So yeah, ice of multiple, you know, animals are very similar in the sense

of that, the power to convert that into something understandable by the brain. But where human brain is different from, let's say other brains as the processing power. So the ice are converting the same amount, you know, Dogs can see almost as well as what we can see. They see black and white though, right? Or something like that. That's what I've heard. Yeah. So yeah, but that's the other aspect that they don't have. But other than that, it's very similar. I mean,

or like eagles are supposed to have better eyesight than humans. But like, I mean, that means domesticated humans humans have domesticated eagles. So clearly there's something happening between the ears. That's more important than what's happening behind the eyes. Yes, yes. In fact, sometimes you don't even think of eyes as the, you know, of course, eyes are very important, but there are some animals that have much more complicated eyes. So for example, like the seahorses, right? So just as an example, seahorses, the eyes can, they don't have, they are actually independent of each other. So which means that the two eyes that they have, they can look, be looking in different directions. So they're capturing more information. Which we can't do. So we are looking

in, you know, whenever we look at something, we can only follow in one direction, right? So they are dependent. Um, so that's the, you know, that shows the eyes of a seahorse that much more powerful or much more versatile. But doesn't mean, you know, uh, seahorses have much better, of course, if they don't have better brains, they're not able to use that. So that's amazing. Yeah. So I think chameleons have the same thing, right? Chameleons, I think, can also like move their eyeballs. Yeah, yeah, chameleons, yeah. It looks so weird when you look at that. You're like, how are you... I can't even visualize, like, it's seeing two images that probably

don't even overlap with each other. Yes. It's like watching two TVs at the same time. Like one has a football game and the other has, I don't know, a parliamentary speech. And you're watching the two things together, like, how are you making sense of what's happening there? It's completely crazy. Exactly. So that's the aspect. Okay. So today we're going to talk about advances in chips, right? So KZ changes, you said, you mentioned AI, which I presume you mean action item and not artificial intelligence. I'm kidding, of course, but yeah. So artificial, all the changes in artificial intelligence over the last few years. And that's like, you're

saying spurred a revolution in the world of chips that are specialized for artificial intelligence based processing. So all the chat GPTs, all the wonderful. amazing Instagram images that I now get flooded with or the YouTube algorithm that tells me what video I need to watch every single minute of the day. Now that's happening. You're seeing based on some very specialized infrastructure and hardware, which is pretty amazing. Exactly. And let's, let's deep dive with our guests, right? So on, on these AI chips and understand not just the technology part of it, but also a lot of business and political part of it, because that's been, you know, because they are now the resource, literally the new oil that we have. So now there's gonna be fights for that. So, yeah, let's invite our guests and then we can continue from there. All right, sounds wonderful. Welcome back, everyone. We are incredibly excited today to have a special

guest, Dr. Mrinal Iyer on the show with us today. Mrinal got his undergrad at IIT Madras, where he decided that he wanted to make some ships. And at some times he decided that ships weren't nearly as interesting as chips. So he decided to move from the world of ships to chips, uh, and then did his PhD at the university of Michigan, where I think he completely shifted from engineering water-based devices to solid state devices, which I think was super fun and interesting that he went through that transition after an illustrious PhD dissertation, about 99% of which I probably will not understand. He moved to Intel, uh, in the Pacific Northwest. where he continued his journey with semiconductors and all the cool stuff that Intel makes. And he's since been involved with a bunch of very interesting companies in the area, Graphcore, which is one of the leading edge AI specialized chip makers and then Modular, which is similar to Graphcore as well. Outside of all these incredible professional accomplishments, Mrinal is also

an incredible athlete. Thanks for the great intro to just. you know, add a little to what Siddharth just mentioned. I have, you know, spent at least last seven years of my career in various chip companies, all geared towards making ASICs for AI. So these are AI hardware

accelerator companies. And the joke that I always have is I've spent my career trying to beat Nvidia and it's taken me through three companies and four architectures, and the fight is still on. And so you can see that the rest of the talk is coming to why Nvidia is such a big king in this field. Goes back to like, why do they have such an entrenched position? We will probably cover that today. Yeah, we'll start from the basics, so starting with some of the words that you mentioned, right? So what's an ASIC, right? What's an architecture? So let's go through that. So what's an ASIC? And yeah, in your own words. So when most people think about a chip, the chip that goes into your, what we're most like comfortable using are those in our laptops or in our phones. And these are being called as central processing units,

or these are good general purpose programmable devices. ASICs are application-specific circuits. So what they allow you is, if you have a given budget of transistors, let's say I have 50 billion transistors that I can play with, if I design all of them with the sole purpose of doing one or a few set of applications, you get something called an ASIC. ASICs are closer than, they're not as far as from what we are used to. The phones that we have already have

ASICs in them. For example, the modem that's required, the Wi-Fi chip. These are all like tiny ASICs, which are geared to just doing one thing correctly. The LTE modem from Qualcomm in most people's phones, just handle signals, just handle cell phone signals. The Wi-Fi chip, which typically comes from Broadcom. specializes in basically taking Wi-Fi and converting the data in and out between devices. So ASICs have already part and parcel of our daily life. Now, same thing is what's happening in AI. We're getting these, as the workloads are maturing

towards a particular, into a particular family of possible models. chips are getting more and more specialized. And why do we do all of this is to just get it either lower in power or faster in speed. That is the only two things that we are always trading off. So just to dig deeper there. So ASICs are essentially application specific integrated circuits, right? So essentially

application specific chips. So just to give a sense of how many chip basics are there in a phone, right? So how many ASICs are there in a phone, MacBook, for example, and so on. So what's the order of things? My guess is there's one for Wi-Fi, there's one for LTE, there's a tiny graphics card inside it, there is an image decoder. So all in all, at least four and probably, if you're Apple, you have many more, like motion units and so on. For every

particular sensor, we also sometimes have a specific chip that just deals with the whole, just processing data from that sensor very easily. As an example, we have Face ID. Initially it makes sense to run it on your, on the standard, on the Apple chip. But then if it's very commonly used, you don't want your Face ID to drain out all your battery anyway. So it's good to then just shift that algorithm into a specific chip. and then run it at much lower power so that it can be ambient. So if I understand what you're saying, it's a bit like instead of having a brain in your head, you're kind of pushing the brain closer to all the different sensory areas.

It's almost like having a mini brain in my tongue and specialized knows like how to taste stuff and nothing else. And it's a very highly simplified brain and a similar brain in my eyes that just knows like how to recognize pictures of Marilyn Monroe, who's one of my favorite actresses. I like things like that, right? So it's like, and then there's maybe like a little part of the brain in my fingertips to help me identify when I'm touching, this is getting really, really shady, but yeah, maybe like a piece of cloth that I'd really like, like some like texture that I really like touching, maybe a cat or something without it seeming too awkward.

Yeah, that's not what you mean by- That's exactly. And what your, so the technical word for it is called SOC, which means a system on a chip. and you have basically a central chip, which is your brain or the CPU, which controls all these peripheral chips. And then, it's always

like trading information, sending one to the other and then deciding what to do with what comes from these peripheral chips. So that's why, so I guess coming back to now, these application specific chips, now the applications could be as simple as touching a cloth, right? Or opening a door, right? Or detecting some visual field. or in the case of, let's say, washing machines or things that I use in cars. So they can be very simple chips as well. Right. Is that,

is that right? Yeah. Yeah. The, the, the, the biggest sellers, they should be really specific. You can't have too many things, uh, right. Uh, that, and as long as we can specialize and say that this is the math that I care about and all the math that I do is just going to fit into these, like two or three buckets. Then I can design a chip, which just takes, you

know, bakes the map, math down into the silicon. And you don't and that allows you to save a lot of power. Yes, yes, I get a calculator that can just do addition and a different calculator that just does multiplication and a different one that just does subtraction is what you're talking about. Like you use three different. Yeah, I mean, yes, if you want to say that

way, but these are all like they are calculators which can do all sorts of calculations. But, you know, they are wired and some of them are specialized for doing trigonometry. um, you know, some other, you know, square exponential signal processing. So those kinds of things, but they all do ads and subtracts. Right. Okay. That's too basic. Maybe, maybe not that basic, maybe not that specialized is what you're saying. Before we deep dive into AI chips itself, uh, you know, now we are living in a world where AI is going to play a major role and people are fighting for that. But before that, I would like to understand what's happening in the

world. Right. So. I hear about these chip shortages. And is it all kinds of chips? I hear about them in cars and washing machines, but what's the story, what's happening there? So that's interesting. So steadily more and more devices are getting more and more ability to be remote controlled or run logic automatically. It's more and more logic coming into them. So... washing machines, for example, you can start them remotely. Dishwashers can be, you know, you know, heaters can be made to turn on and off at whatever time. Same with lights. So to be able to control them remotely, you need like a chip inside it. These are not very complicated

chips. These are, you know, the simplest kind. These are called microcontrollers. All they do is like you have 10 options. You can either take a light, you can turn it off, turn it on. You have a clock in it. inside it, which tells it at this time, turn it off or turn

it on and those kinds of things. So these are called, these are very simple data. They allow you to do a few sets of things. And these are called microcontrollers. These are very common in appliances today. Fuel levels up our cars. Cars are like much more complex chips because they are today, like, you know, they'd sometimes control how much gas, how much petrol can actually go into your engine.

What's the ambient temperature inside the engine gasket? Then control the ignition based on that. Allowing you to get much more mileage. Then they control several other things. Your car has your app display. It allows you to tell you to set your time and play a bunch of music and all of these. But the logic here is not very difficult. These do not compare in complexity

to what goes into your phone or your laptop today. So the challenge which has been happening when you hear chip shortages, it's actually limited either to the lower end devices, like your appliances, or to extremely popular higher end devices. So if you're someone like, for multiple reasons, like if you're a Ford, They always operated on just in time, JIT manufacturing. So they do not build up huge inventories. So they depend on everything just coming in place and for efficiency reasons. When it happened that everybody wants like more and more devices,

it so happened that the factories got full and they always started prioritizing devices which get dev more money. Chip from Ford will probably, Ford will buy it for 20 to 50 bucks or. in that ballpark. But if you have an Nvidia coming and selling, buying a high performance graphics card manufactured from a factory, this is gonna cost anywhere from a few hundreds to up to $10,000. So there is much higher margin to be made there. So what you started seeing is like,

factories basically started prioritizing what's gonna make them more money and because they were running... 100% of the time anyway. And so that resulted in a huge problem for everybody else, the lower end manufacturers, compounded by the problem that they are just in time. So they did not have any inventory to last them more than like a few months of production. Right, so it looks like it's more of an inventory management and operations problem, that's my area of research. That's true. Yeah, and today, if the same thing were to happen, The card chips are not going to get, you know, the appliance is not going to get any more expensive. Chips will continue to cost like the same, you know, a few dollars to a few tens of dollars. The

right solution there would just be built up using one place and then scratch it or build it in a way that you cannot keep upgrading the software on it, but invest more into chips. And this basically would be a simple way to fix it. And what's the economics here? So why can't we just, let's say, you know, Air Lift. unit in Taiwan, right? To and put it up in Texas. So what's what's stopping here? So is it the cost of land? Land is a lot of land

in Texas. So what's the issue here? Actually, I checked on that. Are they manufactured in Taiwan? Like where are these things coming from? Like even that I don't know. Like are is it like is it Taiwan or is it like China or you I thought the US like it doesn't Intel like manufacture tons of chips in the US like that? That's a good question. So In the US, there are only two companies which manufacture logic chips, like the ones we talk about in the phones and laptops and the advanced chips. They are Global Foundries, which used to be owned by

AMD. And then there is Intel's own factory, which makes only Intel chips and which are pretty much only what goes into your laptop or in a server. There are a few others, but these are not as competitive. The major competition to the American manufacturers is from the Taiwanese. And the spearhead of this is TSMC, which is Taiwanese Semiconductor Manufacturing Corporation.

It's the most valuable company of Taiwan and probably among the most valuable in the world. They are the most advanced, and then there is Samsung in Korea. So these are the major high-end manufacturers. There are other like, you know, lower end manufacturers in China. China also has a huge role in labor intensive parts of assembling chips, sorting them into your, medium quality, top quality, because you can, if you have, in your laptops, if you have Intel chips, even now you have i3s, i5s, i7s, these are all manufactured together. It's just that i3 has some defects. It's still good enough to run. So they discount it and sell you, market it

as an i3 chip, which is, it's not a different chip. It's just a chip with some... flaws in it. But flaws are not enough to make it stop. They're just, you know, they are simply, they're functional, but they're not as fast, not as efficient as the i7s. And you pay much more for i7s. But they're all from the same wafer. You're saying it's the same chip that's just

like sold at a cheaper price because there's a possibility of a defect? Oh, no, there's some, this is same wafer. So you have a wafer and a wafer has like, you know, several hundred chips in it. like every square, square, square inside a big circle. And typically, ends of the circle, it's harder to get good quality manufacturing at the end because of all sorts of optical problems when you manufacture. It's a part of manufacturing. So you always like right in the corner of the chip and right at the center, you will always have chips which are less quality than the ones in the central ring. So these are things which will, for example, sold at higher prices than the ones in the corner and the centers. You're saying it's the exact

opposite of like a pizza or a dosa where like I think the tastiest parts are like the edges you're saying it's the opposite with like chips. It's the absolute opposite. The center and the edges are the ones which are discarded the first or sold cheap or just absolutely junked and then everything else is sold at higher prices. But coming to manufacturing, so these are the companies and TSMC is the best of the lot. The Samsung is very good at manufacturing memory.

And that's a great visual. That's what a wafer looks like. And as you can see, each of the squares within the wafer is a chip. And what I'm saying is that once closer, you know, in the central ring, are the better quality. The ones closer to the edges and the center are typically, worse quality.

And so we're back to, so Chaitanya's question was, what is the challenge in taking a factory and replicating it? In fact, when I used to work at the factory, there was to be an entire slogan called copy exactly, which actually speaks to how difficult it is to take a factory doing one set of processes and exactly copy it. in another factory, even within the US. So Intel had factories, let's say, in Oregon and Arizona. And taking one and taking it and replicating the same factory in another state is really, really, really hard. You have all sorts of excursions, and many times the issues are way outside this typical statistical zone of comfort.

And Intel also then started pushing manufacturing to Ireland. And those factories had a long period of time, long as relative, but they had a period of time where they were just manufacturing, but they would not hit the same quality as the other factories. And it took them continuous iteration, clever, clever debugging to figure out why exactly they were off, even though procedurally they are exactly the same menu, same recipe to follow, but you will not get the same... 11 of output out of it. So and that is the same company, same set of people, same, you know, management chain trying to create two factories. Now compound the problem of, you know, you know, different companies trying to, you know, move across the globe and set set up plants in the US. You know, Rod, you know, the TSMC, you know, founder has come out, you know, come publicly many times to say, it is very, very They were not interested in coming to the US. So it is only after heavy, heavy subsidies, which basically gives them

a huge run-up time where they might be just producing junk for a while before they figure everything out and start producing top quality that they will be able to produce from both locations, from US and Taiwan to equal quality. And it's expensive because you have a fully functional factory just producing garbage. And that is... hammering money and it's basically these subsidies which kind of cover that and, you know, you know, soften the blow of, and, you know, I'll, you know, make them this shift. I mean, what you're saying, if I understand,

is that what McDonald's or subway or every other franchise restaurant has figured out, which is to like create the same thing in a hundred places with the exact same quality, you're saying is hard to do in the world of semi semiconductors. Very, very hard to do. That's a very good analogy. Scaling out has been really painful. It's just get like maybe like Intel's But the precision required is very high, right? So not like, hey, you know, the burger, okay, if it's a centimeter here and there, it would not matter. But here it's like, we are talking about- It's a couple of nanometers off, you're dead. And tens of nanometers, it's over. Yeah. So nanometers like what? One millionth of a meter or something? 10 to the minus nine, right? Is that right? Minus nine. So it's a billionth of a meter, which is a gazillionth of an inch in the way we describe things in the US. I don't know what, I don't even know what inches are or the feet.

So, yeah, but, but it's, it's so a couple of tens of nanometers. Um, and it goes a sense of the size of COVID virus, which is really small. Of course it's a 16 nanometers. So it's like, so COVID virus, uh, it's 16 nanometers. So we should not be off by. a single virus. Yeah, wait, wait. So this is the size of the transistor that's on the chip. That's right. Wow. That's the size of a COVID virus. It's smaller. It's actually much smaller, but you can't tolerate errors more than half a COVID virus or up to a COVID crazy the accuracy requirements are. And that's what causes, it's a really difficult problem

to copy exactly. But these are made by machines, not humans, right? I mean, it's not like a guy flipping burgers or something. Like there's like some very specialized equipment making these. Why can't you configure the equipment the exact same? Like what, that's what I'm struggling to understand. So as an example, right? Like, you know, the way it's almost

fully automated, there are very few humans actually involved in the process. You're just setting the overall recipe. But it's like, you know, you have a wafer and you're, you know, the way you build up transistors is you hit it with a laser. And you have parts of the wafer which

are masked out and parts which are not. And wherever the light goes through, those like, you know, those become the transistors, your little squares pattern where you want the transistors. And, but the challenge is, what is the depth of like, you know, once the light hits the wafer? There is a chemical which sits there, which actually changes properties according to whether light hits it or not. And if light hits it, it becomes very soft and that can

be removed away easily. But this is mechanical. So it depends on, the coating that is applied is about not in nanometers, it's in angstroms. It's a tenth of nanometers. So, you can have errors there and this is a very mechanical process which just spins. You drop a... liquid, spin it, and then that just perfectly spreads around. But then you can have variations. And you're

talking about these kind of variations across a wafer, which is 300 millimeters in diameter. So if I were to take an analogy, it's like talking about a pizza, but then you're talking about a tiny piece of a flake of oregano. You take a flake of oregano and then take a tenth of it. A small error there is going to cause you a problem with your entire pizza. So that's the challenge basically, even if it's fully automated, to get it out correctly. So just to show the size of these things, right? So you have the human head that looks like a mountain here, versus coming down to Zika virus. That's the smallest one that you can see here, which

is still in... nanometers, right? So it's not even, it's still micrometers. So, So it's micrometers. Yeah, what does 0.045 micrometers mean in nanometers? That's like 45? 45 nanometers, correct. Okay. And the size of a transistor is what again? Around the same thing? The latest transistors are somewhere there. Okay. In that ballpark. And you're saying that an editor of the size

of like this, like an editor, if the laser just like burns a little too much or a little too little to the tune of just a few nanometers here and there would materially change the outcome of like, whether the transistor is usable or not alive or not. Yeah. Yeah. Wow. Yeah. That's very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, I mean, we're used to just imagining the world with like just a bunch of digital stuff, copy paste, and all these things that we do on the internet all the time. Like this is where the digital world interacts with the physical and like physical processes. Absolutely. It's very, very sophisticated and it's a very hard problem to get it correctly. So this is the world's smallest transistor. It's about a nanometer. So literally going to the atomic level, not

atomic, I guess, molecular level, right? Wait, so Chetan, if that's one nanometer, the entire transistor? That's the smallest one. Yeah. Wow. We're pushing this far away from production. Let me tell you, this is proof of concept to say that it can be done, especially if you have carbon nanotubes, it's a completely different mechanism from what's done in practice. The challenge with manufacturing scale is you have to make 55 billion of these. And many of them

almost, you know, they all have to be like correct. And that's where. Um, you know, it's, if, if we can get to a point where we can manufacture, you know, about, you know, a hundred billion of these like correctly, that'll be awesome. We'll see a huge jump in like capabilities that have computed way beyond what is available today. And Wendy, we were talking about Moore's

law just a little earlier, right? And we talked about how like that had doubled transistor capacity every 18 months or so. You're saying that had reached a natural limit and that was the problem. That's what I think we were talking about earlier. Yes. So, you know, we cannot get, we cannot keep on increasing that because now we have reached the limits of physical space, right? Literally going down to nanometers. And I guess this is the, this is where we could,

the semiconductor with the classical computing probably will stop here. And the reason, just to be clear, the reason why when you say limits of physics, like we're going down to the size of individual molecules and atoms, and that's why we can't go any further than that. Right. Right. Gotcha. And it's also the question of our current way of creating transistors. Transistors are from the gate. So it's like, if you want to think about this schematic here, just squeeze

the tube, nothing will go through. If you keep the tube open, things will flow from one end to the other. But transistors are logically just gates. You can either turn it on or turn it off. And the way we do it today using, you know, the standard silicon base by doping silicon and so on, that is reaching its limits. Now, if these open up, if we can take the nanotube-based methods into production, into large-scale production, we're back into Mozilla again. That's a completely different paradigm. But with what we are doing today, which is just continuous refinement

of the standard CMOS technology, we are massively hitting limits today. Right, right. Cool. And just to push a bit more about what's the state of the art, right? So, for example, so I came across this chip by Cerebras. So they have 1.2 trillion transistors. And looks like the one that they're comparing against is with the Nvidia one. That's right. Yeah. Could you tell us

a bit more? Yeah. So, you know, this, so Cerebras' chip is not a chip, it's a wafer. entire wafer is treated as a single chip. So what's typically done is imagine, so cerebrus, this is the wafer. Now normally you would like all the little circles that you see, all of them, that's where you start cutting the chips and selling like, you know, all of these 800 different copies. But what cerebrus, you know, took the bet is that if we, we're patterning this wafer one, one wafer at a time, why don't we just use the entire wafer towards just gear it towards running this AI model. It comes with its pros and its cons. The pros are that if you were having five different chips, you would need wires to connect them. And wires are where it gets really

slow and where it consumes a lot of power. And that becomes the new bottleneck. But if they were all part of the same silicon, they're talking to each other as fast as they're talking inside each other. So your interchip communication is blisteringly fast. The challenge is the fact that not all of these chips work at the same time. You're manufacturing a wafer. And as we discussed, no wafer gives you a full yield. Some chips will not work or will be of lower quality, will be all of this. You're all trying to handle everything. So every chip that Cerebras will sell, will be of different qualities. They'll have some pieces, you know, some fraction of

the chip, which just, chipset doesn't work. And they have to manage their software so that they can avoid all the dead silicon and basically use the parts which are working. And it's a humongous challenge. The second major thing which they've solved, which was a massive engineering problem always was transistors operate at, you know, 0.9 volts. And that's the operating voltage. Go any lower than that, the transistors fail. The entire logic starts failing. Go higher. Again, they can't take it. So it's a very tiny 0.9 volts that they operate on. Now multiply

all of these transistors by that kind of voltage. You'll have a huge power drop. I'm talking of the order of tens of thousands of amperes. That's what these chips need to be supplied with at a very low voltage, at a voltage of 0.9 volts. So it becomes a giant engineering challenge to make a wafer-sized chip. But how do you keep the supply? Imagine a chip which has to always ensure that every transistor gets 0.9 volts. So it's a giant engineering problem

to get those kind of power supply units which can keep it sustained. They've solved that problem, but they have other problems of, you know... They have a much more complicated software, which just does not work in. It's not as easy as, you know, as, you know, GPU and GPU itself are not easy. The Cerebrus is like an order of magnitude hardware software. And they have other giant challenges which, you know, which we can get to about when it comes to just handling large models of the type which are coming in GPT today. But that is Yes. Yeah. So just to understand the state of the art. So it looks like Cerebrus, which is about one, one trillion

transistors. It's getting close to the number of neurons that a human brain has. A human brain has about 100 billion neurons, 86, taking the nature article, 86 billion neurons. So literally we can have a transistor representing every neuron, right? Or even, you know, neuron. So we are achieving, we are reaching that, you know, I don't want to call it singularity, but no, getting there. Right. Yeah. But of course, you know, I, uh, these transistors to neuron equivalency is a false equivalency, because it doesn't mean that the chips are getting smarter, better or anything. Or, you know, it's just a number to throw around. This could absolutely

mean we might need like 100 trillion transistors before we can match the brain of a frog. So it's just not clear today. So is it because the neuron does more than much more than what a transistor can do for transistor is mostly this. Neuron can actually yeah. And you run as plasticity in the new dawn. So it is in all of that. And you know, these don't exist today. I mean, the other thing I was thinking of was like the neurons are also like three dimensional, whereas why are we building all these things like flat? Why can't we build them in 3d? Like that's always so bizarre, right? Why are we list? spreading it onto this piece of paper, like we have 3D printing and all this shit that we talk about, pardon the French. Like why can't we use like 3D, like whatever chips instead of this planar design? That just seems bizarre.

That is a very deep question. And I'll give you two answers. One is that, you know, what you look, what you see as 3D is not, it's actually, what you see as planar is not planar, it's actually 3D. So every chip has like, imagine it's a skyscraper with like, you know, a hundred floors. That's the number of layers you basically require before you even build a chip. And why do you need these? Imagine there's the transistors that they are floor zero, but then you need to connect them, which can be floor one, and then you need to connect groups of them, which is even bigger groups of them. And then that keeps going through more and more levels before you can start connecting big units together. But there is another question, which is the transistor is eventually like, it is on a piece of silicon, you're just dropping, you're basically scoping some, doping some silicon to some depth and the transistor is still there. It's not,

it's 2.5D is what's called, it's not fully 3D. There is a lot of research going on to basically make the transistors more and more, to the point where the part that is conducting is fully completely surrounded the dopant or the other material, the gate. And so what we have is like basically you have the sources and the sink, which is one with the surplus of ions and one with a deficit. And then you have a gate which is on top of, which can switch on and off. But then the question is can the source and sink are two corners. The gate is also

like just on top. Can it be such that the gate completely covers and the source and sink are just at one corner? If you imagine a pipe, can basically gate completely cover the whole thing. So then you can make the transistors way more efficient. And that isn't there today. Right. I think this would require a visual for us to understand. So I've also heard of something called 3D stacking. So is that a completely different concept? 3D stacking of chips. So

that's another area. Yeah, that is yet another thing. Yes, that's a very good point. And that comes at the packaging level. So today, like, you know, or not today. So today, if you like today, today, like, you know, you have chips in market with Intel Sapphire Rapids, which is 3D stacked already. But if you go back even one year ago, you basically like if I talk

about your SOC or take your phone as an example, all of these four, you know, all the chips that form from your system in your phone are all like you have a motherboard and everything just plugs onto that. So you stick it here, stick it there, stick it here, and it's all 2D. And then this motherboard does the job of, you know, talking between these different chips. And that's through a substrate where, you know, there are wires and wires in the substrate and there are pins. You slot onto these pins in the motherboard and these wires can, you

know, these pins can talk to each other. And that's how they were initially designed to talk. But as you have more and more chips in your, in your... system, you need to communicate faster and faster between them. So, and this became a bottleneck. And that's when 3D stacking became a thing where it's a pretty, it goes into this realm called packaging and packaging is another area where actually Intel is a big, you know, is one of the most advanced and it's all about like, can you put chip on top of chip on top of chip? And this is one example where What do you, the blue part that you see in the center is called a true silicon via, because it's going through silicon and between different silicon layers. And it's a via, via for via

duct. So it's basically imagine a fat pillar which just connects all these chips together. So what you can do now is you can mix and match technologies. You can have memory at one technology, you can have logic at one technology, you can have IEW at another technology. And this fact. The adduct in between is like this wire, which allows you to communicate at terabytes a second of bandwidth. So you can very quickly communicate without having to spread them across a motherboard with this fact thing, which is allows you to all these chips to talk together. Got it. Yeah. This is like an elevator that's like connecting multiple floors of a building at like very high throughput or something. Yeah. Yeah. And I was actually involved deeply in the manufacturing

of this giant pillar in between. It's a nightmare. Because you can imagine that. Transistor is a little, you know, tiny, tiny specks of metal in everything. And this is what giant like slab of, you know, straight up, you know, whatever fancy metals being used, it absolutely screws up the density. So it is very hard to manufacture that because you have the entire, you know, current characteristics like you've got raw current flowing through all of this giant wire. And it's it has. enormous amount of inductance and capacitance that it generates. And it just

messes up everything around it. So you have to do a lot of clever manufacturing to account for this massive change in density where you have so much metal all piled up in one place. And then you've just like bits and pieces of like the transistor like metals away from it. So it's a really, it is a challenge to manufacture this particular via. That's amazing. So let's

get to, I think we have been talking about state of the art in general chip design, right? Something that we wanted to deep dive with you is applications to AI, right? So what are the chips that are being developed for AI models, training neural networks, for instance? Yeah, so tell us a bit more about that. Sure, so what the few things economically, technologically which has resulted and math, which resulted in this. The math was basically figuring out that for all of its jazz, a lot of AI is just multiplying matrices. That is the first problem to fix, and then other things follow after that. The second thing was, okay, once you've done this, you've restricted

your math so you can start creating custom logic for it. Second thing is that the tools have improved a lot to the point that you can have a 15-20 percent company, absolutely capable of coming up with a hardware design, creating this design, handing it off to a contract manufacturer like DSMC and get fabrication done. You don't need capital investment of billions of dollars. You need capital investment of about five million to six million bucks to come up with a prototype chip today. I see. And the third thing was this thing that AI is so expensive in terms of the

number of of mathematical operations that are required to get an answer from your image to all the detections or from your prompt and charge GPD to the text that it produces, that it played this demand like no company serving it is capable of basically, it would just blow up the cost. So Google basically saved the equivalent of... dozen data centers of cost by creating their own chip. And they just absolutely, they did this massive vertical integration. So whenever

we think of anything, the first thing is performance. So imagine these players, they want more and more performance. So either faster serving of your, quickly responding to your queries, or doing the same thing at a lower dollar, dollar spent. So that is the fundamental thing driving everything. So behind that comes like, okay, what can we do? And the fourth thing I had was like, you know, the spectacular drop in interest rates that we saw until recently, which allowed a lot of money to be invested into space to try and, you know, come up with newer solutions. So these four or five things together created this, you know, Cambrian explosion of all sorts of... architectures, chip designs. There is like, you know, Google came up with

it's custom chip called the tensor processing unit, TPU. Nvidia always had its graphic processing unit called the GPU. Graphcore had something called the IPU for which they call it as the intelligence processing unit. And then FPGAs are always like, you know, the, you know, very flexible, very pretty much decide whatever you want to build, you can build any kind of logic into them, but more efficiently. So FPGA, you can control it at the transistor level, but

the challenge is that, you know, it is very, very inefficient. So it allows you to experiment with all sorts of hardware designs. And the only people I know who are using FPGAs today is Microsoft. There is an adage in the industry that if you're using FPGAs, you don't know what you're doing. So no ding at Microsoft, but it is, you know, it's everybody else tries to make real ASICs, like, you know, and then the word we use is called, we bake the ASIC, we freeze the design. And once we know what we're doing, we stop using FPGAs and create a full chip just with that. So again, yeah, go on. So just to go on this. So I think everybody

understands CPU. So you know, when we were buying computers. Could you take us through CPU, what does it stand for, GPU, what does it stand for, and what are the key differences? Yeah, so CPUs, the central unit which does the math, was always like, it started off just a scalar unit. Going long back, it was always called the ALU, the arithmetic and the logical unit, which would take two numbers, either add them, subtract them, or do a logical operation on them and return. Then it... evolved to vectors. So you had a vector of units here and you could take a vector of inputs and a vector of outputs, a two-vector of two inputs, take their dot product, take the cross product or you know do a bunch of operations, these are all vectors. What changed with... For the un-initiated, a vector is the same as a matrix? No, vector

is a one-dimensional matrix. So when you have just a single li

2023-05-21 10:08

Show Video

Other news