AIR Overtime The high stakes race for the future of AI - The Chip Wars

Show video

Back to I think some of the recent technologies that you mentioned something about, some of the exotic hardware, right? So a set of brands, of course, we talked about that. Could you touch upon Graphcore and D-Metrix and how these things are compared with each other? Yeah, so Graphcore, the company I used to work for had a different bet. That was that it is basically saying, so if you go deep into what, what makes GPUs really attractive today, it's this memory and it's a combination of memory and compute. Right. So GPUs come with their own private, own attached memory that they have. And this memory is called as HBM, or high bandwidth

memory. And there's only one company which manufactures it, and that's Samsung. And this is very, very expensive. but it basically gives you memory at high bandwidth. So as a point of comparison, your standard memory modules that you attach in a CPU give you about 200 gigs a second in high-end CPUs, very high-end like in server-grade CPUs. 200 gigabytes per second of memory transfer,

right? Memory transfer, correct, correct. And the thing with... The thing is, if you start to compute very fast, if you take your matrix muls and just like smack it with raw silicon and do it very fast, you lift like you start, you're basically waiting on memory to like, you know, data to come. And then you've had this problem where the first sort of problem is gone, but the second order problem is, okay, now I just need data coming fast enough for me to, at the rate at which I'm consuming it. And that's why you need it. So GPT, let's say GPT chat, right? So suppose GPT chat is creating this sequence of text, but then to be printed on a paper and then send it outside, then it's like, hey, there's no point making GPT faster if paper is the, you know, paper is the plot of that. Right, right. Got it. Exactly. So once, so HPM basically gave you a lot of bandwidth and that allowed you to, and that was required for TensorFlows. So once you started putting these kind of units in, you needed proportionately fast memory to keep the whole thing going. But the biggest problem with the HPM and which

is what like today, Nvidia is always grappling with this. It is hard to increase capacity. The best HPMs today get you about capacity of 80 gigs and bandwidth of about two terabytes a second. So five times. 2000 gigabytes a second and capacity gigs. Now, take GPT-3 as an example. It has 175 billion parameters. So if each parameter, each weight took 16 bytes, so two words, that's already 350 gigs of storage. So you cannot fit that in one GPU. You need at least four to eight GPUs to start squeezing that in.

800 gigabytes. So that right. So you need automatically like, you know, that many GPUs before you can start, you know, just fitting just to serve you your request, just to fit the body. We need that many GPUs training. It's just inferring. We just running just for sure. Yeah, yeah, yeah, yeah, yeah, yeah. So it's just to fit it in that many GPUs are gone. So that's that's so graph core came with a bet that, hey, like, let's go back to like, if we start increasing the amount of S RAM in a chip. SRAM is what is called as cache and other things like today in a laptop, for example, with 12 megs of SRAM, you have broken into L1, L2, L3 caches. This

is the standard CPU terminology. Graphcore came with the thing that I will give you one gigabyte of that, which is like, imagine 12 megs and this is one gig. But if we give you one gig, you're going to be so busy utilizing the one gig that I can then connect it to low bandwidth memory. I can, you can live with like, you know, a hundred gigs a second and hundred gigs a second is commodity memory modules, which is used everywhere in the world. And you can scale

that independently to terabytes of storage. So your capacity can go up to terabytes while you're, you know, because mostly because you've allowed for much lower bandwidth and why lower bandwidth is because. I've basically given you, you know, even though if you have tensor cores, I've given you so much SRAM, so much cache, it's basically a giant pizza to eat. So by

the time you start eating, you'll finish the pizza. Even though my delivery is really, really slow, I will reach that before you finish eating your pizza. Right. So that was the basic thing saying, like, you know, it has a lot of compute, but it's this memory that you see the AI memory is about 3.6 terabytes. So it gives you a lot more memory, but the catch is, there's a trick here, which is, you know, you have lots of SRAM and the aim is to accommodate all that, you know, live computation within, you know, within that largest RAM. It's, the jury is out whether this is fundamentally a successful idea or not. The idea harks back to very standard computer science that you need. If you have a lot of cash, you can go with very slow main memories.

But it's the same analogy, right? If Ford had millions of chips for its computer stored in its inventory, it would care less about what's happening in the outside world about whether there are supply chain shortages or not. Because it has so much in cash. By the time it consumes all its chips, you know, COVID can come and COVID can go and then, you know... the factory can completely start in, you know, you're not like, you're not exposed to the vagaries of, of the world economies have enough, enough, enough cash, enough storage or enough inventory. It's a, and this was the bet, but you know, AI is growing so fast that I think most people are coming down to this idea that you absolutely need high bandwidth and high capacity, which is two very opposite problems. You can give one, you can't give the other usually. And all of the advancements that, you know, Nvidia, AMD, anyone in architecture is doing is to address this one and one problem, how do you give high capacity and high bandwidth memory together? I think the problem of computers largely solved in AI hardware, nobody, everybody knows it stands, of course, everybody has them. The only question is, can you give a lot of memory

at very high speeds? Um, and that is kind of what we're grappling with today. And multi-chip packages, um, more HPM stacks. And today, you know, what people are thinking about is a couple of CPU, GPU, where you're not to be on a CPU separate chips, but you're building that logic, uh, and tying them up very, very close to each other. So you basically have the capacity of

a CPU and the entire memory of the an inventory becomes a cache. Right. And then you're very quickly pulling from that and then feeding the GPU. So that's where all the latest advancements are. It's can you pull from the CPU's main memory fast enough? I think. And can you use the GPU

as a cache, GPU's memory as a cache, so that you're able to hide this entire for write capacity and bandwidth. It's again a standard problem. Like, for example, if Netflix is giving you movies, it would keep the movies you watch more frequently close to you rather than a data center very, very far away. It's a very, it's not a, it's the, it's the most, uh, you know, logical next step. It's not a fundamental revolution. So we're all doing the logical next step. It looks like, uh, all these, uh, you know, these fights are still based in Silicon and, you know, semiconductors, but I also, I would also be interested to learn more about, you know, some of the other, other types, right? So photonics for instance. So how do they fit? into this whole thing. Yes, that is very interesting. So photonics is what is calling the revolution.

It's fundamentally capable of providing high bandwidth at very high capacity together. And the way it does it is it completely removes wires. Your laser is communicating. It basically switches, which are always. You have some data here, which is digital. Then you have a transducer which converts that into pulses of laser, like each as the data is being read, the laser turns on or off. And then you have another thing with that laser that that input is received and that's converted back into digital. So you know, the chip is all digital, the storage is all digital. And then it's just the connection between them, which is now just through lasers. So

the transistors are still semiconductor, the silicon based, right? Yeah, yeah, yeah. It appears that we are replacing by photonic connections. Is that? Yes, that is exactly the thing. So at the boundary of the chip, you have this sea of transceivers, which basically are analog to digital or digital to analog converters. And that is the new bottleneck. It's easy to send a bit by laser and receive it. But the question is, how quickly can you do this digital to analog process? And that's where all the challenges of photonics are today. which is

can you do it fast enough so that that is not the new bottleneck? It should still be faster than just sending it by wire. Take the analogy in the real world, so you have these typical wired cables to transmit information. Now coming up optical fiber cables, right? So it's the analogy. Yes, yes. These are actually optical fiber cables, just a little more advanced. These are fundamentally optical fibers. and it can be a bundle of fibers or it can be any

which way you want, but they're fundamentally optical fibers. The question is how quickly the optical fibers, while it's going through the fiber, is capable of a certain bandwidth. But the bandwidth at the end, when you convert that back into digital, is much, much lower today. That needs to be overcome to be able to make it useful for chips. So what problem is this, Aumra? I didn't follow. So is there still a physical cable or there is no cable?

You can choose. You can actually have physical cables or you can just have, you know, the laser inside the cable or you can have lasers outside. It's up to you. Most people would prefer. Yeah. But if you have laser outside, I mean, I'm trying to build this picture in my head. So you're saying there is a chip, and there's a bunch of lasers being shot off right

above the chip that are communicating signals to each other. Won't that just be like massive signal interference and all sorts of crazy stuff going on? The brilliance is that like, imagine even if it's just one cable, all these things could be at different frequencies so they don't interfere. Oh, okay. So like different frequencies. Yeah, different frequencies of light. So they don't interfere with each other. So what is this reading the violet light, green light,

yellow light, and they're far enough that they don't, they're not close to each other. Okay. No light is typically exactly one wavelength. It's like a Gaussian. So as long as they are spread enough from each other, they're not going to interfere. They can go through each other.

with no problem. Yeah, but the problem you're solving is that you're saying the speed of interconnectivity on the chip itself is too slow? No, but it's between the chip and the memory. So wherever you're storing your model and storing the data, getting the data quickly enough has become the new challenge. It's become exceptionally acute with the charge GPT kind of models where you're predicting it literally alphabet by you know character by character. It's not you know, or sometimes like one word at a time, not more than that. So word at a

time. And it's not that it's predicting the whole thing. So you predict one word based on that word, it predicts the next word and the next word. So it's all back and forth going you know, between you know, by the time you know, what's happening on the ship. Yeah. And it's a once a back and forth. Yeah. So the back and forth is that you, you, um, you basically,

you know, every word has to go again through the entire model to give you the next word. So if you have predicted two words, both those words then go through the entire model, then to get the third word. It's not, it's not that you send your input once through the model and you have all of the output, which is the standard way of imagining. So if you've given, and if you're basically, if your entire model was just like, Here's my input. All I'm saying is true false or something. You've sent it to me once, but what's happening in these, the

generative models is that the model is changing. I mean, your output depends on what you've just produced, like, uh, the previous words. So as you're generating words, your next word is responding to that. So you're going through the model. Every single word involves, it involves another, you know, loop through the model. You're saying that the model has to be fully loaded into memory every single time. And that's what you said is possible on one core. It has to

happen on like multiple GPUs. Let's say eight GPUs today. Yeah. So and then you're the input and the model are being cycled through the entire memory for every additional word. So the inputs is going to cause once okay, you have the all of the input produces every output is like at the end of the day is that one word that joins the input? Yes, but which goes all the way through the model again. So it's not that you've gone into your first GPU, come out of

your GPU and you give it to the user. Gotcha. Go through this thing every single word like several times to personally computer before the reason for every additional word generated now has changed the state of the input and therefore that you want to do input the new input is the earlier input until now plus this new word that I've generated. And I have to do for every new word that I generate. Yes. Yes. Like that's not how humans think at all, by the way. I mean, I don't think so. Like, you think of the answer and you put a sentence around it. But you know, this one, like, is this creating, like, you know, imagine you're going word by word and creating your answer on the fly. And that, that is that that generates insane requirements

on the, on the entire memory stock, which is, which is what these photonic startups are trying to address. There was also others that people are trying to do. the entire, you know, matrix multiplications using lasers by, you know, the interference is equal to addition or sum, sum is equal to, so, so they just change the phase of the same frequency. When you change the phase by enough number, you can always have two waves cancel each other. So that becomes like a, like a, like a subtraction of two things. Right. So if you have exactly the same phase, they become addition. So once you have addition, subtraction, and then enough adds becomes a

multiply, and then you can pretty much get all of everything, all of arithmetic happening just through lasers. But some startups are trying that, it is still not clear whether they will succeed. If it is, that's another massive revolution, because it will cut down the power or increase the speed of computations enormously, because you can have... so many frequencies, lasers

operating in parallel, doing all your different, you know, multiplications. It just blows up. It's mind blowing the number of things that can be done. I thought like lasers on lightsabers and Star Wars were cool, but now you're talking about those lightsabers doing math while Darth Vader and Luke Skywalker are fighting. That's a pretty crazy image. But they all deal with

one fundamental problem, which is their lasers are very sensitive to temperature. So Star Wars doesn't like you to show if it's hot, you know. you know, Luke Skywalker's saber might actually be shorter. If he's in a cold planet, you know, his thing might be longer. They don't talk about temperature effects, but temperature effects are a serious problem when they're doing AI, because your answers could keep changing whether, you know, you're dead, you know, the chip is hot or not. And then you cannot expect like results to depend on like, you know, whether

the day was, you know, it's a hot day. So I'm giving you, I'm going to tell you that all the cats are actually dogs. And it's a cold day, so all the cats now become like, you know, mice. So you can't expect, you know, that, that, that is unacceptable. So they have like these kinds of physical things to deal with. You're saying this is like a turtle egg where it would be like the gender of a turtle egg is dependent on the temperature in which the egg hatches. This is a problem with global warming where a bunch of turtles were just I think turning out male or something. And that's therefore like, you're telling me the cutting edge of

AI. is like a turtle egg that's dependent on the temperature. If you go lasers, if you go using lasers for math, if you stay off that and then then you're fine. Like digital logic is is reliable and consistent. It is not dependent on temperature. But one thing that all this like brings up is that we talk about you know like blockchain being horrible for the environment but what you're describing is even this AI stuff chat GPT is probably terrible in terms of power consumption and like heat. It's enormous. It's just absolutely horrendous in terms of the resources it consumes, which is why there is a massive, right. But I have a different amount to really make it faster. Yeah, go on to them. No, no, I have a different take in general,

like, hey, are you making something? You know, you're training this one computer once and then inferring is relatively easier versus somebody driving 20 miles and sitting in an office in front of a computer and then doing the same work for a day. Right. So. Yeah, I guess it's a debate that we can have. Yeah, yeah. But you know, AI, your bottoms will become bigger before they become smaller. There is no way we can economically continue sustain these kind of big models. They will grow bigger. And when people figure out, okay, what is like, you

know, what's my working formula, they will aggressively like shrink it to something which is not as expensive. Makes sense, like something like brain, right? So we have the brain in a human brain, which has a trillion, you know, a hundred billion neurons, but then we only use over 10% of it. So we will build the entire thing and then figure out which parts are needed. And then I guess, you know, compress the remaining parts and so on. So probably, but what does this mean? Bundy and maybe like at something we've talked about carbon credits captured and trading of carbon credits, right? Like maybe there's a future view where like the computation itself becomes now a constraint, not because of any other reason outside of the environment.

And now companies have to figure out how to really ration. how much compute you throw at what given problem. And maybe you can figure out how humans also have to trade this because I, I don't feel terrible about like going and ask Chad GPT to come up with like a stupid poem about a cat. Maybe it's like, I don't know, consume like the power equivalent of a city in, in like Asia or something. And there are two answers here. One is that, um, data centers are trying to go green where as long as it's going to a data center and they power it with renewable resources, you then like, you know, less, uh, need to be less worried about what's the environmental impact. But there's also like, AI is also to be personalized going into your device. So your phone and your laptop will have AI smarts. Now does Apple pay the carbon price

of its manufacturing or the carbon price of all that the laptops burn? Because I mean, they're putting all this like, big ass silicon into your phones and laptops and that's gonna keep running. And... So what is like, you know, okay, so to be personalized and to be, you know, for privacy, AI is gonna go on device. Who's on device? Like the devices have to get like far more beefy. Yeah. Now who is like talking about the, where does the carbon, you know, the price coming in? Is it right? Is it the company or with the consumer? I don't know. Great question. I think it's a good segue into going from tech to other aspects of the business,

right? So something that I'm very curious about is the whole economics of it. So we have touched upon the carbon economics of it, but just thinking about the pricing, you know, even how this industry is organized. For example, I wanted to learn more about the contract manufacturers and how do they work and how does this fit in into the entire ecosystem? Yeah. Yeah. So, um, the, so the biggest contract manufacturer today is, you know, TSMC as we touched upon, Taiwan Semiconductor Manufacturing. And then there is of course Foxconn, which does a lot of the other later assembly work once the chips made. But then typically, like if you go back 20, 15 years, all of manufacturing was done by Intel and AMD. They made the chips, they

designed the chips, they manufactured the chips, they have this entire vertically integrated business. Right. Everything they did was tightly coupled. They would make designs which are only manufacturable. They will not go outside that. Yeah. When I was in my undergrad, the only decision that I had to make was, hey, is it an Intel computer or an AMD computer? Yeah, exactly. Or a core used to work well on AMD. You know, you wouldn't want Intel. You know,

things used to be very simple. Yeah. And these were the only ones manufacturing and TSMC was not as good. But what the magic which has happened over last, it's actually just three, four years, which has happened. Intel was always like a generation or two ahead of the SMC, mostly because it could tweak its manufacturing. And when I was in the factory, we did the same thing where we canceled the designs if they were hard to manufacture. So there were particular

designs which would use the this thing we would tell the designers, go fix it, you know, go change it. Right. It's logically perfect, but it just is, it's a pain to manufacture. So, um, TSMC was always fighting with, you know, one hand behind its back because it said, I can manufacture anything like, you know, you just, I'll give you a rule book. And as long as you stick to the rule book, um, I will, I guarantee you, I'll do anything, um, and the fact that they could do it fight with one, one, you know, with one arm behind their back and produce, you know, good quality chips, basically allowed. Apple to start designing its own chips. I see. And Apple started then gearing its chips to its software. So the apps that run on the iPhone or the most popular apps will be given priority to become more and more efficient.

And it will design its logic to basically make these apps go even faster. For example, Uber, if you run it on a very old iPhone, will just kill the battery. But if you run it on a new iPhone, it will... run for a long because you know these apps have been always optimized

on Apple Silicon to make sure they're always running really well. So they were able to separate that out and start like you know giving their designs over to TSMC to do it. So today Apple, NVIDIA, Qualcomm, all the big you know these huge players all go to TSMC for manufacturing. TSMC will come up with a rule book and it's called the design rules, design rule book and you have to you know honor that. And then that's it. You know, they will they will make sure it works. So that has happened. Yeah. Success of the iPhone and Apple's decision to build its own silicon

that has led to the rise of TSMC, if I understand what you're saying. It's the other way which like the rise of TSMC to be as good, allowed Apple to leave Intel and start making its own chips. Interesting. I see. Not before. That should be market cap, right? Um, you know, this is something this, this only last five years. So Intel has been just flat, flat 200.

Yeah. Yeah. Yeah. Yeah. Both of, um, PSMC. I mean, this is something that should get heads rolling right in the, it has, it has got heads rolling. It has. And the, the thing, I mean, what always happened within, um, and I was, you know, I was at the era and Intel, when you see this, this divergence to happen. I wasn't the factory, right. But this divergence happened. I mean, people used to, you know, we, Intel used to convince itself saying, this is, you know, we have hit the limits of physics. So, you know, if we are slowing, everybody is slowing,

you know, no big deal. We're still like the generation ahead. What happened, you know, when you start seeing the stock, you know, diverge is also the point when PSMC actually started beating Intel. And it started going ahead in manufacturing technology. That never happened before that. And you know, given like how they're not like a fully, they're a contract manufacturer. They don't know the designs that come to them. They don't know the applications. They don't know those kinds of things. So they just operate off a rule book. So and the fact that they were able to beat Intel, right. Change the industry in such massive ways. The fact that you have

an M1 silicon in your MacBook is a direct consequence of this. The fact that Nvidia is so good in terms of its capabilities is one of the consequences. The fact that you have Qualcomm and producing all this is a direct consequence. The fact that AMD, which used to be always like a second fiddle, caught up so much with Intel was because they became design equal to Intel. And now they have better manufacturing technology that they can depend upon. So, I mean, so they are...

completely eating into Intel's data center business because of this. Right. Two very interesting things, right? I think one timeline wise, I think 2020 is when Apple Silicon gets released, like for the iPhone, for the Mac and a bunch of like things together. That's I think where in that chart, Chaitanya, like the TSMC market cap starts to like really, really start to hockey stick. And the other interesting thing is, I mean, it seems like then, Renal, from

what you're describing, it's like America's lost its competitive edge over the last few years with the chipset. Two years. Yeah, two years. It's very, very recent, but you can see the impact how like on product. It's such a dramatic impact because people could always ensure that, okay, like, you know, everybody can design as well as me, but I can manufacture better than you all.

So overall my chip is better, but that the moment it changed, like suddenly you had like an explosion of other providers, all having like equal or better designs than them. Yeah. So this chipset. Right. This is so dramatic, right? So chips being such an important part of the AI revolution, right? And this should be a much bigger story than hey, I just think flat versus this. And

especially in Taiwan, which is like, which is going to be, I guess, you know, we'll go into the geopolitics now. Why it has become such a hot, hot issue now. I mean, Nancy Pelosi's visit to Taiwan now takes on a whole new meaning when you look at this chart. It's not just posturing. It's like security of the United States at some level. Yes. It's like linked

to this. Absolutely. Which, which, I mean geopolitics wise, like, yeah, you know, Taiwan says, you know, what they have is not, they don't need a missile defense. What they have is a silicon shield and their silicon shield is basically TSMC. The presence of TSMC ensures that. the world will come and save Taiwan. Because without which all of the silicon, basically the industries

and all of the advanced economies just grind to a complete halt. Because like if TSMC's factory is invaded and they're occupied, that's a chip stop. And then you can just imagine the massive ramifications on everything because we are in a highly automated economy without a continuous improvement in chips. We're in a deep problem. Yeah. So just understand that. Would it not be good for you then? Oh, sorry. No, I was just saying that chips are almost

like the new oil, right? Like you, yeah. New idea as an oil producing country, you know, the countries will come in, come to your aid in, in case of an attack because oil is the foundation of the economy. You're almost saying that's the shield is the same thing in like the 21st century. Right. Pretty much. It's Saudi Arabia in 1990 when Kuwait and sorry, 1990.

When, when basically. Saddam threatens it like the world comes to its rescue. They honestly don't care about Kuwait as much as its oil. It's the same thing with Taiwan. The world will come to its rescue. If the challenge of course is it's China, it's not like, you know, it's not, it's not as, it's not Iraq versus Kuwait. It's completely like, you know, militarily weak countries attacking each other. So this is another added conflict. It's even bigger

risk, right? So now you're going against, you know, another world power, right? So it's yeah, yeah, yeah. Yeah. Right. Which other you had a question though, I think you were. Yeah, I was just gonna say. So the difference from, you know, chips being the oil, right? So oil is physically, you know, restricted to be there. So one thing that I wanted to understand was what's, you know, the question I had asked, I'm glad the fake SMC factory just airlifted and put it in Texas. So what's Copy exact is very, very hard. So you need a lot of, you

need some time, like, you know, several billion dollars where the factory is set up and it's just producing junk before it catches up with the capability of its Taiwan factory and all the supply chain of all the chemicals, the equipment, everything to be set up. And that is something that TSMC is very unwilling to spend. So that's where the subsidies come in. They're like, we're happy to do it, but you pay for all this downtime. Because economically it makes no

sense to me because I will continue investing more into Taiwan and just making the factories even bigger. And so if you want me to take an economic decision, pay for it, pay for my factories. They're willing to spend billions of dollars in one place. Right. So they should, yeah. I mean, this should be a no brainer, I guess. Right. Yeah. Yeah. Yeah. Yeah. That's what

the chip sack is probably trying to do, right? It's trying to like attract exactly subsidizing these things so that people can come back. It's to reverse Chetanya's chart. I think it's to make Intel some American company back in the business. I don't think the chart will reverse because it's mostly to get the chart will actually expand. They're basically telling the SMC to come and set up a factory here. So TSMC makes more money. It's not going to, you know, it's not. about making Intel catch up. It's just saying TSMC, you guys can be the most profitable. You're fine. You can continue beating Intel, but make your chips here because if your Taiwan

factory is occupied, you're still producing something here, which is going to feed my country. I guess, I guess it looks like it will happen at some point. So that's the, it is going to happen. Yeah. Yeah. Yeah. Yeah. So I mean, this is just, we have just been talking about TSMC versus, versus Intel, right? So where do the other players fit in NVIDIA, Google, the Oracles of the world? So, you know, how are other kind, Google, right? So how are they thinking about these things? So, Google of course, manufacturers from TSMC, or they all manufacture from TSMC, they're in a different space. So if you have your chip, the question is, what can you do? And Google has the story of being vertically integrated for everything except manufacturing.

So it designs the chips, it has its manufacturing, it's got its own chips, it's got its own data centers. And it's got its own software, AI software. And what it's then trying to do is basically, can I then just survey through the cloud? But since I own the entire stack, can I come up with a much cheaper alternative than my competition? And that means Google strategy throughout. It created the software, the framework, TensorFlow. It has TPUs, it has Google Cloud. They all

exist to basically feed one story that if you use AI, do it through Google's cloud. I see. You get the best price and you know, the best models and best everything. Right. And their AI first investment was to make a vertically integrated AI company. I see. I see. Got it.

Right. And who's making TPUs by the way? Is it TSMC as well or? Yeah. Yeah. TPU is fabricated by TSMC. Everything is. Everything except Intel ships. And the really embarrassing part is some of Intel's ships are also being manufactured in TSMC. Wow. They are struggling with delays and so, and their own process not being good enough. And they're like, I cannot lose market share. So I will also go to TSMC and manufacture it there. And it is like a real like egg in the face of. I mean, the other thing I want to share, Shivan, in the chart that you showed,

like that's what I still can't get my head around this. It's not like they, they sunk, sunk like billions of dollars into R&D or something. Like how did suddenly in that three year? Like what it seems like this is a huge story, right? That we're not just learning about like how did that happen? Because the hockey stick happens after 2020. That's obviously the pandemic and everything, but the investment wise, like Intel has been like sinking money as well, right? It's not like the red bar on the left has dwarfed the realization. There are three factors at play. The first is a realization from investors that Intel is not catching up to 2023, 2024. because Intel's been claiming every year that it's going to catch up with TSMC and then it announces like one year goes and like, Oh, there's a delay. There's another delay. And this is

where investors have realized that it is not catching up. Um, second would be catching up though. My point is like, how did TSMC even get ahead? Like it's just, that is, uh, I wish people do a documentary on like TSMC. They took the lead and now people have realized that the leader is there to stay. It's not going to be covered anytime soon. So that's when, you know, it's, it's a remarkable growth of, you know, what the SMC is pulled off is just humongous. In the U S we are busy with the culture wars. Hey, should I go to office? Should I vaccinate myself? Um, should I commute or just work from home? When all of this was being discussed, um, the SMC was just chipping away. Right. So that's the charts. Yeah. And also

during the pandemic, right. They were able to shut down the quality of the entire island, out of the country. The factories were running 24-7 all the time. And the total production they were doing just exploded. So they sold more. Their competition was nowhere in sight. And they have this lead that is going to be maintained for several more years. And that

is when people realized they have a monopoly. Samsung has the same monopoly in memory. TSMC has it in logic. And these are the two things which create AI. So NVIDIA's chips is logic plus memory. And that's, you see then the realization that, oh my gosh, these companies should be worth way more than they are. Right. Boom. Right. Yes. This is such, I mean, the story of three years, like 19, 20, 21. That's when I think you see that little bar on the red bar on the left starting to really rise. Yeah. And then it's a lead time. What a year that it takes

the market cap to then catch up. Like that's just who are the CEO of TSMC that dude, like, should we, I mean, we should get them on the podcast. Totally. Sure. You totally should get somebody from TSMC to just narrate how heroic. And, you know, they've got this technology lead for almost from 2018, right? Even in 2018, it's taken two years for people to really say that, okay, you know, this lead is there to stay and all of this before the market catches up. shows, you know, reflexes. Right. Yeah. Cool. I mean, this was great and then this definitely, this is a good cliffhanger to, you know, for a part two record. Uh, and you know, a lot of questions are unanswered. Uh, for example, I would like to understand how the whole NVIDIA

is playing out. Right. So, um, how NVIDIA is going big there. What are the challenges that NVIDIA faces and also some details on, um, So just AI accelerators and so on. So just focusing more on that. So we should definitely. Absolutely. Yes, that will come in the next version of

our interview. But yeah. Absolutely. This was great. So any closing thoughts, you know, or any trailer for the next part? I mean, there is, I would say like the amount of action in the space is. massively under underrated and understated. It is like, you know, cliff hanging, you know, it is also like, you know, extreme competition that even market leaders are facing. It is not that TSMC is sitting happy and fat and happy. It is not that Nvidia is sitting

fat and happy on where they are. You know, competition, they're, you know, they're fighting battle by battle day by day to reach where they are. And the challenges that they will have to fight and the battles that they So Nvidia is, you know, it's becoming a game of go. So it's no longer your one opponent. You know, Nvidia's opponent is now, you know, the US, it is Google Cloud. And how does it, you know, it is the standard AMD, which always was, but then also

these new things opening up and how does it continue, you know, you know, keeping, you know, being the new middle uno, uno in this, in this new landscape is a massively interesting problem. And Renal, I think when you'd spoken before, you'd mentioned this very interesting elephant versus whale story. So maybe you can tee that up as a teaser for what's going to come the next time. Absolutely. Like, you know, each one is a specialist. Like NVIDIA is a king in the GPU ring. AMD is a king in the CPU ring. And now we have like, you know, AWS is a king in the cloud ring. Everybody's trying to eat into the other market. How do they do that is going to be fascinating. Because each one is creating their chips. You know, is there

a question, like, do everybody start creating their own data centers? Does the media go that route? How does it keep the, you know, what advantage does each one of have? What are their disadvantages? It's a, you know, they're all champions of their own arena, but they're very weak outside it. And they're all trying to go into that. And the question is, will they or will they not? Right. Is it the story of an elephant in the elephant of the whale where the whale is a, you know, king of the sea and elephant is a king of the land. But can the whale walk or can the elephant, you know, swim? That is the, that's the question. Elephant versus whale assuming both are carnivores and both are like desperate to become king of their realms. A tiger versus a killer whale or like what would I get? What I get the point, right?

It looks like more like Godzilla versus Kong, right? Yeah, absolutely. It's like you didn't mention Intel though. Like was that a strategic omission? So no, no Intel's in the same. No, not at all. They are firing back. Um, they have like, you know, he's a lot of more, lot more cash in hand than other companies, so they are plowing it into anything that works. They have like a portfolio of like, they have a CPU, they have a GPU, they have an out and out accelerator, their FPG is the name of everything, but they're not the king of anything. So the question is like, how will they come and, you know, um, you know, really, you know, catch up here. And that's, that's their challenge. Amazing. So

good, good time to, uh, I think pause here before we, you know, go through the second part, right? Uh, and, uh, I think, yeah, let's dig deeper into each of these bars next time. Yeah. Thanks for coming on the show. It was awesome having you on. It was a pleasure. Absolute pleasure. It was a lot of fun talking. Lovely. Talk soon.

2023-05-21

Show video