How Chips are made Worse to get Better

How Chips are made Worse to get Better

Show Video

How can you make a chip better by making it worse?  It has to do with the height of transistors,   the height of cells and the question,  how can we continue to scale chips when   transistors just don't get smaller anymore. A special technique called "fin depopulation"   is one of the reasons why modern chips still  continue to scale, even though new process   nodes struggle to keep up with Moore's Law. It's a less talked about and often misunderstood   technology. Especially when it comes to  its most advanced implementation in the   form of TSMCs FinFlex. A technology that  even experts sometimes seem to get wrong.  Lets take a look at the tricks of modern  semiconductor manufacturing and the secret   behind TSMCs FinFlex. Which I promise, you will  be able to understand by the end of this video. In order to figure out what fin  depopulation really is, how FinFlex works,   and why it's so important for Moore's Law,  we have to set the stage with three basic,   but very important concepts of semiconductors. The first one are the electrical properties of  

transistors. At the most basic level, a transistor  is a electrical switch that can control the flow   of current between two areas, called  "source" and "drain". This control is   enabled by physically placing a so called "gate"  between source and drain. If the gate is active,   it emits a electric field that affects the  flow of current between source and drain. 

Now you might ask yourself, when does the current  actually flow? When the gate is on or when it's   off? Does the electric field stop the flow of the  current or does it facilitate it? The answer is:   it depends. There are actually two different  types of transistors, "N"-type and "P"-type.   Since we are talking about electricity, "N"  stands for negative, and "P" for positive.  A N-type, or NMOS transistor, uses n-type material  in the source and drain regions of the transistor,   while the body of the transistor is made  out of the opposite, p-type material. If you   apply a positive voltage to the gate, it emmits a  electric field that attracts the electrons inside   source and drain, which then forms a channel  and lets the current flow through the p-type   material. If you turn the gate off, the electric  field stops and the flow of current stops too.  A P-type, or PMOS transistor, uses p-type  materials for source and drain and n-type for the   body of the transistor. A PMOS transistor works  when you apply a negative voltage to the gate,   which repells the electron holes in the  n-type body and allows the positive source   and drain regions to form a channel. NMOS and PMOS are very similar,  

they just function in electrically opposite  ways. A NMOS transistor needs a positive   voltage to be turned on, a PMOS transistor  needs a negative voltage. As such they also   have their own advantages and disadvantages. NMOS  switches faster and PMOS is more power efficient.  For a while, NMOS and PMOS transistors fought over  which one is the best to use in semiconductors.   Today, almost all semiconductors use so called  CMOS transistors, where the "C" stands for   complementary. CMOS is a transistor type which  combines one NMOS and one PMOS transistor. It   really just two transistors next to each other.  CMOS offers the best combination of switching  

speed for high performance and energy efficiency.  But it also means, CMOS transistors are larger,   because they are quiet literally made  out of two transistors, instead of   only one. That's the first concept we need to  understand. CMOS means double the transistors. The second concept we have to understand in  order to make sense of fin depopulation are   the actual fins that are being depopulated.  What are those? When we talked about NMOS,   PMOS and CMOS, we've talked about the  electrical properties of the transistor.  But there's another aspect to a transistor:  its physical form. How source and drain and gate   are manufactured. Up until around 10 years ago,  transistors were manufactured on a planar space  

and fittingly called planar transistors. In  this layout, the gate sits above the channel   between source and drain and the electric  field affects the channel from the top.  But this 2D approach has a big flaw. If you  want to reduce the size of a transistor,   in order to increase transistor density, which  is what new process nodes are supposed to do,   you have to shrink the size of all its components.  This means that source and drain are getting  

closer and closer to each other and at the same  time, the gate itself also gets smaller. At some   point, source and drain are physically so close  to each other, that not even the strongest gate   can control the flow of the current anymore. So  called "short channel effects" ruin everything   and the electrons just do whatever they want,  no matter if the gate is turned on or off. The   planar transistor couldn't scale any further. In 2012, Intel pioneered a new type of 3D   transistor, called a FinFET. In FinFETs, source,  drain, channel and gate are build as elevated 3D  

structures. This means the gate wraps around the  channel from three sides instead of just one,   which allows the electric field of the gate  to assert greater control over the channel.   3D fin-based transistors enable stronger gates, so  source and drain can be physically closer without   negative short channel effects. Which means  you can continue to make transistors smaller.  If we look at the cross-section of a FinFET, we  can actually see the physical height of a FinFET   transistor, as the fins are build upwards from  the silicon wafer. They are called fins, because   they look somewhat like the fin of a fish. And  while early FinFET nodes had less pronounced fins,   their structure and height started to improve  with following process node. In this comparison  

we can see Intel's first 22nm FinFET node  next to 14nm and 10nm fins. The 14nm process   already shows a great improvement and the 10nm  node, now called Intel 7, has much taller fins.  But this isn't just for show, there's a reason  why fins are being build taller with each new   generation. In a FinFET, the height of  the fin is pretty much proportional to   the transistors conductive strength.  By increasing the height of the fin,  

the transistor can transport more current. So with  the evolution of the FinFET, the transistors not   only got smaller, as in they were more tightly  packed, but also better at the same time.  Right now we are at the cusp of new generation  of transistor, the successor to the FinFET. I've  

talked about it in detail in my last video,  go check that out if you are interested. But   we are not quiet there yet. Almost all  current process nodes still use FinFETs,   with TSMCs N3 process node family being the most  widespread one. That's why further optimizing   fin-based transistors is important to improve  transistor scaling, even in the years to come. So far, we have covered two of the three basic  concepts we need in order to understand what   fin depopulation is and how FinFlex works. We  know that current transistors are CMOS based,  

meaning they combine a NMOS and a PMOS  transistor. And we know that these transistors   are build as so called FinFETs, with elevated  3D structures. The higher the fins, the better   the conductive performance of the transistor. The third and last concept has to do with how   these transistors are combined to actually serve  a pupose. Because a electrical switch alone  

doesn't do anything besides turning on and off.  Even if it's super fancy with CMOS and FinFET.  If you want to build a chip, lets  say we're designing a new CPU,   you need to create structures that can  perform calculations, store and read data,   and much more. That's where the so called logic  gate comes in. A logic gate takes an input in   the form of voltage and produces an output, also  in the form of voltage. Perfect for a transistor,  

which work with voltages as in- and output. Logic gates always act in a predictable   manner. For example, a logic gate might have  two inputs and only if both inputs are the same,   as in the same voltage, there is an output. If the  input voltages are different, there is no output.  

This logic gate could be used to check if outputs  from other logic gates are the same or not.  There are many different types of logic gates.  The most common ones perform operations like AND,   OR, NOT-AND (which is also called NAND) and many  more. Don't worry, you don't need to understand   how they work in order to understand FinFlex. Logic gates are first described as a technical   diagram, but in order to do actual work, they have  to be build by using transistors. And that's where   the CMOS FinFET transistors come in. Based  on the electrical diagram of the logic gate,  

transistors are combined in such a way, that  they perform the same function as described by   the electrical drawing. For example, only generate  an output when two input voltages are the same.  Let's get back to our chip design example.  We are still working on our new CPU, which   means we need to combine a lot of different  logic gates into even larger structures,   so that these combined logic gates can actually  perform calculations. But we are in luck,   because we don't have to worry about creating  the logic gates that we need to build our CPU.   Our foundry partner, such as TSMC, Samsung  or Intel Foundries, already did that for us.  When you design a chip on a specific process node,  your foundry partner already prepared a large   variety of different logic gates for you to use.  And I'm not talking about the electrical drawings,  

I mean real world implementations based  on transistors of the specific node you   selected. They are all part of the process node  development kit and are commonly called "standard   cells". There are so many of them that there  is even a "standard cell library". When someone   talks about libraries related to process nodes,  that's what they are talking about. It's literally  

a huge library of pre-designed logic gates to  choose from. What a great customer service. I'm   guessing that's why foundries are so expensive. But it gets even better. Because not only did our   foundry partner already pre-designed every logic  gate we could possible need as a standard cell,   they also offer different versions of the  same logic gate for us to choose from. But   why would they do that? What's the use for  different versions of the same logic gate? Every time I'm approaching a complex topic in  semiconductors I try to decrease complexity   by looking back in time. It's often easier to  understand a concept in a chip that was designed   20 or even 30 years ago. A fantastic resource  for that is Ken Shirriff's blog, which I not  

only highly recommend, but it also provides the  perfect explanation for our question regarding   different implementations of the same logic gate. Ken Shirriff has a blog post about reverse   engineering a Intel 386 processor from 1985,  which already used CMOS transistors and a   standard cell approach. Let's take a look  at how Intel was designing CPUs back then.  Like every CPU, Intel's 386 also made heavy use  of the NAND logic gate. Ken Shirriff went the  

extra mile and produced high-res die shots,  that allow us to take a look at real photos   of the NAND standard cells inside the 386. As  you can see, it's a CMOS based standard cell,   as it uses both NMOS and PMOS transistors.  This specific combination of transistors is   able to perform a NAND logic operation,  that's why it's a NAND standard cell.  But there's more. The Intel 386 also implemented  much smaller and much larger NAND standard cells.   All three cell types perform the same function.  But they do so with different electrical   characteristics. The small NAND cell uses the  same amount of transistors as the normal one,  

but each transistor is smaller. The gates are  only about half as wide as on the normal sized   NAND cell, which reduces the output current. The large NAND cell, although performing the   same function as both the normal and  the small cell, uses more transistors.   Ken Shirriff notes that it's essentially two  standard NAND gates in parallel. This allows  

the large cell to provide double the current. Even back in 1985, the same logic gates were   implemented in the form of different standard  cells. Each with their own unique electrical   and density characteristics. Smaller cells  take up less space and use less power,   but also provide lower current. Larger cells  provide higher currents, but use more power  

and need more space. It's always a trade off. The reason why engineers are creating these   different versions of the same logic gates is  that different areas of a chip have different   clock speed and current requirements. If we are  selecting a standard cell to use inside our high   performance CPU cores, we want to select the  extra beefy implementation that can handle very   high switching speeds. Our low power efficiency  cores on the other can do with the normal cells,   we save power and silicon area at the same time.  Less important areas of the chip might even be   best suited for the half-size cells, because we  don't need a lot of current there. Even if  

we had infinite die space available, using  only large cells would lead to a very high   power consumption, something we want to avoid. Once we have chosen the standard cells we want   to use, we have to place them on the chip next.  Something that of course doesn't happen at random,   there's always rules to follow. Standard  cells are placed in so called "cell rows",   which is exactly what it sounds like: a row of  standard cells next to each other. And many of   these rows combined form a so called logic block.  With chip design, it's all about efficiency. You   start with the smallest parts, transistors,  and build up from there. Transistors are  

used to create standard cells, which go into  rows, which build the logic blocks. And many   logic blocks combined create a chip. Actually  pretty simple, if you look at it that way. Now we have all the ingredients. We know that  CMOS transistors are a combination of N- and PMOS,   that current gen transistors are build with  elevated 3D fins and that these transistors are   then used to design various types of standard  cells which are placed in neat little rows.   And the standard cell is exactly where "fin  depopulation" is starting to work its magic. 

Remember the reverse engineered Intel 386 from  Ken Shirriff's blog? While it was a CMOS design,   it was long before the invention of  FinFETs, so it used planar transistors.  Today, standard cells are based on FinFET  transistors. And FinFET transistors don't   just have a single fin, they usually come  in flavors of between two and four fins per   transistor. The reason for that is performance,  drive strength and reliability. As you saw before,   early FinFET nodes didn't have very pronounced  fins. Multiple fins were needed to achieve   the required transistor performance.  That's why FinFETs use multiple fins.  And since we are talking about CMOS  transistors, which contain two transistors,   the actual number of fins is doubled. Now we  are talking about between four and eight fins  

per CMOS FinFET. That's a lot of fins. And these  fins take up space. Not because of their height,   but because these fins are sitting next to each  other, which takes up space on the silicon waver.  In the past, FinFET process node scaling was  achieved by reducing the so called Fin Pitch,   how close the individual fins are to each  other, and the Gate Pitch, which determines   how close the gates are too each other. If  we look at a 3D representation of FinFETs,   we can see that Fin and Gate Pitch are related  to the size of the structure. If we reduce these,   we can pack more transistors into the same  area, thus increasing transistor density.  When it comes to the size of cells,  the next step up from transistors,   the most important variable is the"cell height".  Though the term "cell height" is a bit confusing,  

because unlike transistor height for FinFETs,  we are not actually talking about height as in   going upwards from the silicon wafer. Instead, the  layout of most cells is depicted by looking at the   cell from above. In this view, cell "height" is  more akin to cell "length". Reducing the fin pitch   reduces the length of the cell and thus its area. But the miniaturization has slowed down. New   process nodes are not able to achieve significant  reductions in fin pitch and gate pitch. And that's   where fin depopulation comes in. On the left  we can see a Intel 7 standard cell. That's the  

process node used for Alder Lake and Raptor Lake  CPUs. We can see the CMOS character of the cell,   with PMOS transistors on top and NMOS  transistors below. And we can see that   each area is using a 4-fin implementation.  The whole cell has eight fins in total.  Now let's compare that to a cell from  the new Intel 4 process node. That's  

Intel's first EUV node and it's used  to manufacture parts of Meteor Lake.  As you can see, the Intel 4 cell is  a lot smaller than the Intel 7 cell,   and that's due to multiple factors. First of  all, Intel was able to reduce the space between   the P- and NMOS areas of the cell. Next, Intel  4 also offers a slightly reduced Fin Pitch, so   the node still does some traditional scaling. But  the biggest change is the reduction in fin count.  Instead of four fins, the Intel 4 based standard  cell is designed with only 3 fins. And since it's  

a CMOS transistor, both P- and NMOS transistors  get a reduction in fins. The Intel 7 cell on the   left has 8 fins, while the Intel 4 cell on  the right has only 6 fins. This reduction in   area isn't due to shrinking features with  a more advanced process node, it's simply   based on removing features. The fins are being  depopulated. On paper, it's a worse transistor.  But it doesn't have to negatively affect  the performance, because as you remember,   the height of the fin is proportional to its  conductive strength. It's possible that a 3-fin   transistor based on Intel 4 could offer a better  performance than a 4-fin Intel 7 transistor,   if the increased fin height adds more conductive  strenght than what is lost by removing one fin.  This example is based on a Intel node, but  it's the same for all other process nodes,   no matter if TSMC, Samsung or any other  FinFET design. At the beginning of FinFETs,  

designs with four or even more fins were the  standard. Nowadays, even three fin cells are   usually reserved for high-performance cells, while  two fin designs are becoming the new standard.   And 1-fin cells are not too far away. That's why  recent FinFET process nodes still offer a decent   increase in transistor density. Because once  the traditional scaling of FinFETs slowed down,   the foundries started to reduce the amount of  fins used in their standard cell libraries. 

Modern FinFET process nodes are intentionally  made worse, by removing features fundamental   to the performance of the transistors. But  because the ever increasing height of the   fins can somewhat compensate the removal,  and removing them achieves higher density,   the result of a worse transistor is a better  chip. At least until you have remove all the   fins. Because just like scaling the planar  transistor, there is a finite amount of   fins to remove. You will always need at least  one per transistor and two per CMOS FinFET. Now that we know what fin depopulation  is, how it works and what it actually   achieves - what about FinFlex? TSMC has  been hyping their FinFlex technology and   at the same time, many people still  misunderstand what it actually does. 

At its core, FinFlex is a type of fin  depopulation. But it has one advantage,   that normal fin depopulation techniques  don't have. It's a lot more flexible,   hence the name. To understand the difference,  we have to circle back to standard cells and   how they are placed. And here Ken Shirriff's  Intel 386 deep dive comes in handy again.  As I mentioned before, we can't just place  the standard cells we selected from the cell   library anywhere we want. We have to follow  certain design rules. And one of them is   that standard cells are always placed in rows.  These cell rows are exactly what it sounds like,  

just different cells next to each other. On  the Intel 386 die-shot we can actually see   the cell rows with our bare eyes. The combination  of multiple cell rows then creates a logic block.  But not only do we have to place our cells in  rows, we also have to follow another important   guideline: only cells with the same cell height  can be placed together, and by extension this also   applies to the whole logic block. Remember,  cell height was that term that refers to  

the length of a cell and not its actual height. The reasons for this rule is simple. First of all,   it simplifies placing the cell with automated  processes. Second, placing the cells is just the   first step of manufacturing. In a second step, all  the cells have to be connected with tiny wires,  

so they can actually input and output voltage.  That's the so called metal layer. Manufacturing   the metal layer is already a very complex  production step, as you have to connect billions   of individual transistors to super small wires. If the cells placed in a row would have different   cell heights, planning and placing the metal layer  would be even more complex. It's not impossible,   but because of automation and workflow  optimizations, cells inside the same   row and the same logic block usually use the same  cell height. And since height is determined by   the amount of fins, that means the cells in  a logic block use the same amount of fins. 

And that's where the unique properties of TSMCs  FinFlex come into play. With the N5 process node,   the standard cell library was based on a 2-fin  design. The N3 process node also offers a 2-fin   standard cell library, but adds two new options:  a 3-2 fin and a 2-1 fin based library. And no,   this doesn't indicate different fin counts  for N- and PMOS transistors. Instead,   TSMC is adding the option of so called alternating  row heights. You still have to use cells with the  

same height for a cell row, but with FinFlex, the  very next row can now use a different cell height.  The 3-2 FinFlex library for example alternates  rows with three fins and rows with two fins. That   means, a single logic block can now contain cell  rows with different cell heights. As a result,   chip engineers have more options to fine-tune the  process to achieve the highest transistor density.  Before FinFlex, you could choose if you want to  use cells based on a 3-fin or on a 2-fin design.   That was the entire amount of customization. With  FinFlex, you now have two more in-between steps.  

A combination of 3-fin and 2-fin cells inside  a single logic block results in an average   of 2.5-fins. And the combination of 2-fin and  1-fin rows has a average fin count of only 1.5.  FinFlex is explicitly not the use of different  libraries in different areas of a chip,   like some claim. This has been always  been possible, even before FinFlex.  FinFlex brings this flexibility into the logic  blocks itself. With FinFlex, TSMC offers a 1.5-,   a 2- and a 2.5-fin cell library. Something that  no other foundry is able to offer at this point.  But it's not all sunshine. Placing alternating  rows with different fin counts increases design   complexity. You have to make sure that very  important logic gates and so called critical  

paths are not placed in the rows with  the lower fin count, where they would   suffer from reduced performance. Instead,  it's important to use the alternating rows   in such a way that less important cells  are placed on rows with lower fin counts,   while important cells can benefit  from the rows with more fins.  Of course this is all done in software. No modern  chip can be designed without Electronic Design   Automation. Companies like Cadence and Synopsis  already have updated their tools for FinFlex   designs. As process nodes and their design  rules becomes more complex, EDA providers   will continue to be on the winning side. SemiAnalysis has a very interesting blog  

post about the use of FinFlex and other ways  to increase transistor density outside of the   process node itself. It's very in-dept, but  they give real world examples and explain how   it affects different companies and products. And  if you understand the basics of fin depopulation,   it might be up your alley. I've put  a link in the video description.

Chip design and semiconductor manufacturing have  always been a complicated subject. But when we   break it down into little pieces, even complex  technologies become tangible. And I hope I didn't   promise too much at the start of the video and  you now actually understand the concept behind fin   depopulation and what makes FinFlex so special. Fin depopulation and FinFlex are not a very hot   topic. But considering their importance, I  think they should be. They are one of the   major reasons why FinFET process nodes still  continue to scale. Without fin depopulation   and other design technology co-optimization,  Moore's Law would have died long ago. Cutting  

the fins off of transistors has been a lifeline  for semiconductor manufacturing. Even though,   at face value, the idea looks like a brute force  approach. But at some point, you just have to   start cutting things off, if you want to stay  competitive. Even in the semiconductor industry.  I hope you found this video interesting  and see you in the next one.

2024-09-01 18:05

Show Video

Other news