The Evolution of the Operating System

Show video

In his video on Large Language Models  or LLMs, OpenAI cofounder and YouTuber   Andrej Karpathy likened LLMs to  operating systems. Karpathy said: > I see a lot of equivalence between this  new LLM OS and operating systems of today. I am intrigued by this notion. Operating systems are some of the  world's most important technologies,   with a history spanning 80 years.

It mirrors the journey of computing in all  of its physical forms. In today’s video,   we look at the evolution of the Operating System. ## Aspects of the OS So what does an operating system do? Maybe not so unsurprisingly thanks to the long  history, this is hard to pin down. One definition I like says that the OS  manages the computer's resources for   the User efficiently, reliably and unobtrusively. Hardware is hard. There is a lot of it -  the CPU, main memory, secondary memory,   display, keyboard, mouse, and the network.  Users and their applications must navigate  

the idiosyncrasies and pains of that  hardware to make it do something useful. Operating Systems help make this easier by giving  the User or their application programs a clean,   pleasant interface for their task -  abstracting away the horrors of hardware. An operating system is defined by its abstractions  because those are what the Users are interacting   with on a daily basis. Some have been around for  so long we forget how revolutionary they are.

For instance, take the humble File.  In the beginning, Users dealt with   physical memory - working with cells  and bits. But each memory system type   has its own peculiarities, and dealing  with all of them is a pain in the butt. You might risk one program overwriting data used   by another program - causing both to  crash. It is advised to avoid this. The File throws a blanket on top of all that  and just gives you this nice, clean abstraction. You might think that your "file" is  sorted away somewhere as a discrete   entity on computer memory - like a book  in a bookcase. But this is a fraud!

In reality, the file data scattered in pieces   like Cheetos across wherever the  computer happened to have storage. When you "open a file", the file  system is gathering those bits,   putting it into the right  order and presenting it to you. The OS automatically handles  all that behind the scenes,   putting them in either the primary  or secondary storages as needed. Abstractions like this are needed for us to  do our work. Every day we are interacting   with abstractions built on top of more  abstractions. And it all somehow works. ## Beginnings The first computers of the 1940s  and 1950s were made to be used   by just one user or group of users at a time.

So they just gave that user every available  resource. These expensive devices cost millions   of dollars today and so were rented  to individuals and billed by the hour. However, those individuals found that  most of their allotted time was being   wasted setting up the equipment  for the job. This was costing  

hundreds of thousands of dollars  in lost productivity each month. So in 1956, General Motors Research Lab realized  that they can make a software for their IBM 701   mainframe to automatically handle the loading  and unloading of each job - "batch computing". With batch computing, jobs are transferred  from cards to magnetic tape. The computer   would then run them all at once sequentially,  with the outputs recorded onto a second tape.

Special cards between each job told  the computer what resources would be   needed to do the jobs. These were  called "Job Control Languages". There are a few who call these  the first Operating Systems,   but the debate between historians on the  validity of that statement remains fierce. ## Multi-programming The 1960s saw better and pricier hardware - card  readers, magnetic tape, disk drives, and I/O.

Users realized that not every  job used all of the computer's   resources. So these expensive resources  can be better utilized if different jobs   could be run in parallel. Is there  some way to take advantage of this? There was. Back in 1956, the UNIVAC  1103A computer introduced a new concept  

called the "Interrupt". It let a peripheral  hardware call for the processor's attention. At the same time, we introduced new innovations  in memory capacity. Items like magnetic drums   were giving the processing units  more memory than they had before. Together, this let the computer  hold and run multiple programs   at the same time. While one program is  occupying something like input/output,  

another can be simultaneously running on the  processor. This is known as "multi-programming". ## Time Sharing If you think about it, running multiple  programs simultaneously inside a computer is   a small step away from having that computer  service multiple users simultaneously. One of the major problems with batch  computing was slow development times.   That in turn was the result of a  long "edit-compile-run" sequence. Big batches took hours or even an entire  day to run. If there was a bug somewhere,   the day's entire output might be just  an error message. Immensely frustrating.

So in 1959, the computer and cognitive scientist   John McCarthy proposed a possible  solution to his colleagues at MIT: > An operating system ... that  will substantially reduce the   time required to get a problem solved  on the machine ... The only way quick   response can be provided at bearable  cost is by time-sharing. That is,   the computer must attend to other customers  while one customer is reacting to some output What this meant was a large central computer  connected to what they called "terminals" - a   monitor and keyboard. This gives the user the  illusion that they are the only person using  

the computer. A very powerful software was needed  to coordinate all this and provide this illusion. Two years later in 1961, the MIT team  led by Fernando Corbato managed to get   a prototype working on their IBM 709  machine. Lacking hard disk drives,   they used a bunch of tape drives attached  to four typewriters. It just barely worked. In 1962, MIT announced the Compatible  Time-Sharing System or CTSS as it was   called. A year later CTSS got a hard disk drive,   and was offered to large scale users -  though MIT was not allowed to charge for it.

Though hints of the feature were  implemented for the military's   massive SAGE radar coordination system  and other specialized systems at the time,   we consider CTSS the first timeshare  expressly made for the purpose. By 1965, CTSS had hundreds of  registered users at MIT and   other colleges across New England,  handling up to 30 users at once. It also implemented the first mail and   mail box function between users - a  spiritual precursor of the e-mail.

Several other timesharing services emerged  throughout the late 1960s and early 1970s.   One notable system was the Dartmouth Time Sharing  System. The BASIC language was developed on it. An early version of DTSS later powered  a popular timesharing service offered by   General Electric. GE was the market leader until  the mid-1970s when competition overwhelmed them. Today, we have largely forgotten about the phrase  timesharing - though it underpins the idea of   what we now call cloud computing. But it had a  lasting impact on the history of computers and  

their operating systems. Before the rise of the  PC, this was how people experienced the computer. ## Multics and OS/360 CTSS's success spurred MIT to create a successor. So in 1964, the CTSS MIT team joined  with Bell Labs and General Electric,   the market leader in timeshare  systems to create a new software   called "Multiplexed Information  and Computing Service" or Multics. Multics' great vision was to enable a time-sharing   "computer utility" capable of  handling hundreds of users. So kind of like how water utilities  provide cheap, ubiquitous water,   Multics would enable computers to  bring computer service to the masses. But the project ballooned as it wanted to  be everything for everyone and progress   bogged down. Bell Labs finally pulled  out in 1969 and things fell apart.

MIT finally got Multics to work on their own,   and it was eventually sold to Honeywell,  which installed it on a few systems. It gained a cult following and persisted despite   Honeywell's determined efforts to kill  it. The last site shut down in 2000. Multics' troubles were reflected in another  legendary OS project happening at about the   same time. In 1964, IBM announced its  historic computer line - the System 360.

Ideally, a program written for one  360 computer was supposed to run   on all of them. That was the whole  schtick. But programmers struggled   to write software that can work on all  these different hardware environments. Famously, IBM tried to build a single  operating system for it - the OS/360.   Despite a monumental budget and  an army bigger than the Romans,   OS/360 fell way behind on schedule. And in  the end, they had to split it up anyway. Its project leader, Fred Brooks, later  wrote a book based on his learnings from   the OS/360 experience - the "Mythical Man-Month". ## Unix Multics failed as a commercial product.

But its groundbreaking ideas - security,  hierarchical file systems, a command shell,   and more - were incorporated into  its spiritual successor Unix. I already did a video about Unix's development  so I am not going to reinvent the wheel. But I think it is important to emphasize that  Unix was the right thing at the right time.  

It had many of the revolutionary ideas  of Multics and added a few of its own. For instance, the pipeline, which let you  pipe the output of one process into the   inputs of another. It is like a Human  Centipede but for computer processes. These helpful utilities were written for  cheaper, lower class minicomputers just   as those devices came to be popular with users  beyond those of traditional mainframe computers. And because it was written in the  high-level C programming language,   Unix can be easily ported to other minicomputers.  This - and the weird Bell Labs situation that left   it in a copyright limbo - helped Unix gain  wide adoption in universities and beyond. Unix's rise was a cultural phenomenon that  paved the way for other decentralized software   communities like those for the open source  Linux OS and the hobbyist microcomputers.

## The Microcomputer In the mid-1970s, new semiconductor  technologies enabled the creation   of integrated circuits with  thousands of devices on them. These ICs were powerful enough to be general  purpose chips. The first such microprocessor   was Intel's 4004, a four-bit chip originally  made for a calculator and released in 1971. Intel later released the updated 8008 in  1973, and then the 8080 in 1974. Other  

firms like Zilog and Motorola released  their own microprocessors as well. These   powerful chips would be the heart of  what was then called the microcomputer. Intel - then very small - hired  a computer scientist and language   professor at the Naval Postgraduate  School in Monterey, California named   Gary Kildall as a consultant to produce  certain software for their 8080 chip. Intel needed an 8080-compatible operating  system for testing purposes. So they helped  

Kildall port one he had written while  working at the Naval School. The new OS   was called CP/M - which originally  stood for Control Program/Monitor. Kildall believed that personal computer  hardware was getting good enough to compete   with existing timesharing systems  as a programming tool. Remember,  

the great illusion of timesharing is that every  terminal user thinks that they are programming   on their own computer. What if that were  to be actually true and not an illusion? ## CP/M Key to achieving this dream would be the memory. Existing microcomputer memory  systems - particularly secondary   memory - sucked. Things like paper tape  and cassettes. None of this was acceptable. So Kildall got interested in a new secondary  storage technology first invented and introduced   by IBM called the floppy drive. It offered  far more storage at a relatively cheap price. Oh and unlike paper tape, it was random  access. You can just jump to the data  

point you want rather than spooling  through the whole thing sequentially. Kildall gets a sample drive  from Shugart Associates,   at the time just a few miles from Intel.  Founded by storage legend Alan Shugart,   Shugart Associates would later dominate the  8-inch and 5.25 inch floppy drive markets. Now what? So there was Kildall, in his room  with just a naked floppy drive on his desk and   a crude Intel CPU microcomputer. So as you do,  he programs a controller software that helped a  

microcomputer running the CP/M OS interface  with this floppy disk drive and its data. It might not sound like much. But  hardware limitations meant that   early microcomputer operating  systems were often just that: Just a file system for organizing  and managing files on an external   disk storage plus the ability to  load and run programs on that disk. Disk Operating Systems or DOS.

Kildall founded a company called Digital Research  and began licensing CP/M to microcomputer end   users, who paid him thousands of dollars.  By 1981, he had several hundred licensees. CP/M - retroactively renamed to stand for  "Control Program for Microcomputers" - quickly   became the dominant operating system for the  small and burgeoning microcomputer community. Though they were not alone in the  industry. Others included Apple DOS,   the OS for the very popular Apple  II computer by Apple Computer.  

A Unix-cloneish thing called Coherent  which was ported down from minicomputers. And this small thing from Microsoft, MS-DOS. ## MS-DOS Fatefully, CP/M lost its early lead. In 1980, a rogue team at IBM began a secret   project to make their own microcomputer  - the IBM PC. Facing a tight deadline,  

the team built the machine together with parts  and software sourced from outside vendors. The PC team licensed a BASIC interpreter  from Bill Gates and his company,   Microsoft. They were connected  to IBM through Gates' mother,   who was co-chair of the United Way  non-profit along with John Opel, IBM's CEO. The IBM PC team asked Gates if they knew  anyone making a microcomputer OS and he   pointed them to CP/M. But for reasons  that remain unclear today, Kildall did  

not personally take the meeting with IBM and  refused to sign their Non-Disclosure Agreement. So IBM went back to Bill Gates for an OS.  Microsoft was then in negotiations with   Bell Labs for a Unix license. That  effort would eventually result in   the Xenix operating system. However that  was not yet done and there was no time. So Gates went out and bought a DOS from a  local computer manufacturer. He then hired   its developer Tim Paterson to make a few  modifications and rebranded it as MS-DOS.

Critically, Microsoft did not sell MS-DOS outright   to IBM but rather licensed it to  them on a non-exclusive basis. The IBM version that ran on the PC at  its release in 1981 was called PC-DOS.   To protect PC-DOS from clones, IBM  wrote part of the OS - the BIOS - to   a hardware chip and copyrighted  it by publishing it in a journal.

The IBM PC - with its iconic name and  marketing muscle - quickly became the   most popular microcomputer on the market.  Its setup became an industry standard,   inviting competitors and clones. At the start, MS-DOS was a crude piece of  software - about 4,000 lines. Nevertheless,   it allowed software vendors like VisiCalc to bring  their software packages onto the IBM PC platform. A bevy of computer-makers then managed to  work their way around the IBM PC-DOS BIOS   copyright - kicking off the PC clone  industry. Microsoft struck licensing   deals with those PC-makers, rapidly  grabbing market share in the industry.

Working directly with PC assemblers or OEMs  scaled far better than CP/M's approach of   going right to end users. Microsoft's  MS-DOS overthrew CP/M as the dominant   PC OS. By 1983, they had a fifth of the  microcomputer operating system market. ## Applications Today, we might see Microsoft as one  and one with their operating system. But in the early days, Gates and  Microsoft more saw themselves as   an applications company. Operating  systems were important - it provided   50% of the company's revenues - but  they were seen as a means to an end. Gates' thinking at the time was that with an OS  you get just a few points of the machine's price   so like $40 for a $2,000 machine. But with an  application, you can earn hundreds of dollars.

In 1981, their top selling application  was Multiplan - a now somewhat-dated   looking spreadsheet application for MS-DOS.  It sold a million copies over its lifetime. And for that reason, Microsoft in 1983  was big but nowhere the giant we know   them to be today. That year they  generated $70 million in revenues,   very good but VisiCorp did $60  million and Lotus did $48 million. This thinking was why we had these interesting  situations with Microsoft offering two operating   systems to its customers. MS-DOS was for  its low-end IBM-PC users. And the version  

of Unix that Microsoft had licensed  from AT&T Xenix was for high-end users. ## Windows It took time for Microsoft to realize  how powerful of an asset it really had. By 1983, semiconductor hardware got good enough   that PC operating systems can start  incorporating a few needed features. One of the most needed was multi-tasking.  Computer work was getting more interrelated   and complicated, involving the  outputs of several different programs.

For example, making a company report might  require a painting program, spreadsheet,   and word processor to be open all at once. With MS-DOS and other single-task  Operating Systems of the day,   users had to close down the one program running  in front of them entirely, which was annoying. Also, the way people interacted  with MS-DOS was through a command   line. You had to type in the right  prompt to get the computer to do   what you wanted. Deviations in the  prompt could give unwanted results. By 1983, the PC community narrowed on the  windowing graphical user interface as an   elegant solution to these problems. It was first  demonstrated by Xerox in the 1970s, and later   incorporated into the operating systems for the  Apple Lisa and Macintosh - sold in 1983 and 1984.

Microsoft adopted the windowing  GUI for its Windows operating   system - first released in late 1985  basically as a shell on top of MS-DOS. Throughout the late 1980s and 1990s, the  PC ecosystem exploded in size. CPUs and   other semiconductor hardware advanced  in performance like never before seen.   Hardware processes once only seen on  mainframes made their way to the PC.

The PC's modular design encouraged a plethora  of hardware peripherals and software drivers.   On the software side, an ecosystem of utilities  and applications to suit different environments   like the home desktop, the high performance  workstation, and the enterprise server. To handle all this, Windows evolved a  sprawling modular architecture with each   system function handled by a separate OS  component. Multiple software layers added   new abstractions to help programmers  and users navigate these environments. It took years for Microsoft to get this incredibly  complicated piece of software working to its full   potential. But they benefitted as Windows  established itself as the dominant operating  

system and the company started bundling  adjacent software like Office into it. By 1993, Office had 90% of  the productivity market,   contributing 50% of Microsoft’s revenues. Its  low prices - in part due to scale and subsidies   from Windows - drove competitors like Lotus and  WordPerfect out of business a few years later. Thanks to its grip on the PC universe  through Windows, Microsoft became the   defining technology company of the 1990s. But the  sun don't shine on the same dog's butt everyday.

## Mobile The first mobile "computers" were  the Personal Data Assistants. These were handheld PCs popular in the mid-1990s   for helping people manage their contact  information, addresses, notes and to-dos. Apple had been one of the pioneers in the  industry, releasing the Newton in 1993. It   was an ambitious product, but the hardware  was not ready yet - making it difficult to   fulfill its promises. For instance,  the ability to recognize handwriting. These early devices were extremely  constrained in terms of resources. The   original Palm Pilot ran on a 16 megahertz  processor and 128 kilobytes of RAM. This  

made them extremely challenging to build  for - you can't just scale down a PC OS. Microsoft initially struggled to bring Windows  to the PDA market. Their first offering was   the Windows CE OS - now Windows Mobile -  which they produced with hardware partners. Released in 1996, CE struggled with bad battery  life, OS stability, and a very bad interface.

Successful companies like Palm produced  their operating systems from the ground up   with these constraints first in mind. This meant  compromises. For instance, the Palm Pilot lacked   a keyboard and handwriting recognition,  using a shorthand system called Graffiti. ## Phones It did not take a lot of foresight to see that  PDAs and mobile phones will eventually collide. Having seen what Microsoft did to the PC  industry, in 1998 three of the largest   phone makers joined together and bought  into an operating system called Symbian. The Symbian phone OS was produced by  a British company of the same name   once producing PDA software.  Adopted by the phone-makers,   Symbian became an early leader with 65% market  share and one hundred million users at its peak.

But Symbian failed to build a powerful and  lasting ecosystem around its advantages.   None of the handset makers wanted to  give up their connection to the user,   causing serious fragmentation issues  and a whole bunch of different UIs. And since it had to serve so many  different hardware environments,   Symbian was notoriously hard to develop for. The   company struggled to build good tools and  distribution channels for their developers.

Nokia was the leading Symbian phone-maker,  driving 80% of its sales. And while they   grabbed significant market share in Europe and  Asia, they struggled in the United States. In   part because of the dominant position of the  mobile networks like Verizon and Cingular. ## iPhone & Android The early 2000s saw more improvements  in semiconductor hardware. In addition to faster and more power-efficient  processors enabled by the Arm instruction set and   its ecosystem, the decade saw the rise of flash  memory as a compelling secondary memory option.

The only thing now missing was a compelling  interface to pull it all together. As well as   a company capable of cutting through all the red  tape that turned Symbian into a convoluted mess. Apple made the first breakthrough with the iPhone,   famously building its operating system  by scaling down the desktop Mac OS. Its   multi-touch interface and desktop-class  browser instantly connected with users. And because it was based on Mac OS  X, Apple was able to port over its   ecosystem of passionate developers.  Developers so passionate that they   were hacking the OS to make apps of  their own before an official SDK was   released. The opening of the App Store in  2008 only poured gasoline on that fire.

Google saw the writing on the wall and  pivoted their Linux-based Android phone   OS in the same direction. By giving  Android away for free via open source,   Android rapidly stole share from  the then-closed source Symbian. Those old legacy operating systems are  now gone. What we now call iOS has made   Apple one of the biggest companies  in the world. And Android is the   world's most widely used OS period  - and a powerful asset for Google. It is interesting to see how iOS and its  deep ties with the App Store help drive   Apple's massive Services division  - kind of like how bundled apps   like Office made Microsoft king  of the tech world in the 1990s.

## Conclusion I have noticed that the story of operating   systems across its various form  factors share a bit of a theme. In the beginning, systems were limited by  compute. The first devices - mainframes,   microcomputers, and mobile PDAs - were not  fast enough to run anything other than the   most rudimentary programs. Compromises had  to be made to get the products to work. Over time, the processors did get fast  enough. Now the new limit is memory.  

Mainframes needed DRAM and the disk drive  to manage multiple tasks and users. PCs   had a craving for memory that was eventually  fulfilled by the floppy disk drive. And mobile   OSes could not produce bigger programs  until flash memory got cheap enough. Then finally after that, we  are limited by input/output,   or the interface. We needed new paradigms  of communicating and interacting with our  

computers to get the results we need. For the PC  that was the GUI. For mobile, that was multitouch. So we cycle back to our original question.   Are LLMs the next operating system? I  don't know. But I did notice something. It first took breakthroughs in compute to show  that larger neural networks had some potential. Then after that, we leveraged improvements in DRAM   memory to really scale up LLM sizes  to where they can show economic value.

And then most recently, we needed to find   new paradigms of interacting  with these LLMs with ChatGPT. It is fun to ponder the possibilities of an LLM  operating system, and where the metaphor can take   us. What new abstractions and environments  for doing work can an LLM OS do for us? What might that actually look like? I'm not sure   about the answers for these questions.  But I look forward to finding it out.

2024-05-20

Show video