The Evolution of the Operating System
In his video on Large Language Models or LLMs, OpenAI cofounder and YouTuber Andrej Karpathy likened LLMs to operating systems. Karpathy said: > I see a lot of equivalence between this new LLM OS and operating systems of today. I am intrigued by this notion. Operating systems are some of the world's most important technologies, with a history spanning 80 years.
It mirrors the journey of computing in all of its physical forms. In today’s video, we look at the evolution of the Operating System. ## Aspects of the OS So what does an operating system do? Maybe not so unsurprisingly thanks to the long history, this is hard to pin down. One definition I like says that the OS manages the computer's resources for the User efficiently, reliably and unobtrusively. Hardware is hard. There is a lot of it - the CPU, main memory, secondary memory, display, keyboard, mouse, and the network. Users and their applications must navigate
the idiosyncrasies and pains of that hardware to make it do something useful. Operating Systems help make this easier by giving the User or their application programs a clean, pleasant interface for their task - abstracting away the horrors of hardware. An operating system is defined by its abstractions because those are what the Users are interacting with on a daily basis. Some have been around for so long we forget how revolutionary they are.
For instance, take the humble File. In the beginning, Users dealt with physical memory - working with cells and bits. But each memory system type has its own peculiarities, and dealing with all of them is a pain in the butt. You might risk one program overwriting data used by another program - causing both to crash. It is advised to avoid this. The File throws a blanket on top of all that and just gives you this nice, clean abstraction. You might think that your "file" is sorted away somewhere as a discrete entity on computer memory - like a book in a bookcase. But this is a fraud!
In reality, the file data scattered in pieces like Cheetos across wherever the computer happened to have storage. When you "open a file", the file system is gathering those bits, putting it into the right order and presenting it to you. The OS automatically handles all that behind the scenes, putting them in either the primary or secondary storages as needed. Abstractions like this are needed for us to do our work. Every day we are interacting with abstractions built on top of more abstractions. And it all somehow works. ## Beginnings The first computers of the 1940s and 1950s were made to be used by just one user or group of users at a time.
So they just gave that user every available resource. These expensive devices cost millions of dollars today and so were rented to individuals and billed by the hour. However, those individuals found that most of their allotted time was being wasted setting up the equipment for the job. This was costing
hundreds of thousands of dollars in lost productivity each month. So in 1956, General Motors Research Lab realized that they can make a software for their IBM 701 mainframe to automatically handle the loading and unloading of each job - "batch computing". With batch computing, jobs are transferred from cards to magnetic tape. The computer would then run them all at once sequentially, with the outputs recorded onto a second tape.
Special cards between each job told the computer what resources would be needed to do the jobs. These were called "Job Control Languages". There are a few who call these the first Operating Systems, but the debate between historians on the validity of that statement remains fierce. ## Multi-programming The 1960s saw better and pricier hardware - card readers, magnetic tape, disk drives, and I/O.
Users realized that not every job used all of the computer's resources. So these expensive resources can be better utilized if different jobs could be run in parallel. Is there some way to take advantage of this? There was. Back in 1956, the UNIVAC 1103A computer introduced a new concept
called the "Interrupt". It let a peripheral hardware call for the processor's attention. At the same time, we introduced new innovations in memory capacity. Items like magnetic drums were giving the processing units more memory than they had before. Together, this let the computer hold and run multiple programs at the same time. While one program is occupying something like input/output,
another can be simultaneously running on the processor. This is known as "multi-programming". ## Time Sharing If you think about it, running multiple programs simultaneously inside a computer is a small step away from having that computer service multiple users simultaneously. One of the major problems with batch computing was slow development times. That in turn was the result of a long "edit-compile-run" sequence. Big batches took hours or even an entire day to run. If there was a bug somewhere, the day's entire output might be just an error message. Immensely frustrating.
So in 1959, the computer and cognitive scientist John McCarthy proposed a possible solution to his colleagues at MIT: > An operating system ... that will substantially reduce the time required to get a problem solved on the machine ... The only way quick response can be provided at bearable cost is by time-sharing. That is, the computer must attend to other customers while one customer is reacting to some output What this meant was a large central computer connected to what they called "terminals" - a monitor and keyboard. This gives the user the illusion that they are the only person using
the computer. A very powerful software was needed to coordinate all this and provide this illusion. Two years later in 1961, the MIT team led by Fernando Corbato managed to get a prototype working on their IBM 709 machine. Lacking hard disk drives, they used a bunch of tape drives attached to four typewriters. It just barely worked. In 1962, MIT announced the Compatible Time-Sharing System or CTSS as it was called. A year later CTSS got a hard disk drive, and was offered to large scale users - though MIT was not allowed to charge for it.
Though hints of the feature were implemented for the military's massive SAGE radar coordination system and other specialized systems at the time, we consider CTSS the first timeshare expressly made for the purpose. By 1965, CTSS had hundreds of registered users at MIT and other colleges across New England, handling up to 30 users at once. It also implemented the first mail and mail box function between users - a spiritual precursor of the e-mail.
Several other timesharing services emerged throughout the late 1960s and early 1970s. One notable system was the Dartmouth Time Sharing System. The BASIC language was developed on it. An early version of DTSS later powered a popular timesharing service offered by General Electric. GE was the market leader until the mid-1970s when competition overwhelmed them. Today, we have largely forgotten about the phrase timesharing - though it underpins the idea of what we now call cloud computing. But it had a lasting impact on the history of computers and
their operating systems. Before the rise of the PC, this was how people experienced the computer. ## Multics and OS/360 CTSS's success spurred MIT to create a successor. So in 1964, the CTSS MIT team joined with Bell Labs and General Electric, the market leader in timeshare systems to create a new software called "Multiplexed Information and Computing Service" or Multics. Multics' great vision was to enable a time-sharing "computer utility" capable of handling hundreds of users. So kind of like how water utilities provide cheap, ubiquitous water, Multics would enable computers to bring computer service to the masses. But the project ballooned as it wanted to be everything for everyone and progress bogged down. Bell Labs finally pulled out in 1969 and things fell apart.
MIT finally got Multics to work on their own, and it was eventually sold to Honeywell, which installed it on a few systems. It gained a cult following and persisted despite Honeywell's determined efforts to kill it. The last site shut down in 2000. Multics' troubles were reflected in another legendary OS project happening at about the same time. In 1964, IBM announced its historic computer line - the System 360.
Ideally, a program written for one 360 computer was supposed to run on all of them. That was the whole schtick. But programmers struggled to write software that can work on all these different hardware environments. Famously, IBM tried to build a single operating system for it - the OS/360. Despite a monumental budget and an army bigger than the Romans, OS/360 fell way behind on schedule. And in the end, they had to split it up anyway. Its project leader, Fred Brooks, later wrote a book based on his learnings from the OS/360 experience - the "Mythical Man-Month". ## Unix Multics failed as a commercial product.
But its groundbreaking ideas - security, hierarchical file systems, a command shell, and more - were incorporated into its spiritual successor Unix. I already did a video about Unix's development so I am not going to reinvent the wheel. But I think it is important to emphasize that Unix was the right thing at the right time.
It had many of the revolutionary ideas of Multics and added a few of its own. For instance, the pipeline, which let you pipe the output of one process into the inputs of another. It is like a Human Centipede but for computer processes. These helpful utilities were written for cheaper, lower class minicomputers just as those devices came to be popular with users beyond those of traditional mainframe computers. And because it was written in the high-level C programming language, Unix can be easily ported to other minicomputers. This - and the weird Bell Labs situation that left it in a copyright limbo - helped Unix gain wide adoption in universities and beyond. Unix's rise was a cultural phenomenon that paved the way for other decentralized software communities like those for the open source Linux OS and the hobbyist microcomputers.
## The Microcomputer In the mid-1970s, new semiconductor technologies enabled the creation of integrated circuits with thousands of devices on them. These ICs were powerful enough to be general purpose chips. The first such microprocessor was Intel's 4004, a four-bit chip originally made for a calculator and released in 1971. Intel later released the updated 8008 in 1973, and then the 8080 in 1974. Other
firms like Zilog and Motorola released their own microprocessors as well. These powerful chips would be the heart of what was then called the microcomputer. Intel - then very small - hired a computer scientist and language professor at the Naval Postgraduate School in Monterey, California named Gary Kildall as a consultant to produce certain software for their 8080 chip. Intel needed an 8080-compatible operating system for testing purposes. So they helped
Kildall port one he had written while working at the Naval School. The new OS was called CP/M - which originally stood for Control Program/Monitor. Kildall believed that personal computer hardware was getting good enough to compete with existing timesharing systems as a programming tool. Remember,
the great illusion of timesharing is that every terminal user thinks that they are programming on their own computer. What if that were to be actually true and not an illusion? ## CP/M Key to achieving this dream would be the memory. Existing microcomputer memory systems - particularly secondary memory - sucked. Things like paper tape and cassettes. None of this was acceptable. So Kildall got interested in a new secondary storage technology first invented and introduced by IBM called the floppy drive. It offered far more storage at a relatively cheap price. Oh and unlike paper tape, it was random access. You can just jump to the data
point you want rather than spooling through the whole thing sequentially. Kildall gets a sample drive from Shugart Associates, at the time just a few miles from Intel. Founded by storage legend Alan Shugart, Shugart Associates would later dominate the 8-inch and 5.25 inch floppy drive markets. Now what? So there was Kildall, in his room with just a naked floppy drive on his desk and a crude Intel CPU microcomputer. So as you do, he programs a controller software that helped a
microcomputer running the CP/M OS interface with this floppy disk drive and its data. It might not sound like much. But hardware limitations meant that early microcomputer operating systems were often just that: Just a file system for organizing and managing files on an external disk storage plus the ability to load and run programs on that disk. Disk Operating Systems or DOS.
Kildall founded a company called Digital Research and began licensing CP/M to microcomputer end users, who paid him thousands of dollars. By 1981, he had several hundred licensees. CP/M - retroactively renamed to stand for "Control Program for Microcomputers" - quickly became the dominant operating system for the small and burgeoning microcomputer community. Though they were not alone in the industry. Others included Apple DOS, the OS for the very popular Apple II computer by Apple Computer.
A Unix-cloneish thing called Coherent which was ported down from minicomputers. And this small thing from Microsoft, MS-DOS. ## MS-DOS Fatefully, CP/M lost its early lead. In 1980, a rogue team at IBM began a secret project to make their own microcomputer - the IBM PC. Facing a tight deadline,
the team built the machine together with parts and software sourced from outside vendors. The PC team licensed a BASIC interpreter from Bill Gates and his company, Microsoft. They were connected to IBM through Gates' mother, who was co-chair of the United Way non-profit along with John Opel, IBM's CEO. The IBM PC team asked Gates if they knew anyone making a microcomputer OS and he pointed them to CP/M. But for reasons that remain unclear today, Kildall did
not personally take the meeting with IBM and refused to sign their Non-Disclosure Agreement. So IBM went back to Bill Gates for an OS. Microsoft was then in negotiations with Bell Labs for a Unix license. That effort would eventually result in the Xenix operating system. However that was not yet done and there was no time. So Gates went out and bought a DOS from a local computer manufacturer. He then hired its developer Tim Paterson to make a few modifications and rebranded it as MS-DOS.
Critically, Microsoft did not sell MS-DOS outright to IBM but rather licensed it to them on a non-exclusive basis. The IBM version that ran on the PC at its release in 1981 was called PC-DOS. To protect PC-DOS from clones, IBM wrote part of the OS - the BIOS - to a hardware chip and copyrighted it by publishing it in a journal.
The IBM PC - with its iconic name and marketing muscle - quickly became the most popular microcomputer on the market. Its setup became an industry standard, inviting competitors and clones. At the start, MS-DOS was a crude piece of software - about 4,000 lines. Nevertheless, it allowed software vendors like VisiCalc to bring their software packages onto the IBM PC platform. A bevy of computer-makers then managed to work their way around the IBM PC-DOS BIOS copyright - kicking off the PC clone industry. Microsoft struck licensing deals with those PC-makers, rapidly grabbing market share in the industry.
Working directly with PC assemblers or OEMs scaled far better than CP/M's approach of going right to end users. Microsoft's MS-DOS overthrew CP/M as the dominant PC OS. By 1983, they had a fifth of the microcomputer operating system market. ## Applications Today, we might see Microsoft as one and one with their operating system. But in the early days, Gates and Microsoft more saw themselves as an applications company. Operating systems were important - it provided 50% of the company's revenues - but they were seen as a means to an end. Gates' thinking at the time was that with an OS you get just a few points of the machine's price so like $40 for a $2,000 machine. But with an application, you can earn hundreds of dollars.
In 1981, their top selling application was Multiplan - a now somewhat-dated looking spreadsheet application for MS-DOS. It sold a million copies over its lifetime. And for that reason, Microsoft in 1983 was big but nowhere the giant we know them to be today. That year they generated $70 million in revenues, very good but VisiCorp did $60 million and Lotus did $48 million. This thinking was why we had these interesting situations with Microsoft offering two operating systems to its customers. MS-DOS was for its low-end IBM-PC users. And the version
of Unix that Microsoft had licensed from AT&T Xenix was for high-end users. ## Windows It took time for Microsoft to realize how powerful of an asset it really had. By 1983, semiconductor hardware got good enough that PC operating systems can start incorporating a few needed features. One of the most needed was multi-tasking. Computer work was getting more interrelated and complicated, involving the outputs of several different programs.
For example, making a company report might require a painting program, spreadsheet, and word processor to be open all at once. With MS-DOS and other single-task Operating Systems of the day, users had to close down the one program running in front of them entirely, which was annoying. Also, the way people interacted with MS-DOS was through a command line. You had to type in the right prompt to get the computer to do what you wanted. Deviations in the prompt could give unwanted results. By 1983, the PC community narrowed on the windowing graphical user interface as an elegant solution to these problems. It was first demonstrated by Xerox in the 1970s, and later incorporated into the operating systems for the Apple Lisa and Macintosh - sold in 1983 and 1984.
Microsoft adopted the windowing GUI for its Windows operating system - first released in late 1985 basically as a shell on top of MS-DOS. Throughout the late 1980s and 1990s, the PC ecosystem exploded in size. CPUs and other semiconductor hardware advanced in performance like never before seen. Hardware processes once only seen on mainframes made their way to the PC.
The PC's modular design encouraged a plethora of hardware peripherals and software drivers. On the software side, an ecosystem of utilities and applications to suit different environments like the home desktop, the high performance workstation, and the enterprise server. To handle all this, Windows evolved a sprawling modular architecture with each system function handled by a separate OS component. Multiple software layers added new abstractions to help programmers and users navigate these environments. It took years for Microsoft to get this incredibly complicated piece of software working to its full potential. But they benefitted as Windows established itself as the dominant operating
system and the company started bundling adjacent software like Office into it. By 1993, Office had 90% of the productivity market, contributing 50% of Microsoft’s revenues. Its low prices - in part due to scale and subsidies from Windows - drove competitors like Lotus and WordPerfect out of business a few years later. Thanks to its grip on the PC universe through Windows, Microsoft became the defining technology company of the 1990s. But the sun don't shine on the same dog's butt everyday.
## Mobile The first mobile "computers" were the Personal Data Assistants. These were handheld PCs popular in the mid-1990s for helping people manage their contact information, addresses, notes and to-dos. Apple had been one of the pioneers in the industry, releasing the Newton in 1993. It was an ambitious product, but the hardware was not ready yet - making it difficult to fulfill its promises. For instance, the ability to recognize handwriting. These early devices were extremely constrained in terms of resources. The original Palm Pilot ran on a 16 megahertz processor and 128 kilobytes of RAM. This
made them extremely challenging to build for - you can't just scale down a PC OS. Microsoft initially struggled to bring Windows to the PDA market. Their first offering was the Windows CE OS - now Windows Mobile - which they produced with hardware partners. Released in 1996, CE struggled with bad battery life, OS stability, and a very bad interface.
Successful companies like Palm produced their operating systems from the ground up with these constraints first in mind. This meant compromises. For instance, the Palm Pilot lacked a keyboard and handwriting recognition, using a shorthand system called Graffiti. ## Phones It did not take a lot of foresight to see that PDAs and mobile phones will eventually collide. Having seen what Microsoft did to the PC industry, in 1998 three of the largest phone makers joined together and bought into an operating system called Symbian. The Symbian phone OS was produced by a British company of the same name once producing PDA software. Adopted by the phone-makers, Symbian became an early leader with 65% market share and one hundred million users at its peak.
But Symbian failed to build a powerful and lasting ecosystem around its advantages. None of the handset makers wanted to give up their connection to the user, causing serious fragmentation issues and a whole bunch of different UIs. And since it had to serve so many different hardware environments, Symbian was notoriously hard to develop for. The company struggled to build good tools and distribution channels for their developers.
Nokia was the leading Symbian phone-maker, driving 80% of its sales. And while they grabbed significant market share in Europe and Asia, they struggled in the United States. In part because of the dominant position of the mobile networks like Verizon and Cingular. ## iPhone & Android The early 2000s saw more improvements in semiconductor hardware. In addition to faster and more power-efficient processors enabled by the Arm instruction set and its ecosystem, the decade saw the rise of flash memory as a compelling secondary memory option.
The only thing now missing was a compelling interface to pull it all together. As well as a company capable of cutting through all the red tape that turned Symbian into a convoluted mess. Apple made the first breakthrough with the iPhone, famously building its operating system by scaling down the desktop Mac OS. Its multi-touch interface and desktop-class browser instantly connected with users. And because it was based on Mac OS X, Apple was able to port over its ecosystem of passionate developers. Developers so passionate that they were hacking the OS to make apps of their own before an official SDK was released. The opening of the App Store in 2008 only poured gasoline on that fire.
Google saw the writing on the wall and pivoted their Linux-based Android phone OS in the same direction. By giving Android away for free via open source, Android rapidly stole share from the then-closed source Symbian. Those old legacy operating systems are now gone. What we now call iOS has made Apple one of the biggest companies in the world. And Android is the world's most widely used OS period - and a powerful asset for Google. It is interesting to see how iOS and its deep ties with the App Store help drive Apple's massive Services division - kind of like how bundled apps like Office made Microsoft king of the tech world in the 1990s.
## Conclusion I have noticed that the story of operating systems across its various form factors share a bit of a theme. In the beginning, systems were limited by compute. The first devices - mainframes, microcomputers, and mobile PDAs - were not fast enough to run anything other than the most rudimentary programs. Compromises had to be made to get the products to work. Over time, the processors did get fast enough. Now the new limit is memory.
Mainframes needed DRAM and the disk drive to manage multiple tasks and users. PCs had a craving for memory that was eventually fulfilled by the floppy disk drive. And mobile OSes could not produce bigger programs until flash memory got cheap enough. Then finally after that, we are limited by input/output, or the interface. We needed new paradigms of communicating and interacting with our
computers to get the results we need. For the PC that was the GUI. For mobile, that was multitouch. So we cycle back to our original question. Are LLMs the next operating system? I don't know. But I did notice something. It first took breakthroughs in compute to show that larger neural networks had some potential. Then after that, we leveraged improvements in DRAM memory to really scale up LLM sizes to where they can show economic value.
And then most recently, we needed to find new paradigms of interacting with these LLMs with ChatGPT. It is fun to ponder the possibilities of an LLM operating system, and where the metaphor can take us. What new abstractions and environments for doing work can an LLM OS do for us? What might that actually look like? I'm not sure about the answers for these questions. But I look forward to finding it out.
2024-05-20 14:34