NEW: What is the FASTEST Computer Language - 45 Tested: Round Two! (E02)
Hey I'm Dave, Welcome to my shop! I'm Dave Plummer, a retired operating systems software engineer going back to the MS-DOS and Windows 95 days, and today it's time for Round 2 in the epic Software Drag Racing Series, where we're pitting 45 brave computer languages against one another in a sudden death, no holds barred showdown of prime number generating fury. You'll get desensitized to the carnage soon enough, but until then, I must warn you to brace yourself for some of the most computationally intense YouTube footage you've ever seen: all today in Dave's Garage. [Intro] Our first order of business today will be to deal with a hot topic that has come up again and again in the video comments, and after that, we'll be getting much deeper into the results today as we keep on adding new languages to the competition results. We'll drop right into the code editor and I'll take you on a guided tour of few languages that everyone has heard of today where you'll learn cool new keywords like unless, and bless, and last. Still, even if the none of the languages I feature today become your favorite, that's ok: remember the whole point of this series is to go through the quick review of languages that you DON'T use daily to give you a sense of the language flavor, how it works, and as I like to say, just enough knowledge to be dangerous.
If you just stumbled across this video without seeing Round One of the competition, fear not, these language episodes can be watched out of order, so you can and absolutely should go back and watch the others later. For now, grab yourself a venti chai and have your assistant hold your calls for a little bit, we've got a lot of languages to cover! Once we're done the language review, we'll take the new contenders to the track and run them heads up for a side-by-side performance showdown. Just like a lane at a real drag strip, each language will be given a dedicated CPU core and the task of solving for all of the prime numbers under one million, as many times per second as possible. The language with the most passes per second becomes today's winner and at the end of the competition, the highest performing language AT THAT TIME will be crowned the overall winner. Right now, the crown is being tossed back and forth between Zig, Rust, Nim, and C#, which really surprised me. In fact it surprised me so much that right now would be a good time get a vidcap of my surprised face for the video thumbnail! If you absolutely positively can't wait and want to get right into the code, you can jump ahead to about . You would, however, miss the latest gossip, rules, rants, and my philosophizing
on assembly language, all up next. And be sure to subscribe to the channel so you get notified of future episodes in the series, which I will be releasing on a regular basis. After all, if you don't sub you just might miss your favorite language! The Fastest Computer Language series has really taken off, even to the extent that the Primes project was suddenly the number 2 trending project on GitHub! A huge thanks goes out to everyone involved in that end of things, both as administrators, contributors, and even just folks interested in seeing how it all works! Be sure to check out the video description for a link to the Github repo where you can just lurk and check out the code in 45 completely different languages.
Go have a browse and see what some of the weirder languages look like at your own speed! And don't worry, you can't break anything by looking, but you CAN cruise around every solution in all of its syntax-highlighted glory right from the comfort of your web browser. Join the discussions and ask your questions. If you like what you see, please give the project a star on Github and tell a friend to check it out! And just when I think I've seen it all, someone will add a new language or a new platform. A few minutes ago, I saw a pull request for what looks like an IBM/360 mainframe version! So by all means, if you know your way around a language we don't have, like SNOBOL or PROLOG or many others, please dive on in! The more languages we stick on my newfangled Rosetta stone, the better. And it's a great place to preserve some of the oldest ones.
I mean, what if you're the only person left who knows how to write a prime sieve in Job Control Language? Don't you have a moral obligation to the children? Or at least to the child programmers? And what is that hot topic in the comments I was referring to? Since that first episode aired, a number of people in email and online and in the comments have flatly stated that of course assembly is by definition the fastest language. One person even went so far as to state that because it was a well-known fact that C and Assembly are the fastest languages, this whole thing is just pointless and does not even need to exist as a concept! Sort of nihilist in a "No Country for Old Programmers" kind of way, if you know what I mean. Either way this question has stimulated some very healthy debate in the comments section of round one, so it's something I wanted to address: Since all languages ultimately compile down to assembly language anyway, isn't assembly by definition the fastest language? And since C is so powerful and compiles so directly to assembly, isn't it going to be the second fastest? It'd be a short video, 3 minutes, and then off to watch me some Amanda McCants re-runs.
Done and done. The problem is in part the way the question is stated: it's wrong because high level languages don't compile down to assembly language, they compile down to binary machine code. While it is true that you could theoretically write assembly that would also compile to that same machine code directly, assembly language is a language to be used by humans to write machine code. The fact that there is a direct translation to and from machine code doesn't mean that assembly language can therefore somehow take credit for the victories of other languages.
Just because you can translate losslessly between two languages doesn't mean Shakespeare was originally written in Klingon. Let's say for a moment that Zig becomes the ultimate winner. Zig compiles to x86 machine code, not assembly language. For assembly language to take credit for such a win or even a tie with Zig, you would need to disassemble the Zig and convert it into Assembly language such that you had an assembly source file that you could check into the assembly branch of the project. And that means three things: it means that this supposed fastest assembly language version is dependent on a Zig compiler as an intermediate tool, and it means an extra step in converting back to assembly language, and it means a human was never directly involved in the creation of the assembly language version anyway.
Because they wrote it in Zig. Some have made the similarly absolutist claim of saying "Well then, machine language is fastest. Mic drop."
And perhaps it is - but to actually earn title to that claim, a programmer would have to use it as a language to express the algorithm natively in that language. Expressing it in another language and then translating it to machine code isn't the same as using machine code to express the algorithm - if you can even consider machine code a language, which I don't readily grant. If you can write a sieve in hex that is faster than all the compiled versions, well then you'd win the argument. And I would be, in fact, mighty impressed.
Another quick thing I wanted to note is that I grew up in Saskatchewan, way back before the introduction of US cable TV, which meant we were isolated enough during my formative years that the region even had its own unique pronunciations for things, such as the Toyota Cell-i-ca, the Kia Sport-ahge, Nike shoes, and so on. And that means maybe I say Seequel and you say S-Q-L. Or the other way around.
If so, I apologize in advance. But perhaps instead of cringing every time, remember: I'm just a whacky Canadian immigrant who's talking aboot stuff that I only read in books anyway, eh? At least I don't do these videos in my original accent. One other issue that has come up a few times is people butthurt over the title because I didn't cover all 45 languages in that first episode. It would have either been hours long or each language review would have been shortened down to being a picture of me holding the printout. And hey, it's a series, and has a series indicator and an episode number right in the title, so... yeah.
They don't tell you who shot JR in episode one. As we get further deeper into the rounds, however, I do hope to spend more time on the actual languages and less on the talking! If you want the immediate spoilers, I have made the results available in the github repo, but even I don't know the final results because the winner won't be chosen until all the language reviews are completed and everyone gets a chance to improve their favorite language. And speaking of the rules and picking a winner. It turns out that we have a couple of different categories in which I will announce winners, but the overall champion will be determined when I'm done covering the languages individually. At that point I will review the then-current leaderboard and select the highest performing single-threaded implementation that meets the following criteria: - 1) The algorithm must be "faithful" to the one used in the original C/C#/Python episode - 2) It must return its results as a packed bit array that uses at most one bit per non-negative integer up to the limit size of the sieve. If the limit is one million, it may return at most one million bits.
- 3) I significantly prefer solutions that operate on one bit at a time in memory as well, but appreciate it may not be possible in all languages, so haven't made it a strict requirement. But if you wantonly waste memory, I do reserve the right to mock it mercilessly! - 4) The algorithm cannot encode any pre-existing knowledge of primes other than ignoring even numbers. Storing 3, 5, 7, etc or any factor-wheel derived from them is 'unfaithful'. It's sometimes cool and very be fast, but unfaithful nonetheless. - 5) The sieve must be completely torn down and be recreated for each pass. It cannot cache any info or state from one run to the next or reuse allocations.
- And Finally 6) The sieve must accept the sieve size as a runtime parameter, as if you were writing a prime number API of some sort. It cannot be baked in as a fixed compile-time constant. That's the basics.
In some cases, we will only have unfaithful implementations for some languages. Ideally someone will jump into the Github for that language and fix it up in time for the final showdown to make it eligible, but if not, I'll still feature the language but with an asterisk next to its score to indicate that it's a non-official, exhibition style run. We'll get to check out the code no matter what, though. With all that out of the way, it's finally time to dive on into some code, starting today with PHP. PHP was created back in 1994 by Rasmas Ledorf, and the real raison d'etre for PHP is to run on web server backends. In fact, PHP originally stood for Personal Home Page, though now it somehow means "Hypertext Preprocessor".
Rasmas has said that he did not plan to invent a programming language. In fact, he's gone so far as to say: "there was never any intent to write a programming language [...] I have absolutely no idea how to write a programming language". When you see the first version of PHP syntax, you might agree with him! Let's imagine you had a web page on your personal website and you wanted it to say "Good Morning" in the AM and "Good Afternoon" after noon and so on. In a web server running PHP on the back end, the web page content is loaded form disk and then the server scans for the PHP code inserted into the HTML content. When it finds it, it executes that code which can range from making subtle customizations to the page, such as adding the Good Morning text, all the way up to completely generating the entire rest of the page programmatically.
For many years, the vast majority of backend text processing on pages was done by PHP, and today it is said that about 250 million websites use it, which amounts to about 75% of the active web servers. Although managed by a web server and operating within a web page is where you'd normally find PHP, you can also use it for processing and storing form data that a user enters into an HTML form, for storing data within and manipulating web cookies, or for filtering and restricting access to certain pages within your site. One could also use it for standard server scripting as well. You can execute PHP code directly from the command line with the -r switch. That is how our prime sieve application will function.
Let's drop into the PHP code and have a first-hand look. Right away we can see that PHP allows you to define classes, not unlike C++, and that you can mark things as public or private access. The declaration of the results dictionary is simple and elegant, and it's even properly marked as static, which I always like to see.
The constructor right below it is a little less intuitive: the constructor for a class is implemented by defining the underscore, underscore, construct function, and this is where you would perform any necessary instance initialization and setup. Next, we are met with the two main bitarray functions, getBit and clearBit. This implementation likely runs at half speed it could because it contains an extra, unnecessary step: each time, it's checking to make sure the array index is never an even number. That's fine, but it's not really required because the sieve should never be attempting to even do so in the first place unless it had a bug. So it's better to place that test in an assert or other debug message so that it you catch any such scenarios without slowing the program down during benchmarking. Happily, the RunSieve function is equally clear.
We start with the factor 3 and prepare to run the loop out to the square root of the sieve size. The use of the dollar sign on every variable is a little weird and ornate for my tastes but I'll assume they had their reasons. Otherwise, the logic here is clearly apparent and the code implements a sieve that is easy to read and clearly follows the same algorithm as the other languages we've inspected so far.
Computing the raw bit count is done in an interesting fashion - it's done by summing the entire array of yes/no bits as though they were actually integers. Since each array slot that was left as set would contain a single 1 bit, adding them up as integers has the effect of giving you the total count of how many primes were left unmarked in the bitarray. The gettimediff function seems straightforward enough - it works in microseconds and returns the time delta in milliseconds. The validateResults function is more interesting. The question mark int means this is a nullable type.
That means the value it returns could be an integer, but it could be null. And null is NOT the same as zero, it literally mean no value whatsoever. If there is an entry in the historical table it returns that int value, otherwise it returns the null. This is used by the printf function above to display an indication of whether the results are valid or not. Apparently comparing a nullable int to an int constant is allowed, as that's what it seems to be doing. It's also worth noting that clearly PHP supports good old C style printf formatting.
It appears both the echo command and the printf command do not provide a carriage return, so it's up to you to do so at the end of your output line if you want one. Now I think the problem with this PHP sample is that it's too nice. Seeing it might give you the idea that an HTML pages marked up with PHP for server-side processing were also elegant and readable. In my humble estimation, that's not always the case, or at least, it certainly wasn't always true for PHP. Let's take a look at a small sample of early inline PHP code from Wikipedia: As you can see the code in question is actually embedded within HTML comments, which was pretty much the only way that was practical at the time.
Code such as "is less num entries 1" doesn't read quite as clearly as we would like, but the syntax of PHP would improve across the years. PHP 3 was the first version that really resembled what we know as PHP today. For PHP4, it adopted the Zend execution engine. PHP5 introduced the second version of that engine and was succeeded by PHP 7. And whatever happened to PHP 6 in between them? Well, over the years there have been a number of abortive attempts to integrate UTF8 and Unicode support, and PHP 6 was but one of those failures. As one of the guys who rewrote the entire windows shell to make it Unicode, I appreciate what a challenge compatibility can be.
About 2/3 of all PHP installations are updated to PHP 7 today and all but a scant few percent of the other third are running PHP 5. If you've never used the live interpreters at W3SCHOOLS, they are a handy scratchpad on which you can write your experimental PHP code. If you have PHP installed, you can execute PHP source files from the command line and if you wish, you can execute code directly as follows. Not the use of the -r command line switch to tell PHP to run the code directly rather than looking for script begin and end tags within the file. From PHP we now turn our attention to Perl. I thought the best way to Segway from one language to the next would be to summarize the differences between the two languages, and so I turned to Wikipedia, where I learned that while PHP is a basic, intelligent, object-oriented, utilitarian and procedural programming dialect, Perl is a basic, intelligent, object-oriented, procedural, useful, multi-paradigm and event-driven dialect.
The syntax for declaring an associative array, or dictionary, can be seen where the code initializes MYDICT with this historical results data. It would appear that the constructor paradigm is implemented in the form of a member function called new. But before an object can work properly in Perl, you'll see an odd looking statement known as the bless command. It serves to associate an object with its class. When you create an object, it's generic, but when you want it to become an instance of a particular class and cause the constructor to run, you bless it with a class name.
In this case it is instantiating those two pieces of instance data - the sieve size and the bit array - and turning it into an instance of the PrimeSieve class. In the RunSieve function we see variables with dollar signs again, just like PHP. One perhaps odd and unique feature of the language is that it always tries to give you more than one way to do everything.
Simply having an if function or even an if not function was not enough - they chose to add an unless statement. If the Boolean expression evaluates to FALSE, then the block of code will be executed. Thus, if you were to say "unless x is less than 20" and hit it with 30, the code WOULD execute. You can use the unless keyword as a conditional for a block, as we see in this case, or by appending it after an arbitrary statement, a style of use we'll see later. In the for loop there can be found the use of the last keyword, which I would say is effectively a break statement. At first I thought perhaps you could say this pass was your last at any time and the remainder of the loop pass would complete, but it does not.
No more statements are executed and control drops out of the loop immediately. This it's really a break statement, but you know... the language was designed by fast talking rebels who play by no one's rules but their own. So last it is. We see frequent use of the self pointer and the arrow operator, which are very similar to what you've come to expect in languages like C. If you've seen a this pointer, you know what a self pointer is.
We also see the my keyword used a lot, and it is like the var keyword in some languages in that it indicates that the variable being declared is local in scope to that block and inward. In print_results we can see a good example of the somewhat odd syntax for declaring function parameters. The arguments are represented by the "at underscore" operator and assigned to local variables with the names specified in the parenthesis. The this pointer is listed as the first argument and is passed implicitly. One of the sketchy parts of these untyped lanugages is, of course, that you can pass anything in any order and it'd still compile and run but do fun things when you pass an array where a string was supposed to be. Count primes reads quite clearly, with the possible exception of another use of the unless keyword, which somehow really doesn't fit with the way I think.
I'm fairly sure that I would personally stick with if when using the language, but to each their own. Note how the unless keyword can be applied after a statement, such that count will only be incremented when the bit is NOT set. Down in the program main we have a nice example of the use of the high-resolution timer available to perl programmers. It's rather elegant in that you simply subtract one time from another and the difference is automatically provided in floating point seconds.
That's about all the perl we've got time for today if we're still going to race them, so let's head on over to the virtual dragstrip and run PHP and PERL head to head, prime sieves to one million, as many times per second as possible. Well, those sound effects might be a little optimistic for the perl case! It's score of 64 places it squarely in bottom fifth of our leaderboard. While our PHP version is 3-4 times faster than that, it's only a few slots ahead on the leaderboard. By the way, both are faithful implementations of the algorithm.
I want to reiterate that when a language comes in with a low score like that, it's not necessarily a slight of that language if your use of it isn't performance sensitive. If the language does what you need it to do, and you're not waiting around for it to do its part, that's all that ultimately matters. And besides, if you can get all the primes to a million in under one hundredth of a second, how can you call that slow? But it will definitely look that way in relative terms when we break out the C and Rust and other languages that are ripping off thousands or even tens of thousands of passes per second. Going forward, I'd like to get 3-4 languages into an episode if possible, and if you have suggestions on how to match those languages up, please share them in the comments, which I do read fairly diligently. Some groupings, like the functional languages, are obvious, but when it comes to oddities like postscript and SQL, who do you pair them with? Please take a look at the list of all 45 and share your thoughts.
And speaking of sharing, I might as well ask a favor as well - this kind of topic, racing computer languages, is a very narrow niche. So, if you know someone who might be interested in it, or you participate in an online community where it would be specifically relevant, please do share it! Post the link to Round 1 on Facebook, Reddit, wherever you think folks would be interested in it. But do make sure it's an on-topic value ad wherever you do decide to share it. The goal is to reach more people who would be interested without spamming anyone who wouldn't. Please make sure you're subscribed to the channel and turn on the bell icon so that you don't miss future episodes including the grand finale where the overall winner is revealed.
I'll also be announcing winners in various categories of languages as well, like functional, scripting, and so on. Which is why I want your suggestions on grouping them! Thanks for joining me out here in the shop today. In the meantime, and in between time, I hope to see you next time, right here in Dave's Garage.