CS50 2020 - Security

Show video

[MUSIC PLAYING] DAVID J. MALAN: All right, this is CS50. And this is not a typical week of CS50. Indeed, we're midweek here in the fall of 2020 here on campus, which would normally be first year family weekend, an opportunity for the parents and family members of the undergraduates here at Harvard to actually come to campus, sit in on classes, attend talks, and get to know their students in situ here right on campus.

Of course, this particular year, all of us are staying afar, digitally, except those of us who are actually here living in Cambridge itself already. And so what we thought we would do this year is hold a virtual talk of sorts, a virtual discussion focused on one topic that doesn't require any prior familiarity with computer science, does not require that you be in or have taken CS50 itself because it's about a topic that's at least in part familiar to all of us. Even if it's a little bit less familiar technically, it's certainly familiar to all of us as users of technology. And that topic is security or cybersecurity. And what we propose today is that we have a discussion about how you might go about keeping your own computer-- be it a laptop or desktop or your own phone, which is also a form of computer-- secure. And I daresay that this topic, even though we'll get into some of the technicalities of what it means to be secure, is familiar in the sense that all of us think about or encounter good security and bad security in the real world every day.

Think about the home that you live in, be it a house or an apartment or a dormitory or somewhere else. Typically, you'll have things, like, locks on the doors. And you might even, depending on where you live, have bars on the windows and the like. But typically, there are manifestations of security at different levels.

And I mean that literally. For instance, in a typical city, there might be bars on the first floor windows, but not on the second floor or the third floor. And that is to say that someone could technically make their way into your home by way of the second floor or the third floor, but it's going to be more difficult, of course, because they need a ladder. They need some other form of physical access to a height like that, at which point, they're probably going to attract more attention. And so the probability that an adversary is going to break into your home given that they have to actually rise to that level and get above the first floor is probably relatively low. It's not zero.

There's nothing stopping someone technically from pulling up a ladder and going into that open window or the light that has no bars. But it's less likely. And that's actually a good way to think about security in the digital world as well, that there's really no such thing as secure. Like, your phone is not secure fundamentally.

Your laptop, your desktop is not secure fundamentally. It's secure to some extent. It might be secure against certain attacks or certain types of adversaries or adversaries with certain amounts of resources. And those resources might be time, might be money, might be technical savvy. But it really is going to be a trade-off.

And so while a bit unfortunate, one way of thinking about security is that you don't want to be a secure in an absolute sense. In the real world, you want to be more secure than your neighbor's house, for instance. You want to somehow raise the bar, either physically or metaphorically, to the adversary so that it's going to take that adversary just too much time, too much money too much effort to break into your home that they might as well just go next door instead. And the same is going to be true in the world of computers.

But we're going to measure the security of systems more computationally, not so much physically. So with that said, let me invite you to open up this URL here on your screen. If you're using a laptop or desktop, go ahead and just open it up in a separate tab in another browser. If you're on your phone, you can go back and forth between two windows most likely, depending on your operating system. But go ahead, when you have a moment, and open up this URL.

And we'll use this URL to ask a few interactive questions that you can respond to digitally. And we'll also take questions and comments throughout today as well. So with that said, what does it mean to be secure, then? Let's take a couple of thoughts on this.

What do you think of the word "secure" as meaning in the context of your phone, of your computer, of your home? Interpret as you will. What does it mean to be secure, would you say? Any digital hands in Zoom? If you're feeling shy, feel free to chime in via the chat and Brian can proxy. But otherwise, do feel free to raise your hand virtually if you would like to offer your definition.

Yeah, how about over to Pranav, if I'm pronouncing it right? What does it mean to be secure? PRANAV: Yeah, I think it means, by security, you mean to protect all the data that's stored on a particular system if we're talking about technology. And at least make it hard and buy yourself enough time that a certain person may not hack into your system at the current moment because-- DAVID J. MALAN: Good. PRANAV: --let's face it.

You may not be able to protect your system for your entire lifetime. But I would say, at least buy yourself, continuously buy yourself time. DAVID J. MALAN: OK, I like that. So security is all about keeping someone out of your resources.

But as I myself have claimed thus far, that's hard to argue in the absolute. Really you want your system to just take too much time to compromise, your phone or your laptop to take too much time to compromise, at which point you're sort of probabilistically, statistically safe against adversaries. Because again, they're not going to want to waste that much time or effort or money hacking into your particular system versus someone else.

Now, there are different ways that you and I in the real world try to keep our laptops and our phones secure. And one of those most popular mechanisms is, of course, passwords. Passwords, being some kind of phrase, some kind of number that you actually configure your device with so that ideally, only you know that password.

And only you, therefore, can get into the device by using that password. And so by a show of physical hands, how many of you have passwords on your laptops or desktops if you use one of those devices? So almost all of the hands are going up. Those of you who don't have your hand going up, you've probably made, I presume, a conscious choice to not use a password.

Maybe it's annoying to type in. Maybe you don't really worry about anyone around you getting into the device. But you should concede or recognize that there is therefore a threat. It's much easier for someone to get into your laptop or desktop then into that of anyone else who raised their hand just a moment ago.

Now, those of you who have a phone, a mobile device, those of you with that device, how many of you have a password or a passcode on that device, on your phone? So somewhat fewer hands I'm seeing. So it's good that so many hands are going up. But there, too, it seems that some of you don't have. And hopefully, you've thought about the implications of that, which means that your parents, your siblings, a stranger, if they just physically pick up your phone, whether it's in your home or in a cafe or an airport, has immediate access to all of your data. So arguably, much less secure, certainly, than someone that requires a password.

But let's consider how we can measure the security of your phone, measure the security of your computer, just by using this simple familiar mechanism, like, a password. So it turns out that you and I, frankly, as humans, aren't very good at picking these passwords in the first place. As of 2019, just some months ago at year's end, this was determined by security researchers to be sadly, the most common password in the world, literally, 123456. That was the most common password according to many measures this past year among those passwords that were known. Number two on the list was slightly better, 123456789.

After that was qwerty. If that one looks a little weird, if you have a US English keyboard and you look at the top left row of your keys, Q-W-E-R-T-Y is what they would spell on a US keyboard. People are really not trying very hard to come up with their password, even though it's not technically an English word, per se.

Password was the number four most popular password, P-A-S-S-W-O-R-D, which is a little too tongue in cheek to be at all secure. After that was slightly worse, 1234567; after that, 12345678; after that, 12345. You can perhaps see the pattern here. After that was, adorably, iloveyou. But if you think you're being clever by having iloveyou as your password, well, there's a lot of other humans in the world that think they're being cute, too. 111111 was also popular.

And then lastly, 123123. So now why these passwords? You can perhaps infer from this list why some of these passwords are the way that they are. Odds are these people were using these passwords on phones or on websites or in other systems that probably had, like, a minimum password length. These people probably needed a password that was six characters long. These people probably needed one that was nine characters long, and so forth. So you can perhaps see some manifestations of policies that companies and universities and software manufacturers might have in place.

But suffice it to say, if your password is on this list, your first takeaway from today's discussion should be change that password-- at least if you care about the account. And I would argue, too, and we'll come back to this, it really probably should figure into your decision making what type of account it is. If it's for some silly website or game that you're never going to use again, maybe it's not a big deal.

If it's your bank account, your student record, something medical related, probably you really don't want your password on this list. So there, too, consider the context in which we make all of today's decisions. Now, why are these passwords bad? And why are passwords themselves potentially at risk? So a term of art in computer science is that of brute force attacks.

And this kind of is what it says. This refers to an adversary-- someone who's out to get you or get someone-- has a device or writes software that tries to just guess your password. Brute force attack means that if they don't know your password, they're not just going to try random numbers necessarily.

They're going to try 111111. And then they're going to try 111112. Then they're going to try 111113, either manually, by typing it into the phone that they might have stolen off of you, or maybe by writing software, and then connecting that software via a laptop or desktop to your phone via USB cable or lightning connector or the like. A brute force attack pretty much just means that the adversary doesn't necessarily know anything about you-- your name, your birthday, your children's names, nothing like that. But they do have a lot of time or a lot of skill.

And so they're just going to try all possible passwords. And what's eye opening, I think, about this type of attack is that it already gives us an opportunity to start thinking about how can we protect ourselves against an attack? And just right now, how secure are your accounts on your phones and computers against brute force attacks? Well, let's consider how an adversary might do this. This is kind of a silly YouTube video here. But let me go ahead and play this animation, really, which shows a small robot of sorts that is typing using this little robotic arm onto an Android phone down there.

There's a zoomed in version of it. And pretty much this is a brute force attack by a robot, a physical device that an adversary has designed to just type in all possible passcodes. And even though the video itself is short, you can imagine the adversary going about their day, going to sleep. And this thing just keeps brute forcing its way through your password. So eventually, it might get lucky and stumble upon whatever code you were indeed using. But of course, there's probably other threats, too.

There's other threats. In fact, anyone who's taken CS50 or CS50x or even just the first few weeks of it, learning a little bit of C or Python, both of which are common programming languages, anyone who knows a little bit about programming can certainly write software that simulates what that robot was physically doing. And the thing about software is as soon as you don't have any moving parts, you can do things much, much faster because it's all electronic. It's not at all mechanical. And so in this case, what if I were to steal your phone off of you, for instance, write some software on my Mac or PC, and then plug my Mac or PC into your phone with, again, a USB cable or a lightning connector, such that I could write code that tries all possible passcodes again and again? For instance, suppose that your phone is using-- and this is not an uncommon default on iPhones or on Android phones, at least in the past-- four digits. Suppose that you're required to choose minimally a passcode or password, synonymous here, that are four digits long.

And we're talking decimal digits, so 0 through 9. So 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, you need to choose four of those digits in some pattern. Well, how many possible passcodes are there that are four digits long? If your passcode is 4 digits long, you can begin to think about the security of your passcode in terms of, well, how long would it take an adversary to brute force their way to my actual password starting at 0000, going all the way up to, for instance, 9999. Well, let me go ahead and open up the screen. If you pull up that same URL from before, you'll see in just a moment a poll that'll ask you this very same question, that being, how many 4-digit passcodes are possible? In just a moment, you'll see this on your screen. Let me go ahead and full screen it on my end as well.

Go again to the URL that's atop my screen here, if you missed the URL earlier or happened to close the tab. How many 4-digit passcodes are possible? How many 4-digit passcodes are possible? Among the answers here are 4 or 40 or 9,999 or maybe 10,000-- or quite fine, too, you're unsure. Go ahead and buzz in with one of those responses, if you could. All right, looks like we have got a few hundred responses thus far. We'll give you a few more seconds to buzz in.

And let me go ahead and begin to reveal the results here. So it looks like quite a few of you, 60-plus% think it's 10,000 possibilities. 27% of you think it's 9,999 possibilities.

And then a few others think it's 40 or four. And a bunch of you are unsure. So let's consider, then, how we would answer this question so that we have a mental model for answering this on our own.

Let me go ahead and propose that to answer this question, we just do some very simple arithmetic. It doesn't need to get very complicated. But the math could be thought of in the following way. If we've got a 4-digit passcode, that's four digits, each of which can be zero through nine. And there's 10 total digits.

Therefore zero through 9. Eight nine so 10 possible values for each of those four digits. So if that's the case, I think it's fair to say that there's 10 possibilities for the first digit times 10 possibilities for the second times 10 times 10. And of course, if you multiply this all out, the answer was indeed 10,000 possibilities.

So if you have an iPhone or an Android phone right now and you've got a 4-digit passcode that you think no one knows, that may very well be the case. But you should worry about or consider, well, what happens if a friend with a fancy robot connects your phone to that and just tries all possible values from 0000 to 9999, or smarter still, connects your phone via cable to their laptop, writes software to generate all of those possibilities? Well, a little worrisomely, it's not all that hard to do the latter and to actually write code. So in fact, let me go ahead here and on my own Mac, let me go ahead and open up a program that's going to let me write some code in a file called crack.py. So "crack" is a term of art in programming, which means to brute force your way into a password somehow, so to figure out what it is algorithmically. Those of you, particularly parents and family members who have never seen any of this before, totally fine.

That's new to you. Your sons and daughters and others here in the room have seen little bit of this code. But we'll keep it short, which is to say that it actually doesn't take all that much effort to write code that brute forces an attack on your own phone. And the code I'm going to write here is in a language called Python, which is quite popular these days. And I'm going to say a command like this, from string import digits, which is just a clever way in Python, this programming language, just give me access to all the possible digits in decimal, 0 through 9. And then I'm going to import, so to speak, from a library, some software that some other smart people wrote, something called product.

So it turns out, in a programming language, you have lots of functions or functionality. Much like in the world of math, you have functions, like, addition, subtraction, multiplication, and division. In the world of programming, you have all of those capabilities, but many more. And so one of the functions I'm importing here is this notion of a product, which really just means a permutation of all possible digits.

And now I'm going to use what's called a loop in programming. A loop in a program is just something that does something again and again. And I'm going to go ahead and say this, for passcode in, the product of all of those digits, and repeat the digits four times total, go ahead and print out each passcode. Let me go ahead and print it out using somewhat cryptic syntax.

But that's only because I'm going to print out a list as an actual string. Parents and family members, don't worry for now what that means. CS50 and CS50x students, this is just a clever way with a couple of lines of code to iterate over all of the digits 0 through 9, combine them four at a time, and print out all possible permutations of those four digits.

So if I didn't screw up here, I'm going to go ahead and save my file and run a command called python on crack.py and hit Enter-- boom. That was so fast. In fact, let me do it again. Let me clear my screen and rerun this crack.py program--

boom. That's how fast a computer, my little Mac here, can try all possible codes between 0000 and 9999. And it's so fast because it did them all in the blink of an eye. So if you're thinking that your 4-digit passcode is keeping you somewhat secure, it probably really isn't because it wouldn't take that much effort for maybe someone in your household to write code like this, connect to your phone secretly at night when you're not paying attention, and figure out, potentially, what your code actually is.

So what would be better than using just digits? What would be better? Well, why don't we use letters of the alphabet, English alphabet, for today's purposes? And in the English alphabet, we have more letters than we have numbers. So how might we think about this? Let's go ahead and ask a question here. If you change your phone after today to use four letters of the English alphabet instead of using numbers alone, how many possibilities are there then? Well, let me go ahead and open up a different poll question here, which asks this time, how many 4-letter passcodes are possible? And we'll see what folks think and answer to this, as the answers begin to come in. To be fair, I have not qualified one thing.

So you might have to be making certain assumptions. There are indeed 26 letters of the English alphabet. However, there's uppercase and lowercase. So if you allow the user to type in something case sensitively, so to speak, where case matters, it's not 26 possibilities for each of those four characters.

It's instead 52 possibilities. So it looks like an overwhelming number of you, 78% think there's some seven million possibilities when using 4-letter passcodes. About 11% of you think that 52,000 are all of the passcodes. So let's go ahead and do the quick math.

Again, it doesn't need to be particularly sophisticated, the math. Let me go ahead and open up this time, similar approach to this problem, whereby if we have four letters of the alphabet, and let's assume case sensitivity, which, to be fair, you might not have assumed, well, then I think we have 52 possibilities times 52 times 52 times 52 for each of the four letters in your passcode. And if you multiply that out-- boom-- you indeed get seven million plus possibilities.

So consider the takeaway here. If you are currently using a passcode that's four digits, purely numeric, you have only 10,000 digits between you and some adversary hacking into your phone potentially. If you change your 4-digit passcode to be a 4-letter passcode, then you've got seven million possible passcodes between you and the adversary. Now, why is this better? Well, again, whether they're using a robot or using code, it's just going to take them more time to hack into your device. And again, at that point, if it's going to take them that much time, that much effort, maybe even that much money to hack into your phone, you, relative to other people might indeed be more secure because it's probably going to be easier for that adversary to go steal someone else's phone and try to get into that one instead.

Well, let's consider what this does in actual code. Let me go back to my Mac here. And let me go ahead and open up that same file as before.

And let me go ahead and change something as follows. Instead of using just digits, let me use what I'm going to call ASCII letters. Families who are not familiar with CS, ASCII just refers to essentially all of the printable letters of the alphabet that you would typically see in English, so A through Z, capital and lowercase here. And I'm going to go ahead and change my mention of digits here to be ASCII letters as well. So again, the program is almost identical.

But it's going to use all 52 uppercase and lowercase English letters instead of all 10 digits. Let me save this file. Let me rerun python of crack.py. And this time I actually have a moment to walk over to the screen and point out now that we're just now through the lower case zs. Now we're going through all the possible passcodes that start with capital letters. It's still pretty fast.

This is maybe, what, 10 seconds later done? We went from AAAA to ZZZZ. So we've raised the bar. And again, the security of our phone in this case is arguably more. It's higher because now it's going to take the adversary more time or more effort to actually hack into our device. Well, let's consider, perhaps, another question, then.

What if we generalize it further to be characters? And those of you among families, perhaps, might not know the distinction between characters and letters. So let me open this up to the floor here. When you register for a website these days, it's somewhat annoying because those websites typically force you to choose a good password.

And what do they typically mean by good password? What does your password these days often have to contain before the website even lets you proceed? Any thoughts? And let's see. Brian, who do we have? How about Dax? What are your thoughts? DAX: Eight characters at the very least, number, and a capital. DAVID J. MALAN: So at least a number and character.

So combine the two. I like that. So instead of 26 or 52 or 10, we instead have, maybe, 62 if we combine letters and numbers. Other thoughts on what websites typically-- DAX: Special characters. DAVID J. MALAN: --force you to do? DAX: Special characters-- asterisk, hashtag, dollar sign.

DAVID J. MALAN: OK, so special characters or punctuation characters. So maybe it's a hash symbol. Maybe it's an exclamation point, a parenthesis, a comma, a period, something else-- yeah, so these symbols. And frankly, I get as annoyed as you probably do when these websites annoy you and say, no, that you can't use that password.

No, you can't use that password. You need to choose something that's much harder to guess. But indeed, if we add punctuation to the mix, I think we can do even better. In fact, a character, therefore, is any type of character.

Maybe it's punctuation. Maybe it's a letter. Maybe it's a digit, unlike just letters alone. So if we have four characters, it turns out that typically, at least in ASCII, the system that CS50 students will know, computers typically use, there's 94 possibilities for each symbol because you've got 10 digits, zero through nine. You've got 26 lowercase letters, 26 uppercase letters-- and then if you count them up on an English keyboard, 32 characters more that represent punctuation, like, hashes and exclamation points and commas and periods.

So if you have 94 possibilities for each of those symbols, it turns out that you then have a total of 78 million possible passcodes. And that's pretty good. Now we're really raising the bar to the adversary because now they have to waste even more time trying to hack into your passcode.

And in fact, let me go ahead and simulate that with some actual code. Let me go ahead and open up my same program as before. And this time let me go ahead and import not just ASCII letters, but also digits, and also literally, punctuation. The code I'm writing in this language called Python literally gives me access to all printable punctuation by just importing it with this first line of code. And I just need to change one line of code down here. I need to actually say ASCII letters plus digits plus punctuation.

So this is Python shorthand notation for joining multiple lists. Those CS50 students among you will know that you can join two lists, perhaps, in this way, using what looks like concatenation. But with lists, it combines them all together. But I'm still going to do of length 4 here. Now let me go ahead and save this program and rerun it as python of crack.py.

And now I can frankly take my time walking over to the screen because now what you're seeing on the screen is four possible symbols. But it's including 32 possible punctuation symbols, which means this list is much longer, right? At this point in the story, we were already through all of the lowercase letters up through Z a moment ago. Now we're only at the Ms, Ns, Os, Ps, which is to say, that if my Mac weren't just printing this on the screen, but were instead connected to your phone that I stole and somehow sending all of these possible passcodes into your phone, it would be taking this much time to actually solve. Now, to be fair, we're almost at the lowercase zs. So if we stall for a minute or two longer, this program, too, will finish. So even 78 million possibilities is not all that impressive.

And so I daresay that we should do even better than this. So what might be better than four characters for a password? Any thoughts or volunteers? What would be a better password than four characters, where, again, each character is a letter, a number, or a punctuation symbol? The list is pretty good. But I think we can do better because even this will be done in under a minute. Yeah, thoughts about it, Leo? LEO: Right. Have a longer password to use, like, at least eight characters. DAVID J. MALAN: Perfect.

So have a longer password using at least eight. And notice here, we're even now going through the numbers. But we're almost done, it seems, with the numbers. But now we're going through punctuation. But again, if I give this a little more time-- and I think I was a little overzealous.

Under a minute probably isn't going to fly. But certainly, by the end of class, that will have been done. But what if we do a little better and use eight characters? Well, eight characters is going to take even longer.

But let's go ahead and ask you all how much longer this might take. Let me go ahead and open up a somewhat different question, but similar in spirit. In just a moment that will appear on your screen.

And the question here is going to be how many 8-character passcodes are possible? And this time I'm waving my hand at it. I didn't even bother doing the math precisely yet. But I'm proposing that it's roughly a million, a billion, a trillion, a quadrillion, a quintillion.

Some of you are perhaps noticing a pattern here. And you went straight for quintillion. That bar jumped up really fast. So maybe you're right. Good instincts, perhaps. It looks like we're getting equilibrium.

About 60% of you think it's 1 quintillion. 25% of you think it's a quadrillion. And then fewer and fewer for the others.

Well, let's take a look at what the actual answer is. Give me just a moment to actually do out the math here on my screen. And if we do out the math on my screen here, we'll see, of course, that we need to do some more math. We need to do 94 times itself eight times instead of just four, to Leo's suggestion of using eight possible symbols. And if you do this out, I had to think about this.

This in fact is, let's see, we've got millions, billions, trillions, quadrillions. Gotcha. So it wasn't the biggest option on the list. The answer is indeed quadrillion.

So 6 quadrillion, if you will. But-- but-- but those of you who are fans of having quintillion possibilities, which is pretty, pretty secure because it's just going to take the adversary way longer to hack into your password, well, all it takes to go from 6 quadrillion to some number of quintillion is just two more characters. So in fact, if Leo had proposed not an 8-character passcode, but a 10-character passcode, we actually would have hit quintillions.

So life gets interesting. Life gets more secure, the longer and longer and more complicated these passcodes get. All right, so by logic, then, you should all probably have passcodes that are not eight, not 10. Maybe they're 20 characters long. Maybe they're 100 characters long.

But here we see another theme in security, that of trade-offs. Like, the end all is probably not to be as secure as possible, but to be as secure as possible conditional on some other goals you might have. So let me ask this, what's the trade-off here? In making your password longer and longer and more and more complicated, what price do you pay as the human? What's the downside? In computer science, as in life, there's always a catch. There's always a cost. So what's the cost when you make your passcode more and more secure? Any thoughts? Let see.

Who do we have, Brian? Over to Jenny? What do you think? JENNY: Yeah, I feel that it is very difficult for a human being to remember such a long password. And due to that, we even store those long passwords somewhere in the system itself so that we can use that whenever we have to log in into the system. DAVID J. MALAN: Yeah, there's this trade-off of just remembering the darn things. And you make a perfect point.

If I can get on my soapbox again, if you are among those people who have pretty good passwords, and by good passwords, I mean, some numbers, some letters, some punctuation, but it's written on a Post-It note on your monitor at work, or maybe it's slightly more cleverly written in a Microsoft Word file in your hard drive, or maybe it's in a Google Doc, or maybe it's even on a piece of paper in your drawer-- you're just exposing yourself to other threats, of course. But here, too, is a sociological consideration or just a policy consideration, whether you're running a business or a university or just a household with multiple family members. What should your own policies be? Because arguably it's not Jenny's fault, it's not our fault if we are resorting to writing things down on paper if our passwords are so darn hard to remember. And moreover, I haven't even made the suggestion yet, but if you are one of those people in life who is using the same password on multiple devices or on multiple websites or on multiple apps, you are bad.

Like, you are also doing something bad. Why? Because if any one of those apps or websites is compromised and your password gets out, whether it's "iloveyou," quote unquote, or something much more complicated, all an adversary has to do now is try that same password on your other accounts. And so you're just exposing yourself to more risk by reusing passwords.

But to Jenny's point here, my God, where does it end? Now I need a really long random password on this website, this one, and this one, and this app, all over the place. I mean, honestly, I as a human certainly can't remember all of those passwords. And even if I could, I feel like there's better things in life to be remembering than passwords for accounts like this. So there's surely a trade-off here. But again, the goal is to keep the adversary out with some probability, not necessarily out in the absolute. So what else can we do to prevent the adversary from hacking into our systems so that I can have a somewhat easier, more memorable passcode, but at least keep them out? Well, here's a screenshot of something you might have done by accident, perhaps late at night when a little groggy, or a little blurry-eyed, trying to type in your password incorrectly too many times.

In fact, by a show of physical hands, how many people have locked yourself out of your phone before by typing in the wrong password too many times? I did it, like, literally just the other day. And so on iPhone, for instance, it looks a little something like this. And if we zoom in, notice that it's saying, try again in 1 minute. So you don't have to get rid of the phone and start over. But the iPhone is telling you to come back in a minute. And if we look at, for instance, Android, something similar-- your Android wallpaper will differ, certainly.

But down here, for instance, it says too many attempts. Try again later. I mean, that's a little infuriating because if I pick up my phone now, I want to get in now. Well, when the heck is later? So putting that aside, what's the takeaway here? Why are Apple and why are Google doing this? Because I bet all of you, if you've ever locked yourself out of your phone, are super annoyed at that moment in time and probably don't appreciate Apple or Google.

But what's the upside of what they've just done when they lock you out of your phone for having guessed your password incorrectly? Why is this arguably a feature and not a bug, a mistake? Sam? SAM: Yeah, it's used to decrease the chances of a successful brute force attack. DAVID J. MALAN: And how does it decrease the chance of that, would you say? SAM: Because it makes the attacker have to commit more tries before they can successfully get into the phone. So it decreases the chances. DAVID J. MALAN: Exactly.

So this is a very common principle in security. And it was pointed out earlier, too, just slow the adversary down. We don't have to rethink the problem of security. We don't have to redesign passwords necessarily.

But we should make it harder for the adversary to log in, ideally, without making it harder for you and I to log in to our own devices. So consider the simplest passcode that had four digits. A 4-digit passcode, there were 10,000 possibilities. A computer, a robot could guess all of those pretty quickly.

But what if after typing in the wrong passcode three times or maybe ten times, some small number of times, what if the iPhone or Android phone locks you out for a minute, just like iPhone did a moment ago? Well, that might mean, even though there's only 10,000 possibilities, maybe it will take the adversary 10,000 minutes to track your password because they keep getting slowed down every time they type in an incorrect one. And maybe it's not quite 10,000. It's some factor of that.

But you can slow them down in that way. Maybe you have a 10-character passcode with 78 quadrillion possibilities. And imagine the phone just slows you down 1 second. Maybe you can only type in one passcode per second.

That sounds pretty fast. But 78 quadrillion seconds is crazy long. And so even that kind of slowdown might very well be enough to keep the adversary out. And so if you don't have features like this enabled on, really, any device, you should look for them. Nowadays, thankfully, they tend to come pre-configured for this. But there is a downside.

There is a downside. You shouldn't just turn on these kinds of defenses blindly because what's the downside of keeping this feature enabled or leaving it enabled-- those are the same things-- or enabling it, if it's not already enabled? What's the downside here, to be clear? Because none of our advice today will be 100% a win. David? DAVID: Well, if you forget your password, that means it's going to take longer for you to access your phone again. DAVID J. MALAN: Yeah, it's going to take you, the user, the owner of the device,

even longer to log in. And I'll admit, too, I have on multiple occasions not locked myself out once. I then got stubborn. And I think my anger level just rose. So I started typing in more angrily, and therefore making more mistakes.

And what Apple and Google do is they have what you might describe as exponential backoff, which is a fancy way of saying, the first time you get penalized one minute. Now you have to wait one minute. If you screw up again, then you have to wait two minutes.

If you screw up again, maybe it's five minutes. Maybe it's 10 minutes. Maybe it's an hour. And I swear, at that point I wanted to throw my phone across the room because I couldn't get into my own device. And there you start to sacrifice, of course, usability, right? If my device is so secure that even I can't get into it, then is it really worth having at all? And so finding that inflection point is part of engineering good secure systems because you have to find that inflection point so that your users are using good passwords and passcodes.

But they're not just taping them onto the monitor on a Post-It note or disabling them all together. All right, let me pause here to see, are there any questions about passwords, passcodes, brute forcing or these kinds of defenses, given that passwords are perhaps our most common defense against adversaries accessing hardware and software that we don't want them to? Yeah, Dax, question? DAX: Now so there is a definite number we can calculate that for 4-digit numbers this is the most possible number of outcomes. But what about biometrics? Fingerprints? Face scanning? DAVID J. MALAN: Yeah, really good question. So what about biometrics, using face scanning? Like, Apple has face ID these days, which also annoys me sometimes if it doesn't quite get my face right. Or these days if we're wearing masks, it's infuriating to use that kind of feature.

But maybe probabilistically, there are fewer people with exactly your facial features than someone else. And so that would be more secure than picking some passcode. Sometimes you use fingerprints or retinal scans or the distance between your fingers, all of these different measures that statistically tend to not so much uniquely identify us, but uniquely identify us all enough.

And there's threats there, too. A former colleague of ours, for instance, had a twin brother who because of Apple's face ID was now able to get into his phone by just picking it up off of the table because as twins, they both looked all too similar. So there's downsides and upsides there, too. But biometrics can also help things so that it's a factor you have on you always and not something, for instance, that you just only have to remember. And in fact, that's a perfect segue to what computer scientists call two-factor authentication. In the security world, security people would call the passwords we're using one factor, and something like biometrics, a second factor.

And indeed, two-factor authentication means a defense mechanism against the adversaries that doesn't rely just on something you know, like, a password. It also relies on something typically that you have, like, a hand or fingers or eyes or face or the like, so that even if someone compromises your password and downloads it somewhere from a database where you've used it before, they don't necessarily have access to your eyes and your hands and your face and the like, unless they have physical access to you. So it just narrows the scope of the threats.

But there's other forms of two-factor authentication. For instance, if this sounds familiar now, and maybe you don't even call it two-factor authentication. It's often called two-step authentication. By a show of physical hands, who has one or more accounts that uses two factors instead of just one? Yeah, so here, too, it's good to see so many hands going up. But if you do not use two-factor authentication for things like your email account or your bank accounts or your brokerage accounts or your health medical accounts, you really should start considering doing so.

And what form does this typically take? Well, let me show a screenshot here, for instance. Even if you just have a simple Gmail account that you use for work or for personal use, you can enable what Google calls two-step verification, which is two-factor authentication. And what you'll be prompted for when logging into your Gmail account if you enable this is not only your username and your password, but also a 6-digit code. And six digits doesn't sound terribly long. But in this case, the way these technologies typically work is that you are sent that 6-digit code once via email or via text message or via special app that you install on your phone or some other device so that only you have that code.

Only you have that device. And therefore, only you know that code. And better yet, these codes expire. So even if some adversary intercepts it or sees you typing it in over your shoulder, you can only use these codes once, which makes them even better than passwords alone because they expire after single time use.

And so consider now, again-- and I can't emphasize this enough-- if you are of the age where you have your own bank accounts, again, brokerage accounts, anything medically related, anything that you find especially important or personal, like, your own email or chat accounts, if you're only using a password, you now as of today already have the mathematical tools and the mental model, I daresay, to figure out just how easily someone could compromise your account and get into your information and take your money or read your emails or the like. So you can improve that situation by just coming up with a better, longer, more random password that you remember or memorize in some way, or additionally, by enabling the second factor so that you narrow the number of threats that are dangerous to you as a result. So with that said, too, with two-factor authentication, there's another thing you can bring into play when it comes to managing all your passwords. I alluded to using Microsoft Word before or a Post-It note.

There are software solutions to this, too. So another defense we would like to offer up for your consideration today is what's generally called a password manager. This is a piece of software, either for free or that you pay for, for your phone or your laptop or desktop, that literally manages your passwords. In its simplest form, think of it like a spreadsheet, but that's "secure," quote unquote, on your own computer. That is, these password managers-- and here's two popular ones. onepassword.com is one popular tool. lastpass.com is another one.

And there's others if you google around. But I would, as always, read up on reviews or get second opinions. Don't just take at face value what we propose. But these password managers are programs that you type your usernames and passwords into. And then you save them all behind one master password, one password that's really long, hopefully, really random with lots of numbers and letters and symbols. But all you have to remember is that one main password.

And by entering that password into your Mac or PC or phone, you then unlock all of your other accounts. And you can then just copy and paste your actual accounts' usernames and passwords. Or these programs also give you keyboard shortcuts.

So you hit a keyboard command, and voila, you're automatically logged into websites. You don't have to copy/paste or manually transcribe them. So to this day, what does this mean? For me, I use one of these password managers.

And most of my colleagues do as well. Many of us, most of us, don't even know the passwords we use for various websites or apps or the like. Why? Because we now trust that the password manager can, with the click of a button, generate a really long random password with lots of numbers, digits, and punctuation. And then it will remember it for me.

And I just have to remember that one main password that's protecting all of those others. So that's good in that now I can practice what I've been preaching. But there is a downside.

I'm exposing myself to a new risk or vulnerability. That is to say, what's the trade-off here? Why should you not necessarily just run off after today's class, download and install a password manager, and start using it without a little bit of thought first. What's the downside, perhaps? Yeah, over to Lexlene if I'm saying it right? LEXLENE: Yeah, if someone cracks your password manager password, then they have access to all your passwords.

DAVID J. MALAN: Yeah, so really depends on what the threat here is, or what you're most worried about. If someone compromises, guesses, figures out your main password that protects all of the others, now you've just handed them all of your accounts at once.

And that's a massive trade-off. However, if you again consider the alternative, coming up with big random passwords and then memorizing them all, or somewhat foolishly, writing them down on a Post-It note and putting it on your monitor, the question shouldn't be is this the right way to do things, but really, relatively speaking, is this a better way to do things? So you're always going to be vulnerable to some risk. Which of those risks do you worry about? And maybe you can mitigate that concern by maybe you could write down your main password for your password manager and maybe put it in a physical vault or a fire locker or the like that with very low probability someone else would get access to, unless they physically attack that device, or hide it somewhere in a book on your shelf or the like. So that yes, it's vulnerable. But the odds that someone finds it might just be relatively low. But again, this is the theme, figuring out what the right balance is for your accounts and the type of security that you want to aspire to achieve.

Well, let's consider a few other defenses. And we'll leave time at the very end for questions about particular tools and techniques. What's another building block that we can bring to bear when it comes to protecting ourselves online? So encryption-- CS50 students will know that encryption refers, again, to the scrambling of information, making data look like it's random data, but by encrypting it with what's called the key, typically, a key that only you and the recipient somehow know. Encryption tends to be the solution to a lot of our problems. And indeed, these password managers typically additionally encrypt your data so that even someone who steals your Mac and PC can't just open up the program and see it. All of the data, too, is similarly encrypted.

Many of you have already been trained or conditioned by society to at least look for or hope for or recognize https://. The s means secure. That just tends to be a good thing because it means a website you're visiting is secure.

It's encrypted, as opposed to just http, which was much more common just a few years ago and is completely unencrypted. So that is to say if you visit a website that says just http in the URL, anyone between you and that website theoretically can be listening in, so to speak, on your traffic, the zeros and ones going back and forth. Anyone can see what pages you're visiting. If you're in some foreign country visiting sensitive materials, the government could know what websites you're visiting and what content, for instance, you're reading. https makes that much harder. It's not 100%.

There are attacks still that are possible. But again, it just raises the bar. But there's another technique that's increasingly being discussed in the media, and with which you should be familiar, known as end-to-end encryption. End-to-end encryption means that when you're using a third-party service, typically, whether it's a chat service, a video conferencing service or the like, you're not just encrypting your traffic, the zeros and ones, between you and Google, you and Microsoft, you and Amazon, or some other third-party.

You are encrypting your data between you and the person you're talking to. So WhatsApp, for instance, the popular messaging tool, early on had this feature. And many other chat programs nowadays have it as well, including iMessage and Signal and Telegram and the like. End-to-end encryption means that even though you're using a third-party service, a company that you may or may not trust, your communications are communicated between you and the person with whom you're speaking.

The company in between, their servers, even though your data is going through their servers, cannot decrypt that information. They cannot see the information in its raw form. So that's a good thing. So WhatsApp does this, too.

Zoom kind of does this, at least, only recently does this. So Zoom, for instance, the technology that we are all using right now, actually took some flak, rightly so, some months back, when in their marketing literature on their website, as I recall, advertised Zoom as offering end-to-end encryption, which was false because what end-to-end encryption means is, as I described it, between you and the person with whom you're communicating. But the marketing literature at the time was referring to end-to-end encryption between you and Zoom, which is not what security researchers or computer scientists or technologists in general would define end-to-end encryption as. And so they took some flak for that, rightly so. They've begun, though, in recent weeks, rolling out actual end-to-end encryption.

We are not using it right now. It actually makes certain features harder to use. So there, too, there's a trade-off. But generally speaking, if you're having the most intimate or private or personal or financial or medical of communications with people, this is another feature you should start to look for and listen for and expect of the tools that you're using. And especially when it comes to censorship in various countries and communities, this is the kind of software that's increasingly under attack by governments because they often want backdoor so that the USA's NSA or FBI or some other entity can get into these communications. That's made much more difficult, in a good way, by using end-to-end encryption so that your communications are indeed secure.

Well, in our final moments together, let's focus ultimately on Zoom, the very technology we're using. Because they've taken some flak, certainly beyond end-to-end encryption, which you might not have even heard of, as just being insecure. And a lot of school systems, a lot of users decided some months ago to stop using Zoom for this reason, even though their business is still booming. So is Zoom secure? Let's ask one final question of the group here, keeping in mind that we've now just spent the past hour discussing topics of security.

Let me go ahead and ask this final question here, which will appear on your screen in just a moment. It is quite simply, is Zoom secure? All right, let's see how the responses are coming in. I'm seeing 55% no, 16% yes, 28%, unsure. So a reasonable spread there.

Let's take a couple of comments here. Among those of you who think Zoom is secure, why do you think it's secure? Would anyone be comfortable raising a virtual hand so we can call on you, or maybe commenting in the chat as to why you think Zoom is secure? Let's see, over to, how about, Sam? What do you think? SAM: Two days ago, Zoom offered end-to-end encryption to all the users. DAVID J. MALAN: Yeah, so it was, in fact, that timely.

Zoom began rolling out, on a trial basis, essentially, end-to-end encryption with all users. So if you are using that, and-- and this is key, too-- and Zoom has implemented that concept correctly, then, yes, maybe Zoom is secure in the sense that your video conversation with someone else is in fact private between you and them. With that said, if you're in a coffee shop or in a library, at least in healthier times, and someone's looking over or listening in on your conversation, arguably even that technology is not secure. You can imagine there being other threats. Maybe you have accidentally been vulnerable to a virus, some kind of threat on your own computer.

And even though, yes, your data is encrypted between you and that other person, that doesn't mean there's not malicious software running on your own personal Mac or PC or the other person's, recording everything you say and uploading it to some third-party adversary. So there, too, whenever you ask or answer questions about security, take into account those kinds of qualifications, those conditionals, because security should never be discussed, really, in a vacuum. So those of you who said no, I think we could come up with even more reasons. But at least let me dispel just a few because I do think some of the flak Zoom took was overstated because those criticizing didn't really understand some of the issues that were being touted in the media.

So for instance, all of you today, to log into this meeting, for instance, followed a URL, most likely, that you had been emailed or that you saw on your screen. And that URL probably looked a little something like this-- https://, which is good, zoom.us or something like that, followed by a number, the meeting ID-- for instance, 5551112222. But it was a different number for today's meeting. So if you received this URL after registering, is it secure? Well, even though all of you here right now have presumably registered, technically there was nothing stopping any of you from texting or emailing or DMing this same URL to anyone else on the internet. And they could therefore join, perhaps, without registering.

So maybe that's a threat, though, Zoom typically sends you not a URL that's as simple as this when you register, but a longer one, indeed. And there's another detail that some URLs have, too, which might look like this-- a question mark at the end, and pwd for password, and then some kind of password. And indeed, the URLs you clicked today looked a little more like that, still different because they were special registration URLs. But here, if your URL has this password, now you need to know both the meeting ID and the password in order to join that particular Zoom meeting.

And if you're not running big classes, like we are today with this meeting, but rather you're having one-on-one or smaller scale meetings, typically you are receiving or generating a URL that looks like this, or better yet, that looks like this, so that it doesn't suffice for an adversary to just guess the meeting ID. And that's what was happening early on. Zoom typically did not require that people choose passwords for their meetings, which meant the only thing between you and some adversary Zoombombing you, so to speak, hacking into your meeting, which they just had to guess the meeting ID. And we've seen already it took me, what, like 1 minute, 30 seconds to write a Python program that just generated all possible numbers of length four or eight or whatever.

So people with too much free time are writing code that just tries all possible URLs. And so if you've ever been Zoombombed, maybe that's because someone shared the URL with someone they shouldn't have. Or maybe someone with a bit of programming experience or just luck guessed your meeting ID. So this was a feature in the sense that, honestly, having to type in a meeting ID and a password is just annoying. It starts to hurt the usability of the system.

And a lot of people in the corporate world, they're going to choose another product if another product is easier to start the video conference with. So arguably, it was a conscious decision on Zoom's part. Now universities and companies have started requiring this or another feature called a waiting room, which some of you might have experienced today.

But that just, again, raises the bar to someone attacking the system. So is Zoom secure? Yes and no. It really should be considered not in a vacuum, but in the context of what kinds of threats are you worried about and what kinds of defenses are you willing to put up? So just like in the real world, you might have your own home or apartment or the like, on which you might have locks and bolts and bars on the window. At some point, if it takes you five minutes to unlock every lock on your door just to get into your home, it might be much more secure, but you're probably not going to enjoy going home because it takes that long to get in.

And you might put bars on the window to keep that person physically out, but it's not going to look particularly nice. And there's nothing stopping them from going one floor up. So there, too, there's this trade-off. And so among the takeaways, we hope, from today, are one, just better thought processes when it comes to what does it mean for your phone or your computer or your homes for that matter to be secure, and to recognize that there's always going to be some trade-off. And we would encourage you, ultimately, to ask these kinds of questions.

If any company, if any app, if any website just says on their website, "we are secure," that's nonsense. That means nothing in and of itself until you start asking questions, like, what are you secure against, and how? Well, thank you so much for joining us here. Let's officially wrap here. But folks are welcome to stick around for some more time if you'd like to ask questions in the group. But if you have to take off, please feel free to head out. [MUSIC PLAYING]

2021-01-02

Show video