Phishing detection despite the fact that large language models exist doesn't seem to work very well constantly flagging the wrong things we've got so much more to do in this area and actually I'm also as an aside a big fan of learning about more than one topic because I think it gives you that broad experience of how this might apply just knowing how to train a deep network but no knowledge of who could use it or how it might apply to something like cyber security is going to be limiting you want to try and just know a little bit about how things work as well it was such an obvious push back from someone like me or someone like you and yet Microsoft just seem didn't seem to see that coming right that we wouldn't want to just take all our desktop screenshots and just throw them straight straight up to a server I think there's a lot of privacy concerns and I don't think they've been adequately addressed necessarily/. Everyone it's David Bombal back with the amazing Dr Mike Pound, Mike welcome. Thanks for having me back. Really important topic I think it's 2025 and I written this question down for you to answer please how would I learn AI in 2025 if I was starting over but perhaps before we get there you can just give us perhaps an update of the state of AI I've heard all kinds of things but perhaps you can tell us you know is it worth me learning AI what's the state of AI? It's still worth learning it's not it's not gone away put it that way um yeah I think my take is that is this year has been the year of starting to build systems with AI right so we had a an era let's say from about 2022 23 24 where we were focusing very much on just making the language models bigger yeah and the idea was well sooner or later they'll just start solving all the problems and that hasn't been the case as many people people kind of thought anyway and so now what what these uh companies are doing is building slightly smarter systems with what we call things like retrieval augmented generation bringing data in to the language model to make it more accurate and combining that language models together so have it like draw images but also write text letting it use tools right so go off for the weather and actually get a data source rather than just kind of making up a weather forecast and so I think that what's changed in the last year is is is that kind of the way that we're designing systems is moving towards building llms and other AI into tools that do other things as well to make them more performant overall um and that's a really exciting time because you know these things are just working much better now that we have actual data being put into them. Is it hype or is it getting real because I've heard like Sam say that AGI is coming in like a thousand days or a few thousand days. He like he likes to say that um
I I think that Sam says that I don't know whether Sam Actually believes that or whether he says it because it helps open AI stock stock price I don't I don't know honestly I don't I don't think that and I think quite a lot of researchers agree with me I would like to think so I certainly that's the opinion I see uh when I read around I think that the fact that we've seen this move over to trying to build and use LLMs as tools in as part of a larger system is kind of an admission in a way that they're not just going to solve everything on their own and that the ultimate solution may be one that uses some of these technologies within other things as well I kind of feel like they're really really cool they work really really well at what they're specifically trained to do and so use them for that purpose don't try and just start assuming that we can just stick a chatbot in a car and off it off it drives it doesn't work that way. Do you call it Math or do you call it Maths is it aluminum or aluminium doesn't really matter from a math point of view the great thing is that Brilliant can help you understand the mystery and depth of mathematics with Brilliant you can turn your confusion into confidence step by step you're going to learn so much more by doing and that's where Brilliant really shines rather than you just passively watching someone lecture you you're going to be solving problems you're going to be answering questions and quizzes Brilliant makes learning to code intuitive and fun with lots of Hands-On interactive quizzes and Labs Brilliant breaks down math concepts visually so you're not just memorizing you're actually understanding what you're doing build formulas solve real world problems and challenge yourself every step of the way whether you're brushing up on mathematics fundamentals or tackling Advanced topics like calculus Brilliant has a course for you I recommend that you check out their vectors course they make even abstract Concepts easier to understand with their simple interactive examples try Brilliant for free for 30 days go to https://Brilliant.org/DavidBombal plus you get 20% off your annual premium subscription learn something new every day and change your life. When you spoke to me a year ago ago you were talking about you know the AI people are hyping it that it's going to be exponential and then s months ago I saw also on the computer file Channel you you mention you looked at this white paper where it was plateauing so what's your opinion now is is AI going to suddenly be exponential again or is it is a plateau is it somewhere in the middle? I don't think anything has changed from that I think that the ex this sort of exponential or should we say logarithmic actually probably uh trend is still happening right you know GPT 5 hasn't been released with some massive increase in performance what we've seen is we've seen things like 01 which introduces slightly different ways of using these models but the actual model itself is kind of the same as it was before there's been another really interesting paper recently by Apple actually that showed that you can really easily confus language models by giving them extra information so suppose you're asking a question like like a mathematical question like you've got five watermelons and you have another three watermelons how many watermelons you have in total large language models do really well at this time ask and the Assumption has always been that it's because they just know how to add up right which is not really the case what this paper did so beautifully was they just created variants of all these questions with Superfluous nonsense information in so they'd say things like you know how many I've got five watermelons that are really small by the way and I've got another three watermelons how many do I have in total and and the LM will just completely break it will just say oh because they're so small you only have three you know and it'll just it just won't work and it's because they're not trained on that kind of data and they don't actually reason their way through a problem as we might see that they think that they do based on the text so I think that using them in a smarter way is is the way forward and actually you can see that's what 01 is right it's using an LLM in a way where it sort of has this monologue with itself which is a way of coercing better results out without having to transform the way the model um performs. Are we in the middle of a hype cycle because every vendor and everyone
seems to to have like AI as a product everything is this plus AI it feels like it's just hype. Yeah I mean you know we had the new iPhone came out quite fairly recently and and and all of the the stuff was about AI basically and the difference it seemed to me between the 15 and the 16 was the introduction of AI yeah we very much are in a hype cycle in the sense that I think companies now feel that they have to introduce these features as a way to compete in a market where everyone else is already doing this and actually when you use these features I mean you know everyone's mileage May right other people might really love some of these features but I kind of find they work some of the time and other the times they don't work you know yes I could ask my phone to rewrite my text message but I actually just wrote the text message and I'm going to kind of just send it so you know I'm not I'm not flipping I think some of these things are really good but I think that we shouldn't be throwing a at some AI at something just to say that we have which I think is what's happening at the moment AI on routers and switches and firewalls it just seems to be everywhere yeah yeah it's everywhere like yes and Next Generation you know firewalls and all these things that include machine learning I can understand why they do that and maybe they perform way better I guess I I'd be more interested in the metrics right does it actually find these things than kind of what it says on a tin yeah I think that we're going to see this for a while but as people get more used to what AI does and what AI doesn't do I I can I hope and I predict that the sort of the hype will die down a bit as people kind of recognize you know what just because you've written AI doesn't make this thing good let's see if it's actually good and then just use it. So coming back to the original question you know how would you get started first question is is it is it too late and should I get into this now you know have I have I missed the boat. No no not at all it's not too late I I think um I'm going give you an example I learned I trained some large language models using a an efficient technique just a few months ago yeah mostly just to learn how to do it because I didn't know how to do it I don't use large language models most of the time I use Vision models and I thought well that's an like let's add another string to my to my bow or whatever the phrase is and so I went off and I traded it and it took me about it took me a few hours to kind of look through the code work out what was going on get something training yes I have some expertise But ultimately it's mostly just Python code right so actually I think you can pick these things up pretty quickly and the other thing is that the technology is moving unbelievably fast and that's both good and bad right you know you it can give you the impression that you're going to be left behind but actually what it means is everyone's just having to relearn everything every six months anyway um it's a bit like JavaScript libraries isn't it so you know in a way I think that it's actually a benefit for people coming on board that a lot of what we're doing now isn't what we were doing two years ago because it allows you to go well we'll just pick up what we're doing now and let's go with that. So I am not too late but what should I do then? What
what would you do now if you started again in 2025? yeah so I mean let's think about where people would be right so if you're if you're an undergraduate and you've and you and or you've been working in the industry and you know how to code already then the best thing you can do is take a very quick introductory machine learning course get that done and then just start playing around and trying these things out particularly with things like Google Colab loads of big libraries that do things like retraining large language models they come with Google Colab or other examples that you can literally just run so the first time you run it it's a complete Black Box to you you don't know what's going on you just click play and it runs and like oh I'm training something this is great you know and then you start to dive into it a bit and slowly start to realize what's going on you start opening that box if you already know how to code for example in Python then I think the best and quickest way is just to basically jump right in and start training some stuff up even though the first time you train it you're going to feel out of your depth you're going to feel like you don't know what's quite going on and then over time you will start to pick this up pretty quickly. Do you have a course or do you recommend courses? So I always still recommend Andrew Ang's Coursera course for machine learning because I'm a sort of a firm believer in the idea that if you know how machine Learning Works which is to say things like what does a learning rate do what should a loss function do then the rest of it kind of comes out in the wash right large language models sound fancy but they're just big versions of the kind of same networks that were being trained many years ago and they train in exactly the same way which is to say you have a loss and it goes down as you get better at predicting your outputs and if you know how to interpret that if you know you know how to look at those numbers and go right this is what I should do next that's going to put you're in a really good position if you jump in and run the code then you'll see this loss going down and that's great but then when it doesn't go down you won't know quite what to do and that's I think where a little bit of the fundamentals will help and you can do these things kind of at the same time so you know there are different levels of machine learning so there's sort of what we will call maybe sort of standard machine learning or classical machine learning these are things like support Vector machines and small artificial neural networks these are the kind of topics that are covered in things like Andrew an course there of course and above them you've got the kind of convolutional networks and the vision networks and then transformers and then above there that you've got the really big large language models and they're a progression if you understand what one does it's kind of a level of abstraction you can understand what the next will do and what the next will do and you don't the leap isn't as big you know going upwards almost all of my research is AI driven now you know it's I I do sometimes do traditional computer vision which is to say you know manipulate pixels s basically um and I can still do that and I sometimes force myself to do it just to remember how to do it but actually often a lot of the times someone present me with a problem and my first response will be well we probably need to train a deep Network to solve that problem it's both true that we are in a hype cycle but also true that these are incredibly performant techniques and knowing how to use them is going to help getting the first ones trained that's not a not a big ask you know it's it sounds really hard it sounds intimidating but I kind of feel like that's a bit gatekeeping right it's a bit people saying this is super hard look how smart I am yeah exactly I I don't like that approach right I I I prefer the approach of saying look they they developed these libraries to make them easier to use specifically so that you can have a go and and use them and then if you if you're willing to kind of get a little bit stuck in you know one of the things that I did when I was learning how to train large language models was I looked at the data I actually looked at the rows of data that were going into the model to see how they were structured how what how how they shape that data that the large language model could use and that allowed me to shape my own data to reflect that and then I could train on my task so a lot of the time you're kind of learning by seeing what's been done before and kind of manipulating that into something that works for you and then going again you know and then then once you know how to interpret the results then you then you're all good. I'm glad you mentioned gatekeeping because it feels like unless I've got a PhD or I really know math or maths very well or you know all this like data science stuff I've got no hope. I'm I'm a firm believer in an undergraduate degree as well right sure but it serves a slightly different purpose there are loads of benefits to going to a university to study computer science for three years and that's all you study because you learn a huge broad range of topics AI is only a very small part of what you learn at an undergraduate degree you'll also learn you know how firewalls work right and how and how object orientated programming is done if you already know some of those things then then you know actually a degree is only going to have a very small part right maybe a masters might have some might be more targeted but actually I think that for people who know what they're doing for people who've already got a degree it isn't reasonable really for you to go back and pay again all those fees to you know to learn a second time that's when you use the skills you've got already the fact that you know how to work independently you can learn on your own you're willing to have a go those things are going to help and and there's loads and loads of resources online. So learned Python that's the first step right? That's that's 100% the first step for better or worse Python has kind of emerged as the language of choice for most machine learning that's true of pretty much most you know any type of machine learning lots of people work in other languages like mat lab and and c and things like this for for the most part people work in Python all of the big libraries so things like pytorch and language um large language model libraries like UNS sloff that we can talk about later those are all based around Python they all operate in a similar way and so you very quickly get used to how this works and the ins and outs learning Python is is the first thing to do and and I would say also that Python has a lot of very Advanced language features that hardly ever come up right you might see a function call that uses some strange Lambda expression you're not not expecting yeah but in general that doesn't happen and actually kind of standard Python will get you through most most of the day. So Python then Andrew's course from Coursera and then? Yeah so I mean there are there
are going to be lots of courses so if you're on you know a different platform like Udemy or something like that there will be perfectly good introduction to machine learning courses I would Advocate not getting too over excited and jumping straight to the large language model on day because if you do that it will be a black box for you it's going to be something where you're training or using a very very large model that you don't understand and that's that's fine you can build a product product based on that is what a lot of startups do but you will you will be able to use it better if you have a little bit of an understanding of what's going on underneath and to do that you need to build up a little bit from the ground up and it doesn't take that long we have undergraduates who start projects and they haven't got any AI background particularly and they're training things up within a few weeks right to to the point where they actually know what's going on right it does require some effort it's not something where you can just go on GitHub go to the collab run it and then consider that problem solved you know you are going to have to look into these things. Python Andrew's course or some other courses anything after that don't need to buy books or? There are loads and loads of books on machine learning and I wouldn't necessarily say that's a bad thing to do I think one of the problems is because everything moves so quickly yeah those books it's going to be difficult to have books that are on the cutting edge version of some live you're trying to use and so I have learned I found my personal experience is of learning some of these techniques is just to just to run them like I say the first time you run them they're hard and then they get a bit easier and they get a bit easier until okay this is pretty routine most of the big libraries have tutorials they're not all great to be fair you know so I think your mileage may vary but I would say that the other thing you can do and this is a bit like learning to code right if you want to learn to code I've always thought that the best thing to do is just to do lots of coding right it's not something where yes you can be taught concepts you can be taught you know about conditions and booleans and things like this but ultimately when you spend time just answering questions and solving problems using code you get better and better at doing it and I think that's very similar also of machine learning if you train up a simple classifier and then you go on to a more complicated topic and so on and so forth that's kind of a nice progression so maybe we can talk a bit about that progression right so if you do something like Andrew Ang's course you'll find yourself you know pretty well versed in kind of lowlevel standard machine learning which is to say things like support Vector machines and and newal networks the next thing perhaps to do would be to look into something like a convolutional new network or maybe a transformer but they're a little bit of a step up or at least a little bit different conceptually different there are you know thousands of collabs or githubschool [Music] and they'll download the data set for you they'll classify some stuff and they'll learn and you you can sort of watch the numbers go down and then you can go into the code and go right how did that actually happen and then you can start to build up the complexity if your aim is to go very much down the sort of modern generative AI route so the kind of LLM route then at some point you'll need to start making use of some of those tools and so those would be things libraries I quite like so things like Lang chain so Lang chain is good for putting models together into kind of working systems and then you've got things like unso which is really good for fine-tuning large language models which are pretty prohibitively costly to tune normally and so yeah there's lots of these libraries and they all come with you know pretty good documentation and examples that you could actually run. Do you have to pay a lot of money to run this stuff or is a lot of it free? No it no you can I think some of them have like a kind of commercial tier but everything that I've used so far has been you know either you know free free license uh and I think that's fine you know you can make a decision later whether something requires a little bit of payment so for I mean Google Colab is a good example right Google Colab runs essentially a notebook a Python notebook a notebook for those people who are not familiar with Python is essentially instead of having one code file you have kind of blocks of code interspersed with images and texts and other bits of information they can be very useful for you know documenting but also looking at the output as you go and you can also run these things in different orders so you can say well okay I want to go back and run that bit see what the output is and then run something else and see what changes Google Colab is that but a little bit posher should we say and also backed by Google's GPU environment so you basically got the ability to actually train models on these without it being really really slow if you want to do more than kind of token training it's going to need you to pay some kind of eight eight n pounds a month monthly fee um else you're going to be waiting for a few hours for a GPU to free up right and so you can make that call at the time certainly you don't need to pay to begin with you can just get started run a few things try them out see how you're getting on and if you think okay I actually need to train this thing for a few days now to see whether it works that's when you could consider investing in some resources. We often do this like list of top skills for 2025 and on our list AI is number one of the skills that people need to learn do you agree with that? yeah well I mean I work in AI I work in AI I may I maybe a cheap I yeah I think I think there's kind of two answer that questions so one is is yes I think that AI is the thing that's most people are discussing I mean at the University I get emailed a lot from people who who want to do some AI with their research they don't have that background they want to know how to get started so I actually get these questions quite a lot um which is great for me right so I get a lot of great contact with loads of cool cool collaborators in different areas of science you know there's no doubt that people want to learn more they don't just want to have me do it they want to also have a go themselves and so you know when we asked recently lots and lots of academics what would you like for us to do the first response was is there some training we can use can we put out some training on this in around University to try and get people skilled up and I think that's going to be a really important thing over the next few years I mean I'll be thinking about this as well what I can offer. We need you to make of course Mike. Right I know yeah I'm sure you said this last year and I have I've been stuck in meetings um but yeah I I I think that I mean there are Lo of good courses I think that the difficulty is of course which to pick I think that's always the always the the thing I would not worry too much about that right any content you're consuming that's explaining how deep networks work and how you train them is going to get you closer to that goal of knowing how it works and then just training it yourself is also going to get you slightly closer to that goal the only thing I would discourage people from doing is just run a notebook press play all the way down to the bottom and then kind of go to bed satisfied that you've learned some machine learning today because know if you got to read decode right read the code see what it's doing and then you know hopefully you've learned a little something. Yeah Mike the reason I asked you this is Dave Kennedy who's really well known in the cyber security space put out a tweet in early December 2024 and he said cyber security automation with AI LLMs is starting to become and will be one of the most desired skill sets in the next 3 to 5 years in all of security I mean I'm assuming you agree with that because you nodded right? yeah so I mean it helps also so I'm I'm I'm an interestingly placed researcher because I research computer vision and AI I also have Security and cryptography for many years so I have at least a pretty good knowledge about about subject even though it's not my main area of of work and I would say that they're both incredibly hot topics right I mean security of networks and the internet is getting more and more important in some ways I always feel like we've got it so good now you know we have encryption on every channel but ultimately actually we also have huge state actors trying to get at our stuff yeah we get people get phish all the time phishing detection despite the fact that large language models exist doesn't seem to work very well constantly flagging the wrong things we've got so much more to do in this area I think that is a a very you know a very useful thing to know about you know and actually I'm also as an aside a big fan of learning about more than one topic because I think it gives you that broad experience of how this might apply just knowing how to train a deep network but no knowledge of who could use it or how it might apply to something like cyber security is going to be limiting you want to try and just know a little bit about how things work as well. Yes so I mean that example from Dave
Kennedy someone who's very well known in industry lot years and years of experience and um owns a cyber security company and does a whole bunch of things I think if someone in the field says that and someone like you doing a lot of research sees says the same thing that's a real indication to anyone who's younger or wanting to change career or help themselves you know better themselves yeah that they should learn this the reason I ask like should I learn it is cyber security definitely need to learn AI others other niches or fields the same thing applies right? Everyone is trying to build AI into almost everything right now sometimes that's a silly idea sometimes this thing already Works without Ai and we didn't need Ai and it's just making it more complicated but on the other hand there are lots of times where automation or just doing something you don't want to have to do yourself AI fits right in there and that's like that's a huge amount of the work we do every day I think that knowing how to do these things is going to be a real bonus for even if you end up not directly researching or working in AI just using AI in other areas we already know one of the Nobel prizes went for protein folding right which is an application of AI to a completely different area of science that area of science will be completely transformed by that field even if that AI isn't AGI or isn't superhuman and can't seem to drive a car straight it's still on that particular task has been pretty transformative and so I think we're going to see a lot of that we're going to see a lot of places where there's been an absence of AI and suddenly it comes in and just starts in problems that people assumed were not solvable and so kind of given up. I love what you said though I mean two fields that I really find interesting is Cyber Security obviously I have like other loves like networking but like if you were starting today cyber plus AI seems to be the like an amazing combination. It's a it's it's a great combination they're also both really interesting right so I I find most of computer science interesting not not all of it I would say but if you find something interesting and you're passionate about it it's so much easier to learn as well so if you enjoy the fact you know what this network is training I can't believe it it's actually actually doing that thing I asked it to do or I've installed this software and now this network is working so much better than it was before these kind of if you're pleased about that kind of stuff you're just the right person to take on this kind of Challenge and learn and just go off and learn all this stuff I mean I'm always trying to learn new things all the time and I probably should just take a break and have a nap but I just I think oh I could learn about this and I go off and I learn about it and it's so so exciting and you know AI and cyber security are those two topics for me they're so they change so much there's so much interesting news weeks don't go by without something happening in cyber security right and I think um AI is just going to make that more exciting as well. I love that I think
if I was starting again and this is what I advise myself and it's you you you seem to be saying the same thing get into cyber get into AI um two big fields lots of potential what you learned today isn't going to be lost tomorrow is it it's going to lost you for the rest of your life. yeah and and and also don't don't overestimate the level AI that's required for a lot of the the platforms that we see right yes training up something like Sora is a challenge and requires someone probably who's done many years of training these models at scale training a network to solve some task you had or to write some of emails for you that doesn't take as as much expertise it just takes a little bit of knowledge and so you don't have to start with the big one you can you can work your way up and actually there are loads of roles for people who are just doing kind of around the edge not quite so fundamental you know research or development in AI which I think that we know we need a lot of those people as well. It seems like the world's going to be split into two groups right those who have the power of AI behind them and those who don't have that and they're going to be left behind it seems. As I say there are going to be some situations where people are putting AI on a problem where actually they would have been would have been fine right and we can all recognize that I'll give you an example just for fun a few weeks ago I trained so the reason I trained a large language model was I trained it to write my emails for me oh nice so I download all of my emails it turns out I send quite a lot of emails and what I did was I gave it the I gave a large language model a task I said right here's an inut email I received this is what I wrote learn to do that for me and it it did it right now it it did it with a few caveats I should say it Fray it wrote emails exactly as I write them because one of my one of my bug bears with with something like trap GPT is it doesn't write like I write I have a quite specific style it writes much longer and more fluffy should we say sentences I would write everyone would know immediately it wasn't for me right so if I want to sort of actually take a holiday and not send my own emails I'm going to need to cover up much better than that so what I did was I trained I trained this thing and it wrote emails exactly as I would phrase them signed exactly as I would phrase them the only problem was it made up the answers still so it would say something like absolutely I can mark your courseworks for you please just send them over I'd be happy to do it and I'd never send that to anyone so you know built into a system where it it's prompted with some of this sort of my actual calendar invite my actual calendar information or what my actual response roughly would be I can imagine a situation where I'm I'm writing an email and I just give it a bullet point that says this is roughly what I want to say and it just turns it into an email fully fledged that kind of thing would be transformative for my work right because I just spend a lot of time typing out words that could have been much easily more easily done some some other way when in fact there was just one key message I needed to get cross and the rest of it was just completely superfluous those kind of examples I think those are the ones where AI is really going to really going to help we aren't there yet because everyone's just trying to interface with ChatGPT as it is and that makes it kind of everyone just writes the same kind of waffly text but I think as these as these tools start to become a bit more bespoke and a bit more tailored to an individual that's when I think we're really onto something being part of that wave and knowing how those techniques work is only going to help you deploy those systems for yourself you can deploy everything I did I did locally on my own machine just to kind of prove a point and you can deploy these things on your own PC if you want you can have you can have Llama which is meta's large language model runs perfectly happily on a mid-range graphics card a little bit more happily on a big graphics card but you can train it you can you can talk to it you can coers it into solving tasks for you and you you know you can do all of that locally at at just for cost of the hardware the initial investment in the hardware. The world's going to change with people who leverage AI just
like people have the advantage of a phone versus people who don't like an iPhone or Android phone it gives you such a such an advantage in life. I I don't want to give away how old I am but you know I remember when the iPhone iPhone came out and of course there were people who said I don't need this nonsense right I'm happy with my noia and well what happened there right you know ultimately people went hang on and it was his better as I say I think we're in for a couple of years or a few years of kind of not finding our feet of learning what's actually useful and what's just a fun as side before we settle down to go okay this works let's use this and this didn't work I mean one of the things that I found really useful is I don't really use um I don't really use AI to code now partly because I don't write as much code as I used to because unfortunately I'm have lots of meetings to go to but but what I what I do find it useful for is zeroing in on a bit of the documentation I need to see in order to write that code so when I wrote when I trained this large language model and I was using um Lang chain to do some retrieval ventage generation and I couldn't I didn't know how to do a certain thing I asked it you know which function in Lang chain would do this and within about five seconds I was off in the documentation in the right place you know it didn't write me good code that would have solved the problem but it did point me to where I needed to read to then write the code myself within a few minutes so I found it was a really nice productivity tool but just as a kind of guide to just get me a little bit closer to the problem rather than just trying to solve the whole problem end to end which I think is where people assume AI will be useful if you know about AI you're going to be able to use it in little ways dotted around your work rather than you just won't go to work and an AI will do it which I just don't see is very plausible you know so I think um it's not going to have quite the impact people think it's going to be different but I think it will be nonetheless really really really really big. It's interesting the flip side of that is what a lot of people hate and I find it really irritating is when you do Google searches or whichever search engine then it's got the AI answers at the top and it's like the information's not even correct. Which is interesting right because that's retrieval augmented generation so to give you a definition retrieval augmented generation or rag is this technique where you inject live data into the prompt to help the LLM write correct text that assumes that first of all what you've injected is correct it assumes that you've injected something useful and it assumes that you've that the llm can correctly interpr what you've injected to answer the question and it has the same issues that I think other large language models have so I don't know if you ever noticed this when you talk to something like ChatGPT you say something along the lines of why is it that this happens yeah but the thing you say is completely nonsense and then it gives you a proper description oh that happens because of this right and it's because your question was wrong your question said you know so you might say you know why is the sky Green in the mornings and sometimes it will say the sky isn't Green in the mornings but sometimes it will give you explanation as to Wild Sky scen I mean they're getting better for this but they still will do it the point I guess is is because the question that was asked wasn't quite phrased correctly it just ran with it right and you don't want your search engine just doing that and running with it I think and so we're in this bit now where it's been shipped as a product but we haven't quite Got Away of is what they've asked actually something we can actually answer right is the source we found to answer it actually correct and has the llm converted that text properly into a response. Microsoft recall sounded like a good idea to some people but there was a lot of push back about it real worry about you know AI and privacy what's your point of view on on you know on that. I thought it was it was fascinating that there was s it was such an obvious push back from someone like me or someone like you and yet Microsoft just seem didn't seem to see that coming like that we wouldn't want to just take all our desktop screenshots and just throw them straight straight up to a server I think there's a lot of privacy concerns and I don't think they've been adequately addressed necessarily and I think companies are going to have to start thinking a lot more carefully about this I don't think everyone is just intuitively happy with just here's on my data please upload this to a language model and do what you want particularly because a lot of these companies have very sort of very vague information on what your data is going to be used to train on and stuff like this you know so you know look do I do I mind if Windows does character recognition on my documents not particularly although some of my documents are you know confidential student marks and Records so you know there's that but do I want them being trained used to train a language model absolutely not right because it's confidential so that hasn't been solved I think we need to think about that a lot harder. yeah I mean Adobe in their terms and conditions just like hid some stuff in
there that suddenly anything you create can be used for the AI it's like huge intrusion. And and on X right or Twitter as I like to call it but you know with terms and conditions changed recently to allow training on tweets uh and that wasn't there before and so you've kind of got the rug pulled out right you you're under the impression that it's not going to be used a certain way and then suddenly it's sort of and at that point you're already using the platform and it's difficult to disengage I I think it's it's a bit of switch right they be careful doing this. yeah I think it like Apple even even though Apple say they privacy company so-called a lot of people would push back against that but like the the chips directly in the phones right? yeah if the chips in the phone then I suppose theoretically that's a good thing right it stays on the device and actually you know I train a lot of my language models stuff on my personal computer specifically because I firstly I want to see that I can uh and also I think if there's there's a strong privacy reason for doing this for people that are using data and they're not sure about whether they should be uploading that to the cloud the obvious solution is don't do that and just use it locally I think keeping on the device is great I think Apple's policy is basically if the phone can't do it it's going to go up to the Cloud and it's secure and they've got all this these white papers and stuff just describing how secure it is you time will tell I think I they are at least trying to answer answer the question a little bit which I think is is more than some of these companies are doing. Mike what's the difference between the GPU MPU TPU CPU you know why do I care? Mostly I think it's just names people give things to sound good right so a GPU is the hardware in our computers usually that we associate with playing video games so they were developed as a way of having many many cores doing small vertex and shader tasks in parallel all at the same time really really quickly ultimately AI actually just balls to mathematics and what it usually is is large multiplications so basically you have an array of numbers let's say 2,000 by 2,000 you have another huge matrix of numbers you have to multiply those two things together and that actually isn't that difficult it just takes a long time and it's something that can very easily be done in parallel if you have a lot of calls which is what a GPU has a TPU or a Tensor Processing Unit is just hardware specifically designed to do that and that's what Google call it an NPU will depend on what company you're talking to it's called like a Neural Processing Unit but for example the NPU or the equivalent in the new iPhone is essentially a part of their chip that is responsible for crunching these numbers really really quickly and allows them to deploy these models so you can imagine idea where Apple will train a model behind the scenes on all their data they will then freeze that model as it is and then they will deploy it onto a device and the npu is the thing that is responsible for running and executing that model on the input right and that allows it because it's in hardware to be done really really quickly. Mike didn't Sora just get released or like recently. yeah Sora was released
um you know a month or so ago and um I guess great excitement and you know but it depends on uh we'd already seen it before right we've seen all the examples we now are just allowed to use it Sora is a really interesting one because I think that once we have got our hands on it we kind of see that it does exactly kind of what we thought it would do which is that it it works pretty well but also doesn't work a lot of the time and does slightly weird things that kind of uncanny valley thing where you know a dog sort of transforms into a slight different animal and then sort of defies the laws of physics and you know and something yeah and I think this is entirely expected right this is exactly the same thing that we see in the text based large language models right these are actually the same kind of technology underneath there are differences but the you're still talking about a very very large model just trained on a lot of data they still have a problem in kind of grounding their information in the real world right there's no physics model there running so in a computer game you have a physics model that defines how your object accelerates and that at least makes it look semi-realistic at least depending on the game yeah that doesn't happen in s there's no gravity there's no concept of gravity the fact that someone stays on the ground is only because the training data has that happened most of the time it doesn't stop someone from turning into a bird and flying off which looks which makes for some pretty cool videos but also videos that are kind of weird and I'm not sure i' actually use them for things so. Mike the biggest concern I think for for a lot of people is I'm going to lose my job AI is going to replace me um is that true do you see that happening I mean I have heard instances where some jobs have been taken or companies are trying to replace humans with with AI? I I think there's always going to be a desire for companies to try and streamline and save money right that's kind of almost by definition what they do and that's because there's more money to be made that way yeah in actual fact I suppose people have to reflect on what what is it that they do in their job and can they currently foresee an I replacing those things right so for example in my job I might write some boilerplate code and I could probably see co-pilot or something replacing that aspect of my work at some point um that's actually only a very small part of what I do there are lots of other things like writing formal documentation that I don't think ChatGPT does anywhere near as well as I can do because it writes very differently and so I think that I can't see chat GPT coming or any of these um models coming close to what I do as a whole there are Maybe some kind of administrative task that could be made more easy or streamlined away by using some of his AI but actually I don't think it's realistic to say we're going to be replacing huge sves of the workforce I think what we might be doing is is getting you know marginal efficiency gains I mean I think Google recently predicted that they would they think that AI could be by 2030 saving people 100 hours a year okay well that's that's great right but I'm contractually obliged to work 1,500 hours a year so unfortunately still going to have to turn up yeah so yeah so I I I think that companies are going to try and streamline and sometimes they're going to make mistakes they're going to replace a bunch of people with a chatbot that then is accidentally a racist and then and then they completely embarrass themselves and those companies will learn that mistake pretty quickly I think I I can I still think we as humans want humans involved if I ring a company because I need help I actually want to speak to a person because I think they might want to help me right so actually I think there reasons to think that humans may still be around for a while longer. So you don't think like the Nvidia CEO said that we won't need developers anymore you don't think that's going to happen anytime soon? I mean Nvidia has plenty of developers that they haven't laid off right so I suppose I would ask why why does he feel that Nvidia needs their developers but no one else does is it because he thinks Nvidia developers are better I don't know maybe they are maybe they're not I think I don't see that happening I've actually met with a lot of developers quite recently at different events and I think the consensus has been that you know when when these models came out there was an immediate action of oh they can do quite well yeah is this a problem but over time and using them some people don't use them some people build them into their workflows but no one I think is realistically replacing themselves with one of these models because I just don't think they do that endtoend process of all the at different aspects of software development all at once you can try but I just don't think we're there someone said to me and it was it was a fantastic point which is very quick way of getting legacy code right but you can't understand is to get a AI to generate it because if you don't have ownership of that code if you haven't written it yourself how can you possibly debug it or come back to it in six months time and fix an issue yeah you know so I think that in software development for example I think we we we're going to be around again for a while for a while longer. yeah mean all the cynic would say that there's a reason that Nvidia say that and that's to push the stock price up. yeah I mean Nvidia stock price is primarily now tied to their
GPU offerings right and their GPUs are bought by about four companies mostly my own GPU that I have which is NVIDIA is not the reason that that that one card is not the reason that Nvidia has a high stock price and so I guess it' be really interesting to see what happens Nvidia and open Ai and all of these companies are pushing AI really hard and it's because they want to be seen to be the ones that are pioneering this and it is it is affecting their stock price right Nvidia have been and to to their credit Nvidia have been producing GPUs that can do this number crunching so fast for a number of years they've been ahead of the game and that's allowed them to basically completely monopolize the market in terms of the way that we train lots of these models will that always be the case I don't know will we always want to train models for size of what we're training now or will more efficient mechanisms come up or will will we get bored right and decide you know this small model is doing just about fine we don't need the 200 billion parameter model and if any of those things happen if the consensus is from the companies you know what we don't this is big enough that's when I think Nvidia has to is going to have to change their strategy right and they're going to have to start offering different kinds of products just assuming that these companies are going to continually buy 100,000 GPUs 200,000 GPUs at tens of thousand pounds a piece is not necessarily a long-term strategy that's going to hold for many years to come we cannot predict that. I it's interesting because I saw the Devin announcement where they were having this like AI that was writing and debugging its own code and doing everything and that was like a lot of hype it sounded like but it was that whole thing like developers are going to get replaced fully by AI. I mean it's a nice idea but I don't think it's going to work right I mean I saw I saw a um there's a GitHub repo I forget what it's called where they're trying to replace every aspect of the software development pipeline so they've got the software development AI the project manager AI and the senior developer AI and the junior developer AI I don't know what the difference between those two things is it's just one we didn't train for as long right it's I can see why they're trying to do this it's kind of fun as well but actually I don't think a lot of what software development is is is the actual writing of code I mean I remember Elon Musk joined Twitter and renamed it and then he fired half most of his workforce yeah and and and there was a rumor I don't know whether it's true or not that they were they were letting people go based on how many lines of code they wrote if you talk to an actual software developer we know that number of lines of code is not indicative of code quality if anything is inversely proportional to code quality right it's a bit naive to think that we can just because of this thing writes code so fast that must make it better I think is is it's an interesting way to go but I don't think it's true. Mike I really want to thank you for sharing you know and giving us perspective because there's so much hype so much noise out there and I really appreciate you you know being willing to come on and share your perspective on all of these things. As you know I I mostly like to do online videos because I just like telling people about computers and I think this is one of those times where the hype around AI and the the momentum and not all of it's bad right some of it's hype because the performance is so good it's nice to just to be able to take a step and just reflect on some of these things and say what which of these things is going to be transformative and which of these things can we broadly ignore and not worry about you know and I think also when something moves as fast as AI does it can be very intimidating for someone who's trying to take it on or someone who has maybe not been doing this for a year and now they're starting to panic but actually I think that's kind of working to our benefit in a way because there are these new libraries that are coming out where everyone's just learning them for the first time even people who've been working in AI for a number of years and so actually we're kind of all in the same boat. Mike I really hope that we can convince you at some point to create your own YouTube channel or create a course so that you can share with all of us. yeah it's it's on
my mind I think that there are loads and loads of training courses out there but as you know a lot of my videos are kind of just put out there and if people like them that's great um and I I do think some sort of fundamental how stuff works especially with AI could be could be something I you know I'd think about so you know watch this space. So for everyone who's watching put comments below we got to have a vote we got to get Mike to create content Mike thanks so much
2025-01-23 16:07