so fidelma started her keynote saying that you know people went to the fiesta and what have you so I'll start saying that this is the last session and if you're here you're not going to eat because lunch finishes at 1 and I'm planning to finish my session at well one I'm kidding but so um it's the last session of the of the of the of discover basically formal session and hopefully I'll keep it short so you can actually we can actually go and have some have some lunch so I think I know I I know some of you I've seen some some familiar customer faces here and HP people so uh customers I've been working with for years I've been upsetting for years as well some of you so uh what I'll do today is uh maybe in 30 something minutes I'll go through my newest my latest gig HP right so I I've met you when we're designing cfd clusters data pipelines and for autonomous driving and the couple of other things for the past year or so I moved to the city office and I'm driving a team that is looking at how do we change a bit how HP is looking at Innovation by bringing something that we call customer-driven approach I'm going to walk you a bit through the process it's going to be a bit boring so bear with me and then um I'll go at the end to try to unpack how did we end up building Ethan if you're not in Fidel Mas note then it's called project Ethan and basically we're looking at how we can build this Gateway app and within the G platform that allows you to use any models with there are guid and policies I'm going to go a bit through that what our thinking was and how did we engage customers to actually do it so I'll use that I'll use that as an example okay good so let's let's get started so Innovation it is basically invention which is a fantastic idea that you might have when you when you run in the morning or when you have breakfast but it's also commercialization so it's not enough to think about it you actually need to make it real uh at HP we had a lot of inventions we have a lot of patents but we also have a lot of innovation that actually goes through because you need to make it real for customers you need to make it we need to make it real for ourselves so basically you do have invention Innovation and commercialization research product and then you bring it to customers now the path is not always easy right so if you believe that you can go from a to F in a straight line it doesn't happen and we've been you know going through the stages and we actually go up and down up and down turn around you know change directions until you're actually getting something that makes sense so the reality is well you know it's an intricate line that uh takes us to F but hopefully when when you arrive at F you have a product that everybody loves or something that you're bringing in front of people that everybody loves good so it it is not easy in order to simplify a bit the line or not to simplify but to make sure that F represents a product that everybody loves at HP we we have something which is called the four pillars of innovation we are exploring and this is where you're going to see in a moment we are trying to understand what comes across the Horizon let's say in 12 months in 24 month mons and not only that we are trying to within each area that is interesting for us and for our customers we are trying to identify the ideas and the emerging Trends where we need to apply engineering resources the second one which is something that we are doing which is new for us is we are doing a phase which is called experimentation where basically we go every quarter in three months Sprints you have an idea we put some engineers in place we run it for 90 days at the end of this you decide what you want to do you can move it onwards to actually mature the idea you can kill it or I can pass it to somebody that says fantastic report I'm going to read it tonight and then I'm going to use it tomorrow so we do this experimentation over and over again because ideas come and go we need to test it and figure out if we need to Der risk them the third pillar is acceleration where we spend between six and nine months right where we are taking something that we've done as a PC and we are trying to bring it to a MVP we are spending six to nine months more Engineers trying to reach to a state where somebody in engineering say fantastic idea let me go out and pick it up and finally it's adoption where basically are integrating them with within engineering so four stages let me give you some examples this is what we call this is what we call Innovation map for us as you've heard all over the conference today you have ai hybrid Cloud Edge which are some of the which are for us are the direction where we are pushing in order to to uh deliver products that actually uh our customers are using underneath you have something which is top of mind for everybody sustainability observability and security overall these six areas are for us the areas where we are look Focus areas we are looking on on new ideas so we creat we create the innovation maps and basically what we do we go out and we talk to customers we go out and we talk to analysts we go out and we talk to our HP product managers and so on and in each area we have in in in which section we figure out the ideas we look what is behind them we are trying to figure out what the trends are and you see that they are bigger or smaller meaning what is the impact that it will have for our customers and for HP obviously smaller or bigger but it has it can be empty full or half full which means who are the players on the market are there any startups are there any open source are there any companies that we might leverage to deliver this to customers empty full or half full and then out of that we are actually trying later on to figure out where do we apply resources obviously the bigger the bubble if it's empty I need to go out and apply Engineers if it's half full it means that I need to go out find the open source or the start up and try to enhance it if it's half empty sorry if it's full there is a product I need to go out and figure out how how I onboard it in my road map good what do we do next well basically we have the trends but where the where the ideas are coming from we have events which are internals but basically we have tchon which was one of our biggest conference internal conference for technical abstracts for ideas and so on then we are organizing hackaton and so on and so forth where basically we are streaming all of those ideas in the areas that are critical for our customers then we run technology deep Dives with Partners we figure out from all the ideas which are the startups which which are the open source programs that we are interested in in and we run very deep technology deep Dives why do we do that because later on I need to go to experimentation so I need to present to uh to my peers what are the main ideas that I'm going to pursue in the next three qu in in the next quarter and then they prioritize it this is how we' build project Ethan this is how we're looking at a security dashboard this is how we're actually figuring out sustainability cannot stay only in reporting I need to be able to tell you which containers which workloads which VMS are actually consuming what so we we run a project on this to understand how can I do it don't think that these are products because Tony here is going to kill me they are not products they are ideas that we go out and run experiment in order to make sure but then I'm feeding the next stage which is basically acceleration by this point in time I'm reaching out to Tony and I'm asking him Tony how does this need to look like to make it real for customers and I I Define a prototype mvpn then I go out and I execute six to nine months so this is why we've been figuring out for example how do we do a data access managers for GP how we improve it how do we deliver data services to our customers like intelligent data inspection an open data catalog for metadata and for example a data search and recommendation and all of this is real we have the projects Tony doesn't have them I have the projects but slowly then we move to a phase where we think about moving them into engineering so adoption good and at this stage you've heard about it this is where this actually came from the sustainability dashboard actually came from a PC to Innovation engineering and now fidelma just showed it on the platform and you've seen it right now in in L's in L's presentation that we actually demoed this so it went through the entire Journey right it went through the entire journey and finally you have it in the product but another example that I want to talk about is m&as we acquire companies but we don't acquire them without actually running the technical due diligence figuring out where they play on the market and then later on WE integrate them it goes to the same path but it's not an organic inov invention it's an inorganic Innovation so we go out but it goes to the same path we run a due diligence we run a deeper due diligence then we integrate right so this is adoption now why am I telling you this because these are our four pillars of innovation but at every step of the way we are discussing with our customers as I was saying earlier from the and this is a busy slide but in the exploration phase we understand the challenges and we map them into innovation maps when we are running the 90 days PC's we are talking to customers to have an early view about what we are developing then before we move to acceleration and we are actually going in we stay with customers we understand if this is phys visable we understand what is required to move to move this into a product only for ideas that actually make sense then we spend 6 to 9 months in order to make sure again and we meet again customers we run the tech preview they want to understand where you are is not a product yet it's a tech preview and then we are actually running betas with customers every stage has the customer involved in some cases just asking question in some cases showing them that some capabilities others testing on their own and finally engage s s customers with a release candidate so all the way but it's a nice story so let me earlier today I think fidelma talks about something that was called project Ethan right and she used the word obsess and yeah I was obsessed about it because I feel that we talk about gen everybody wants to use gen but everybody is shutting it Down Right Italians especially Italians right was the were the one of the first country that actually don't don't use chat GPT right so Italian companies and so on so forth so I was thinking what if as an idea what if actually we go out and create an app that allows customers to use any model anywhere or the I guard rails policies uh protection against injection or whatever other else and then slowly we we ask for feedback and why not magically it appears as an app as an app so why gen well first we start with the problem so let me walk you through the entire reasoning of Ethan by pit stopping a bit around gen we've been talking about gen for the past couple of you've heard it in in in uh you've heard the termin Antonio's keynote in Fidel masy note I'm going to do a three minutes breakdown about how this works so if you know it already just bear with me if you don't hopefully you'll understand after so let's take one example at HP we believe right advancing the way people live and work and I had the privilege to be prels which is the coolest job in the world right for many years right and I've been doing this firsthand working with customers traveling pulling cables you know um uh discussing engaging deploying software whatever you for many years and again it's the coolest job in the world if you ever think about a career change prels is the best so now now I'm doing something else because I'm not smart enough anymore but at HP we be believe in advancing the oily VOR and I've been doing this for many years right anything from making trying to understand how you know formula one works to how autonom autonomous driving works because I see here my colleagues from zanak so I've been doing that now the world before November 22 for me was very easy I was driving AI for the company I was looking at computer vision and I was contemplating about GPT and talking to Nvidia about Megatron it was very very very easy now wake up in the morning have my coffee have my run everything cool now after November 20 uh 22 22 I can't freaking sleep anymore because this guys introduced Chad GPT I have many friends in the industry right at Value startups anything from you know from your Alf Alphas to whatever other companies in the world and they can't sleep either why because every week somebody is announcing something every week you have a new model popping up right anything from clo to from anthropic to whatever other else every time something something happens so then staying aligned with innovation in order to influence the way people live and War become really became really freaking complicated it's complicated as a consumer right because you're putting stuff in and stuff gets out and then you try to understand what got out but secondly as a as U as somebody works at an Enterprise is how do I make this simple okay so obviously then they are trying to make this very complicated but that's a different story so then now why did this change first let's try let's try to take an example so do you remember when homework was a thing you're are all older as old as I am so you used to do your homeworks right homework used to be a thing right so right about Shakespeare is not a thing anymore you can get it quickly I remember I was I had an English major you can't hear it from my from my English but I had an English major at school right so I remember a professor that actually was in hand this idea of DH Lawrence was crazy right so had his favorite line that was us also in a movie I never saw Wild Thing sorry for itself whatever other else man I never managed to get an A at this class it was always close to you know from 1 to 10 it was always five where 10 is the better right so this is easy now you go in there you put it in and then you translate it to German in a heartbeat right my kids are not allowed to touch GPT or whatever other llm this is why but even more importantly change the way we consume information you want a summary of the election in 2020 there you go your favorite paragraph about Donald Trump so it changed the way we consume information at school we consume information day to day and it changed the way information is being fed to us or sorry handed to us from from all the possible Direction right my keynote was not written by ANM as you can see because it it doesn't flow but even more importantly you can actually have fun with it so but what are generative AI models right if you're asking open AI well you have you have two different questions uh if you're actually asking CH GPT you know like 3.5 are Advanced A system that actually produce generate that actually generate human life context and they can generate anything from code to text to pictures if if you're actually thinking about bigger models and so on on the on the other side I've actually asked di which generates images from text what are actually large language models and this is what you get so still a bit to go so large language models are close are um uh a type of deep learning uh deep learning neural networks that basically generate content after they've been fed a lot of information now okay but let's clarify NLP work with something that we call engrams what does it mean it means that models are looking at construct of words at one time in order to generate the next word I'm if I take the examples s sing delivers one fantastic keynote in Barcelona you have various types of engrams depending on how big n is a unigram is a group of one words a Byram is a group of two words a trigr you got the picture a five gam you understand as well okay so why is this important because models are looking at a certain amount of words to generate rate the next one in the context now gp4 it's a 20,000 G model 20,000 G model he's looking at 20,000 words to generate the next one the context this is why they are so good and this is why my friend Dennis said problem has a problem trying to make a smaller paragraph they love to generate words now has about 1 trillion parameters we can debate that and it has seen attention between 20 and 30 trillion words of training data we can say but that's not enough okay good let's compare now one book one on the averag is 880,000 words one book all the books that are published and if you don't trust b f this with Google it's 130 million Books Okay the number of words it's close to 11 trillion words so gp4 I'm talking about actually you remember the books with paper these ones right I'm not talking about going on on Wikipedia whatever you actually real books so 20 to 30 trillion words number of words that we published is 11 trillion so even if somebody reads all the books that are ever published is still not as smart as okay apologies can still not generate better text than GPT 4 as simple as that but we've taken this a bit further as you've heard also fidelma we do something which is called prompt engineering so we go there because even after reading 20 to 30 trillion words it still doesn't make sense so we spend more time in trying to teach it what he's supposed to do what he's not supposed to do what is good what is bad and so on and then magically you do your homew work with it okay good now why am I telling you this because I wanted to level set a bit the next four question five questions when we thought about Ethan we went and I personally went and then my team as well to talk to customers what what would they need something like Ethan why because they need to ask these questions to their users which model do you want to use is it production ready does it work what about governance do I actually need anything in place to check their freaking answers I'm recorded okay sorry or to figure out if they can can actually be hacked into the entire process end to end then which use cases are ready this one fidelma covered I will not cover this one today and then what do I need to get started so we went out and we asked all the five questions and even today today the past couple of days I was discussing with friends from Qualcomm and others exactly how to answer this five now why are they important well which model do you want to use you have two sides on the one side you have actually the model that you can go out and you can buy or you can pay for anything from your open AIS your Alf Alphas your your Salesforce and whatever you you have commercial models on the left but wait on that side basically you have open source models when Lama 2 when Lama was launched in February 24 and it was leaked a couple of weeks later the open source exploded and then you have now Lama 2 and Falcon and whatever other open source models they exploded and now you have the open source models to use some of them are trained to put them for your use case and then you're ready to go obviously then you need to think about ort and whatever other else but still you have an alternative so which one do you want to use do you want to use a public one on the public cloud or you want to put it on Prem a lot of questions and yes I was not the only one that actually had these problems a lot of customers had these problems a lot of you I've heard we have but together we have this problem which want to use maybe I want to use a lot more why because they have there are models that are specialized now you want a model that actually tells you correctly how to read a legal document we had stories about this how this failed but anyway there are models that do this I want to use them already while in parallel get another model for chat for example good the second question was are they production ready well it seems that they perform very well if they are actually prompt engineered and tuned to certain information they can be very good like the one that for example the the chatboard was actually um was answering uh patient questions as better as or even better than the doctor so it can be done or chat GPT passes exam schools right for law in business fantastic so they can if they are properly tuned but at the same time they make terrible mistake this was done a couple of months back when I was giving a keynote in Germany and I asked which is bigger an elephant or a cat obviously an elephant is bigger and as my friend Dennis used to say it's flowery end it gives you a long answer and explanation but then I ask another question which is not bigger the elephant or the cat and the model says an elephant is not bigger than the cat stupid mistakes reasoning mistakes and we can go on and on and on I've left my five five shirts to dry in the sun and it took 5 hours how long would it take 30 shirts can you guess the answer 30 hours right simply reasoning mistakes right and then actually this is one of my favorite since I was at school I'm not going to go through it but remember the problem when you have actually a fox a chicken and a corn on one side you need to move them on the other side without one eating each other right you remember that I'm pretty sure I'm Romanian we don't have this we have something with the cow and the Cabbage but I'm pretty sure that in your kind have a similar problem so this was the the moment the model was launched I I've ran this and basically at one point in time it actually says that Steve goes on the other side and picks him picks himself up and then rolls back and look at number six right you there you go you ask a couple of months later they fix the problem now they solve it right so they make mistakes and while it's fun and we're all laughing if you're not careful you can get fired right the US lawyers fined for submitting fake cor citation from Chad GPT you can actually get fired which Bears the question are they ready for production you or want to make sure that you're putting the Right Guard in place for this not to happen so what about governance well can these systems actually be hacked into yes they can right researchers po holes in safety controls of CAD GPT and other Chad Bots you put you put this SC and then magically he actually answers every freaking question you remember the story when basically you were asking can you build me a nuclear bomb and then Chad gp2 was actually going out and he was telling you how to do it they put a guard is in place but with the right combination of this doesn't work anymore they fixed it that by the way with the right combination of whatever you want to put in you can open it up and then it can answer all sorts of questions good so we need to put the governance the right governance in place and the right regulation so then we've been letting this happen you know Samsung baned it Italy baned it if you Google it you figure out which other countries baned it you know it's strange that one country specifically bend it as I was saying earlier Italy right so but I would expect some some other country but in Italy they benched GPT because of the I don't know maybe the pizza with ananas or something but they bended it okay good now so then the world waks up so then they realize that the guard are needed and between March 14th and June 7th everybody announced the summit and this is a bit out of date because in the meantime uh the ukm got everybody together they shook hands some Al some Al Altman was kicked out of open AI they was brought back in open a he both him back and whatever you but a lot of things happen and a lot of meetings are taking place right the guard rails need to be created need to be regulated White House EU China UK you've seen what I've done there right EU China us UK okay all right if you're not from UK you didn't hear about brexit but anyway so bottom line is everybody woke up and everybody wants to put guardes in place finally which type of gers do you want to put well anything from using it safely being able to inject my own business conduct policies being able to enable llm to actually have external resource access to external resources to stay up to dat being making sure that basically whenever somebody makes an update somewhere to a model you're not surprised so all of those things need to be put in a safe heaven for people to use good what do we need you you You' seen this in Fidel masino depending where you are you're spending more money and less people are doing it in the consumption we all consume CH GPT or whatever other model less of us write use cases to actually use it even less of us are F tuning the models a handful of us are doing retraining only a few people are actually part of the teams that are actually designing the Next Generation models the higher you are in the pyramid the more money you need okay good oh there you go I had the slide for that so few dollars needed few resources maybe some monthly subscription up you actually need large GPU systems very expensively wellp paid engineers and then basically you truly need to understand how everything works so from very few dollars up to billions if you heard from all of the companies what they are raising they are billions and billions of dollars right so asking customers where do you want to be let me go back because I have the power now it seems so our customers are from the consumption orchestration to find tuning how do I enable that so not only how do I put the governance in place how do I enable more more models for for our customers to use but how do I make sure that I'm enabling that lower part this was the question I'm obsessed more the most about why because if you're in the public Cloud you need this in the public Cloud if you're on Prem you need this on Prem if you're at the edge you need this at the edge so My worry is not how do I bring all the not only how do I bring all the models together but how do I enable that to make sure that you're not thinking why Falcon 40b needs something uh the Luminous based model from Alf Alpha needs something else so that needs to be seamless good finally why do I need look at this very careful why do I need infrastructure why do I need to make sure that I have done everything right underneath let's see if this cat were elected president the first order of business would be to make sure the economy is strong but then the other answer is after we train it a bit more he declared the war on the dog what's the difference the difference is the first model was trained with 64 gpus for 37 Days the second model was trained with 256 gpus for 22 days you do the math three times the the power compute power the better the answer unless you like the first answer which means we are not in a very well plac in a very good place so in order to get the right answer you need more compute power right so this is the problem that we need to solve and I'm not saying that I want our customers to use infrastructure but I want us to have the right answer right so from the models that you use up to how you're actually using everything underneath so this is why we created this we looked at project Ethan as one of the solutions after we figure out all the requirements we are not there yet well I'm not I can't tell you exactly to declare the war on the dog but we are getting there so we looked at the J app in the middle that is able to take open source models third party models and also HP tools and uh and and models that we are actually training for example from open source and putting it out there but it's putting in place the guard it's exposing everything it's aping everything into one place with the right I never can see authentication authorization audit metary so then I'm exposing to my customers to you the way to consume it via chat as Dennis has rightfully showed you and also the way you can orchestrate to include it in your use cases no matter where the model lives so this is why project Ethan was introduced this is a bit of an ey chart and I apologize for that when we introduced it we wanted the UI service the chat service on the side that can actually from the chat service you can bring in skills skills are use cases you can actually bring in and say I want to do a Q&A from my documents that are actually uh showing me how this Parts Works put in the document apologies pre pre-train everything run a rag if needed then expose this as a skill for all your users to use so that's on this side on the other side you have the model service that basically has the governance underneath and exposes it has the vector DB allowing you to find you on the side but Expos is an API that basically you can channel into any models that are exposed so we have trying to put this together as quickly as possible because rightfully so I was obsessed and maybe I still am we need to make this available so what are some of the threats that we need to guard ourselves from not only that whatever goes in needs to be according to your policies I'm pretty sure that all the companies has have standard of business conduct right and even more importantly all the customers want to keep their IP in their on their premises so we accept that we also need to figure out how do I secure myself from prompt injection how do I secure myself from misdirection that can actually be um embedded into the model how do I avoid prompt capture or leaking because while you're putting something again people can still actually C Whatever Whenever you see the output they can actually see what is being generated right so and then obviously you have all the request in yellow there that are actually coming through so this is why we are developing the guardrails in order to make sure that I'm able to filter the response governance model to understand what is coming in what is coming out validate the accuracy of the model I don't want you to believe that you know uh basically the elephants are smaller than the cat we shouldn't be do we should actually check that and so and so forth so the right guardless are important with the freedom that customers or users can actually inject more guardless if needed not all the companies have the same beliefs so because of that we actually created Ethan it delivers the chat and again it's an internal project don't go out and it is if you're an HP employee you log in the GP as um uh L was showing earlier L and team and you can actually see the app you can actually go in and you can start using it user history copy paste user feedback everything is delivered as a chat but even more importantly I'm running we are running the orchestration underneath so the ability to create Crea delete API tokens that allow developers to access all the models provided by the service and for which you've deployed the infrastructure you just don't put a model and say oh fantastic I I put a model some here and then I'm going to have a 100 user using it no you need the infrastructure underneath show user quotas and service rate limits show the history of all Communications one token to access all available models that was the idea and this is what we've done obviously there is work to be done we are working on the analy on the analytics and audit because obviously on triggering the retraining but analytics why analytics and a it because you want to know what your users are doing I don't believe in big brother but you need to understand what is happening Which models are being used how are they being used and so on and so forth so we are looking at consumption analytics auditing understanding which guard L were influenced and how were active and how and so on so forth so all of that is actually under under the works in the city office today that being said I'm coming back to this slide again why because we couldn't have developed Ethan I mean we could but it would have been it would wouldn't have been looking looked like it does today if we didn't go out there and talk to customers is this useful do you want to use models do you want the r guard in place do you want to use public and private and don't PR model why and so on and so forth so we've run the discovery the discovery part we are working we we start will be starting working with customers on between experimentation and acceleration and slowly magically in one of the future discovers your cies appearing in Ton's platform thank you very much
2023-12-11