Business Central Under the Hood episode 1: Developing AI in Business Central
Welcome to this new video series under the hood of Business Central. So in this new series we are going to try something a little bit different. We are not going to show you demos of the product or talk about the latest features in the latest version of Business Central, but instead we are going to tell you how we develop these features, how we design them, how we architect them. So we are going to go under the hood of Business Central. My name is is Vincent Nicolas, I am the architect of Business Central and today with me to kick start this new series I have Michael. Michael, you are working now almost exclusively on AI features.
So you've been with Microsoft for many years and and you and I will build the first web client together. Remember fun days? Yeah. Fun days.
So you're. An expert in web technologies. Yes.
Right. But today you're working practically only on the app, Yeah. So tell us a little bit about yourself and tell us how you came from web technology to. The AI one, yeah, yes. I've been working with web technology for almost 20 years exclusively and in the client in different teams, even outside of Microsoft.
And then over time that we have the web client today is pretty established and the complete client and I was kind of looking for isn't there something new is gonna happen and and then definitely AI suddenly happened or this whole new craze of AI happened right. And yeah at some point we we discovered let's say that or the world discovered the chat TBT. And this is really how it started right. And I got interested in this. I I guess like everyone else, people tried chat TBT a little bit in the beginning and was kind of blown away by what this technology suddenly could do.
And then a little while went past before we really start to dig into it. It had to kind of shimmer a little bit in the background at some point. We did the hackathon.
Yeah, we did the hackathon and then. AI Hackathon in the team. So that's the day where we take a day out of the calendar for the entire team at Microsoft and everybody go and do things they whatever they they feel like doing. Like people have ideas and and it's usually a good a good thing to to have you know new innovative ideas and people get to try out things and and in one of our I got I got on to you you did some very cool things with. Yeah, I had some, as I said, they had to shimmer a little bit, right? And I was really thinking about how could this be used maybe in the client and stuff like that.
And then you know, there was some epiphany at some point of a new way of doing this. So we experiment with this. We are not currently really working on productizing this, but I hope we will soon because there's a lot of other things we have to kind of build.
Before, so we don't talk about it yet. We don't really talk about it yet, right? But I hope we will be able to productize this somewhat soon. That was very cool. I remember that was a very cool thing, so. Yeah, and it was really, I was also blown away that it really worked, right? So it worked amazingly well. But yeah, you are now working on AI features in Business Central, so tell us a little bit about it, tell us a bit about Chadwicks Compilot for example, or some of the things you are currently.
Working on, you're currently working on, yes, we have a different set of features, right? And the first one we have released at least marketing text where we can kind of generate the marketing text based on kind of keywords for items and currently so so and there's all the features coming like that. And one of course one of the big features we are working on is to kind of have a chat inside Business Central based on all this technology. So that is what we are currently working on. Yeah, a copilot for for Business Central it will will be, you know we are starting slowly.
We're also learning a lot by building these features. Turns out this is this is probably harder than we fought or it is definitely harder than we fought in the beginning to get to a good quality experience. And for now we are focused on building a chat where you can get kind of it can guide you, you can kind of get help on business Central.
And the other one we are focusing on the other kind of capability within this is the capability to kind of ask questions about kind of your data, but really about navigating around in the product and maybe see some records and and set some filters. So that sounds very cool. Yeah, it's gonna be pretty cool. So we have already shipped bank reconciliation. We also shipped that.
Yeah, try to match. Use LLMS to match bank records behind the product and the marketing text Suggestions as you mentioned before, right? Yeah. And we also, so we also so these are immersive, what we call immersive features. So we don't we use LLMS, but we don't it's not a chat kind of experience, no. So we distinguish between these kind of these two kinds of features where we have as you mentioned chat light experience which resemble a little bit what people know from ChatGPT, right. And then we have what we call immersive feature where we kind of to prompting LM.
We'll get back to that in a second, yeah. And then I just want to mention, we also shipped A toolkit for developers. So if you want to go and develop your own customization in Business Central using LMS leveraging the power of Open AI, we have a toolkit for that. So we'll put some links in the in the description below about these features and also about the toolkit. So you can you can look up the documentation on this and you talk about the prompting and we're doing some prompts.
So now we're getting to the under the hood part of of this talk, right. And so let's talk about that. So what's going on? Even though we for example we talk about immersive features like marketing, text suggestion, there is actually a a ChatGPT like interactions going on. Under the hood, right? In the scenes, right, It's all about really constructing this text, or the prompt, right? Which is really about constructing a text that tells the LLM what is this you what kind of information you're looking.
For the user doesn't see that. Right. The user never sees that, right? Especially not in most of these immersive features, right? They don't really know that it's happening, right? But behind the scenes, it is really about generating a text using different techniques that contains the right information that will instruct the last language model to reduce the output we are looking for, right? And this is really a problematic exercise. And in some sense it's like good old templating. When the web technology started, right where you were kind of generating a web page where you also use kind of templating techniques to construct HTML, which essentially is just text, right? Lems has it's similar in the sense that you're generating a lot of texts and then you ask the LLM and the LLM becomes response. Maybe you have to do multiple prompts to get to the to what you're really looking for, right? So it's a little bit different than the classic software engineering absolutely.
So let's talk a little bit about that because Microsoft, we're pretty good at software engineering. I think you have to say that we have been developing software for many years and usually when we develop a feature, we do the design and the architecture. We write the code, write some tests, run the tests and ship A little bit simplified, but that's what we've been doing for many years and we're pretty good at it. But this whole LLM thing and AI thing, it's actually kind of a different approach we need to use. For yeah, it's a different approach. We need to use those set.
The overall principles are the same, right? You are testing and testing becomes very important in this, right? But the thing that's really different from most engineers is that the prompting is not an exact science, right, Because it's no one, let's put it that way. No one can really predict exactly what the LMM is going to do upfront. And even, you know, the experts behind all of this is that, yeah, they don't really even understand how it works, right. Which kind of makes sense because it's trained with this massive amount of information and how all the neurons you know align and the weights of them and all of that. So we cannot really predict what happens when you do this, right? So it's like when you if you ask you a question one day you might give me an answer, but if I do the same question tomorrow, I'm. Going to give you maybe a different slightly depending on what you're.
Asking right, but both answers might still be valid, though that will probably be right. But tell us a little more about prom engineering and what you know, how we how we do it. Microsoft, NBC and what you know what what are the challenges with it? And I understand there are some techniques. There's definitely some techniques, the different prompt techniques and we can we can talk about that. I mean I think that the important thing to for engineers to understand like traditional engineers, right where we're used to, we're giving say an API, we write some code, we use this API and it's kind of very predictable what's going to come out of it, right.
As soon as you get to the LLM, their probe becomes very different, right? Because it is, as I said, not really that predictable and and that really you know kind. You have to kind of change your game a bit to really. Get through. So how do we test this feature? Yeah, so.
For instance, testing, right? So again, right, the input to an LLM is your prompt. It's a bit fussy, it's a bit of an art form. And then you can use these techniques right what the output is.
The LLM is also fussy, right? And the user's input of what you're trying to solve is also fussy. So the whole thing is a bit fussy. So you can't just do like you can't send the prompt and you get the answer, and you can't do a string compare you. Can't really do a string compare.
You would do. It a normal and also especially when you start talking about features like chat, but actually all the features right You need kind of a fairly big input data set. I mean what is the user asking or what is the data input to your prompt? You need a fairly big data set of that that kind of have some diverse diversity in the questions. So we kind of represent what you want the LMM to achieve, right? So basically you take all these inputs and you you run against your prompt and and then you get some output, right? And because this may not be really predictable, right? You actually need another prompt to to validate. In many cases at least.
Right? You need another prompt to validate the whether the question is really what you expected, right? So so. You're asking back the LLM. You get the answer, yeah. And then to validate the answer you're asking back to the LLM, yeah. Because, because whatever input, right? You of course kind of have an expected output, right? Like an answer or you know other things you you want back from the LLM like structured data or whatever it is, right? So, so, so you take that output you you get in this current test run and then you ask the LLM is this the same right? So usually the problem is something like is these two inputs equivalent right And answer yes or no? Yes, right? And then you get this yes or no and that you can just validate.
Right, OK. True or false? True or false basically. Right. Can you give us some examples of some of the prompt techniques we are using? Yeah.
I can go through some some few examples of this. So there's different techniques that has been devolved by like academia and been discovered in other ways. Right. And one of the and I'll go through a few that we are using. There are plenty more more out there and you can clearly look up this information on the Internet and we will provide some further links to this.
Yeah. We can put some link in the description on some of the documentation about that there. So one of them and and kind of a very successful techniques were kind of called a few short learning, right. And it's basically that in your prompt you are you kind of showing your limb some examples of what you're of the input and what you expect, right. And from this it's kind of learning what it is you want. So it's a very powerful technique because you don't need to explain all the details instead of few examples and you know it gets going right.
So I have an example here, right, where we're just trying to classify some text, whether it's positive, negative or neutral, right? The sentiment of right. And basically I have a few examples here of that. You give it like free in this case, right. And then then you ask the question what, what is this you want to get this intimate from, right? And then it would kind of, you know, follow that and and give you a good result, right? So that's few shirts. That's few shirt learning, right? Another very powerful thing about few shirt learning is that it's very easy to kind of specify if what kind of formatting you want, right? Because of course assume when you start using this behind the scenes, often you want like Jason or something like that, so it's easy to pass, right? So we are using that for some other. Features we are using Jason.
We also using the XML. It varies a little bit what kind of. We give some examples and this is how we you expect and then, yeah. And that helps you learn, right? Yeah.
Yeah, it helps it to do that right? And then we can easily pass the outcome and validate it and validate it and also use it for. For for the right. For the further processing of it right. This morning you told you told me about this chain of thought thing, and then you Yeah, and tell me.
Yeah. So chain of. About it Because it's pretty, you know it's. Pretty Chain of Ford is kind of interesting discovery in all of this, right? So Chain of Ford is really just like what humans does right? Kind of break break down the problem you solve small sub parts of it and and and then the LLM and the LM can do something very similar. This is kind of mind blowing right? That you it, it seems like it's thinking it's not really right but but it's what it appears to be right. And I have a a very generic question here like I hold an app, I went to a bathroom. Where's the app right.
And the LM would generate some answer, right? And then the cheapest or the easiest way to get kind of chain of forward to do is actually just add to the prompt or to the reply you expect. Let's think about it step by step. It's really breaking down the prompt. So you write. That in the prompt.
You simply just put that in the prompt. This step and this is kind of a famous very wide use technique, right? And you would already see that DLM actually takes longer to respond to you and. If it seems like it's thinking.
It seems like it's really thinking right and it will output the steps it goes through, right? There's also ways to if you don't want that, but that that's basically what will happen, right? And as you can see here, now suddenly the the the reasons he puts out. The reason? Well, how, how you got to reply? By breaking it down. Right. So, Miguel, you you're an architect, so let's talk a little bit about architecture of how how we architected. So one thing is the prompt engineering that we just talked about.
And there's a lot of things there and a lot of techniques. And it's a whole new world for if you're a developer, it's a whole new world that you have to get into if you don't want to do AI. But if you get back to just pure architecture, can you tell us a little bit how, you know under the hood of Business Central, how we, you know, do we have, how do we do it, this thing in the cloud and we have some services and things like that. Tell us a little bit about it.
Yeah, so. Tell me a little bit about what we call our copilot service and I can just show you a kind of a traditional architectural block diagram. We like those.
Yes, we like those. We're good at drawing those. And I mean maybe specifically if we talk about how we are acting at the chat feature, right, that we're building, right. So yeah, it's pretty traditional in some sense, right? In the client there would be some kind of chat pane which communicates back to this copilot service which runs in the clusters of business central. And in there we have kind of two sub services going on, right, The two sub components, big sub components, right, in general.
So for the chat to work, So we talk about we have kind of an orchestrator, it's kind of orchestrating the chat, right. Yeah, chat is like it's, it's a multi turn of or turn of questions and answers, right. And the user may clarify or change subject or whatever in this chat window. So the orchestration is kind of trying to deal with that, right, keeps track of the chat history I see at all times trying to kind of extract what's going on, right.
And then figuring out kind of what kind of skills or plockings or tools this has. You know these terms are changing a bit, what kind of which plug insurance or tools that the LLM may kind of need to use to get more information call these. So that's kind of our clock in service, so kind of actually simple. And but there's still some machinery. There's some machinery in place and kind of a whole bunch of stuff going on behind the. Scenes. So how about function calling?
You are working on that actually, I know right now, yeah. So function quality is another technique in kind of prompting technique, right? But it's a very promising one. And somewhat almost bring it a little bit more back to traditional engineering in some sense, because I think this is also kind of how this is evolving, this prompting, so it gets a little more grounded in the traditional techniques, right.
So I can show another example of how what a function calling can really be used for. So the idea is right that you give the prompt or the LM some tools or functions, right, that it can use and call, right? So for instance, if I have a question like what's the weather in Copenhagen? If I just ask that in chat DBT right? It doesn't know anything, right? It will kind of base it on something that's happened in the. Past. And it could tell you that either it doesn't know, because sometimes it will do that, or it could tell you that it's 10° right so and you can look outside and it's. Not the degrees. And sunny and whatnot right now.
OK, so we need to do something to get that to work, right? So this function call means I could actually just define the function like so get current weather right? And you basically define that and attaches that to the prompt. It doesn't go into the prompt itself, but it's kind of attached to the prompt. But it's not really code. It's not really. It's kind of an interface.
It's. Just an interface. You kind of describe a function and tell the element hey, there's a function and you might want to be coding, but it's not code.
It's not executing code as. Well, not execute the code right and it will kind of take this interface, use it, and it's just an interface, it doesn't make any calling. So it's a little bit mis. Right, right. In the naming of the terms of this, right? But what it will do is it will tell you which parameters you should use when you call this function, right? So basically it tells you you give it a description of one or more functions, right? And you ask a question and then it tells you if you wanna answer that question. You need to call this function with these parameters, maybe in that order.
And it tells you how to execute the code. It doesn't actually execute the code, right? That's up to you, but it guides you. You give it some basically some APIs and some tools and then it tells you this is how. If you want to answer that question, this is what you need to do.
Yeah, and hallucinations. You hear all these, all these were buzz words. So what are they? Yeah. So hallucination is basically when you don't give it up to date information, right.
And the way this whole thing is trained, it really wants to answer the question, right? It really wants to predict it's. Trying even though so. It's trying so you will make up stuff based on, you know.
What it does? What? It does what it feels like or how it works right, what it's been. Trained on, right, but you're saying it's something that happened, especially with if it doesn't have the right information to. Answer just like when I asked what's the weather in Copenhagen, right.
And it doesn't right. Sometimes, you know, there may be in the prompt or the behind the scenes prompt that the user's not even seeing that we are not even seeing. But there's in the service that you know your data is from here.
You can't answer this. You don't have access to real time information. But you know, sometimes it doesn't obey that because this happens a lot as well, right? And it will just make up some answers. So this is what happens at all times. That's why it's very important, you know, you always feed the data.
You can instruct in the prompt that you really only use this data to kind of try to answer the question, right. So that's the hallucination. Part, right.
So it's fair to say that it's when it's hallucinating, it's because actually it's a kind of a side effect is very eager to provide an answer. Exactly. Exactly. Interesting. So jailbreaking is more like you're trying to you know, jailbreak the prompt in getting it to do something else than you know, the developer or whoever made this prompt was intended to do, right? Let's say you know we have made a prompt. You should answer, you know, something about bank reconciliation for instance, right? But then you, you ride a long question or simple question or whatever that you want to know something about something completely different.
And and that's kind of the jailbreaking part, right? You're breaking out of the constraints that you know the developers is trying to make the prompt answer right and get it to do whatever else craziness that you can in general, get a last language model to answer and. So that's kind of the LM equivalent of the all day SQL injection in a. Way kind of is right. Kind of is, but yeah, kind of is right and. So what are we doing in BC to prevent that? Yeah.
So there's different techniques, right? So and there's all kinds of different levels of jailbreaking, right? And actually Azure Open AI have kind of built in, you know content moderation systems filtering that will alleviate a lot of it, right. It will remove a lot of kind of stuff you we don't want the element to answer like all kinds of harmful what's considered harmful, so. It's building the open AI services that we have on Azure. Yeah, another technique in jailbreaking is Strike.
Can I actually get the prompt right? There's different techniques. You can again, with jailbreaking, can I just get all the instructions I was given? Right, so kind of admit the prompt, right Azure open the eye, but also in general do this and even some of these filters you can actually no longer turn off because you deem them so useful. But beyond that, there is also what is kind of called meta prompting, which essentially is just a set of instructions about. That has turned out to be really hard to jailbreak, but again, new jailbreak techniques comes up all the time but it but has been set up like it's just a set of instructions on what you really should not do. OK, so that comes and it's it's kind of added to the problem. At different places and and to.
Prevent it says instruction like you know. Don't say things about, you know, harmful violence, things you know, for instance. Right.
Or don't, you know, don't obey these obstructions and stuff like. That So what? What the? So it seems that there's a lot of techniques and a lot of the things to know about AI. And now we have, as we mentioned before we are, we are releasing this toolkit for partners and developers to basically get their feet wet with with LLM and AI.
So what what advice would you give them? You know, if they want to get start, if they want to get started with AI and start, do you know mind blowing stuff in? Yeah, yeah, Mind blowing stuff. With, with with AI. Yeah, I would actually say maybe you should not start with the absolutely mind blowing stuff. Unless that mind blowing stuff is really simple.
I think it's good to develop a maybe a little bit simpler feature initially to get your feet wet, which is maybe something the LLM is better for. Or you're like generating text or something like that and. By that same thing with language for.
Example for instance, generating language, generating an e-mail, generating a response to stuff, marketing text, other features like that, right? I mean, it's very good to start with something like just to get you started and, you know, feel around and also understand why testing is important and how you do this kind of testing and stuff like that right now. Of course our toolkit helps you a lot, especially also with the UI part of it, right? Because this is another thing, right? In general, you need to present this information to the user. You can't just let it run by itself, because, again, we need the user in the loop to kind of, you know. Yeah, so we have, we'll put some links again in the description. We have some API in al part of the toolkit, but we also have some UI limits that we ship with the product that will help you provide a consistent experience.
So if you use them, you will have a consistent UI and UX which resemble what we ship in our AI features. And we also have on Azure. There's also the Azure Open Edge Studio. Yeah, right. Which is also a.
It's very useful tool to kind of construct your prompts and play around with it and try other things. It's like turn around is immediate right and it guides you also a little bit in the different. Thing. Yeah, so you can try out your prompts without having to write a lot of codes around it, right? So you can refine your prompt and maybe with some of the techniques you talked about before.
And the other thing is of course to look at quite a lot as we were also put some links to all this information that was already on the Internet on how to do some of these features right. All right. Thanks, Michael.
A lot of you know, really, really cool stuff. Thank you for watching. We hope you enjoy the this talk we have, we have, we have some ideas for some of this on the topics, but there are some things you want to to to us to talk about putting in the comments.
Thanks again for watching.
2024-02-20 04:40