Introducing Oracle Enterprise Manager 24ai: Oracle Enterprise Manager Technology Forum 2024

Show Video

hello everyone thank you for joining us today for this Oracle Enterprise Manager technology Forum my name is whim karts I'm Executive Vice President of software development and with me is mugis minhas he senior vice president of Enterprise and Cloud manageability product management so this is the third annuals version of the technology forum and we have many sessions planned over the next 3 days the first day today we will talk about Enterprise Manager 24 AI the introduction which is part of this keynote and we're very excited about that along with seven technology sessions tomorrow we will follow that up with nine technology sessions that will be focused on Enterprise monitoring Cloud management and devops and then the third day the focus really will be on datab performance life cycle management and very exciting we will have our customer United Airlines as the customer Spotlight along with eight technology sessions so the main topic for the Forum this week is really the announcement of Oracle Enterprise Manager 24 aai and it's a privilege for me to be able to do that here in front of all of you Oracle Enterprise Manager 24 AI is our next Generation platform this follows 13.5 which is what many of you you are using today in on premises um deployments of Enterprise Manager and so one we renamed it to 24 to align with other Oracle products and secondly there's been so much change internally from a platform point of view that it really warranted us to show that there was a lot more to it than just yet another version bump and as we started working on 24 AI what we did was we created a set of buckets of focus areas platform modernization operational continuity AI insights and performance and Automation and I'll go into detail in all these four and we'll get some more insights into it so when we look at the first bucket platform modernization what does that mean well one of the things we've done is we've looked at the inerts of of Enterprise Manager which is currently or was in EM 13.5 a web logic container that contained all the code in one monolithic environment and and one of the drawbacks of that was anytime we had to do an upgrade or an update we bring down the entire environment patch it and bring it back up and so during that time a number of features and amount of functionality that was really critical for lots of our customers would basically be unavailable for you know 15 20 minutes or however long it takes for doing the upgrade so one part was how can we improve on that side and the second part is really it's a security element so focus on ensuring that we no longer use Legacy components that are difficult to maintain difficult to keep current when it comes to security vulnerabilities or potential security vulnerabilities that need to be fixed in our product and so we really brought the entire Enterprise Manager stack all the different libraries and third party components that are used to something that's current modern much easier to maintain and certainly a lot easier to keep secure for a long time to come another thing we did was we took a good look at the user interface and so much of what Enterprise Manager used in the past was a technology called ADF and so we switched to a more modern technology that within Oracle is called Oracle jet it's our JavaScript based UI toolkit we also use that in Fusion applications we also use that in Oracle cloud and so forth so it's a really efficient way of doing user development it has a really nice user template a great interactive model it's single page based so it's very smooth in operation it's easy to go from one page to the next it's easy to change the widgets inside it just looks and feels very modern and and it's a really nice upgrade from what we've done in the past and one other advantage of this is that we are now able to Share technology or share code that we do on Oracle Cloud when we work on Enterprise Manager functionality that is equivalent in or Cloud we can take a widget there and plug it into Enterprise Manager locally so we have developers working in one place and we can reuse the code in another place so the process of continuing adding new functionality and Enterprise Manager like we've done for the last several years will certainly continue and likely at a at an even faster Pace than before the third point is remote agent support so the prior architecture meant that you had to install Enterprise Manager agent on a particular server host Target which would then monitor that environment and the drawback is if you have thousands or thousands of servers and and databases you would have to install and manage that many agents as well and so by providing remote agent support and the ability to create small pools we now have the functionality and Enterprise Manager 24 AI where you can say I have four agents installed on a server and these four agents are able to manage a thousand other servers so it really reduces the complexity it makes it easier to do updates it also makes it a lot easier to deploy new targets and and so forth because there's there's a lot less to install on a given Target you just say this remote agent can talk to it one of the things that's nice with upgrading to the new Oracle jet user interface toolkit is the interactiveness and the responsiveness of of all of this so as I mentioned you know a a single page model that's just you go to the next page you go back it's just very seamless there's no page reloads happening because it's a single page application and one of the advantage that comes along with it is creating customizable dashboards so with Oracle jet we will allow you to have your own homepage with your own dashboard with widgets that you want to see very customized to your environment you can use SQL statements to build up a widget and say I want to see you know I don't know certain types of IO for certain types of databases then it will it'll basically graph them directly onto the screen and you don't have to do any special coding simple SQL statement creates a nice user component in our customizable dashboards now one of the things that um that really was important to a lot of our customers for many years was if we do an upgrade of Enterprise Manager then all the monitoring pauses we still get notifications in that get saved but there is no notification sent to the admins to the users there's no tickets logged there's no alerting happening just kind of paused for the period of time that Enterprise Manager would be in an upgrade State and so in 24 AI what we did was we basically split the internal architecture of Enterprise Manager of the MS into two pieces so we have two Two web logic containers one that contains the standard Logic for the user Pages for the the standard admin functionality and management functionality and then we have a second container that's really focused on the monitoring and the logging and the ability to create tickets and being able to in real time during an upgrade provide you with alerts if something happens somewhere in your environment and so we're really excited finally introduce that functionality into into Enterprise Manager so this is what we call zero downtime Services then one small technical item is the API Gateway so even though we have multiple containers running now the API Gateway still makes it so that there's one end endpoint that applications and and SDK talk to and then we internally rout it to the right place so that if something needs to change we do that behind the scenes it's not something you have to worry about so in summary this will provide you with a a very simplified life cycle management solution whether it's the agents that can run remotely or whether it's the ability to run upgrades of Enterprise Manager while that the actual monitoring continues it allows us to continue Innovation at an even more accelerated Pace because of the changes in the UI environment and it also gives you a better user experience because it will be much smoother much faster and certainly much nicer looking and we provide a lot of better scalability and performance because we're now doing multiple containers so we're being able to separate lots of the workloads to to multiple environments the second part or the second bucket is operational continuity and so this is still related to zero downtime monitoring first off we have the job system so if something was scheduled to run while there's an upgrade we will continue to execute these jobs very important for if a let's say a database event happens where a database server runs out of storage and you want to be able to execute a job to to expand the storage on that database but in parallel you're in the middle of doing an upgrade while we can still continue doing that update of the storage or if a critical production database were to have an issue whether that's a performance issue or or potentially you know a crash or a hang then the zero downtime container will still be able to talk to jira or slack or other services that we support to either create a ticket to do an instant pager alert or a text message alert and um from a admin point of view you keep your SLA so you don't have to worry about when I'm going to do an update of em because the the critical pieces that you need em for will continue to run and of course the agent pools are ha as well if you say here is a server with three agents installed to do monitoring remotely then they're set up in an ha environment so one agent will be the the active agent that's doing all the work and the other two will sync the state so that if something were to happen on the on the main agent then we will fall back onto the second one or potentially the third warrant one or however many you have in your in your pool so there's a an availability comp phone in as well so in summary maximize uptime uptime for Enterprise Manager the critical components but also uptime for all the targets you're monitoring operational efficiency because you don't have to worry about dealing with other things while you're doing an upgrade you have the ability to do the remote agents and have have a much more efficient installation of of agents and in both cases there's higher business continuity of course a continued fast response time even or fast incident response time even if you're doing upgrades and you will keep your slas the third part I want to talk about is AI insights and what we've done here is basically two things the first thing we've done is we've created and integrated a gen assistant into Enterprise Manager so em itself the user interface will have a a separate chat box bot interface that you can ask questions in a natural language form when you ask a question what we do is we send that to Oracle Cloud oci where it gets processed and we send a response back and that can be in one of two forums one is we've indexed all the Oracle documentation for database and Enterprise Manager and other components so if you have a question for what to do or how to do something in Enterprise Manager and you would typically go online and look on Google or go open the documentation and try and find it now you just ask the question in our chatbot and it will go back and and basically generate an answer in text in line in the chatbot for you the other part is you want to get some more insights into targets that you're monitoring and instead of trying to find which page an Enterprise Manager would have the right data for you or the right widget you can basically just ask our chatbot and say hey can you show me the top three databases with the following statements and what we then do is send that back to our cloud service and then what we return is basically a set of of dashboards and widgets that will allow us to visualize the question you had for you and one thing I want to point out here though is that we are not sending metadata or data from your Enterprise Manager repository to Oracle cloud we sent a question that you ask in text format but we do not send actual potentially proprietary confidential data from your repository to the llm running in Oracle Cloud what we send back is either a response from our documentation or we send back a set of widgets and components that allow us to visualize the data you want to see which then comes from your local instance so from a security point of view I think it's important to point out out here it shows you a little example screenshot of on the left side having the agent and you ask question and then the workflow that happens on the other end so we we create a web soer connection to Oracle Cloud we send the question over we do the searches for documentation or other things we make it in a more human readable form so that it's a concise answer and then we send the result back whether it's an answer in text format or whether it's a a set of widgets and a a set of dashboards so that's the first part and the second part is AI Cloud extensions and what that means is you know over the last several years we've worked on Ops insights and we've worked on Oracle database management and so forth as cloud services in Oracle cloud and lots of these cloud services since we have the backend database we can run lots of processing on the back end a lots of these Services have lots of capabilities around ML and AI and being able to look at historic data being able to look at things we've learned from various use cases to provide great responses when it comes to SQL insights or capacity planning or similar things so in Oracle Cloud if you use Ops insides today you have these these dashboards and components for SQL insights and capacity planning so what we've done with 24 AI is we've taken these components and embedded them also in the EM console so if you're using Enterprise Manager on a daily basis and that's really where your life is so to speak then all you do is connect em with your Ops insights account or service that you've deployed in oci and then you can use those widgets embed it within your console you don't have to then go and say oh I need to start a separate browser session I need to go log into Oracle cloud and see those things they're embedded inside em and they provide you with the exact same types of data so first off we do sync data from your em repository of course into Oracle Cloud to be able to do these things anyway so in this case there is a transfer of of data um but secondly because we've done these modernizations where we can embed jet components this allowed us to basically do this so over time we will add more dashboards we will add more capabilities that we've done on the oci side and enhance the em24 components the same way so you'll this is the first step or first foray into being able to deliver some of our AI capabilities from the cloud into Enterprise Manager itself so what that brings you a much more improved user experience of course because you can just ask a question in natural language we also make it easier to send data from the repository to oci where we do a lot of decision making lots of ml processing and bring that result to you locally so you don't have to have any special gpus or any special Hardware within within your own data centers to to do some of these things and it makes you the operator and the DBA a lot more efficient because finding something is just asking a question not go and dig through different user interfaces and trying to figure something out so with that we're now going to do a demo for these two so mcis will come up and do a demo of the Gen assistant and then also the cloud extensions and he will then continue the rest of all the cool stuff we have in Enterprise Manager 24 AI thank you thank you Bim Let Me Now quickly show you two quick demos on the AI features that Bim just talked about let me show you the Gen assistant first so this is the EM homepage a page I'm sure you're very familiar with and on the top right hand side is the icon for the new chatbot let's click on that and here you it launches the interface so here you can see we have two different tabs um as whim explained we have one llm model that has been trained on em metrics em Telemetry other on the documentation we're going to start off with the EM Telemetry um llm first so I have a bunch of exit datas on this system so now I me instead of going to the exit Data Page looking at the different um machines I'm just going to interact with this this new smart assistant and see what it has to say so I'll enter in this message box how is the health of my exit Data Systems says I have some findings for you let's open the findings so now this is launched a dashboard with four charts on it the first one is a database server CPU and memory chart uh uh second one is a dis iops and throughput then I had the database storage usage growth Flash iops and tro total so some of the charts you would say are maybe more pertinent to the question you had in mind some may not be but that's sort of the where the fine-tuning of the llms goes uh it doesn't get everything right perfectly all the time uh but in this case most of the information here is quite relevant so for example I can see that on this first chart I have um I'm showing this is showing me the CPU and memory consumption of each database node uh that I have in my in my exor machines so it appears I had three machines at least one in Boston one Chicago one in Phoenix the size of the bubble corresponds to the number of cores on that particular note so bigger bubble more core smaller bubble small core so it seems like I have some old exat some new ones and if you're wondering about what these things mean you can always click on the information icon it gives you more information about the chart okay so what we see here is on this particular chart at least I have two nodes that are somewhat high on CPU and memory consumption so let's click on this it gets you more information about the node what the node name is um what database uh machine it's on so this one is on X8 mco1 is the name of this machine what the CPU memory utilization is another one here this bubble uh apparently is also on the same xator machine X8 co1 and is also CP memory is high so if I wanted to now go further and say hm let see why is CP and memory high on these X data on this particular X data machine I can explore that further so let's do that so let me first go ahead and I'm going to pin this particular chart so that the next time I ask a question this information stays where it is so I pinned it now I'm going to go back to my message box and enter what is the performance of my databases running on x8m co1 but that's the one that seem to have be high on memory and CP utilization so let's take a look now this is showing the information about that specific X data machine X mco1 so I can see that I have some right on the first chart I can see have a spike on average total CPU usage per second um uh on at least two instances so here each line corresponds to a database instance so it seems like DB so1 and DB so two seems like one database two inist so seems like both my instances have a spike in their CPU uh the CPU time per second which is essentially average active sessions um the DB iio has gone down a little bit for these instances and here I can see my average sessions again a spike this chart shows me my incidence and events on that particular exed dator machine uh now um since we have seen the Spike let's explore further and see what's happening to this particular database DBS OE uh on my system so I'll come back first I do that remember I'm going to pin this particular chart so once again so it doesn't go away so I've pinned it now it's there now I'm when to ask a question this information will stay where it is so here you go show me the performance stats of day4 database DBS OE let's take a look so now I'm gone down from all my exod datas one particular particular exit data now to this particular database on that exit data and I can see that indeed average active sessions for this database has spiked up my uh DB uh database CPU time percentage is up database CPU usage per transaction is also up and this shows me a list of my top weight events by time so if I now wanted to further investigate to see you know why do I have these weight events for example it seems like GC buffer busy uh uh weight seems to be the top two main events I could now use our other uh llm model which has been trained on our documentation to see how do I fix those problem so I'm going to click on the documentation uh assistant and enter here how to troubleshoot buffer busy weights and when I do that it now shows know does the same thing it matches my question searches the documentation and tries to give me the information most pertinent to my question here is telling me exactly how to trou out Sho buffer busy vades uh it also points me to the link in the doc so I can go to the doc directly and it takes me the right section and I can see exactly what causes buffer busy ways and how to fix them so now this gives you an idea that how you can you know with these gen Smart Systems it completely transforms the way you would interact with em you don't have to necessarily have to go to each specific page uh and look at different chars and different tabs and so on just interact with the assistant and it brings the information to to you as opposed to you going to the information so that's the first demo next I'm going to show you the AI Cloud extensions demo so this is the database homepage a page I'm sure you've seen many many times so what's new here is this new tile called capacity and SQL insights with this Cloud icon next to it this icon indicates that in this particular case the information is actually coming from our cloud services right inside the M let's click on this title so as we explained we have two sets of capabilities uh Cloud capabilities we have brought over to em one is for capacity planning one is for SQL insights so here I'm on the capacity planning tab I can see what my CPU trend for this database is what the storage trend is what the memory is and also because we are able to forecast we can tell you exactly in how many days or weeks you will be hitting some capacity limits in this case in 2 days you will be high CPU High CPU by default is defined as anything above 80% average CPU utilization goes above 80% is considered high when you hit hitting your storage limits that's shown here too and any findings regarding SQL so it seems like I have two degrading SQL in this particular database and one I have a plan change which is support so let let's click on the degrading SQL to see what's happening there when I click on degrading SQL it shows me exactly what those two SQL statements are what the database name is what they are the command is the SQL queries it seems average latency DB time CP executions iore read uh grow execution as well as plan changes so it seems like this one has two plan changes and the second SQL has three plan changes in it so so if this point if I wanted to drill down further into the SQL then it will actually take you to the cloud service directly so if I click on it it brings up the interface for me to connect to my oci cloud service I enter my username information and then right it'll be a punch out to the right part of that cloud service showing me the information about that particular SQL statement and here you can see you have a lot more details about it uh what the DB timean execution have been over the last few months what the average latency per plan is is quite clear that as the when the plan changed this new plan beginning with 281 the average latency went out and what the io trend has been and so on and all the performance stats of the SQL are shown you on the left side so this are the AI Cloud extensions where we are bringing in information and capabilities of our cloud services directly into em so you as an em user can benefit from them so now let's move on to the fourth key Focus area in this release which is performance and Automation and the first thing I want to talk about are exod data and zdl packs so let's begin with exelor management pack so this pack has net new capabilities to Monitor and manage a single or a fleet of exatas at scale with it you can optimize resource allocation and performance we provide interactive dashboards at Fleet level exit data and even component level like VMS host databases Etc that shows you resources consumed versus allocated uh it allows you to quickly pinpoint specific servers across your entire fleet that need attention uh we also have the database impact advisor which as the name suggests helps you identify Noisy Neighbor problems on EX data shared infrastructure where one database may be impacting the performance of another database due to CPU or IO competition the database impact advisor also recommends the optimal configurations for each database to minimize the problems so next we have advanced excess scale monitoring this shows iops allocation across vaults pools and databases and Visually pinpoints the bottlenecks using sank key based data flowcharts we have integrated with the I RM advisor as well in order to provide analysis of flash iio resource utilization across all database to determine if the system has capacity for additional iolo another area where this pack helps is with Fleet scale automation so em now allows you to patch the ex dat infrastructure including host KVM guest zenom U toed server infin band Etc uh this pack also helps you with AHF management that is the autonomous Health framework management it deploys the latest AHF release across the entire fleet of exatas it schedules and configures compliance checks it automates EXA check runs and allows OnDemand heal checks for a streamlined maintenance process so it provides full Hardware software visibility across your Fleet with the ability to drill down into specific components or analyze the entire fleet these configuration visualizations help you analyze and detect outliers like outdated software or hardware and together with Fleet Maintenance they also helps simplify patching to ensure security and compliance finally we have support for all deployment types exit data on premises Oracle ex dat database clouded customer service and Oracle ex dat database or the xscs service uh and here more good news we've also backported the functionality of both the packs to em 13.5 R23 let's talk about the zdl management pack now this pack automates the onboarding and backup management of data basis to recovery appliances things like configuration backup scheduling and validation are fully automated as soon as a database joins an em group it inherits all the properties from the group including backup settings uh this approach not only enhances efficiency but also ensures consistency and reliability if you have advanced backup topologies like you know you're backing up one primary database to one Appliance and the standby is going to a different Appliance with replication between these unit units the new packs the new pack handles these configurations seamlessly there are more capabilities that provide Fleet scale Automation and hardening such as AHF monitoring that is automated for zdlr systems a fleet scale backup and archival management is also built into this pack and is based on article recommended best practices so by using the pack you are automatically following Oracle blueprints the zdl management pack offers workflow design to efficiently execute and oversee the archival to cloud or tape we have done a lot of enhancements in the monitoring area as you can see in the word cloud on the right let me just mention a few of them first the enhanced dynamic runbooks dynamic runbooks are used to encapsulate best practices procedures to resolve incidents in em24 AI the dynamic run books are enhanced to cover not just incidence and metrics which by the way are already supported but all other em artifacts like jobs and deployment procedures Etc as well so now you can create a runbook on any em artifact we have also modernize the incident manager UI in this release is has been redesigned for more modern look and feel and enables more efficient and it it it enables more efficient triage and resolution of incidence it highlights events such as critical incidence on production systems you know long running incidents or escalated incidents so you can quickly identify and address them another major enhancement is seamless pdb relocation so now monitoring of pdbs relocated from one cdb to another cdb will continue uninterrupted the target go in EM stays the same all history monitoring settings such as metric extensions thresholds Etc are all carried over when the pdb relocates so now you can relocate pdbs without getting any disruptions in monitoring the last thing uh is globally distributed database that I want to talk about uh you can now monitor the availability of all shards and replication units in a raft based globally distributed database this was not available before you can monitor things like transport lag and apply lag times uh between the ru leader and his followers and so on moving on to database management uh much to talk about here but let me just talk about three specific things awr for active Data Guard support in EM many customers run read only or reporting workflows on active Data Guard or standby databases this allows them to fully utilize idle resources and take advantage of the complete data set on standby servers maximizing the ROI so before Oracle database 23 AI managing awr in active Data Guard environments was let's say a bit challenging it required setting up of bi directional database links using SP files and you know manual intervention for switching awr snapshots every time an adg switch over in 23 AI we've introduced a new implementation that automates all the configuration and manual processes by leveraging the existing and very robust redo log channel this means no more setting up or managing database links during failovers and best of all this feature is enabled by default em24 supports awr for adg environment using this new db23 AI implementation and as a result the same performance Diagnostics methodology that you use today for primary databases can be leveraged for readon workloads running on adg databases next feature I want to highlight is session SQL history so before Oracle database 23 Ai dbas and application developers could only track long running SQL statements you know typically 5 seconds or more of CPU or IO with the session SQL history feature IT addresses that limitation in em24 AI session SQL history is shown as a t in performance Hub if you are a DB or application developer you can now see the recent history of all your SQL statements issued by a particular user or a session for performance troubleshooting so now a couple of words on awr Explorer so this feature provides interactive visual interface for DBS to conduct in-depth analysis of awr data visualize performance Trends detect anomalies and compare awr data across different time periods and even across databases the awr Deep dive analysis that this Explorer enables covers many many performance statistics and weight events and provides you a much more detailed insights than what is available with in an awr report today this gives you a quick rundown of the features new features we've added in in em24 AI so there are many benefits of the features that I've just described you know first of all they'll help you improve your slas with them you harden your security and compliance you can do troubleshooting faster uh you can improve the consolidation density and performance of your database fleets and also you will get higher operational efficiency so that brings me to the conclusion of my presentation uh I hope you have enjoyed listening to what we had to say and I'm happy to answer any questions that you have

2024-12-22 04:30

Show Video

Other news

How useful is an original Raspberry Pi in 2025? (ft Blue Raspberry) 2025-06-05 04:05

Google's Latest Attack on FOSS 2025-05-27 17:10

Nvidia CEO Slams US Chip Rules, Trump’s AI Action Plan | Bloomberg Technology 2025-05-26 20:00