Dell DPA Analysis Engine
Good Day everyone. Welcome to DPA Academy again. And in today's edition we'd like to talk about DPA's analysis engine and how you can adapt it, use it in something like cyber anomaly threat detection, just by simply using what's out of the box, what's already within the DPA UI. So DPA's always had an analysis engine and it's used to analyze the data that is already been collected in the DPA datastore and can act upon the data as it is received from the DPA agents or agents. The analysis engine uses analysis policies which you can assign to a single object in DPA or to a group of objects, whether it be standard or a custom group that you create in the DPA UI. The analysis policy is comprised of rules and these rules contain the logic and data variables that the analysis engine needs to act upon when it changes state. Thus, the analysis engine is a stateful engine. It acts upon data changes, state changes in the
variables called out within the rule or rules. It does not run reports continuously. It alerts you when the rules or detects a threshold breach or the logic in the rule or rules yet triggered. So the analysis engine does not produce reports by default, but it could do if it's part of an alert or trigger. You do not need to visually inspect a report or check emails or inspect a dashboard in order to detect an anomaly or threshold breach. This takes it one step further, it alerts you only when it happens and when it alerts, it will be automatically displayed in the DPA alerts tab of the DPA UI. So there is a record of it and or it can also trigger the external alerts defined by the rule or policy.
And these can be one or more of the following. That is you can have all of these or any combination of these for a particular rule or policy. And that could be send an SNMP trap to your favorite event management system. Send an email to someone or to a group
or a distribution list, execute a local script provided by the customer or the user. For example, a script that will take the data and send a text message to a distribution group or create a Windows based event log entry for those systems that are continuously scanning Windows event logs for actions. So as a use case example, one could set up a cyber threat anomaly detection policy with rules to detect possible data changes that could indicate a cyber-attack in progress. Now remember when I say data changes,
we're talking about the metadata about the job, not at the actual data within the backup job. So one could start looking at various attack vectors and select the DPA rules, the appropriate rules to detect those types of changes, So, for example, in our cyber-attack vector of ransomware, it is usually done by encrypting your data, And when you encrypt your data, what will happen is your data becomes very unique and therefore when you write to a de-duplication target like the PowerProtect DD, it will suddenly have a lot of unique data that has to be pushed through the data domain. So you would want to try and pick that up by looking at the average of how much data gets sent to data domain versus what was sent last night or the last backup. And if there was a huge deviation, then there's something going on. There is a possible anomaly,
Or you could look at the file system utilization on the data domain, and that is because all the user data on a data domain gets written to the data file system. So this file system, aka your PowerProtect DD utilization suddenly hits 80%. Then there could be something that is writing a lot of unique data to your PowerProtect DD, which could indicate a cyber threat in progress, Another possible cyber-attack vector is an insider attack or remote execution where someone could go in there and take control of your backup appliance, disable it ,misconfigure it somehow so that you're not able to take backups, In other words, you don't have a backup to fall back on. So, we could use a rule to detect some backup application configuration changes, Whether there's some tape drives, were disabled, whether some backup devices were disabled, whether some schedules were disabled, Or clients being disabled, or clients being removed. Another cyber-attack vector is unavailability or exposures, So, the modus operandi is to restrict your ability to do DR. So, you want to make sure that you have data being backed up. You want to make sure that you are able to recover. So, you look at your exposures, for example, three strike
failures. Do you have a client that has been consecutively failing for three days in a row or many backup devices unavailable or perhaps in the case of networker you don't have a networker bootstrap generated in the last 24 hours. Trigger and alert and tell me when these things happen. So let's move straight into the UI. How do we do this? So in the DPA user interface,
you go to the analysis policy section and you would create policy, And let's give it the name. Cyber threat anormally detection, So that's just naming the policy. And you can see at policy level I can set these exon triggers if I wish, but I can also set it at rule level. So now let's look at what types of rules we wanna add into this policy. So you
go add remove rules and you'll see a list of all the standard rules out of the box. On the left hand side, you're able to filter using the filter final icon if you so wish. And for example, I'm looking at something that's related to full backup. There we go, And for example, I'm looking for the rule, full backup large on the average for time window. And I will add it through into my policy, If I say okay to that, you'll see you'll be prompted. These are the variables that are associated with this particular rule. The days of history, two weeks. So in other words,
have a look at the last two weeks and see what the average size is of this full backup. A deviation of 50%, that means it has grown significantly, So we'll leave it as a default. If you wish to tweak them higher, you may sort of do so or lower. That's entirely up to you. So you can see here we've added this rule in. And then I can also further add another possible
example for ex another use case scenario for cyber threat analysis. Let's look at the bootstrap. And there we have it. Same thing as before. You can select it. Add it in. Okay? So far how many days you would consider as having no bootstrap, Okay? So in this particular default is two days. So a minimum of 48 hours,
Anything over that is a problem. And of course you can tweak it up or down if you so wish. I'll just leave it at the default for you. So you can repeat this process to add more and more rules with the use case of cyber threat anomaly detection in mind, Just to give you an example of how you can set actions when such an alert is detected, such a breach is detected, you can highlight the rules, click on edit actions. And so if you say no actions, it'll do nothing except log it in the DPA UI alerts tab. You want policy based actions
or rule based actions, So rule based means that is particular to this rule. You can send an email, And if you have an SMTP mail server defined, it can send emails, You can type in actual email addresses and, but of course the better way to go about it is to email an exchange or Outlook distribution group. You can run a script, but obviously you'll have to copy that script onto the DPA application server. But the alert does pass over predefined variables to the script that can be used, So you can see here it passes through at least 11 variables through to the script.
So, you can create a script to pass the variables and do with it as the script pleases, generate a ticket within a ticket event management system a support ticket a script to CR call, a mobile application to send a group text, et cetera, So a very generic way of doing it. Also generate an SNMP trap and send it to your favorite event management system that you have in your environment, And you also have the ability to send it to multiple SNMP destinations and the ability to customize each one, And of course, write an event to the Windows event log if you so wish. And as you can see here, I have ticked all four. In other words, whenever this rule is triggered, send it to all and trigger all of these actions at the same time. Okay? So you could do that if you so wish. Okay, I'll leave this one out so that I can save it for now. And then you can see that it's rule based and you can see all the rule parameters when it was lost updates and all these good things in the DPA UI, So the aim is here too, is to create a collection of rules in the analysis policy to detect a cyber threat that may be in progress. And you can look
at deduplication rates, compression ratios within the PowerProtect data domain. You can look at the job data, you can look at the data domain data, et cetera. So there's various aspects that you can ask DPA to analyze and trigger where there is a state change.
Now that you have created an analysis policy and you've assigned some rules into it, You also have the ability to go and disable the rules within that policy, And if you are new to DPA analysis engine and new to setting up this kind of thing, the recommendation is to just enable one policy at a time, So if you enable the first one, if it works as you wish, then start enabling the second one. Don't do them all at the same time, You may get flooded with a lot of emails and alerts, because some, not every single rule is applicable to every single environment. So, you know, some environments have more failures than others, which is a possible accepted norm. Some customer environments have a lot of backup application changes per day. , so it really depends on the customer environment itself. So the rule recommendation is to enable one rule at a time and then test it.
So in this particular example, I've deselected all the others and just left this first one, I can click on, And once you have done that, right, there's one last step that you need to do. And so although you've defined the analysis policy with the rules in it and selected which rules you want to enable, you need to let DPA know on which objects to run this policy, the cyber threat anomaly detection policy, And that's when you go to applied analysis policies tab. And on the left hand side, DPA will show you the group structure, the tree structures, I like to call it, of everything that you have in your environment right now, as with always DPA by default creates a configuration folder with everything in it. This is ideally meant just for DPA configuration purposes, for reporting purposes. Customers are recommended to create their own folder structures, their own tenancy groups, So you could create a folder like, company group, and then you can put a folder in there with all your data domains, your PowerProtect DD, and all your Networker servers, You can split the site even further to create a group to say, these are my production critical hosts. And another folder for, , mail servers and another folder for development servers, It doesn't have to be service. It could also be a list of clients, So it doesn't have to scan the entire backup survey.
You can actually group hosts together, backup Clients together, and have the anomaly detection just on those, So if your analysis policy has a mixture of backup application, metadata analysis, as well as data domain metadata analysis, you could apply the policy at this level, Because you've got data domain objects and you've got networker objects below it, in which case it'll apply, it'll apply for both, Or you could split out the end cyber threat anomaly detection into two policies. One for analyzing data domain and one for analyzing Networker service, So both ways will work. So once you've decided you want this group is eligible for cyber threat, normally detection, it's just a matter of turning on the policy and applying it, You don't have to restart dpa, you don't have to restart engines, you don't have to do anything of the sort. It happens on the fly,
If you want to switch policies, you can also do that as well, So if you, there's a current policy being assigned and you want to change it, you can change it to a different policy too if you wish. And if at any time you want to switch it off, it's just a matter of switching it off. So the key thing here that I wanna point out is that be cognizant of the rules that you are using, if the rule is related around backup metadata where you actually analyzing job data, you'd be applying that to backup application objects like Networker and Avamar. If you're looking at a rule for data domain, then you can just scan data domain, And when you, when you ask DPA to look at these backup application data, for example, NetWorker, there is also a prerequisite that we get the data from the backup application to be, to enable us to do that level of detection, So in this particular example where I spoke about size and size scanned and detecting the changes in the amount of change data going to the data domain, that's only possible if we get size and size scanned from the backup application. Now, we do get that for NetWorker, and we do get that from Avamar, and we busy testing it for PPDM as well, but our third back, third party backup products like Netbackup either does not expose that level of detail to us, or DPA doesn't collect it just yet, So that kind of rule may not work for third party backup applications, but you can still use the file system utilization high for data domains in a Netbackup environment. So you could set up a cyber threat anomaly detection using DPA just for the data domains and look for anomalies from the data domain perspective, You can still do backup application configuration changes on Netbackup three strike failures on Netbackup as an example. , but most of these rules are for Avamar and NetWorker,
So that said, let's go through a couple of tips, Start with one rule and build up the policy. Don't enable them all at the same time to begin with. Otherwise, you may be flooded with alerts, Not every single rule will work great for a particular custom environment. If there's
a great rate of change or a great rate of failure, which is considered normal, that may produce a lot of false positives. So test these rules out one by one first, Add one rule at a time to build the policy that is right for your environment. Just reiterating what I just said, This is not new functionality in dpa, it already exists. And when using the rules related to backup metadata, it primarily applies to NetWorker and Avamar not third party backup products yet, So PPDM has been qualified for this use case at the minute if you enable a lot of rules and you apply it to a lot of objects. Please also note that this may also increase the DPA application servers resource requirements, both in terms of CPU and memory.
So you may want to, , review your DPA server sizing before you embark on this. Thank you.
2022-12-10 22:20