How AI Is a Game Changer for Software Testing ASTQ Tech Track

Show video

so the first session today in Igor will uh welcome you to join us is Igor karolanco talking about how AI is a game changer for software testing and so uh Igor I'm going to uh change presenter to you and let you share your desktop hey everyone it's great to be here as always let me share my screen and I hope you can see it now yeah we gotcha okay today's um very exciting topic we're going to talk about EI and how it can be applied to software testing practices to bring the value right to the customers and when they're talking about AI usually I mean a kind of broad definition of AI I'm thinking about this it's kind of ability for computer systems to exhibit the humans behavior in forms of reasoning kind of thinking and learning and usually learning is based on obviously observation health and collection of the data and the most important here is ability to change the behavior of the system based on what the system learned usually it's associated with machine learning but if I machine learning terms used very often together so when we talk about that I want to First spend several words talking about the Press of continuous quality platform that's our offering which actually provide a lot of different tools and testing practices to help our customers um improve the quality of their software right and assess the quote of the software they can use static analysis from that purposes around running them again the code even before it's been compiled they can use unit testing obviously in front of functional non-functional testing we'll be offering soil test Solutions which focus on API and as well as we support mobile testing UI based testing and so on so serious virtualization is big part of that because it allows you to decouple your software from external services to improve your testing capabilities we can read all those about our products and offering some our website um today I want to emphasize that these three open systems because our tools integrate with open source Frameworks and you can run them as part of the ICD pipelines through the Jenkins or for example cloud-based Solutions such as gitlab git GitHub Azure devops the system obviously intelligence and that would be the focus of today's discussions about Ai and how we can behave in the smart way and it's actionable because at the end of the day the system helps the developers and tester to improve the quality of their software through recommendations through fixing some problems suggestions how the system how the quality can be improved probably the best way to look at all those different testing practices because we're talking about automation of testing it's through CI CD pipelines that's where traditionally you want to see all your testing happening right as part of your pipelines and traditionally we'll start on the very left or if you're talking about the test impairment at the very bottom because this is a foundations of the quality it's execution of static code analysis it obviously helps to detect any security a lot of different reliability issues detecting for example race conditions area handling it helps developer software development organizations enforce the security standards such as OS cwe served and another interesting side effect it helps to enforce development teams to follow the best development practices and in general it's great techniques for conventions and error detections uh it becomes uh standard for safety critical industry to adopt static analysis for compliance with different standards but you can see that it becomes very common for other verticals Financial sectors for example e-commerce because um the security becomes a big concern aesthetic analysis helps to detect those problems in proactive manner the challenge though when their software development organizations try to use static analysis it's not about Integrations and how to start using it it's really easy to integrate in your CI CD but when they turn on those rules and very often it happens in the middle of your development cycle so maybe even closer to release time their developers will be overwhelmed with for example detection of thousand different violations which start popping up and now they might be confused or overwhelmed with the fact like how they're going to fix them which of those violations they need to start working on them first and that's where we have multiple different techniques from AI perspective helping them to actually address that particular problems start working on those violations in performance matter the first investment we did from the perspective of AI is was a client different classifications techniques to violations which have been reported it again the source code and we using classifications here to observe the violations which were fixed previously by the developers inside the system and based on that we can predict which violations in the future come into the system has to be fixed or suppressed into each way I can give you example if you have developer in your organizations who doesn't follow the best development practices and obviously he introduced a lot of for example that code which will be triggered multiple produce multiple different violations at the end of the day they will be fixed the system will learn and associate with that developer that any future violations comes into the system which actually produced by that particular person most probably need to be fixed as well so it will highlight them with a higher probability that they need to be addressed at the same time if they have some rule which actually regularly executed and these violations which development team doesn't care much about and mostly suppress those violations eventually the system will learn that violations coming from that particular rules might be low priority and preferably might be suppressed in the future so this way this classifications technique allows us to filter out what we call noise and highlight those violations which come into system which most probably need to be addressed by developing teams the next AI techniques we're using is called Matrix factorizations basically what we're doing we again observe what violations each developers is fixing and then we create profiles associated with each developer understanding on the type of violations that they're working on so basically we call that we learn about their skills and now when the new violations come into the system we would automatically suggest which of the developers based on his experience in the past it's more suitable to address those violations think about this approach as you Netflix when you watch your movies uh you like some of those movies or some of it's not and the system kind of learn from that pattern and suggest you new movies to watch in the future it's very similar approaches here the idea behind this that the flow violations come into your system can be distributed across your development organizations based on the skills and what they did in the past another enhancement we actually made is helping the users to group violations according to root cause analysis we analyze for example in high space of violations which which exist in your products and then be detect those common problems which might be Associated for example with one single line of the code but that the particular problems that that code may be exposed in like tens or even 100 violations give you example for example you have problems with initializations of one of the objects in the class and then everywhere around your source code where you try to reference to the particular objects you will probably start getting like no point exceptions or some other problems so the exposure of that problem can be visible through multiple different violations but the root cause it's just one single line of the code so what we're doing here we detect those hot spots and we assign them to individual developers now instead of the benefits here that instead of people jumping and trying to address hundreds of those violations at once they're actually looking in this one particular single route now performed based on root analysis and find this particular line of the code address them and that will fix entire class of violations associated with this hot spot so obviously it gives the High Street Channel an investment for development organization and other techniques which we consider to be very Advanced and this kind of still work in progress It's uh based on actual leveraging of neural networks what we're trying to do here is um to build a cluster of violations which uh based on similarity of the code which kind of surrounding those violations and the idea and principle here is that when developers are working on violations always try to fix the violation in the context of the code right whether in the context of the methods or entire class so the developers kind of learning about the semantic of the code around that and when the problem is fixed we want that developers to start addressing another violations but in the same kind of mental context of the code which he already just learned so we're trying to find similar code in the entire project space which doesn't have to be exactly the same because obviously we're not looking for exact applications but semantically close enough so when we will suggest next violation to be fixed it will be in the content of the same kind of understanding of Paradigm and semantic of the code itself it's very difficult a problem to resolve we have to perform kind of vectorizations of all the methods inside the project and each method will be associated with what we call the signature which representations of semantic meaning of the code and now what we can do we can compare all the vectors between each other and find those which match the closest and that's how we will find the next recommendations of the violations based on the similarity of those digital vectors today we're using a neural network called quality back it's open source but we're looking into other Transformers which kind of can give you better for example signatures better vectors or give like improve the quality maybe with um less size of the network itself because we sensitive to how much we need to deploy on the usage machines and based on performance we're looking for something that will do it very fast if I combine all those different techniques together uh in specifically addressing this problem of handling big number of violations coming into the systems we can consider this workflow like let's assume five Talent violations come to the system right the first things we will do we will analyze them from classification perspective and clustering trying to filter out those which we will consider to be lower priority then we will cluster them to detect hotspots and perform the metrics factorizations to assign those violations which we treat at the higher severity to the developers based on their skills now every time a developers are going and fix the violations which we recommend to address we will perform this cut analysis and find the similar violation well violation could be different but find the similar code with with violations which user can focus on this all those techniques working together helps us to optimize developers performance when they actually focus on fixing those those issues which considered to be the highest in terms of priorities so that's kind of what we're doing on the level of cut analysis now the next system techniques which widely adopted is obviously unit testing and it's well recognized value in terms of uh it helps to identify regression issue protect your code from unexpected changes not to break the behavior reduce the risk of those changes and very often we know that unit test helps to write a better code design right if you follow the good patterns and may focus on making it testable usually it results in the simplifications of your structure of your code your architecture and it's generally produce a better code design and very often unit tests associated with specific code coverage which you need to reach on the higher level and associations of those quality characteristic back to your requirements though unit testing is common practice but it's not the easy one um because creation of meaningful unit tests means that um the test needs to be probably isolated you probably won't use some mocks you need to use some assertions to assess the quality and result of execution a few tests you need to reach certain level of code coverage and then other problems associated with unit testing is maintaining those tests over time when you've got the changes continuously so overall it requires a lot of efforts on the developer Society time consuming and it might be overwhelming we know that it might be up to 40 overhead for developers to continuously invest into unit testing so here we apply another AI techniques specifically Target unit testing it doesn't matter whether our developers are focused on creation and brand new unit tests would focus on targeting specific line of the code in terms of coverage or maybe they want to mutate existing tests and add more testing to uncover additional blocks or maybe even create a bunch of new tests associated with Legacy code generate them at once what we'll do first will build future sophisticated model that understand all the sources of your projects and then we merge it together with our float analysis engines which has capabilities to discover all possible Paths of executions of your code and it's a shared constraints with every path constraints I'm thinking about if conditions which conditions and so on which actually controls the flow of executions now which every the paths which we call recipes we associate the number which you presented which represent the complexity of the code um what the coverage number can achieve if it goes with that path maybe uh how much mocking you need to implement as you do and based on the scoring systems our intelligent engines actually pick up those recipes which we we think will give the biggest benefits to the customers and we resolve all those constraints associated with those selected paths we will discover what uh variables how they need to be initialized as the input parameters to your methods or for example how your marks individualized to specifically a route execution of the test through that choosing the path will generate the unit test with all the interest relation values and then in background we'll automatically run that the purpose of that would be to measure the coverage and do some test optimizations as well as Auto generate assertions so end result will be automated way to create the unit test and we strongly believe that autonomous units as Generations is actually the way to go but we know that um state-of-the-art AI today might not give you uh the 100 code coverage and it really depends on complexity of your code so if you're only relying on autonomous test Generations you might hit that threshold about which you cannot go unless you as a developer start investing and writing those unit tests yourself adding on top of what was already generated so that's why we believe that their end game results here would be combinations of autonomous unit test generations with some kind of assisted test creation technology which can guide the user uh to help them to clone those tests for example create a new mutation generates marks in automated fashion help to generate assertions at the end of the day combined approach will allow the developers to achieve that higher level of code coverage might not be 100 but could be 80 90 depends on the policy of development recommendation I mean not only focus on unit test Generations but as I mentioned before we measure the coverage associated with them and then we apply some optimization optimization techniques because some of the unit tests might not bring much of the value it's maybe exactly or duplications of execution Pathways which were executed before so we try to optimize by removing those redundant tests that's our one of the way to increase the immediatability of your testing infrastructure reduce the cost associated with maintenance of your test by removing those which kind of dump it the next level of testing which we promote it's mostly shifting now to kind of end-to-end testing or like functional testing uh we're talking about API testing API testing techniques it's very um good in terms of uh how fast it can be executed you'll get quick response from the systems it's very reliable and here we offer an EI machine learning techniques which helps to create those tests in automated fashion based on the reservations of usage pattern I'll talk about this a little bit in more details and of course the value of API test on that same so that you can reuse them for load performance testing or even for penetration testing I'm not going to cover that part but if you'll stay with us the sessions um I think third sessions today will be talking about this particular area of functional testing where we'll talk about apis performance penetrations and we'll talk about serious visualizations which works works really well together because it helps to decouple your software your program under the test from external services through our simulational behavior of those services so right now I just want to more focus on how we help the developers to how to generate those API tests and for that purpose we using smart Apache generation technology the way how it works well first of all where the problem is it's traditionally when you are testing your applications through UI you kind of don't think about which API is using are used by the front and underneath and the business logic of those usage of apis built in inside the front end and you basically execute the actions inside UI and all the Magics happen behind the scene but when you are focused on API test Creations in terms of testing you my little documentation they understand description of each API method but it might be puzzled in terms of how to combine them together in some functional flow right and that information might not be straightforward and obvious especially if you're talking about very complicated systems hundreds of apis so what we're doing here we help users by intercepting the traffic which comes from from um from UI it doesn't have to be UI it can be from another external system as long as we have access to the traffic we build the models which understand all the inputs and outputs uh all the parameters which actually ascended to their system under the test and the results returned back we can do parameterizations of those parameters um understanding their meaning and and how they've been used and then based on these informations we how to generate those API tests total decoupling them from UI so at this point you have your test which can be independently executed and from applications point of view it's actually exhibit the same behavior as it was during the three UI and after that you can change the behavior as much as you want because you have your base of API tests created for you so generally we um have plugin for your Chrome browser which basically detects the traffic which comes from their browser based on UI test executions or manual test executions but it doesn't have to be just UI we build this system the models which kind of represent the capture the behavior of apis and the automatically generate those tests for you inside the cell test obviously the more you use the system the more we train our models we capture more advanced scenarios and then we will be able to generate more advanced tests as a result of that and we make some enhancements specifically to tune up for testing package Solutions such as Salesforce and guidewire so you can take advantage of that the overall concept behind this is record a new play where EI actually focused on building those tests for you and then you can replay them as one as many times as you want even though we promote API test is one of the fundamental Financial testing techniques it cannot move though without actually testing your UI right um it's de-emphasized because you know that UI test is very flaky sometimes um they're not stable it takes much longer to run them but because there's a lot of business Logic on front end you kind of have to test it as well and for that purposes we have selenic it's our webi based Solutions which we built on top of selenium framework it helps to automatically create selenium test for you but another level of another value which it provides it's actually self-heal your test during executions at run time and it can be solve healing based on broadband locators or for example based on weight conditions on top of that it's not only for mitigating the problems during executions but then at the end it will generate very Advanced sophisticated report telling you exactly what was wrong during executions how the problem was fixed and it will give you recommendations as a to developer how it can change and fix your test to address those problems inside your code on the high level the way how it works it starts with the same plugin inside your browser which actually record all the user actions executed Journal test based on that informations we can automatically generate selenium tests following this best practices using like page objects model and when you run those two learning tests and they don't have to be only artists generated by selenic it can use your own selenium test assuming you already have for example 100 000 of them let's assume you run them as part of ACI CD pipelines from Jenkins for example we observe the execution of those tests and as it tests running we capture all this like informations about your Dom models we understand relationship between all this uis element on the screen if test fails our engine goes back to historically discovered that previously executed tests which were successful make a impressment to figure out what was changing UI where those elements moved and maybe ideas change and it will automatically recover all the test replying this uh based on this information is learn from the previous successful execution and try to recover and rerun the test continue running this test it can be actually weight condition as well the benefits here that when a user comes back to the office for example he runs launch sessions of UI test executed overnight in the morning it's not that he will discover the test was broken at 5-10 minutes after it started and then he gets nothing and then weighs wasted like entire 12 hours it will see results of all the tests being executed will generate reports which will highlight which tests were kind of self-healed and then we'll send those recommendations back to their ideal environment where user developer can fix those tests the interesting point here that you can reuse the same recorded actions and traffic from captured by the recorder from the browser to automatically generate API tests as well as I mentioned before so once you record One manual sessions through the browser you can't produce your UI based testing and it can automatically create your API test and run them together and as I mentioned before they are tuned up for Salesforce and guidewire system as well so again the same concept applicable here record and replay was in this particular case with self-healing capabilities overall I if you look at this testing permit we kind of went through the bottom up and talk about all different automated testing techniques and how we can enhance them Define those different AI machine learning methodologies on the very top though we um what's left here it's like manual test executions and we know that manual test takes a lot of time to run and uh even though we're promoting automations everywhere but sometimes you have to run manual tests as well here we're trying to help our testers and developers to optimize their time of distributions using what we called smart test executions it's based on test impact analysis and actually these techniques not only applicable to manual tests but it's actually applicable to all um testing techniques which I mentioned before um executed inside for example your cicd at runtime the way how it works is that with every test which we have we measure the code coverage which associated with the test execution so we have this correlation between test and all the each single line of the code which actually touch when their developers changing the sources and we can perform analysis based on the source level or we can analyze your binaries for Java for example we can detect exactly which lines of the code will change and trace it back to those tests which associated with those lines of code now the system is smart enough to recommend based on what was changed subset of tests which you need to re-execute to validate those changes and only and that will be sufficient no so instead of running like thousand thousands of tests which can be very time consuming you can just spend friction of the time we run only small subset of those tests and still get the confidence about the quality of your changes and as I mentioned before this techniques works with execution of automated tests but this unit has API tests selenium-based on UI test or manual testing as well okay overall it doesn't matter which um testing techniques you're using um we have the centralized system for analytics and visualizations which collect all the informations coming from those different testing practices we cover a bit like informations about your code metrics about your static analysis violations results of test executions code coverage and so on and the system can visualize this information for you and at the same time make some assessment about overall quality of your application so it's kind of holistic view on the test quality of your projects and I'm not going to talk much about that again you can visit our website and read more about this um there is probably one of the last slides which I just want to kind of summarize that we've focused on application that we are machine learning in the testing through obviously automated test Creations uh optimization optimization of test executions um self-healing um as part of that and basically noise reductions when we're talking about setting analysis violations and try to optimize the users work and developers time to focus on fixing and testing what matters the most you still produce high quality products you can read more about our products and offerings on our website and where we can scan a QR code here to get access to our consistent continuous testing for devops white paper and um at this point I'm ready for any of the questions uh if you have anything right now we can talk about it all right thanks eager we do have a few questions here um one is when should I choose between UI and API testing when I'm looking at these choices honestly you probably want to optimize and emphasize API testing right because again it's works much faster and more reliable and it's do full validations of your applications the tests and some systems actually don't even expose any UI so you don't have a choice you have to test some rest apis nevertheless I would not say that the only that the only testing techniques you need to use gives UI on its own it exhibits some uh Behavior business Behavior which built in inside that layer and obviously you do want to validate it functional properly so there's certain amount of tests which you need to put in front on on the UI but not for the sake of testing functionality of your system as a whole but mostly you will put those UI only for validation verification that your business logic built in your front-end applications works properly then the rest I would delegate your API test all right so uh Nathan I see you've popped up here so we uh I've neglected to uh introduce him he's our director of development so uh he's here to help answer questions so again if you've got any others you know like get them in now um another one was uh what's the best way to to make use of automated unit test generation with existing manual unit test efforts Nathan do you want to take this one sure um well I think it depends a little bit on where you're starting from in your code base so um if you have a code base that has very few unit tests uh you might want to start by um generating a whole set of unit tests to cover your untested code that would be particularly important if you know that you have a maybe you have a large refactoring you need to do in the code or some you know new updates you might need to do uh you want to have a good safety net of unit tests before you start making changes so if if you don't have any tests at all you should you should probably generate some um we find that we can get you know very high coverage and very short amounts of time um through through doing that um if however um you know you have a nice set of unit tests uh already but you're just adding you know you're doing your normal development then um you would use it as you go to speed up the time it takes to create unit tests so one of the key places and I think our next session is going to talk about this but um you know he said he's places where unit test takes a lot of time is creating mocks if you're using mocking so um the assisted workflows that Igor talked about can help you build unit tests faster even if you're just building them as you go so I think there's kind of two different ways you can apply it um but both of them save you time and creation of the tests right we have another question that says how helpful are AI augmented tools today do they really work I I believe they do I mean we again it depends on what level of the eye you're talking about and what's your expectations are if for example you expect that EI techniques and Technology will automatically perform entire testing for you in this in the matter of just clicking the button and magically you'll get for example hundreds and coverage through unit tests created by AI or UI will explore and discover all possible Paths of executions through your UI without any input from you and actually will be smart enough to understand expected behavior of your system to test it properly I don't think we're there yet that's actually long-term goal where the entire industry going into but at their practical applications which can help you to become more productive and take away those um boring tasks right from the developers and let them to actually take care of more intelligent creative work taking the rest on their own that's where I think the eye actually really works all right yeah I mean we I mean we found like the techniques that we applied for unit testing I mean we had unit testing a few years ago where we would automatically generate tests but the way that we generated them the style they were brittle they looked like a machine wrote them but now with the techniques that we have today we can write them generate them in a way that looks like a human wrote them they're well organized they're clean they cover what they intend to cover um you know as Igor said they're not we're not getting 100 coverage but we can get a large percentage right so I think that the AI techniques are working people are finding value same thing for the API tests um you know stringing together a set of apis people can do that by hand but if I can record that and identify the data values and how they pass through and the the AI can put that together for me even if it's not even if it's only 80 or 90 and I have to tweak it I just saved myself a lot of time and that's what we find is we save a lot of time using these Tech yeah excellent uh there was a question about these AI models for static analysis can I share it across different teams and projects that's a great questions and a great idea why not right we actually did exactly that did experimentations trying to build the models again the code base associated with one particular development teams and try to apply that to another projects or different teams members what we discovered that predictions is not very well transferable because it captured that behavior of those particular developers and patterns associated with them and apparently when they try to apply to different projects or different development teams it's not really uh Apple tab transfer right so we find it's much more efficient to basically rebuild and create new models specific for that environment for that project and for that team and the the model based on this classification techniques will learn learn from that experience and it will be much better in predictions of outcome based on that and there's a corollary question to this right and you know when people hear AI training they get like scared because now I'm thinking about you know massive number of CPUs and and hours days weeks how long does it take to train these static analysis models it really depends on the well there's a different way how you can train this through manual process also for example give you a subset of violations which you need to review and tell us whether you want to fix them or suppress and we'll just learn based on that we discovered that about 20 plus violations would be sufficient enough to give you reasonable quality model or you can automatically train on history of if you're already accumulated a lot of violations in your systems we can send by based on that and if I take one two several minutes depends how far you want to go back to the history I based on our experientation so you don't really need to go through everything uh as I said just on even on manual examples 20 plus violations would be sufficient to create good quality models to start with and as you continuously working on through your violations and fix or suppress them the beauty of the model will learn as you go so the model will kind of improve observing your behavior so you don't need to do it once and just forget about that no yes as you use in the system and it goes through that it will adjust and and becomes more smart about what you're doing okay there's a question about the does the AI test only write happy path tests or does it take into account negative and Boundary information as well yeah so if we're talking about the unit test generation it's um both so the the goal of the algorithm is to find is to cover as much of the code as possible so to cover all of the code you have to handle the negative cases um so basically yeah we're looking we're looking at both when we're when we're writing the tests okay well thanks uh thanks everybody it's been a great session

2023-03-07

Show video