TBS - 02 – Nicolas Darchis – WiFi Troubleshooting

Show video

Hey, are you tired of manually analyzing line after line of dull text while troubleshooting Wi-Fi? And you really want to save that time by using awesome tech tools? Then you really want to meet the guy that created those tools. Keep watching. And they want to test your network. They want to bring it down.

Why should I care about troubleshooting at all? Anyway, he went way off script so we can actually throw away all... All right, good to have you back, guys. It's great to have you back with us in The Basement Studios. Indeed. Hi.

Hello, everybody. So our episodes wouldn't be full without touching an extremely important topic, which is troubleshooting. Well, some of some of the people, they tend to claim that, OK, I've already invested into the latest and greatest wireless solution in the industry.

Why should I care about troubleshooting at all? Well, probably over time, those people look like this, right, Steven? Yes, because doing the right investment and then sticking to all the design rules, it takes you far. Don't get me wrong, but it takes you only to about 80%. And when you leave 20% open, well, Murphy's Law still applies. Yes, so to touch upon such a serious question, we asked one of the most experienced leaders in the wireless technical delivery center to join us and to talk through it. So please welcome Nicolas Darchis. Hi. Hi, Sofya. Hi, Steven.

Hi, Nicolas. Thank you so much for joining us. So, Nicolas lives and works in Belgium. He's not just an active CCIE. He was one of those brave guys who jumped into this challenge when this certification was just released. So he is also active CWNE and also he has a rich experience in tech.

He mentioned that he submitted over 800 bug in his career. So no doubt that with such rich experience in tech as well as rich educational background, make him an author of CCIE Wireless Study Guide as well as Cisco Live speaker. Wow, that is impressive. You use a lot of tools, Nicolas.

And one of the tools that you developed, actually, and became very popular is the Wireless LAN Configuration Analyzer (WLCCA). Can you can you tell us about the backgrounds and what is the use and how to use it? Absolutely. So that started actually maybe close to 10 years ago because as tech engineers, the Sho Run Config of a controller is really the the Sho Tech of other technologies. So it has all the information, but it's a lot of information and you always need to...

it's a lot of text to read in the first place. Then people had several controllers and then you need to cross-reference what if you configured something in one controller and it is configured wrong in another controller? It's virtually impossible unless you spend hours to find that as a human. And the tool, which was a script to start with, was just a crosschecking several configs of several controllers to see if there were no inconsistencies. And then we added to it. We added to it. We added more. And then at some point it's like, why not let customers benefit from it, right? There's nothing secret in it.

So it just got published. I can I can actually show you the page we have on the DevNet website: developer.cisco.com. There is Wireless Troubleshooting Tools. And there you have all the tools that we have published for you to use. There are actually two Configure Analyzer by now: the classic one, which is called the Wireless Lan Config Analyzer.

WLCCA. [which] is a desktop application. It started as a desktop application for Windows. It is cool because it adds some web, some graphical visualization of AP's proximity and to visualize the RF neighborhood. But we also move now to a cloud version, which is the Wireless Config Analyzer Express (WCAE). So this cloud one is a lot more updated and supports the 9800 controller.

So we have all the latest and greatest touch, catch and gotcha's? in there. Every time we find a couple of tech cases on a specific topic and there's a catch, we actually go to it and update it with an additional check, if I can say so. So using it, it's very simple. You just upload your Sho Tech wireless in case of 9800 or a Sho Run Config in case of AirOS, and you will get, you will get the results and a list of alarms, a list of top issues for your controller and what you need to look at. On behalf of hundreds of our customers, thank you. To everyone, I think.

Yes, indeed. So do you basically use it in TAC on your daily basis to navigate through the... [Yes.] Oh, that's cool! Every time we have a case, basically, even if we believe we know what the problem could be, we run a check because we're going to give side advices to the customer like, "Hey, this is not why you opened the case, but watch out for this and watch out for that. It might explode, you know, in the future." Or "Maybe it's normal, but you might want to take a look at it." So we always run it. Yes.

Well, that's really cool. By the way, what do customers need to get this tool. Is it free of charge? Absolutely free of charge. You just go to developer.cisco.com and you can use

it. Yeah, well, actually, that's one of the really cool tools that you can use in order to actively and proactively troubleshoot your network, which is really nice. And the fact that it is free for me as a Dutch person, that is, that's an extra plus. Nicolas knows it's true Yeah. As a Belgian we're cheap as well. So yeah. Cheap people for the win.

Oh that's really funny. OK, so on a paid side, on the paying side, I also wanted to ask you about another tool that our customers can use in order to find some root cause when they are troubleshooting their Wi-Fi. And obviously Wi-Fi is not something simple to troubleshoot because there are so many reasons when infrastructure might not work correctly. But not only infrastructure, but also it can be client device, applications. .. So many things. So basically DNA Assurance can be a really nice tool to intuitively navigate our customers through the troubleshooting process.

So could you please give us a little sneak peek on that tool? Yes. So, I have to say and I have to admit to come clean, historically, I was never a fan of any NMS network management solutions. Oh, really? Because I understand they were useful. When you have a large network to manage and you have a lot of devices because you want to push the conflict to all devices at once and check for consistency. So they have always been helpful for that, but for troubleshooting because they bring to you all the issues from every network device without any kind of sorting, it was unusable. You would have thousands of dollars per day.

And unless you have a full team dedicated to browsing through it and even knowing the device and knowing, you know, if it's real or not, it was just unusable. So I know customers were tending to ignore those alarms. Right? So where DNA comes to the rescue is that Assurance brings intelligence to this process. So there's all you know, all the config part is still there and it was redesigned and all of that. But I'm focusing on the troubleshooting parts. Right, because I'm biased, obviously. So I was never using an NMS tool before to troubleshoot.

I was like, give me access to the device because I don't need another layer to get my information. I want to get the real information immediately. And now it's not the case anymore. So when the customers have, I would say, kind of a blurry problem description and in wireless, that happens a lot. As you know, like we have some issues at some remote locations and some clients are reporting, you know, some disconnection. And you're like, "OK, could you be more specific?" And the poor engineer on the other side is like, "No, I cannot.

This is what end users are telling me." So what are we going to do? In the past I would be like, OK, let's enable hardcore debugs on specific controllers, specific APs and hope to catch the problem. And that could last a very long time and that would frustrate both the customer and tech as well. And we would not get very far. Now, if you have the chance of having DNA Center, Assurance will actually help you to do this, this narrowing down in the first place. So if I can show you, I have a couple of [examples from a] real story from just last week, That would be lovely.

So I have redacted it so I've hidden the customer information. Don't worry. The first thing is that you get the logs on a controller or an access point, but they're always or nearly always instantaneous logs. You do not get the history, right? Or very little of it. With DNA Assurance you can actually see in the event viewer, you can see the history of channel changes, power changes of an access point.

So if you're suspecting an access point to be problematic, to detect interference, and constantly change channel. You will actually see if that's the case very easily without running commands or anything. So to me, that's the first thing I really love about DNA Assurance that it brings history in the picture. We call that the "network time travel", right? Is that is that exactly? Yeah. Because typically when you look at it, the problem is not there anymore. But it was there just before you were looking. Right? So this would improve it.

OK. And I had this other customer with this very specific, unspecific, very blurry type of problem, which is some users are reporting some issues on some access points. So just by going to the issue tab on DNA Center, on DNA Assurance, we have a list of hot issues.

And because it is actually counting each issue, how many how frequent it comes back is going to say this AP is throwing more alarms than the other. So the top AP is this one. It's the most problematic one. It raised 12 issues today because of high channel utilization. And then you have a categorization which ones are the worst APs to look at.

So that's the first really cool thing. Then we're like, OK, you apparently high utilization is a problem. Remember, the customer said "I just have SOME clients reporting SOME disconnection." Right? So, OK, high channel utilization. But the customer assures me that they don't have a lot of network devices, not a lot of traffic.

So we take a look at the AP 360 view and we see the channel utilization history. There again, you can see peaks and regular peaks of problem. So you can go on site at any given time and check the utilization. It will most probably be low unless you really hit in that moment where where it was very high. Right? So there is something that causes it to go very high. So now we're looking at possibly interference. Right? And remember, it's a remote site.

No one technical on site, and it is not something you can easily afford to do otherwise without DNA Center. Wow. So we kept narrowing down further. So the question was like, those peaks are very short, but how short? On this graph the default graph, there were every five minutes, so we took it to the next level and enabled RF stats active collection.

So this is with the 9115 access point, which doesn't have the RF ASIC chip, but it is able to still do telemetry and narrow down to a 30 second reporting period. So we could actually see that those issues, those high channel utilization lasted for around 30 seconds because we see the peak only listing for one reporting interval. And the insurance goes as far as telling us that it is not an external interferer, it is the AP itself having a busy receive and transmit. Wow. So we're looking at possibly a client heavily retrying to the AP and

forcing the AP... Doing a kind of denial of service randomly for a few seconds, up to 30 seconds every two or three hours or so. So we know that's what we're looking at. So that's really as far as we could go. If you have a 9120 or a 9130 because of the RF ASIC, we can actually take an hour of spectrum capture as the AP is still servicing clients and identify if this is really a client or if there is something, some attack causing it. Yes, this is what Min Se Kim was actually talking about: the being able to capture packets from your access point the moment it happened.

Yeah, Min Se talked about that. That's great. Great innovation. That must be making your life from a troubleshooting perspective way easier. Absolutely. I mean, we're saving up to months of getting down to the problem,

because once you know the problem, you know exactly what to take. It's like, OK, you have an external interfere problem. You probably need to go to the site. And we have the tools. We know where it is, more or less, etc., or you have a client acting crazy and then we can narrow down where and when. It just helps.

You could be spending months or years otherwise. Wow. That's impressive. When I explain to one of our customers what is Assurance, I always explain it like putting a thermometer into your network that tells you the health of your network and your devices and your clients and your users and your applications and not just now, but as you say, even to be able to go back in time and do that network time travel, to really nail down the problem when it occurred. Even if the problem isn't there anymore. And another great thing that I like about the wireless Assurance is that it actually gives you actionable items.

Right? So you can see why the issue was there and you can also hit the "solve the problem button" and I know that you as an experienced CCIE, you don't need that because you can actually do it yourself and do it in hard coding. But a lot of us, we actually do like the graphical user interface to tell us what to fix. Yeah, a lot of issues are not coming to us because people [can] fix it before-hand. Right? So I'm very happy about that. I just do not have a perception of how many that represents. That's nice.

But by the way, since we're talking about those actionable insights, maybe Nicolas, you know, some kind of secrets, if tech somehow impacts this kind of advices that we see in DNA Center. Yes. Yes. So at two levels. So there is... DNA Center is one thing. Then you have the AI Cloud analytics behind it. Which is a separate subscription.

And we actually were involved into the machine learning of that. So there was real data being fed and then the system needs to learn what is a problem and what isn't. Right? And sometimes you go over threshold but it's not really that big of a deal or it could be explained, so a lot of tech engineers were actually looking at the data and say, "No, this would normally seem like an alarm, but it's not because of this other metric. And this is actually your problem." And we help the machine to simply learn. And now it's autonomous and it will do it automatically and much better than we do because it takes a long time to do manually.

So I'm happy about that. Wow. Wow. So with that, you're actually able to sometimes prove that a complaint that is coming from the colleagues or the department, you can prove it's not the network, right? Exactly. Exactly. That's very handy. So Nicolas, we are asking always two questions to all our guests, and I'm pretty sure you will have some amazing stories here, because I would love to hear from you one day in your career that you will never forget.

I really have trouble picking only one, to be honest. The one I will really... It's not just one day (so I'm really not answering the question) was the Rio Olympics in 2016.

A bunch of tech engineers had the chance to go there for supporting the event. And it was just a memorable experience. Like the best people from Cisco, a real network, a great organization. It was the best memories for me. Unfortunately, I don't have any funny anecdotes because everything worked as expected. So as a support person, I actually didn't have a lot of work. So that was a brilliant experience just because of that. Well, that must be amazing that you are actively participating in such a big event that is that is broadcasted throughout the world and it works flawless . That must be. .. I can imagine you're very proud of that. Yes. Yes.

Absolutely. If and if I can just give another one, the year I passed my CCA in 2009, I was extremely young when I joined Cisco and I looked even younger. So I had just passed my CCIE... You still do. You still do. Thank you. And exceptionally, there was a customer in the Netherlands, in Amsterdam that was really having...

It was one of the first all wireless office back then, which was a new concept back then. Right? Right now it's business as usual. Right? And there were some issues and we didn't have the right person on site to troubleshoot because it was, you know, no DNA Assurance and it was some dodgy issue. So we needed to narrow it down. And then it was decided that I would go on site because I was, you know, I just passed my CCIE and I was the man for it. And as a youngster, it was the first time I would go to the customer site.

And they saw me and they were a bit like, you know, "you look a bit young", like, "are you sure you're the right guy?" And I was like, "Well, how many Wireless CCIEs do you have around? I don't know. I guess it's me, right?" But I was just as confused and it worked out great. We found the problems and but yeah,... That is hilarious.

The chance we have at Cisco, if, you know, you learn very quickly, you work with the best and you just learn so fast that I didn't see that anywhere else. So that is brilliant. Yeah, I pictured them opening the door like, "does your mom know you're here?" That was pretty much it. Wow. That is amazing. I love that story. Thank you. Thank you. When, we're off the call I want to know who is the

customer, because as you know, I'm very close to Amsterdam. Yes. Ok, OK. So our second traditional question actually is about your home Wi-Fi setup. So especially for Cisco people who are super curious how it looks like because it looks so different. So no exception to you, Nicolas.

We also would like to see your wireless setup at home. Could you please share it? I will show you. First is my office. Aha. This is my lab. I love it already. Yes. I have four access points, one of each model pretty much and a 1900

small controller and a 3504 switch PoE. And it's pretty much all you need to do any kind of setup. There you go. So here it's a bit clean, but otherwise the APs are always like upside down with wireless antennas and components testing it and so on. Love it. Love it. My desk, which is right right behind it, so I have my DX80, an external screen for my laptop and my home PC, which is a gaming PC at night, but also a virtual machine host for some Cisco, IEs and others.

And a detail there is the picture from the San Francisco Bridge, the Golden Gate Bridge. I love that, Exactly. Cisco. I love that. My three passions, basically. So you have a Star Wars poster there, but it's pretty dark. San Francisco. And this "Rock" that my mom gave me.

That that's pretty much... Brilliant. And I have a black office. Yes, I painted it black. And my wife keeps saying this is a stupid idea and it's really bad for the light, but that's what it is. So Steven, can you spot it.

Noticed. Noticed. 3700, yeah. Yes, I love it. So it's connected to a Cloud 9800 in Amazon. So that's really nice.

We use it for lab and I use it for home as well. This is my best placement because it's central. We're in the corridor and you have all the bedrooms around so everyone gets the signal. AP in the hallway is not usually the best design for enterprise but for home when coverage is what you expect... and we're not so far from the AP. So it's working pretty good.

Yes. And that's one of the first recommendations that I make to the people in their home setup is actually, you know what? If you're experiencing problems, get your access point outside of your wiring closet where the electricity comes in and your internet providing and simply put it in the hallway. Already you're going to have a better performance. So I'm 100 percent with you. For home? Do it. Now you're going to see a funnier one. So let's go downstairs.

So you would expect that at the same location I would have an AP downstairs, right, but I don't. I decided this is the hallway downstairs. .. You have Mistletoe there. Exactly. And we have the garage there.

And I didn't really need coverage there. So I decided to put the AP where the users were in the living room. Right. I love that. Yes. But now let's take a look at it. You have the kids playing.

This is so exciting. Oh, I love it, I love it. Brilliant. So obviously, the AP was there before the chimney, that's the first thing and second I didn't really have any other way to put it, because the cable has to go upstairs in the wall and I couldn't really place it somewhere else. So, yeah. I was wondering, are you taking the wire through the chimney? But I'm guessing the answer is no,.

No, no, no, no. The wall next to the chimney. Love it. It's brilliant. Probably in this room you expect high density.

Yes. So you have the TV, the Nintendo switch, the kids are there, phones. .. Yeah, pretty much everything. And the demanding users being your kids. Exactly. Exactly. And, you know, I said before, it's a controller I used for lab and home. Right? And then you can already realize that I had complaints.

I was testing some stuff and I had unhappy users at home and now I'm not doing this anymore. That sounds very familiar. I was doing the same thing and at a certain point my wife was like, OK, so either you take care of the kids or you're going to stop doing testing like, OK. Oh, that was amazing.

Thank you so much for sharing, it was lovely. It was brilliant. With that, we're coming to the end of the episode, I want to say thank you very much, Nicolas. I really enjoyed this session.

Loved it. Thank you. Sofya? Yes. Thank you so much. Thank you so much for spending time with us. And it was really a pleasure to have such kind of conversation. Thank you for having me. See you in the next one.

2021-04-19

Show video