Dani Grant from Jam talking about Building Better Bug Reporting

Show video

there's a lot of focus on engineering teams about how to improve productivity and there's a lot of talk about tooling these things are awesome but I think that the bug reporting process has huge inefficiencies and Engineering teams if they were to spend some time improving that would see gains beyond the investment hi and welcome to lovinglegacy I'm your host Richard Baum and this time I have with me Danny Grant from Jam Danny thanks for joining me today thank you so much for having me absolute pleasure now perhaps you could for our listeners introduce yourself a little bit and tell us what you do and how you do it for I'm Danny I'm the co-founder and CEO of jam we help people report bugs in a way to Engineers that is perfect for the engineer what's what most people don't realize is that while most things in software development has totally changed in the last 30 years like is unrecognizable to us in the 90s the way we do bug reporting actually hasn't changed at all and it slows teams down and it's especially painful in today's new remote Global world yeah I'm already feeling that I'm not in my head along as you say these things so you reached out to me because basically you said I want to come on and uh you said you're a fan of the show which is great to hear um so yeah we'll go gave you the inspiration for Gem and what was your kind of Journey I suppose for kind of getting to the situation you are now in my my co-founder and I started as product managers together at cloudflare and on cloudflare we worked out on a very interesting team we were this sort of this Skunk Works team and our job was to launch net new businesses for the company so we had to move really really fast and get great stuff out the door right away and time and time again what we found holding us back is all of the miscommunication and back and forth about bugs and fixes with engineers and in the in-person world what we would do is we'd Repro the bug on our laptops and then we'd go run around the office until we found the engineer and then we'd give the engineer our physical laptop so they could debug it there because otherwise how can they see you know devtools but when the world went remote there was no way to do that and so that is why we started Jam we wanted to build the tool that we needed as product managers okay right so so then yeah either the laptop has to be there or you could copy paste I mean screenshotting is a part of the application that you have now so jam does that I understand so presumably before that's another way that you could have made bugs you could have just screenshotted it but it's just not as immediate is that is that the idea a screenshot is such a low Fidelity way to communicate something happening on the web in cyberspace there's all this stuff happening in the background that the developer needs that a screenshot does not Encompass even sometimes when people take screenshots they don't even include the relevant part of the page in the screenshot they include the error message but the developer actually needs what's around the screenshot like what account are you logged in with what version of the software are you using what country are you in like what is the language that is shown in like the experiments you're a part of um so with Jam we try to get all of the relevant information for a developer all packaged into one link so they have all the information they need to debug right away no back and forth needed so it's one click to screenshot or record a video or instant replay a bug that just happened and we grab a crop screenshot that you take plus the full screen so all the contacts you may have missed plus console logs fully inspectable Network requests and all the specs of the device and the browser and the operating system and even what your network speed was so so you've been gonna go in a while you've got plenty of users all over the place um my background is mainly working in corporate land and I know what it's like because everything's locked down so how does it play in that kind of scenario as well is it a case if you have to be completely fully open or can you work behind firewalls is there is there even an on-premise solution that you offer there's no on-prem but we do offer like Advanced access controls for organizations that need that and actually a lot of the people using Jam are at these large corporations because the problem is exacerbated the more people you have on the team the more countries different people are working in um and and the broader the surface area of the product which more Legacy companies just have a really broad surface area that they've built over the last 20 years so an example of some of the companies using Jam are like Dell T-Mobile Autodesk sort of these large corporations too yeah okay so backing up a little bit then so what got you interested so are you like from a software engineering background in the first place what's your kind of path I suppose to where you are now I studied human computer interaction at NYU oh cool nice and went from there to venture capital became a VC um my co-founder is uh was an engineer turned pm and we met when we were both PMS at cloudflare so I bring the business side he brings the engineering side and together we're tackling the problem okay cool how do you even start in it from a product mindset but having technical understanding when we started Jam the first thing we wanted to do was just validate that this was not only a problem that we experience as product manager to cloudflare but that this was a problem people experienced industry-wide and so having experience as PMS we said let's okay let's do a bunch of user interviews that's what we know how to do and so the first thing my co-founder and I did even before we quit our jobs to start Jam is we did 45 user interviews validating that this was a problem people faced companies big and small across the industry um and the emotion that people expressed on these calls was so palpable that we knew we needed to solve the problem and actually um one of those first early user interviews tried to pay us and set up a onboarding for his whole team and then we had to explain to him that there is no product yet we are just researching um so we really started from the product mindset but then then we need to figure out well how do we build a product solution to a problem that has not been solved yet with a product solution and both my co-founder and I um knew how to write code he is a former mobile developer so he took a react course online to teach himself react and I was sort of a hobbyist developer and I picked up some skills and together we built sort of the first prototype of jam just to see if there were any legs there um we were really inspired by the story of Foursquare the CEO bought himself a learn PHP book taught himself PHP to build the first version and so we we loved that story and and wanted to emulate that that yet because when you're launching a product from a an engineering perspective it's always about how how we're going to build it um and it's very difficult to kind of jump out that mindset and say no rather than just exercising what we think is the the important muscle which is building stuff how can we actually think about solving a problem that seems to be the perennial thing 100 and actually you know when we were early on the journey we were very excited about the new technologies we could use to solve this problem um so and and and so we chose things like kubernetes and graphql and let me tell you that a couple years later if there's ever any issue in our product it is caused by one of the not boring technologies that we chose like kubernetes or graphql yeah exactly yeah non-boring I think is the the operative words here because you have to think about scale you've got to think about the future of your product so I think it's very sensible the approach you took especially just learning it a technical solution off the bat so do you still run on kubernetes is that question I can ask or still use kubernetes still use graphql but it is one of the interesting things that I had not realized in our first weeks building Jam is that the technical decisions we made so early on before there was even a team stayed for much longer than I would have imagined the last remaining of that is when I when it was like weekend one and I built a first prototype I used mongodb because it's very easy it's very flexible it's great for prototyping we ran on mongodb until a month ago and it was causing scaling issues for us at our scale because it's a great database and it's great for production data for an app of a different type but we just needed a SQL database um and so those queries were just taking so long and when we did that migration gosh immediately the app sped up three or four times and it was awesome users noticed it was great DB hype cycle but it's interesting isn't it yeah boring old sequel kind of comes to the rescue again so yeah obviously then you've got a legacy problem right so you've got like stuff in mongodb you've got to migrate it how did you cope with that um so two Engineers took this on on our team um they spent six weeks um working on the migration they had and at the time that they deployed the migration they had a run book of what to do they had practiced on staging um and they had a run book of what to do if things went wrong and they had um a list of scenarios that they thought could happen the whole migration took 20 minutes and there was not a single bug report after and it and watching I was on the slack huddle where the two Engineers were during the migration and they they just they worked with each other with such respect and such like collaboration it was like a soccer team that's passing very well it's sort of practiced for the bigger problems that may come along and you're learning how to execute on the smaller problems so that when you have to do big and difficult things you can and I just felt like okay we did a big and difficult thing together and and we and now we can do anything so it that was a moment of Pride and that was pretty recently yeah that was one month and one day ago okay congratulations on that so how big is the team then now uh Jam small team of seven the one thing we learned at cloudflare is when something is important and you want it to go faster it's actually better to have fewer people on it not more and so we really live by that we're very inspired by Instagram there are 13 people there were 13 people at a billion dollar valuation my dream is for a jam to be 13 people at a billion dollar evaluation okay so what does the future then look like so you're very excited to get almost to 50 000 users that's brilliant are they all paying users uh how does the licensing model work we're students of cloudflare so large free tier um we love that we we love that we can build something that everyone can use no matter their ability to pay and then build features on top that only companies need and companies have the ability to pay um we're so determined to bring bug reporting out of this sort of archaic manual outdated phase that it's in and change it uh modernize it bring it to the 21st century of how we do things in software development today um wherever there are bugs we will be trying to build product solutions for that so it sounds like yeah you're very aligned then with the developer experience I mean obviously it's great from a product experience as well from uh from a company's point of view but the emotional words that you're using there about kind of pain it seems to be aligned very much with the things that developers will feel when they are confronted with the bug report which doesn't have enough information in it is that fair yes and to be honest the product managers the support team members and the QA testers who are reporting bugs to Engineers are just as frustrated it's a communication problem so it really has to be solved on both ends okay so the Fate is showing that feedback loop again which is the important part of any development or product process so what happens then with these I've tried the tool out and I really like it um but I'm just on the free tier of course but so what's the value add I suppose then for the for the next stage what do you get most of the value-add um under the pay tier has to do with access controls that companies of a certain size need we want to give the value add away for free which is spend less time debugging for engineers spend less time manually reporting bugs for PMS work together better as a productive fast-moving team for everyone um and then under the paid tier we have advanced access controls privacy security um that only corporations are really asking for okay so yeah that's the bit I'm interested in then so because I noticed a couple of sites for example won't let you record so Google won't let you record stuff on it is or is that something that's configured within the tool itself how does that work in the extension we allow people to set all sorts of settings like um if they want to disable jam on any site for any reason or if they want to do the opposite and enable it only for specific sites um Engineers especially have all sorts of different preferences around their browser and extensions that they use and so we just try to accommodate that make everyone feel really like in control when they're using Jam okay but would this be suitable also for customers so would you be providing it to customers or would a um a solution provided for customers when we started gym we wanted to fix the internal communication problems between like product support QA and Engineers but actually what we're seeing is that a lot of teams are sending Jam to their customers and saying we can't reproduce the bug you just reported please log it with this tool and then we'll be able to action it and so we're really excited to see that happen but definitely are excited also in the future to build something more focused towards customers okay and it integrates with other tools as well so kind of you can use it with jira or what we integrate with all the major issue trackers so for example if your team uses jira we want to make it really fast to get all the relevant debugging information into a jira ticket so jira Asana GitHub slack wherever you're reporting bugs we want you to make that faster and better we have this idealized version of of what might happen which is like someone spots a bug creates a jira ticket engineer fixes it but nine times out of ten that's actually not what happens um typically someone reports a bug and they want to get more information before they create the jira ticket so they put something sort of vague into a slack Channel like um login isn't working and the vagueness of any issue creates a lot of fear because it sounds like a big deal without more details and so a lot of Engineers will see that message and be like okay I need to stop what I'm doing on and focus on the critical thing and because there was no sort of orchestration many Engineers will now stop what they're doing to focus on this versus being like one person picks it up as a ticket from there there's a lot of back and forth of the original bug reporters these Engineers are trying to Repro it on their machines they're not seeing it they're asking back and forth what environment are you in what browser are you in what device are you using um like can you send a screenshot and ultimately um if they're able to resolve it with that great but then ultimately they usually need to hop on a call with the original bug reporter and everyone's busy so you're orchestrating schedules and maybe at the end of the day to context switch back to this and got on a call with the bug reporter and then like um they share screen you have to help them like open up Dev tools and click where it says console click where it says Network and like guide them through something that's quite confusing um and reproduce the bug and at the end of the day it could have been something like oh the person had the wrong password but that front-end error wasn't bubbled up to the user so login wasn't broken it's just a front-end error was not bubbled up and so and it's just the and that's just one bug and it's something that could have been identified and fixed in five minutes and instead took an afternoon for several engineers and for that to be the case in 2023 when we have amazing other technical tools and software development has come so far um is is bunkers um yeah I'll contend that a little bit because it depends where you work I would say as well because sometimes you can't get a bug looked at by a developer without going via a PO you know you can't look at a bug looked at even in the same day week or month no matter what it is no matter how well it's reported it's not re because it isn't the process right so sometimes it's a case of there will be a lot of information added to a ticket maybe in I don't know service now before it even gets to jira which will then have to be translated into a separate tool so whilst I get that it's great if everyone's moving at that kind of pace but it sounds a little bit kind of chaotic I'd say as well that kind of bass then then I can get there there's a real need for it there however it might yeah where I've worked it's kind of like you can't get something looked at a a developer because they're not available taking longer to solve which means you can solve fewer issues which means customers have to suffer with more issues in the product for longer and so by streamlining the process you also are able to solve more bugs which is really exciting does it help you also categorize the type of bugs as well so is there anywhere or would that come in something like jira or as a devops or something like that oh this is something we want to build so much um which is you use Jam to record a bug right on the web page so we have a lot more information than jira has and so we've been thinking about ways we can help with deduplication categorization and also routing because a lot of the inefficiencies around bug reporting is like someone reports an issue and they just like route it to the wrong team um anything you can avoid to have deduplication especially in jira because that's what a lot of people's time has spent doing is looking at either content in jira which has not enough information or it's something that has been reported before or it's something that's been there for two years already and should already been thrown away could you then also um not just categorize but also give like a Time series version as well like kind of you almost like an observability platform where you could say these bugs are being pop are popping up at certain times is there any way that you can mine that information from the tool that's that's so cool we're actually working on something related to observability um that we'll announce in a couple weeks and I'm really excited about if the goal is give developers all the information they need up front then doesn't that need to dig deep into the stack and and help them see what the error was across the infrastructure so we're working on something there you know there's a lot of focus on engineering teams about how to how to improve productivity and there's a lot of talk about tooling um there's a lot of talk about collaboration with other teams these things are awesome but I think that the bug reporting process has huge inefficiencies and Engineering teams if they were to spend some time improving that would see gains beyond the investment like I I just think it's it's one of those neglected areas that with a little bit of effort on um has a huge impact um both for the speed and productivity but also the collaboration and Trust within the team definitely but how do you fix that that's the thing I mean because everything is so embedded the way that we work things with the firms have spent a lot of money billions probably hundreds of billions on digital agile Transformations we've had agile coaches we have had scrum we've had devops what all this kind of stuff's been going on and we ended up we've got a fixed tool set you know it's like this is the core of the problem we put things and put a sticky into jira or into Azure devops or whatever and then we forget about it so how do you kind of flip it around and make it a situation where people are kind of yeah because I want to say they're fast they're they're solving bugs faster that's kind of not the point we don't want to have bugs in the first place we want to have good software so how do you kind of flip it and like not make it about bugs and not make it about stacks of bugs but making about a good software it's funny you say that like the tools and processes are stuck in place and we've already invested invest history so much time and effort and resources on um on agile and these new practices in devops when we started Jam we started with a different solution to solve the same problem which we thought we were going to be able to replace some aspects of jira with something better but what we found at that time is there was a lot of resistance to that because it's like well you already used jira so why take a risk on something else it is good enough and so we realized we needed to build something that would improve the jira experience from the inside out where at the you know whatever process you are using to run your team at the end of the day you get these tickets and the tickets are the things that you're working on and if the tickets are of low quality uh it's it makes the job much harder and so we wanted to build some like something very lightweight anyone can pick up and start using that would improve the quality of each one of these tickets um but what you're saying about it's it's not about uh fixing bugs so much as it is about having a great customer experience um I I completely agree um I think you have to do both as a as a remote startup we spend a lot of time thinking about how do we move faster with high quality because as a startup speed is the biggest Advantage we have against Giants and so we really care about moving really fast and still delivering great product and so we've we've figured out a couple things that work really well for us um that I think could work well for other teams too one is a culture of code comments um a lot of issues arise when an engineer is stepping into a file they have never been in or part of the code base that they're unfamiliar with and so they make a change without understanding what are the ramifications across a complicated product but if if there is a culture of code comments where Engineers don't approve PRS unless there's sufficient context given uh in each new part of the code um that really helps Engineers uh keep the quality up and move faster because they can get context um really quickly when they step into a new part of the code base another is doing an on-call program we do it a bit differently than other teams most on-call programs there's an engineer who's expected to be on call in addition to whatever it is that they're currently working on we do it differently where when you are on call you are only on call and so your job is to make improvements that you want to see unless there is something urgent that comes up or you need to help someone release some code and so that way there's always one person on the engineering team which for us is Meaningful it's 20 of the engineering team that's working on sort of background fixes cleaning up technical debt and then the last thing is something we found is the there is a misconception about moving fast which uh which people think if you stop and take too much time to plan you won't move very fast but they're not moving very fast that we're afraid of is like deadline slipping by weeks it's not taking an extra data plan and we found that if we spend about half to one day before starting to write code on any project planning how it will work and get having a round of feedback with the engineering team what are some ramifications we may not have thought of and half to one day at the end cleaning things up we found along the way then we don't have to stop and do a bunch of technical debt cleanup later and that really helps us so there will be bugs and having us through my process to report and fix them is super important but also having practices that support that um is is important too hmm so would you describe then Jam or the mission of jam then as being like kind of not just supplementing the existing tools there but hopefully replacing them in some ways as well I mean the we're we're in love with the problem not the solution if that makes sense um yeah and so uh we will do whatever it takes to modernize the communication between all the different cross-functional teams of an organization help Engineers fix and catch issues faster well I look forward to hearing about the journey as well as you go forward so sounds like it's been very exciting ride so far it's only going to get more exciting so what the plans for the next so you've got something exciting to announce soon looking beyond that you've already touched on some potentially quite exciting things for the future so you're kind of you have large Ambitions is it fair enough to say that yeah um we uh we were talking as a team uh the other day and someone was like this this is very ambitious and and then we're like yes but everything we do is you know needs to be ambitious um we feel like we're I mean we are just getting started there's so much to be done um and I'm really excited about the future we have a really special team uh they are awesome they care uh they work so well together um and and they're so supportive of each other and I can't wait to see what we can all do together awesome well I look forward to seeing it too Danny thank you so much for joining me today it's been absolute pleasure talking to you and best of luck in the future thank you so much thanks for having me on thank you for joining me today I hope you enjoyed this episode it was lots of fun to make if you would like to come on to loving Legacy then please feel free to get in touch you can find me via social media bound RW on Twitter or via LinkedIn or via my website richardwbound.com I'd love to talk to you about anything to do with software delivery and or something to do with Legacy systems of course until next time this is Richard bound saying goodbye and good luck

2023-03-15

Show video