I Really HATE Brittle Python Functions

Show Video

this is okay and this is not okay I know this looks like I'm contradicting myself but there's a reason for this it's all about the details if you make the wrong design decision even just a small mistake you'll end up with brittle functions that easily break I'm going to show you the main principles that I use to write code that's less likely to fail with of course plenty of python examples this video is sponsored by localized more about them later because of the dynamic nature of python you don't need to use type hints so you might wonder what if you don't use type hints should you then actually check that an object or value is of a certain type and even do different logic based on that it's actually something you see quite a bit in older python code let's take a look at a few different scenarios there are basically four of them the first option is more or less YOLO no types nothing there's no indication whatsoever what the function might do when it might go wrong no checks nothing so that's what you see here we have calculate average it expects numbers we don't know what that is and then it Returns the sum of the numbers divided by the length of the numbers now of course if I run this even though we get all these little red squiggly lines it actually works because python actually ignores type annotations so even if there are no type annotations a python script will still run and this case it simply calculates the average now of course this is pretty brittle right because if we pass an empty list well then we're basically going to get a zero division error and here we do that explicitly but by accident it's possible that we pass an empty list and then we might want to give some more helpful information second option is that we do runtime checks before we call actually the function and then raise an error if basically something is wrong so that's what you see here so if my list which is in this case a list of Ines L like before but then we're going to just check that it's actually an list instance and that it contains integers for all of the things that are in the list now in this case of course that is the case but if I change this to let's say a floating point then we're going to get an assertion error that actually it's not an integer so that's another thing you could do right now the third option is that you actually do runtime checks inside the function and that can then raise an error so for example here we have calculate average again the safe version of that which still gets numbers we I did didn't add any type annotations here and then I'm just doing some checks so if it's not an instance of list then I'm going to raise a value error if not all of the elements in the list are in this case either ins or floats then I'm also going to raise a value error and then I'm returning the actual computation so here I have a save version and this is actually going to fail if I do something like this and now you see we get a value error but this is at least a bit more meaningful information now the other option of course is to use type hints and that is what you see here so here I have calculate average typed but now I've added type annotation so I have an input type here which is numbers list of integers or floats and I also have a return type of float and then I perform the calculation and of course this works exactly in the same way as before but then perhaps I should do an actual print statement here so we do a calculate average types and that is going to be the results that Jupiter automatically prints the last value so when you take a look at these options and in general the types are here either checked within the logic of the code or they're not checked but at least they're being indicated to the developer except for this very first version now if you check types as logic so that's basically what's happening in this second option where we do before the function or in the third option where we do it inside the function that is actually problematic I mean in this case they blows the code with assert statements which is just really annoying it doesn't provide any additional functionality also this is actually not really that precise because what happens for example if somebody makes a subass of list it might break these assert statements because that's an instance of another type right and thirdly by doing this outside of the function it means that every time you call the function you're going to have to do these checks which is just really annoying in my opinion this just makes your code a lot harder to use so my recommendation is to not do any of this but simply rely on the type annotations because those are kind of an implementation of these types of checks that you have right here now of course this is not going to raise an error but in any case types are mostly useful in the development phase they don't really matter during execution at least in the case of python final thing you have to be careful of if you add type hints like this is that you don't accidentally make them too restrictive for example in the case of calculate average I pass a list of in or float so this is already something I did to make it a bit more generic but actually when you take a look at what the function does this also works for let's say a set or a tupal so uh you could try to come up with a type that is a bit more generic so that's what I did here so I have a type number which is int or float and then I've defined a type number list which is a list of numbers or a set of numbers and probably this is not a very accurate name I should maybe call this number collection like so and then of course also change the type annotation right here so now because I'm allowing for both lists and sets I can also call calculate average types with a set like I'm doing right here in my opinion there's also a trade-off here like how much time do you really want to spend making it as generic as possible I guess if this is just an internally used thing you don't have to spend too much time on it however if this is part of a package that's being used by thousands of people all over the world you might want to spend a bit of time making your type annotations more precise I want to show you another example of how I sometimes see types being used so in this example I have a function process content that determines logic based on the type of the input so what process content does is that it detects offensive words in a piece of content so here if it's a string then I'm going to do this this if it's a dictionary then I'm going to do this if it's a list then I'm going to do something else Etc you can expand this function as much as you want of course so this is another example of making a function that's very brittle if you decide logic based on the type of your arguments then normally that should probably be different functions instead of putting all of that in a single function it's also quite combersome to work with because this can easily break due to Python's Dynamic feature and depending on the type strictness of your IDE type strict you will be either flooded with red marks the stuff that you see right here or maybe you just get some very vague help in terms of the typin now there are several ways to solve this one is you could simply split this into separate function so you could create a process content string a process content dictionary and a process content list and then call the right function depending on the type of thing that you have another thing you you can do is what we call function overloading where the type or the structure of the parameter determines which version of of the function is going to be called and you can do that in Python actually with something called single dispatch from the fun tools Library so if you want to do that I recommend you check out that library and it's pretty easy to do but actually that doesn't have my preference I would probably turn this type of thing into separate functions with slightly different names that indicate what what kind of thing it does alternatively you can also use a design pattern like strategy but then for example using a dictionary of function so you can point to the right function depending on the content type another thing that can make your code quite brittle is relying too much on hardcoded values a good case study of this is handling translations in software that's multilanguage this can quickly turn into a mess of hardcoded strings translations scattered all over the place and painful manual updates and that's where today's sponsor localize comes in localize is a modern translation management platform that helps developers product teams streamline localization integrates seamlessly with your codebase allowing you to manage translations more efficiently without interrupting your development workflow the API is actually really simple let me show you here's a basic streamlit data dashboard that displays Uber pickup when you take a look at the code you see that this dashboard users localize to provide the translated text in the interface I'm using the python SDK here that localize provides you with setting up the connection with localize is really easy you just need to create an API key in the interface and then create a client in your app now I simply Supply the API key that I take from the environment and I also have a project ID and then I have a couple of helper methods functions to quickly get a translation from a key so once you have this basic boiler pay code it's actually really easy now to get translations in the interface now as you can see in the dashboard one line of text hasn't been translated yet so let's add this here I have the localize interface and I'm going to define a new key called NB pickups hour and I'm going to give it a default English translation in my app I don't need to do anything I can simply restart the app and as you can see the new translation is now right here now it's really easy I can now also add other translations like for example the Dutch translation now ai is directly integrated can simply click that save that and now I have my Dutch translation ready to go switching to another language in my dashboard is not really easy I simply replace the value of this language consent by Dutch in this case and now I have a translated dashboard this is all being handled by localize so if you're dealing with complex localization localize makes things way easier with AI power translations you get results in seconds literally 10 times faster than doing it manually and it's not just fast localized AI keeps things accurate with glossaries and context aware translation so your content actually makes sense in every language faster better translations mean you can launch in New Markets without the usual headaches on top of that you can assign review and approve translations in one place no more messy spreadsheets or endless back and forth Just A streamlined workflow and when it comes to tracking progress localize gives you real insights like how much of your content is AI versus human translated and predictions on when translations will be ready so you can launch on schedule try localize for your project today using the link in description now back to the video so before I've talked about type constraints you want your inputs to be of a particular type but how about value constraints for example an integer that needs to be positive a list that needs to be not empty only even numbers Etc basically things that are not encoded in types where do you put these types of cont constraints clearly you can't use type hints for that one thing you could do is perform checks outside of the function just like I showed you with the type constraints but I don't think that makes a whole lot of sense in my opinion these types of checks should be inside the function so that whenever you call the function the function can check that the contract the things that it gets are valid for example here I have an initiate client function that creates an open AI client but it needs an API key and it wants to check whether actually that API key is is valid so there's some code here that sets the API key checks whether it can read the list of models using that API key and then if that doesn't work it's going to raise a value error otherwise it's going to create the open AI client so this is a clear example of a value constraint that you put inside the function before you actually do something and that's a typical use case the good thing also of putting it inside the function is that inside the function you have the implementation details of in this case the open AI Library so you want to encapsulate that so that if somebody calls initiate clients that they don't have to think about how to check the API key now in this case it's also pretty clear that we want to raise an exception if the API key is not valid but that's not always that clear for example here I have again a version of calculate average that also has a value constraint so it checks whether numbers is an empty list but then what should you do if the list is empty should you raise an exception return zero this is basically a design decision and while both of these approaches whether you're returning zero or raising an error they both have merits depending on the context so here is how you can think about this so if an empty list is considered an invalid input or logical error in your application then raising an exception is more explicit and it's a safer choice if an empty list is a valid input then I'd say returning zero is reasonable this is often used in systems where no data defaults to zero like in some statistical or financial calculations the problem with this though is that this can mask logic errors and that could lead to unexpected Behavior so in my case actually I prefer stricter error handling so my recommendation is actually not to return zero in this case but raise a value error and in my opinion raising a value error here is a better solution than relying on a zero division error because now when I run this code you can see that we get the value error and it gives me more information namely that the list is empty whereas if I didn't do that then I would get a simple zero division error and I'd have to figure out why is there a zero division error so by doing these checks it allows you to give more specific information to developer so that they can solve the problem much more effectively how about optional values things that can be known and default values how does that affect functions being brittle or not so in this case I'd say you need to do checks as soon as possible that's also called the fail fast principle that's what you see here so I have a new version of calculate average which has numbers which is an optional list of ins or floats and this one also has a Precision which is again an optional in that can be none so in this case there is a check in the beginning of the function so one thing you could do is then raise an exception if numbers is none or if it's not empty uh it doesn't make a whole lot of sense to me because then why would you make it optional uh alternative you can also assign it a default value like so and then probably we could uh assign it some default values inside the list so it doesn't fail here it depends on what you want to do if a certain value is not provided so in this case in the case of numbers I honestly don't think it's a good idea to make it optional because it does make a whole lot of sense in case of the Precision it might make more sense to make it optional because if it's non then it's simply going to return the average and otherwise it's going to around it with that particular prision my recommendation though is that if you don't have any logic that's tied to nonv Value simply avoid optional values and if you can let type annotations help you with passing correct values then use them to your advantages so in this case what I would do is avoid the optional values alog together and basically here I have my numbers list and I have a Precision which default I set to to now of course if you want to have the option to not specify fire precision and in that case you want infinite Precision or whatever then maybe we can do it in this way then you could still use optional but overall I would typically avoid it and another thing that's also nice about this is that if you don't have optional values you don't have to do any checks for them so in this case it means that the function is actually much smaller and concise which I like these are pretty typical things to consider when you're designing a software if You' like to learn how to design a piece of software from scratch check out my free design gues Aton c/d designu this teaches you the seven most important things you need to do link is also in description of the video now I want to go back to the example that I showed you at the start of the video which was about returning non optionally now in that example I had a function to get a user by ID that's also the function that you see right here and you could basically decide to either return the user and then raise an error if the user is not found or if there is no user you could actually return none and in that case this type annotation here would be user or non and then basically this I would then remove and then I would have a function that returns n if there is no user so those are the two typical scenarios that you see in my view I have a preference for raising the error again because I prefer stricter error handling I prefer to raise an error instead of returning some value Val that you may not find very useful and in this case in my opinion semantically it also makes sense to raise an error because you pass an ID so you expect that if you pass an ID that you are probably going to find an object with that ID meaning that if that object if that us doesn't exist probably something is wrong and you want to know about it instead of getting num and that's in general the rule that I would follow if you're looking for an object by ID but it's not there raise a not found error if you're looking for an object that you might expect doesn't exist then it could make sense to return n for example here I have another function find user by email that we pass on email address so there I can imagine that actually it makes sense to return user or n because we pass an email address we're not sure if there is a user so in this case it could make sense to return n if that user isn't there that being said I would still raise an error in this case just because I prefer raising an error instead of passing a non value i' like to avoid n as much as possible now if you're looking for several users instead of one user for example here I have a get users function that gets a filter and in this case I don't think it should return none but if there are no users I just think it should return the empty list and that is actually what this code does because if in the for Loop there is never a user that follows this fil filter then it's simply not going to be added to this list ultimately though it comes down to the sematics of a function if it's perfectly reasonable that there's nothing that gets returned then yeah sure return none if you're expecting to find something raise an exception overall though in case of any doubt simply raise an exception don't return none because returning none is going to create a more brittle situation in your python code now one thing I do want to mention here is the maybe monads this is not built into python but you can actually use the returns package and that has a maybe Monet however I'd be careful with this because it's not a standard python feature and might be confusing for some developers so what have I talked about well first don't do deep checks of types before or in a function because that's going to lead to a lot of refactoring work if you need to change things around instead use type hints optionally with a static type Checker like mypi if you want to if you're dealing with value constraints like only positive integers or non-empty lists raise value errors or custom exceptions avoid optional values if you can but instead Supply a reasonable default value and avoid returning a nonv value except for a few specific cases where it might make sense overall in case of Doubt simply raise an exception it's the best way to go about it and finally feel fast do these checks early on and fail immediately instead of trying to ignore errors or return some value that hopefully uh is usable to the caller of the function now I'd like to hear from you though how do you approach these type of things do you think about these things when you write your python code do you have any tips any things that you would like to share please post them in the comments below now I did a full video while back covering the fail fast principle in more detail you can check that out right here thanks for watching and see you next time

2025-04-01 23:49

Show Video

Other news

World's Greatest Military Inventions and Technologies 2025-06-03 18:26

Technology in the Workplace 2025-06-02 05:16

Telefonica Tech Reimagines Itself for the Hybrid Cloud Era 2025-05-30 04:58