Understanding the technologies that power “AI”—a product-owner’s guide
Hi and thanks for having me. My name is Lindsay Silver. I'm the Head of Platform Technology at Conde Nast. I'm really excited to be here today to talk to you about Data and AI and product development and to kind of go through how we think about it at Conde Nast. So, first, a bit of context, Conde Nast is better known as the home of brands like Vogue and GQ and the New Yorker, Allure and Architectural Digest. We're a 100-year-old magazine company,
that over the past 20 years has had to go through a transition from being a company that produces written content and photography around the world to one that creates dozens of digital products. So, by the numbers, we're about 72 market brands around the world that would be Vogue in France or GQ in Russia. We have about 185 web mobile and OTT products; those include things like the New Yorker crossword app or self-magazines website and web properties.
Across that network of products, we have about 350 million monthly active users, about 450 million to 500 million in annual digital revenue. And between five and 6,000 editorial users. So, let's talk a little bit about how we build products at Conde Nast because I think it's important regardless of what the features are to sort of discuss how you think about products in an ecosystem like ours. So, obviously, if you think about our audiences at the top of this chart, we have audiences of all kinds. We have audiences in specific markets with specific interests and specific utility is in products that we need to service. What we find about those products and those audiences is that the products they use have to be different but the tools, the APIs and technologies, the data and infrastructure often don't. And a lot of times when we think about feature development, we're thinking about how do we build a tool, an API, or a technology that services the biggest number of end products. And so, we've built this model that's sort of two
dimensional where we have our portfolio of consumer products across the top. We have teams that support those individually. And then as we go down the stack, we've built different tools and teams and organizations focused on different levels here. So, we have teams within what we call our global platforms that are responsible for things like content management or add systems we also have teams lower in the stack that are responsible for data architecture and how we manage our APIs and technologies. And what we've found is we generally have a smaller
number of teams per organization, as you go down the stack. And so, our platform teams basically the same size that's the four levels below are basically the same size as our single consumer product team in terms of the number of people we need to actually develop. Okay. So, let's talk about developing AI driven features or data-driven features. At the core
of any data-driven feature is a feedback loop actually at the core of almost anything that we do, and it involves thinking whether it's artificial thinking or human thinking is a feedback loop but assume that for any feature that you're building that's AI driven or data-driven, you're talking about a feedback loop and I think that's a really important thing that people don't sort of boil these down to but basically, what we're talking about is some sort of action or some sort of feature that is on the page that a user sees, some sort of signal that comes out of that feature that synthesized by a machine that then leads to a decision about how to change that feature or to improve the experience. And then the circle, the cycle starts over again there's a test. There's some sort of conversion measured that synthesized by the computer again or by some sort of model. And then a decision is made and then actions happens. Now this is of course,
very general. I mean, this is how you live your life, probably. You're constantly trying out new foods, deciding if you like them applying the tastes and to your historical knowledge of the foods you've eaten, making a decision about whether you'll eat them again and then maybe eating them or not eating them next time. That's almost universal in terms of thinking. What's interesting about building features that are AI driven is that you just have to think about it in terms of how a computer thinks or how to build these things in a way that allow computers to think about them. So, starting with
that action systems like your multivariate testing platform that you might use. Your ad servers, all of those personalization engines are built so that they can actually serve features in this way. So, we use a multivariate testing platform called Wasabi that was out open sourced by Intuit. We've changed, we modified it a little bit to allow us to do tests that are self-optimizing but it effectively serves a lot of our feature variation. So, if we want to test two button colors, if we want to test different recirculation algorithms, systems to decide how to rank content, we can use Wasabi to do that. Once we deployed a feature with Wasabi, the next step is for a sort
of a set of systems to synthesize the information that comes out of that. We synthesize data and we'll talk about in a sec and we get into the domains that we synthesize data about our content. So, we extract things like keywords and entities and topics. We synthesize data about our users, so that might be building models of propensity or likelihood to subscribe and we synthesize data about our advertisements and our monetization options. So, that might be whether we should show, so let's say a house ad or a marketing ad or a commercial ad to a specific user at specific point. All of this data is held that we store in various data systems. And we're doing this
in real time as articles are published or as users come to our sites. When we need to make a decision then that usually is when you're visiting our site and we're trying to decide what to do with you or what to show you, we pass all of that data into some sort of predictive model. And this is when you hear people talk about ML. A lot of times they're talking about the models or the systems
usually there's an API on top of them that make predictions based on data that you pass into them. And so, in the example that I've sort of given around recirculation or getting people to click on pieces of content that send them around your website we would use a predictive model that tries to guess what content you're most likely to click on and a lot of different models that we have and we've run a few dozen of these models at the same time, depending on the type of user you are. We will pass data about the context, about what article you're viewing, we'll pass data to those models about what we know about you historically. Do you usually click on articles about kittens or about sports or about fashion and then we'll pass data about the advertisement or monetization options we have. So, are we in Q4 and are we trying to do a lot of commerce or are we trying to push for a lot of top of funnel brand lift type campaigns with some of our partners. All of that data goes into these predictive models and those send back a set of articles or a ranking for those pieces of content. We pass that back to our systems our multivariate testing
systems or our personalization engines. And those actually render the page based on what you've said. That happens again and again, over and over again, millions of times per hour for Conde Nast and we're actually making these decisions over and over again. And what's cool about that and why really, I fell in love with this type of feature is that if they're set up correctly, every time this happens every time that you go through this feedback loop, you're actually improving the system just a little bit. And so, when we deploy a new feature, we have zero data and signal about how it will perform but as these things happen, we're actually building a better and better system and historically this was done kind of manually when you hear people talk about retraining models or pulling down and redeploying models that's speaking manually about the improvement of that feedback loop.
More recently, a lot of our models, a lot of the models we use for recirculation and for advertising and for defining user propensity are auto updating. And so, that means that they can that every time someone views something or in a lot of cases every certain time intervals, let's say every hour, the models are retrained and that means that this improvement is happening behind the scenes. The models are getting better and better and better without us actually doing any work ourselves. And that's that I think is the key to this type of model in the context of companies like Facebook or Google or a lot of the big media companies now is that these models are designed in such a way that they add value to the company incrementally without much additional human interaction, you kind of have these cooks in the kitchen sort of adjusting things and making sure that the levers are sort of pulled the right way but you don't actually have incremental development going on in the same way that you would with a completely human driven model or sorry, feature. This is kind of the core of AI driven models. Now this doesn't go into the technicalities of building these too much. I can tell you, basically, we use a system, we model these in specific domains which I'll talk about. So, the really interesting thing happens
with feedback loops when you turn them on their side and I think this is the core of the reason that AI driven features have taken such focus over the last few years that, when you have built a feedback loop like this just like when you build a really high performing team developing features in general. Every time you go around this feedback loop, you get slightly better at what you're doing. And so, if you take the example of our Vogue recirculation models every time we go from act, to synthesize, to decide, to act. We know a little bit more about our audience and our users, and that's not because our teams know more, it's not because people have become more seasoned in their jobs. It's because as we have a stronger signal and more data within our models or, or that we have trained our models with. And so, we're constantly retraining our models
and that. Adds this incremental value that makes them better and better at their jobs. And if you look at Google or Facebook, you'll see they've built their whole businesses on this model of incremental improvement. That's automated because of the way that their ML systems are set up and it's really cool. It's actually the core. It's sort of what the root of my curiosity about these
features and the reason that we've driven to build several or so many of them within Conde Nast. Okay. So, the second thing to know about AI driven feature development is that it relies on an understanding of your domain and just like the feedback loops from before understanding your domains is something that's obviously really important to developing any feature but when you're building an AI driven feature, knowing what your domains are is important, because you need to translate those into things that the computer or the servers that are making your recommendations, understand. So, the way we look at this at Conde Nast is in terms of the three major domains that we have to deal with and the names sometimes fluctuate when we're discussing things but roughly, they translate to our content as our first domain. Our content represented by a content identifier or an image identifier of some sort and with traits, things like who wrote the content what the content is about, what keywords are important to the content. All of these are traits of a piece of content and when we look at our content model from a data standpoint, there are hundreds and hundreds of traits in for each piece of content.
Our second domain is obviously our user base and for almost any business users will be one of the domains that you need to understand are the areas that data that you need to understand our users are usually represented by some sort of ID whether it's a session ID for anonymous users or an email or a hashed email address for users who we have seen. And that's important because obviously we need to be able to link an individual user to individual pieces of content they've seen or propensities that they have a propensity to subscribe or a propensity to view additional content but there are dozens of other traits that we have. We derive a lot of first party traits about our users. What types of content they're interested in, how often they come to our sites
and we use those when we're building AI driven features. It's really common to have a model that takes an associate, a piece of content with a user who might want to read that type of content. The third domain we have, we call it sometimes we call it monetization. Sometimes we call it our ads business. Here, I'm calling it our experiences. Those are the actual context, the actual applications that were represented that tie together content with our users. And they are important because they stand alone if they were just pieces of content, we wouldn't need this but a lot of times we'll show a piece of content to a specific user across their mobile device or a mobile version of our sites, across our desktop versions of our sites possibly within a mobile application. Sometimes even with our videos within other media like our set top box applications and
other places. And I apologize, there's that low flying airplane above me right now. Basically, our first goal and something we tell any the product manager who's in the space is thinking about these types of features is they need to understand these domains that they're working with. Not every feature needs to understand or needs you to understand all three of these domains, but you definitely need to understand any domains that are related to your feature.
Good example here is a feature that we released fairly recently that took an auto recommended editorial tags to our editors. So, in that case we needed to understand the individual editor, what tags they'd used in the past, what types of content they'd written. And then on the other hand, we needed to understand the pieces of content that they were potentially writing. So,
we had to have our NLP systems extracting keywords and understanding that content. It's important for the folks, any of those features to understand both of those areas and what, how they might relate to each other. You can come up with a hypothesis that that brings me to step two is which is how these things relate. So, it's obviously extremely important for anyone building a feature that does recommendation to understand what they're recommending, what basis on you might recommend something and a good example here is in something that happens a lot is that we can come up with false causation. Food contents,
a great one for this. So, when we started doing a lot of food content recommendation, we really tried to look at what a user might've looked up before, what ingredients they wanted, all kinds of really deep information about our users to build our recommendation model. On the content side, we needed to understand what ingredients were in our content, all kinds of things. And we came up with a pretty complex model that tried to do personalized recipe recommendation. What we found over time after testing a few variations was there were actually other contexts they're way more important to this then user's past behavior. The biggest one was time of day and day of the year. So, when we started looking at what content worked the best,
we found is specific content, specific recipes work way better different at different times of the year. And that was actually what we consider an experience or a contextual attribute time and also geography. A lot of times we consider those contextual. So, we actually started building models for recipe content that took into account day and time as one of their attributes and what times and days that content had performed well in the past. And that actually increased our click through rates substantially on those types of recommendation models. What's interesting here and you can obviously take this as many degrees as you want. But I think that step three is the interesting one, which is asking yourself when you're building a feature, how that feature relates to all of your other, there are domains that are involved. And so,
a good way to look at this is if you start with your subject in this case, let's say, it's your experience, how does this specific experience relate to a specific user with specific content is a question we ask. If we create an inline link serving module, which we do a lot, we do automatic hot linking within our sites how does that apply when the content is about a certain a subject and is being applied to a different audience group. And that actually is the foundation for a lot of the hypothesis for features that we build. We rarely say something is going to be so powerful that it will affect all users in all content or all contexts but we do focus down on multiple as many user groups as we can and as many content types or as many content types with as many users and in looking at as many experiences that kind of gives that ubiquity that we talked about in earlier when we were talking about how to build a platform.
Obviously, the broader and maybe a general concept in AI as sort of generalized AI that's a computer's ability to handle a variety of situations and the Holy grail for machine learning engineers that somewhere like Google is to build systems that solve really general problems. The same goes here that the wider the context, the more problems you can see for the model the better you are. So, we obviously strive for personalization models. Let's say for our pages that are as broad as possible. But usually those are actually much more segmented. Usually, we'll find a model that works really well for a set of people in a specific context and that's okay. But you need to understand deeply what you can do with each of these how users are identified, how contents identified, how you know where you are in an experience, and then what data does the computer have to work with related to that context. Cool. So, I mentioned these, this'll be
a quick one. I think it's important to specify the key ingredients for building an AI driven feature. So, as I mentioned, a couple of times, everything we have in any domain has an identifier and it's extremely important when you're building anything that related to data that you can somehow identify that domain really uniquely. So, a user always needs to be identified uniquely a piece of content needs to be identified. And then what we call the context or the page of a website or the screen of an application or the specific spot even within a page needs to be identifiable and that allows you to attach traits and information that to those and so in our case, our traits as I said are in the hundreds for our users in the tens to hundreds for our piece of content and definitely in the tens at least for our experiences, our advertisements. At the end of the last slide are the relationships. So, every piece of content has a relationship to a user who's looked at it. They've either scrolled or they haven't scrolled, they've bounced or they've clicked to another page those are things that relate to those help you understand the user's relationship with a piece of content and allow you to build a feature off of that. With these things and thinking in terms
of these domains and in terms of those feedback loops, you can build a whole myriad of features and next I want to just take you into kind of what these AI driven features might look like at Conde Nast. So, let's talk about some of the features that Conde has built a little bit more. So, the first is personalized recirculation. So, about three years ago, we started doing personalized recirculation. That meant looking at things we knew about our user domains and about the context. So, at the time of day, as I said was one of the aspects but things like what platform you were on, where you were viewing the content and using those to actually personalize our recommendations for recirculation. When we talk about recirculation, we include things like hot linking and I think one of the more interesting applications of this was a system that we built to extract tease from our content like name's Tina Fey and Amy Poehler in this case. And then
actually look for content within our site that was most relevant to that user about Tina Fey or Amy Poehler, and actually auto link that content back. And so, at the end of the day, the user when they clicked on this piece of content would be redirected to another article about Tina Fey that then also took into account things that they knew and we saw some really interesting things happen with some folks where content actually was linked that was related to something else they'd been reading. So, if they were interested in the MET Gala, on Vogue, you might see this Tina Fey article actually linked to an article about Tina Fey at the MET Gala. Now, obviously there are a lot of parameters to this. You don't want to link to an extremely old piece of content. You don't want to link to a piece of content where Tina Fey, whose name might be mentioned a lot but which is about someone else or something else. And so, when you're building these models, there's
a lot to think about. And that's why this deep understanding of your domains is really important. The other thing that we think about when we create features like this is how do you validate them? So, how do you watch and make sure that that feedback loop is actually closing and improving every time. And with this feature over time, we saw it improve a fair amount. Although what we found was that its kind of plateaued after a little while you couldn't get these just straight personalization features with a single model or a universal model to work any better than, about 10% better than when they started. So, what we did and what applies to a lot of the features that we've done now is that we actually treat them as families of models. And so, when we try a model that might be what we would call a personalization model, we may have five or six variations of that model and we're running at the same time using what's called a Multi-armed Bandit.
Those models in parallel, we're testing them against different cohorts of our audience. So, we're actually doing experiments with multiple models at the same time and for some people, the most popular article about Tina Fey might always be the one that wins for other people, we may test a contextual model that really targets what they've read before and at different times those will work for different audiences. And so, that layered approach is really important to making this type of system successful. Another area that we've really honed in on our honed in terms of this is our experience optimization. So, we know that different users and different contexts or different experiences respond to placements of content in different ways. And so, there was a whole study maybe five
years ago and the human driven study that found that slide ups from the bottom of pages actually had a pretty significant positive impact on users. What we found over time was that depending on the situation or the time or the type of device you're on, actually those are detrimental. They increase bounce rates or exit rates. And so, we've built a system or a set of systems that allow us to run those things experiments in parallel, and then using that same Multi-armed Bandit or auto optimization algorithms to actually change where these things are placed based on the person, based on the device. And a lot of times those are sort of black box. We put in a lot of information about the context. Again, what a user is using to view the content, what content is on the page what time of day it is, what we know about the user from the past and the models will offer will spit out versions or responses to that placement and without actually giving us a full explanation. And so, we've over time, we've sort of honed these models and improve them but they
constantly are getting better on their own. So, as we retrain us, we get stronger and stronger signal. We're seeing increases in a clip through rates and response to these types of units. Third area that Conde Nast has had a lot of impact are sort of AI driven features that had a lot of impact in are our pushed content. So, in addition to changing the order of content on our emails and our notifications, we're actually using AI now to make decisions about when we send email, what ads are served in that email. And in some cases, even whether we're sending email or not. We're timing out users or we're decaying the number of emails that we send to users over time based on their likelihood to open those. And also based on things like what content we know they've
enjoyed in the past. So, a lot of times and in some of our bigger newsletters now, we're making decisions based on the content and the newsletter what audience will get it and those are all dynamic. If you've never clicked on an article about politics, let's say, and we have a heavy newsletter about politics. We'll actually adjust downward, your likelihood of getting that email
and it's still probabilistic and we still include audience in there that we think is low likelihood of opening, mainly to test and to validate those models and every time we do that and they improve, we've got something that helps our AI driven feature improve. So, we're always thinking in terms of how to build those features in a way that gives us that signal back and improves the signal. So, the next feature is less of an explicitly AI driven feature and more of a feature that allows us to power more complex AIs within our applications and that's when of advanced interaction. So, then the labels a little bit abstract but the concept is basically that when you're building features that take data about your users or about their context and then synthesize it and make decisions, the more the proprietary, the information is a more specific to your experiences, the better and so with brands. Especially brands that are kind of like. Specific in their outlook things like Golf Digest or Brides, we need to get information that's specific to those users. And so, over the years, we've done
a lot to actually get information about people that isn't necessarily that general information, their age or the time of day or things that just the real basics. Golf Digest is one where we got a lot of information early on that I thought was really cool. We were able to find people's favorite club types the courses that they enjoyed, specific shots that were difficult to them. And then what we could actually surface content and surface information for them. That actually was really relevant to their level of play and their interests and that's it. I know this was pretty high level and I'm happy to delve into more detail when in the question and answers section or feel free to reach out to me by email or on LinkedIn. I look forward to hearing from you. Bye.
2023-10-02 03:11