Serverless SQL The future of database technology

Show video

so this is a very exciting moment for me post pandemic this is the first time I've stood in front of a big crowd like this and it's excellent to do it in our offices so thank you all for coming so uh Jim gave an excellent welcome but I I did just want to uh you know really thank you all it's amazing to see this kind of attendance for those of you that don't know me my name is Spencer Kimball and I am the CEO and one of three co-founders at cockroach labs and we've been building this database now for about eight and a half years if you can believe that uh time does fly though it really seems like only yesterday that we're three people and we're now at uh 478 we're approaching 500 and that scale for the companies fundamentally tracking the scale of our customers and the usage of the database so that's extremely exciting and I like that there's some proportionality in those numbers we also have over 30 000 clusters that we're currently actively managing and this uh you know is across our Cloud platforms that means both our serverless where there's quite a volume and that is increasing all the time but also are dedicated so the the cloud database that we have that's fully managed and that of course started it was a very white glove service and everything was uh really done manually and now it's fully automated and very self-service and that the state of the art is moving constantly with both of those products so that's very exciting we also have many clusters in fact a somewhat unknown number out there that are self-hosted by our customers many of our customers started as self-hosted maybe explain expanding into the cloud so there's some hybrid usage there and there are still some customers that are really actually um taking a different path than the public cloud and building their own private clouds uh sometimes with Incredible scale so it's really interesting to see the wide diversity of different uh deployment models in companies that are actually employing cockroach so this is Jim already showed this list of logos it's a great set of presenters I wanted to thank all of these folks in particular some have come from all the way around the world Dev sisters is a great example of that so thank you uh and of course some of these are the most well-known logos and I think what Jim already said Bears repeating and that is that all of these companies if there's a single thread that runs through them it's that they're leaders in their respective verticals they're pushing the boundaries in terms of how they use technology and we're happy to be part of that Journey with them so I'd love it if everyone could just give a round of applause to the presenters from these companies thank you great so I'm going to talk a little bit about the current and then what we believe is the future state of databases this is really how we decide what to build and with what priority it's kind of our model for how reality looks and how folks are using databases and how we expect them to to evolve their usage of databases so you know everyone is aware of the traditional database market and it was roughly split into the oltp the transaction processing and then the olap the analytics processing that's a fairly simple formula for how this Market works it's changing rapidly and the forces that are changing and evolving it are are many but these are free to call out I mean one is that there's just this modern era of data there's more data than ever before there's this data intensity you can collect it all but increasing there are two there are tools in the market that allow you to use this data to draw insights out of it to provide a fundamentally richer user experience and that in turn drives user expectations but in your vertical if there's a competitor that has provided a new service that actually uses data to grade effect and pleases or even Delights the end user that's something that everyone has to has to also accommodate in their products in order to really compete in the market uh you know we have we have customers and this is kind of mind-blowing because I got my start in the sort of client server era in the 90s uh maybe even in the late 80s when there wasn't even client server they have in the course of a single operation uh applications that actually can touch the database more than 5 000 times that was uh something that if you actually expected that to work 20 30 years ago you couldn't run very many concurrent operations or have very many concurrent users but these services are handling uh tens or hundreds of millions of these requests a day so it's incredible scale out there and and that of course is driving those end user expectations which need to be fulfilled and then this is really this third part here about the the infrastructure expertise that's necessary to run is increasingly complex data architectures that underlie these new applications and services it's hard to even hire these kinds of people so these are the sres and the dbas and also the software engineers and there's been just in 2021 I believe the the amount of money that went into Venture Capital was twice uh the total in any year prior to that so there's this incredible influx of of money that is is fueling this Evolution and uh that is fundamentally creating a a massive supply and demand imbalance in the labor market so as companies look to satisfy these end user expectations and really use data to the Maximum Impact they they also need to find ways to build with the scarcity in the labor market for folks that have expertise in running infrastructure and that's really what is driving Evolution here and we see that in the next five years these this uh this sort of traditional Market is going to continue to evolve into a much more Unified Market where the uh you know we we're calling it here a single cloud data market but think about how all of these uh user expectations and the usage of data the storage of the data how it all works in practice it's many different vendors all of which are bringing different services and tools to Market all of those have to interoperate and increasingly they all have to be run as services so fundamentally what you're doing here is you're you're Inc you're combinating new capabilities through these tools and through data and you have to balance that by also trying to reduce the complexity if you don't reduce the complexity it only gets harder to run it and you actually have to move in the opposite direction so we see that services are fundamentally have to be the default everything has to be delivered as a service and as we'll get to in this presentation in some detail we also believe that that's actually moving towards serverless so higher levels of abstraction to help manage complexity and uh crucially when you think about everything as a service it's not just one vendor it's not just the big cloud vendors it's a lot of different vendors working together to build any competent data architecture and so you really have to think about the services that you use being Cloud agnostic and not just Cloud agnostic but public versus private Cloud agnostic as there's just an increasing heterogeneity out there in terms of how people need to build and then these capabilities we think it's all about distributed systems and interestingly also about SQL or SQL I think uh you know folks kind of predicted the demise of SQL not too long ago and that's been proven uh quite incorrect the beauty of SQL is just how elegant it is how elegant it is right so you can create a disk a simple Declaration of what you want to accomplish and then you rely on the system that you're querying to make the best optimized plan for how to satisfy that declarative query and as you move up the levels of abstraction and have more and more services that are connected together that becomes very important you want the lower levels the things that are hidden from you that are managing the complexity to be able to iteratively find better Solutions and SQL is a very elegant way to do that so we actually believe that SQL is not just going to be continue to be around in the next several years but probably your grandchildren will use it so we'll see if that's true but it's almost like our grandparents were using it not quite depends on how old you are I guess great so where does this go well we actually think that in addition to managing complexity and increasing capabilities there's also an opportunity to change how consumption works and the reality is that if you think about serverless it is actually something that is adding capability and is helping to manage complexity so it's really an extension of these two but it's sort of a specific thing that we really think is going to affect all services uh you know when you think about what the point of that is it's it's kind of like what do we want to do with every customer with every use case we want to be able to build better more capabilities and to build faster and to build cheaper so all of those three things are what we what we're trying to accomplish here and consumption is really how you can fundamentally change the cost equations especially in the cloud with services and what's this all about it's about Builders uh getting more autonomy being able to iterate more quickly so this is a great step back to just understand what the the opportunity looks like and why cockroach Labs exist in the first place so in just in 2020 we had almost 65 billion dollars of spend in the in the database Management Systems market and uh just into here that's increased to almost 80 billion so that that almost 14 and a half billion dollar market growth 90 of it is cloud so you can see how these Trends are shaping out massive growth in data but also massive growth in Cloud because it's truly a superior way to to build architectures we've had this graph for quite some time and it's amazing how prescient it actually has proven and it's really why we we Google fundamentally built spanner and built uh Megastore and built big table and so forth they were moving up this curve and this curve is is about developer productivity over time and and how do you actually measure developer productivity it's this it's those two factors that I mentioned before it's about the capabilities that you give to them what can they build how quickly can they build using higher and higher levels of abstraction but also how do you manage that against the complexity if you're going to go to multi-region as many of you in this room know it's not trivial there's additional complexity there that has to be understood and managed and so how do you make that as uh as tractable as possible but you also have to work on the complexity part of the equation so we're constantly moving up this curve and that's really where serverless distributed SQL takes this next so I like this metaphor it's uh somewhat imperfect but I I don't think that it's it's incorrect but as you can see here I don't know if anyone remembers that black phone but uh it's hard to dial to zero uh and then we move to that that cool brick in the 90s so the early 90s late 80s uh and then to the the the first sort of little handheld devices uh that were could fit in your pocket and then eventually to the iPhone and fundamentally this is about managing some of the complexity uh but in the process of managing complexity you open up the playing field to accommodate increasing capability so the fact that you can put one of these things in your pocket changes the game in terms of what you can expect it to do and that's really where the iPhone has taken a huge leap forward we'll see we'll see what's next but I bet this will continue to evolve everyone uh has seen these photos I don't know how many of you have Frozen yourself in one of these server rooms trying to install these things that was how it worked in the.com days um interestingly even though it's rare except for you know some large companies and folks that are building their own private clouds to actually interact directly with these kinds of servers we all do it through orchestration and a lot of different tools and uh that's sort of the idea behind serverless let's move at to a higher level of abstraction so just the ideas of nodes and how big they are and even potentially where they're located and what cloud they're in and so forth is something that we can move away from so if we're going to try to Define serverless what we mean by it what are the benefits I think is the right way to think about it and really it's about uh thinking uh it's moving away from the orchestration of individual servers and so there's there's very little manual but it's actually a little bit beyond that I would actually say it's moving away from even thinking about servers because if you don't have to think about individual servers how many of them how they're connected how a load balancer goes in front of them what the networking looks like and so forth that leaves a lot more time to iterate on the actual line of business use case and that's I think what most companies are looking to move towards and then within the concept of serverless because it's a higher level of abstraction you can get better and better at doing things that used to be manually done by the application or at the application Level or layer so this is really about just truly automatic scalability it's not about saying okay well we need this many nodes in our capacity planning and we can grow up to that and then well we're getting to 80 we might need to add some more nodes let's throw them in we'll cockroach automatically balances that but you don't have to make those decisions anymore and by the way I'm using cockroach as an example but this applies across the entire stack from the lowest levels to the highest levels and so you want that all to be completely automatic and and fundamentally when we're talking about scale here it's not just about how big you can get it's also about how small you can get even down to zero but certainly a very fractional use case shouldn't demand an entire server much less three servers in order to for example have a a sort of geo-replication across availability zones you also expect that you're you can continually continually improve the resilience it's just it's inherent in cockroach DB but uh you can always make it better and by abstracting above the servers you can continue to improve that in a fashion that's transparent to the applications and services and this is my favorite one the programmatic management so you can think about uh doing things manually with sres and uh you know teams even having developers of smaller projects manage these things themselves uh and and that's that's you know a lot it's a lot even for a reasonable team with good skills if you want to actually manage 10 000 of these or a hundred thousand of these things like say let's say what we're talking about is databases imagine that a hundred thousand databases you have to do that programmatically nobody wants to touch that with a 10-foot pole if there's any kind of manual intervention necessary there so you want to be able to start these things and scale them restart them scale them to zero and all of that of course should be coming from automated signals and all of that needs to be managed through apis programmatically and then you fundamentally with serverless has this have this opportunity to pay for exactly what you use so that's that can drive a lot of efficiencies and cost and you can actually do that to an arbitrary level of granularity right down to the single request or the single byte store to the single uh you know query done so that's excellent and so fundamentally what we're looking about is uh you know building running maintaining all the day one plus maintenance activities without thinking about the underlying servers and right now if you look at today it's mostly when people hear the word serverless they're thinking about execution layers so you know if you're running a big application server that might be fargate and AWS or Google Cloud run you know Microsoft course has has their own and there's plenty of Open Source and Homegrown Solutions to this there's also an even finer level of granularity it's actually decomposing those application servers into into sort of like Nano services or functions and that's a that's an interesting approach I don't think it's a one-size-fits all but this whole ecosystem has exploding when you actually think about really deriving the true benefits of serverless though it's not just at that execution layer it's all the way through the stack cockroach is playing down at a low level of the stack and so we really want to bring the promise of serverless to databases but it's a lot more than that and if you think about what you need just to run cockroachdb as a serverless database well there's plenty of concern in the infrastructure layers that surround cockroachdb for example there's regions there's different clouds there's the availability zones you might need Global load balancers and and sort of regional load balancers you need kubernetes clusters and all those different availability zones well it's a lot of stuff right you want all those things to be serverless as well and many of them are moving in that direction so today I think it's important to think about uh kind of where we are what we've built what we're building in serverless and what that implies it is important to note that these base capabilities the things that we've worked on for the better part now the eight years or eight and a half years uh these are actually required to build the serverless capabilities to truly build them so we're building on a foundation and and I think there's there's plenty more beyond that we plan to build on that Foundation but what are some of those things well we set out to build a distributed database and uh we wanted to do that in a way that was familiar for developers so we made it postgres compatible which I think is paid significant dividends we've automated the scalability of it it can find equilibrium as you add nodes and as you take away nodes there's of course active active resilience for business continuity and we've tackled the multi-region and again multi-region as an example was not something like serverless that we set out to build in the beginning because that wasn't really the state of the art yet but as we started to build our distributed Foundation we realized that there were customers that were very very interested in accommodating users all over the planet and so we said you know what this distributed architecture that can do geo replication originally intended just to be close by in availability zones we can think about this as splitting out the data and partitioning it you know across geographies across continents even so that's that's something that has accrued because of the foundation that we build and of course when you start thinking about these high latency links between your users you actually start to say Well we'd love to just do transactions within a region but if you have to do them across how do you build a transaction model that can accommodate that as efficiently as possible how do you build Global tables for example how do you ensure that data is don't miss out only in in a single region close to a customer whether that's for data sovereignty or just the customer's preference around where their data is stored all these interesting problems and again it's about the distributed core capabilities of cockroachdb then when you actually think about serverless it's okay it's it's an it's a it's essentially a consumption modality on top of that core database and so one cockroach physical cluster can accommodate potentially tens of thousands of virtual clusters so this is really a page out of vmware's book right they saw all of these use cases Each of which was demanding a single physical sort of bare metal machine and many of those would use you know 10 15 of that underlying machine and so virtualization actually allows you to to parse that physical machine into lots of virtual machines and to use them far more efficiently it's a it's a very similar uh situation here but I think the the really key thing when you think about databases and this is an almost impossible uh capability until you see it built but you want to spin clusters down to zero uh because you when you when you think about lots of different use cases and I'll cover some of them uh it there's there's plenty of things out there that we create that are fractional or that are ephemeral in their usage there's a transient utilization and you want those to uh be run as efficiently as possible you don't want something that uh runs for 10 minutes and then goes away to continuously demand like for example a VM be live at that cost if you have 10 000 of those or 100 000 of those will become absurdly prohibitive right so you want to be able to embrace the Contours of your use case and if you've got lots of ephemeral usage across your business entities or customers that's what serverless is actually extremely good for but if you can spin down to zero you have to spin up too and you need that to be extremely fast just as an example on some of the serverless database products on the market right now they can spin to zero they can take 30 seconds to restart so that use case it doesn't work very well you know you don't want to have a user latency of 30 seconds and it'll probably go away for that or think the system's not working so instant starts an incredible capability here and then multi-tenancy is is sort of what uh underlies the whole system and makes it work so what does this lead to well we think of it I think the best description is something Jim came up with it's pretty brilliant but it's like a SQL API in the cloud not a database that's traditionally defined in terms of how big the node is oh it's a sun Microsystems uh you know thing that costs five hundred thousand dollars and we want to spend 100 or 1.5 million in order to scale it up by a factor of two that was how things used to work and we've moved away from that and moved the commodity hardware and so forth that hasn't made it much easier fundamentally but this this idea actually really I think can change the game and it's actually in the process of doing so so when we think about our vision it actually maps with our mission here which is to enable every developer every developer out there who's intentional and wants to to build world-changing applications I think that's a great Evergreen vision and we just got to keep pushing this state of the art to make this possible uh you know so it's it is fundamentally about meeting developers where they are and leading them into the future making it easier for them to to build the next generation of apps and services but we also want to make sure that we're creating efficiencies today and when you think about the developer it's about really getting your time back and many developers across many companies I'm familiar with including ones I've worked at I've actually been tasked increasingly with running their services scaling their services because sres are difficult to hire and so you can actually scale your software engineers and you you give them the uh the responsibility of running these which is great actually I think I think Engineers should probably hold the pagers at least some of the time for what they build it's certainly it aligns incentives right you want everyone to to to Really build bulletproof services not to hand it off and have somebody else suffer with it uh so you know that's great but you want to give them their time back like how do you minimize how much effort they have to put into uh these tasks that are operational and let them build and and really to focus on what matters right which is the business use case it's always the business use case how do you Delight your users and and yeah part of that is make sure that the thing scales for them and has low latencies and so forth so these operational characteristics the data architecture the design of it is crucially important but you kind of want to you want to Envision that once and then you want it to work when I think about the real vision for a developer is what can they build on their laptop and then can they hand it off uh or or sort of launch it into the cloud and it will globally scale so build on your laptop but have that scale the way that for example Google launches applications right they have platforms you build the platform and then when you launch it you get all the benefits and that's really what we're looking to build here and I mentioned uh sort of reimagining what the future could be with these new capabilities maybe some very exciting use cases that are opened up with serverless and multi-tenancy one that appeals to me quite a bit is imagine a big software as a service application like workday or atlassian at jira as an example if you have potentially hundreds of thousands of end customers and think about the ticketing system in jira you can you can create filters in order to select tickets and take a look at what's going on in your queues those can be arbitrarily complicated they can create Noisy Neighbor problems as you can imagine if someone creates a really big query across many many tickets so you really would like to give every one of those tenants or any sort of customers you want to give them their own isolated database now uh doing that for a hundred thousand would be absurdly cost prohibitive and no one would ever think to do it so what do people do instead well they they pack a bunch of these customers into multiple postgres servers as one example right and that that can become uh very difficult to manage over time those costs are accumulative and probably non-linear and of course you get these incredible Noisy Neighbor problems because there's no isolation built into that kind of a system and and again you have many many of these postgresh shards essentially so you're back to that problem of shards again right but you would like to build those kinds of architectures I mean how many SAS companies are out there I mean probably many folks in this room are building that kind of an architecture it would be excellent if it was trivial to Simply create what feels like a completely isolated database that is uh you know specific to each customer so that's that's a pretty cool one we also see that uh you know development teams are typically spinning up kind of toy clusters even in pre-production and staging environments you might have let's say three regions running in production but it can be so absurdly expensive to spin up a three region cluster in order to do pre-production uh work or to do regression testing as an example and so people don't do it so then when you actually launch into production there's some pretty interesting surprises that happen right latencies that you didn't expect transactions that time out and blow up the system in other ways right so there's all kinds of things that can be done to to to to to work on that but it's kind of hard to beat the Fidelity of a pre-production environment that's exactly like production and that's something that with a virtualization and something like serverless is very easy to do you have a Dev test cluster it could be quite big let's say it's a 10 10 node or maybe 20 node physical cluster and that can support many virtual clusters as many as developers need and they can spin them up almost instantaneously and they can if they don't use it those resources are reclaimed very quickly and if they come back after vacation their clusters right there for them you're not you're you're potentially hosting thousands of developers in your organization to move more quickly to build more quickly and to build better so that's another very exciting use case and there's also fractional use cases this is quite common I mean uh you know the companies I've worked at some have had say 70 externally facing applications or Services other ones have had about seven thousand uh I think there's there's there's customers in this room that their organizations are hosting tens of thousands of externally facing applications many of those can be fractional and and even big applications start very fractionally they might require just a tenth of a server but you may want that to be multi-region you might want to mirror like wherever you have customers and just a little bit of usage is in Australia that's great it should just use a little bit of the resource it's not a full server so combining fractional use cases such that the Peaks and the trough can balance and you're not kind of stuck at that base load where you have to have an ability to to scale up to the peak load you still need a little buffer so you're really operating like about 20 percent of where your server capacity is it'd be much better if you could operate at 50 or 60 percent because you're aggregating use cases and letting some of the peaks in the Trust balance so these are just some really interesting new ones and there's probably one more I'd call out which is particularly interesting to developers and that's new developer platforms so there's a couple that I'm very interested in versel netlify these are actually systems or platforms that allow developers to build and deploy Jam stack applications so it's really about the edge this is our multi-region install these things are really coming to fruition in the market out there these companies are allowing developers to build something and then to deploy it exactly like what I was saying can you build on the laptop and then have that deployed globally and scaled globally I for those kinds of platforms it makes a lot of sense to create a system of record for every one of those deployments as everyone knows when a developer hits a button that connects to their GitHub and deploys something some Jam stack out somewhere it's pretty unlikely they're actually going to use and scale that you know like maybe one in a hundred of those because a lot of that's testing and iteration and so forth but eventually you get to some golden thing that actually starts to get some usage that can scale all the rest of those the developer platform that next Generation developer platform doesn't have to pay for all of those use cases beyond the 10 minutes they were run all right so this is like really incredible efficiencies and this applies uh really across the ecosystem so we we really Envision this as a as a global Service fundamentally so it's an extension of our Cloud platform and it has all the same benefits that we've been building for the the geopartitioning the multi-region uh and additional ones in in as well but again this is fundamentally consumption based and we actually have a free tier now in serverless and we're going to extend that but the the Big Goal here is okay you go a little bit past that free tier you're paying for exactly what you use that's the consumption base and then when you take it a little step further you realize there's a huge ecosystem out there and it's not just enough to build the serverless database we have to integrate with that ecosystem we have to make it work well from top to bottom so there's an entire architecture that can be embraced and so we think of this as distributed data functions think about stored procedures reimagined for for this new world where the database spans many regions many availability zones can it run the application logic right next to the data well the database certainly knows how to do that so that's one interesting idea but you know more fundamentally it's about how do you integrate with all of the tools out there I mentioned all those application execution layers that are serverless now that the clouds are providing and and many third parties beyond that we cockroach fundamentally has to integrate with those things and and increasingly integrate with those things in an automated fashion so that that dream of going directly from what was built on the laptop to a global deployment is as simple as a single click and then we also extend this Vision to the Enterprise and the right way to think about this is it's an aggregation of all those developer experiences right you really want to allow organizations to build and to manage more using fewer resources that are increasingly scarce and the database as a service fundamentally can capitalize on these economies of scale uh and and leaving businesses increasingly and this is quite a departure from the way things have continued to work even to the present day but focus on the lines of business and iterating on use cases faster time to Value these are great things and and we also want to push the big areas of concern like business continuity uh scale Global replication geopartitioning into the database away from the applications and and I mentioned before like the the excellent example of VMware and their virtualization technology we're really taking a page out of their book and it's about efficiently utilizing resources that are already there and so many of you are this final Point here I think Bear's mentioning uh it's it's a good one because I know of the of the customers in the room there's quite a few folks here that are building their own databases of service internally for their developers and serverless is a an excellent extension of of that effort right for the same reason we're putting serverless on our our public uh sort of uh you know sort of shared Cloud platform this is something that many of our customers will want to build internally within their organizations great so thank you all I hope that was an interesting uh preview I guess of what we're planning to build and a summary of the things that we've been tackling over the last eight and a half years this next two days we really I want to stress it's about engagement and also alignment across us as a vendor and you as our customers and Prospects and the best way to do that is of course communication it's bi-directional so everyone I hope you're coming here to to really engage and to learn from each other uh they're experts everywhere and it's not just the folks at cockroach Labs although if you want to find out how the database is built those are probably the right experts we want to find out how the database is used and you're all the experts on that and fundamentally what's working what's not working where do we invest more where do we fix and iterate so please come up talk to me talk to Nate talk to everyone in the team we're fundamentally building based on the signals we get from you and so I think this is an excellent form the best we've ever had so thank you all for attending

2022-10-27

Show video