Hello. Everyone, and welcome to our talk. On persistent, memory. Now. I'm, gonna ask a question and I want to see a lot of raised hands, or you're gonna hurt my feelings I, love. This guy all. Right so. How many of you remember a world before, the cloud and before data. Thank. You all. Right so. Our world is changing and I'm, gonna tell you a personal. Experience going, to the Grand Canyon with, my, brother. Father, mother, summer. Vacation, when I was growing up and we. Went to go to the Grand Canyon and dad says hey kids let's get in the station wagon with the wood paneled sides and let's. Go get a triptych how. Many of you know what a triptych is oh. Oh. Now, I'm really old okay. So, you would go to Triple A they would print out the route that you were gonna go and they, would help you map along the way what you wanted to stop out what was along the way and you'd get this book of maps for. The whole trip we. Thought that was so cool and that's, what we were doing in. That world we. Were not generating. Mountains, and mountains and mountains of data like we are today. Fast-forward, to MapQuest I thought that was the best thing since sliced bread right, I could print out on demand, where I wanted to go so. I was starting to consume data but. I was just at the tip of the iceberg and then. We had turned by turn directions for, those of you that are directionally, challenged, that, was life-changing I don't go anywhere without my phone add. To, that now the world of ways where we're crowdsourcing. The data we're actually generating, data as we drive for the rest of the world so. You guys know where I'm going with this story. So. Let's look at the amount of data that's being produced as we speak in, today's day and age I. Quote. These numbers over and over again because they give them to me in a slide to quote but, let's think about what we're doing with that data, there's, zettabytes. Of data that businesses, are producing, but, we're not able to take advantage of it because it's just sitting in repositories. And it's a liability rather, than an asset, we. Believe we have a technology that can help us on the journey to changing, that so that when we reach, 175. Zettabytes, in 2025.
The. World is able to process every inch of that data make, smarter, decisions and, do more with that data. So. I'm going to talk to you about the most loved, memory, storage hierarchy, we've all seen it a million times I've. Had it drawn this way in particular because. We tend to think of it as a continuum, but, it's not really a continuum, there's big gaps in performance and. In cost as you, jump between DRAM, and storage, and which. Is the fast order then down to tape and hard drive and for. 50 years the architects, of the world have. Had to adjust the way that they program, to minimize, their memory usage because it's very expensive and the capacity, is limited, we. Also see that DRAM, isnt scaling, at the same rate that our data is and so. What if we could have this amazing, technology where. We could have you. Know double, triple the capacity. Give. Up very little of the performance, and do, it in an affordable way that, seems like a dream come true and. Today. I'm here to announce we just went live last week, this. Is a half a terabyte in my hand who ever thought we would see that day and. This. Is now broadly, available with Intel Xeon cascade. Lake CPUs, and we. Believe that we're going to be changing the world with this technology we're. Just beginning on the journey so. I'll tell you a little bit about it it. Comes in modules, one to one eight 256. And 512, gigabyte, to. Help level set you know 32 gigabytes, to maybe 64. Is kind of a sweet spot for today so it's much higher capacity. It. Also has, the ability to be spoken to in either memory, language or storage, language, so it's byte addressable, and bit, addressable, in in, 4k blocks. It. Has multiple, modes we have a memory mode and an app direct mode and I'll talk to a little bit about that because I know several of you in the room are developers, and you want to know what this means for you and, it's. Natively persistent. So the data is at rest in, the CPU block that, changes the way we think about our data it's. Also highly, secure we, have hardware encryption, it. Fits in ddr4, dimm slots and, it's also physically. The same form factor so, it's not terribly intrusive to the system designs that we have today. So. Let me touch a little bit on the modes, we. Have an app direct, mode and that, just basically means, that the application is. Able. To smartly, place the data where it belongs if it's, hot data that's being used on a regular basis, it can reside in the DRAM if, it's. Holder. Data or cooler data it can reside in either the, Apache, path. I. Should have just started to call it by its code name in. The Intel obtain DC persistent, memory or, in, storage and we. Have a whole, range, of is V partners, that, are helping to redefine. And redevelop. The architectures, of their software to take advantage, of this new capability. The. Application, and the OS actually, see two separate, distinct, pools of memory and and. That's what is encompassed. In the app direct. When. You look at memory mode we, have this mode for applications, that haven't yet been optimized, to take advantage, of the capability, and so. What, we like. To say is that it's fairly easy to get in and try the technology, the, application. And the OS C. A single. Pool of memory like it always does and it, doesn't realize, that the DRAM, that's installed, with the, OP team DC persistent, memory is actually being used as a cache to, help speed things up so, it's fairly transparent. You. Don't always get the maximum, benefit, of the technology, when you're in memory mode you, know you get that best performance, you want to be OP direct an app direct optimized, but. This allows you to get in and try the technology, and still take advantage of the capacity, and the, affordability, that the product brings.
So. When you look at the, new, way that we envision, the memory. Storage hierarchy. You can see that we put the, obtain DC persistent, memory right in that large gap that existed, between the storage and the memory we. Also have lots of improvements, that Intel's been making in our three demand SSDs. And this, same, media is available, in a solid state drive form, factor as well and so. We're working on really giving, the architects. And software developers, of the future that full range of products, to design around. Alright. So when I'm talking to people about this technology sometimes, it takes a little while to grasp, what it means when you move. Between these different types and tiers of memory and storage so. We'd like to think about it in terms of what problem are we trying to solve, if, you've ever said to yourself wow ibrahim is so expensive i wish i could afford to put you know more, data in memory but I just can't. Then, this has the ability to help save money if. When you pair it with DRAM. You. Also have the ability to improve. Your memory to Core ratios, you can do more with existing infrastructure, and then. In the app direct mode that we talked about if the solution is optimized, you also have the ability to really help break the performance, barriers that you were seeing if your application, was memory limited and. You can see a lot more performance. So. I have one final slide I want to talk to you about the partners, that we're working with in the cloud, to. Provide ecosystem, support. For this technology yes. I said it Intel is innovating, on the software, side it's. Really important, that we because, we need the ISP partners to be on board we need the, service providers, to be on board and the OEMs have to be on board you, need all of those elements for the solution, we've, been working over a decade. To. Make sure that we have all of the pieces pulled, together for this solution and these, are some of the key partners that we've been working with over the last several years you. Hear us talk a lot about sa P HANA but, we have a lot more beyond, that we're, working with SAS all of the is V and OS, v providers, as. Well as our, partners here at Google. So. I'm gonna bring, up on stage today next, and I. Thank you guys for your time today is gonna talk to you about what Google is doing with us. So. I hope you guys can hear me because I can't hear myself I hope, I can keep your attention while the band is playing back there so, Christie, talked about the memory technology. And the landscape, in general, I am, hoping, to let you know about what, how this how this technology, is actually, realizing. You who go cloud and how you can go and use it and what, are the different use cases that we've been experimenting with so. We partnered, with Google around three years ago around this technology to, bringing it to Google cloud and they, announced. It back in March of life I'm, sorry the November of last year and. That was actually a few months before our own product, launch so there weren't a lot of work from both sides engineering, work into bringing this product into market so they were the first cloud service provider to, bring this into market.
If, You look at what this technology translates. To in terms of the VMS that you have in Google Cloud basically. This is providing. More memory, for you if you look at the VMS that are enabled, by this. Obtain, memory versus. The previous generation, there's actually 4x more memory per VM that that dis enables, so. More, memory that is faster than SSD, that's sitting closer to your compute. And it's. Basically, persistent. Right, so that's just putting all of that together it, creates a lot of different, use cases that are interesting, to, enable. Some. Of the use cases use, the persistence, of the memory some don't so if I were to. Talk. About some of the use cases that we've been exploring, one. Of the main ones is basically, because you have this memory that's per, VM that's, close to your core you, can actually use this as some sort of a cash for, your i/o intensive applications. Right so this would prevent. You from going back to back and forth between the i/o and the i/o latency is, going to be prevented so, that's one of the use cases we've actually done some experimenting, with that use case we've done some. Spark sequel. Analysis. Inside GC B and we've seen about 4x, better performance. For large data sets, another. Use case that we kind, of explored, was looking, at the persistency, of the memory right so the persistency, of the memory allows. You to basically keep, some of the key values, for your data inside. This memory so, when your vm is restarting. You would basically have the data available close, to you so you wouldn't have to go through, the whole loading, of the data so basically that translates, to faster restart, times we've, also looked into that application for different different in-memory databases, and we've, seen for. For, one of the ISPs that we've actually worked with we've seen the, load, times going, some from, some herbs a little. More than 20 minutes to less than 20, seconds, so that's the kind of improvement. And then, the last. Thing. That we've kind of explored in this environment is basically the last value proposition, is looking. At the memory size and how you can condense. More, instances. Into one VM so basically different residence so you know if you have a database with multiple, instances, you can compute, you can basically, put. Everything into, one single. VM. And if. It's especially if it's not very core, intensive, if it's more memory intensive you can potentially have some cost saving there because you're not paying for all the VM that you would normally. If you're taking that much memory, so. Those kind of summarize the value, propositions. Are inside, this you, know in the Endicott environment, now. I'm going to tell you a little bit about what, actually shapes, of VMs these are offered in so, these VMs are, both. If. You can see there are two different VM, shapes, that this, memory is offered, in the first VM is a smaller VM, shape it, offers, 96, B CPUs it, has 624. Gigabytes, of DRAM which is tied with. 3.6. Terabytes, of obtain, memory. There. Is this bigger size VM, that actually. Has 1.2. Terabytes, of dieren it ties that with 5.5, terabyte, of obtain memory and the.
Totals Of the memories you can see six 3.6. Terabyte and 6.7. Terabytes both of these are running, on two socket, Intel Xeon, processors. The next generation, basically this is the cascade Lake, codename. And they're, both single tenant, today the way they're offered meaning if you get the instance you basically have control, over the whole obtain. Memory and the whole the, use. The. Two instances, are both in after ik mode Christie talked about after ik mode I think and, he's going to talk more about the, details of how to program, for after x mode but, basically at very high level what AB direct mode is is it's. The mode where you can take advantage of the persistence, of the memory so, you have the address space of the obtain. Separate, from the address space of the DRAM and you can get basically, the application can, get to choose where. To write what so, you can have it have, an architecture where you would store your hot data into, the ramp versus, your full data into a capacity. I said. It code, word so, up to Intel update. The. So, basically these these VMs. Are enabled in after ik mode and you, know we've been working with different partners, to, enable, their application, for this mode and we'll get into some of the details of what we've been doing just. A, quick note on how you can get access to these VMs so that link on top this is basically, where. You can request access for an alpha, instance, so these offerings, are in alpha right now Google. Is basically this, is the first phase of Google, products, when they launch they go in alpha and then there's the general availability, this. Is the command line where you can use. To in space achill instantiate, these instances, if, you notice there's a new flag, for those of you who are familiar with the CLI that, flag is a new flag which basically lets you unless. You instantiate the VM the local MB dem size and you can set the size this is set for a 5.5, terabyte, instance, which basically, gives you the SEP 6.7, terabyte total memory, the. Instances, are initially. Available in Iowa this is US Central. But. They're going to be rolling out to other instances, so if you're if you want to experiment. This today that's the that's the zone, that you want to pick and. Then there's always support, from Intel and Google you can go to that website for different you know programming, models if you're bringing.
Your Own application. To these instances, or you know you're trying with some of the open source software's, that we've been working on so that's the that's the place to go to get some support. So, now I want to talk to you about some of the use cases that we've been running. On this on this technology there's a lot. Of use cases and different partners, that we have that have been talked about during our launch and, you. Know the collateral, is available, for you to go review I would actually encourage, you, to go and take a look if you're thinking about, your own application. And you know re doing, it in the Google cloud environment. But here I just want to focus on a couple of use cases that we have experimented. In the Google cloud environment, and we've enabled in the Google cloud environment. We've. Had. Access, to these instances. For a couple months now so we've been running different applications. One of them has been a spark, sequel, use case which our teams have. Been running in the Google cloud environment, if you look at the architecture and how we've modified, the architecture, for spark sequel, we've, actually created that OAP, cache that's sitting, between your, storage. Your Google network storage and your compute, and the, way that works is when, you get a query from your compute, into, the into, the aisle first. Time it, goes to the i/o it all the way you know all the i/o latency when. You, get the query its then stored, into, the OAP, cache so, for it so for the next iterations. Basically, you're going to be going right into the octane, persistent, memory so you're basically, reverting. The your, you're not seeing the latency, of the i/o so, that's how this is getting the performance, and if you look at the experiment, that we've done the performance, is actually showing you. Know around 4x better performance. This is not the final picture we're still working on some optimization, so we're expecting to this number to grow even to add two times what you were seeing here but, basically what, I wanted to emphasize here, is that with, these with, that package with the OAP package in, a spark Seco you can basically get up to 8x performance. Improvement, from, your queries. What. I wanted to highlight this is for a case where your data is really. Big it's not your it's a big data data, set it doesn't fit in your Durant, it fits in the cache I mean if you can implement that same OAP on a dram the data wouldn't fit on it you would fit on there on the you know obtain, memory on, the other end of the spectrum of, cases, first part is the data Kay data said that is small that is not that doesn't really require as much memory for. Those cases you still have the advantage of, the cost savings, that you do because your per, gigabyte, price, of obtain, versus. Dram is is lower in Google. Cloud environment, so across, the spectrum you can see there are different value. Propositions for the memory this. Is one side of the spectrum and the other side is the cost saving. And. Then. One more note so this package, the OAP package is now open source it's available on the on github, the address is down there on this slide if you want to download it and experiment, with it feel. Free so. It's it's there.
And Then, another use case that we've been looking looking at it's a similar use case it's with Hadoop HBase, same. Thing so we have introduced, this bucket, cache between, the storage, and, and. The end of compute, when, you get a client request. From HBase client, it basically. Goes you can if you can follow that you know number one that's, the flow of your, request, your client request it's, going into the you. Know network storage and then, the next time we're on when you make a request from from, data it's, actually going to be in the bucket cache so very similar to the SPARC use case it's just for, a different different. Application, and then if you look at that that's actually around, 33, X better performance, again, this is the case where your data does not fit in a bucket catch that's one on a daemon versus. When your data actually fits in to obtain memory so, 33, X for a better performance and then on the other end of the spectrum you still have the cost-saving benefit, for it for the memory. So. Again this is gonna be open source I think that's that's the you. Can go to that you, know link. At the bottom of this slide and get the you, know get the package, so. Allowing. Basically. These are some of it is some of the use cases like, I said there are multiple many. Different, other use cases that we've been working on we're going to be bringing those enabling, those in the Google cloud environment, we're, also working with different high speeds you saw that on this slide there are IVs, who are taking our optimizations. And putting. Them inside their application, and they're going to be available through those as we, you know roll this product, out and as. We roll, the partnership's out. I. Think. That's it for me now I invite nd to talk more about the, programming. Model. Alright. Thank you so we, heard from Christy that there's this new, obtain. DC persistent, memory we, heard from today that it's available on GC P and so. I'm here to talk about you know if it's on GC p how do you use it especially if you're a developer you, want, to develop say a nice. Cloud application. That, takes advantage of, Intel. Obtain DC persistent, memory so. I'm. A software. Engineer and I can prove it because they told me to dress up and this is what I wore so. I'm. Really going to talk to you about software. The, first thing that a software engineer would have to decide is well how do I want to use this stuff and really, there's kind of two broad, categories here, do, you want to use it as volatile, it, is persistent, memory but, maybe you don't care that it's persistent you just want to use it for its capacity, and so, on the left side I've sort of showed you you, have this choice if you decide just to use this for its capacity, you. Can do, the lowest, impact to your application, you can use memory mode which, Christy explained, just makes, the system memory look like it's huge right your. Application, is not modified in fact the OS doesn't even need modifications, for memory mode it's, completely transparent to software, okay.
So That's one way to use it but, if. You want to do a little, better job. On, where data gets placed instead of Hardware deciding, what gets placed into DRAM and what, gets placed into octane memory if you, want an application to, decide what gets placed into DRAM and what, gets placed into on, the update media you. Could use the app direct mode but, still you don't care on the left hand side here that it's persistent it's. Called a volatile, app direct usage. And actually. It's a growing. Number of our use cases use this method it's pretty easy type. Of programming you just, decide, when, you allocate memory do I want to just allocate from DRAM or do I want to allocate from this other tier that, happens to be a little slower than DRAM but it's a lot cheaper right. So you put a lot, of big data structures, into, the into, the obtain, media and then you put the really. Latency-sensitive hot, data structures in the DRAM on. The, right hand side is the. Set of decisions. You make if you decide you do care about persistence. Maybe. You want to use this to, store, some, in memory data structure, that, is. Persistent, it sits there even if the machine gets. Rebooted, or even if it loses power so. Again you have kind of two choices to make you, can say well okay I want to use the persistence, but I don't want to modify my application, for that we. Just have the normal storage api's it looks like a very fast SSD, write. The world's fastest SSD, you just use, the normal file. Api's that you're used to using they. Work they, just happen to be mind-numbingly. Fast, on, the other hand if you want to squeeze every bit of value that you can out, of the optin media then. You modify your application, and this is what I'm showing on the right here you modify, your application, to, go, ahead and map this this. Persistent. Memory right into the application and then, you get to use loads and stores just like you do with DRAM so. This is this thing on the right here it's, the highest lift. It's the most change. That you might make to an application but it's also the the most value that you can squeeze out of the obtain media. So. When, I started, this project, a long time ago I. Knew, that we would need a programming. Model to, expose all the things that I just this. Told you about we. Formed, a, working. Group, in. Sneaking. Industry, association, it's, essentially, a standards, body and, we. Got a bunch of companies to come and join us in the working group that was about five years ago now and. There's. Over 50 companies in this working group now a lot, of them that you would recognize as. Operating, systems vendors and as ISPs, and what. We came up with is this model and I've, I've, updated the model a little bit to use the term app direct which, is what Intel calls the persistent, memory that we used in this model and you. Can see. At. The bottom here I'm showing all the persistent, memory installed in the system some, of it might be in that memory mode that I mentioned before that's. Transparent, to software so I don't have to talk about it anymore the, rest of it is persistent, and you can see we've added draw to the operating systems this, is all true, today in Linux, and windows and other, operating systems I'll give you a list in a minute and that.
Driver Exposes. Things through this middle path here as storage, this is what I was telling you applications. Can use this programming model without being modified they, just think they're talking to a really fast SSD, but. That's not the cool part the cool part is what happens on the right hand side of this diagram on. The right hand side of this diagram an application. Opens up a file on a persistent, memory aware filesystem and maps, it in and you, get this far right blue arrow that I'm showing here where. An application, has mapped, the, persistent. Memory directly into its address space this. Means I just think about this an application, does a load instruction and, it, gets a value off persistence. Without any kernel code running, right. There's no context, switching there's no i/o there's, no, interrupts. No kernel, code runs one, instruction, means that you're fetching from persistence. Right. That's. Pretty. Cool. So. That. All sounds great but does it it does come with some challenges I don't want to make it sound like it's just the easiest thing in the world for, example if you put a data, structure in persistent, memory you. Want it to be something. You can get to after the machine crashes, if you have a crash or a power failure and so. It needs to be consistent now file systems do this all the time right if you're, making changes to a file system and you lose power when, you come back the file system has a journal, or something that makes it consistent, but. Programmers, are. Not used to doing this with data structures that they just created memory so. That's kind of a new kind of programming, consistency. With, the face of failure what you really want are some sort of ways, of transactionally. Updating, your data right, and and. How about keeping track of which people per system memory you've allocated in which you haven't yet allocated, again. That's, a little trickier than it is with with normal DRAM because it has to be a persistent heap and so. What you want is a persistent, memory aware allocator, to help you with that. Also. You know there are some algorithms that are different. In DRAM, than they are with persistent, memory like. How do you what's, the fastest way of copying bytes to persistent memory, well we have we have different instructions in Intel, that. We've gone, through and benchmarked, and done measurements, on to figure out what's, the best way of doing a lot of these operations, so you you want these tuned libraries. So that you don't have to reinvent this all the time and then. Finally when you're programming in your favorite language for, example you're writing a web-based, application in JavaScript, you, want something, that gives you access to persistent, memory in a way that makes, sense to a JavaScript, programmer, right, you don't want to be.
Going. Through some weird API. To call, some C routine that doesn't sound like JavaScript, at all right you want it to be idiomatic you want it to make sense if you're if you're writing and C it should look like C if you're writing a C++ it should look like C++, you're, writing in Java it should look like Java so, we want language bindings so, you all know what I'm leading up to we took all of these things all of these requirements and. We decided that we, could build a atuned. Vendor-neutral. Not. Just for this product but for any persistent memory product, vendor. Noodles sent set of libraries, to make persistent, memory programming, easier. Those. Libraries are called the persistent, memory development. Kits or PMD, K there's, the URL up there P memmio, and the. Slide has a lot of information on it but. All, of this information is on PMMA oh you can go find it you can see where the libraries, fit into that architecture, that I said before on the right see. Where the application, is now getting, direct load store access, I'm. Not adding another software, layer I'm pulling in. Algorithms. Just what you need right we call it shrink to fit pulling, in just the amount of software that you need you, need transactions, pull in the library for drehs actions you, need something. To look like. JavaScript, key value store just pull in that library and so. I've. Shown you on the slide what several of the libraries, are there are some low-level ones that hide instructions, from you they, abstract, away some of those hardware details there, are some libraries in the middle that handle transactions. That's pretty tricky programming, so we get a lot of use out of that and then, we've built on top of these a lot of language bindings like I'm showing you C, C++ there's, some experimental, Python bindings there's. Some Java bindings we've been working on and, like. I said there's even, some JavaScript, bindings which are so new they didn't make it on this slide so, this. Gives you a general idea of what the persistent, memory development, kit is. In. The ecosystem. Like, I say we've we, created this neo working group we had a lot of the operating system vendors there so. Everything that I've been talking about is available, upstream in Linux already I'm. Showing you the Linux kernel version that, we recommend, up there 4.19.
There. Are two file, systems, and Linux that are persistent memory where ext4, and XFS are both already, persistent, memory we're in. Windows, Server 2019. All these changes went in and the, file system that's persistent memory aware there is NTFS. Vmware. Vsphere 6.7. And later also, supports persistent, memory so everything, that I just said is, true. In a VMware guest right. Including, a program and, just think about what this takes to make it work a program. Running, in a virtualized. Guest does. A load, that it's thinks it's talking to persistent, memory and it is because vmware actually, put all the plumbing in so, that it's actually talking, directly to the hardware inside, the guest that's pretty cool. There's a common. Distros rel and slash I put, up there and a boon to Java. Has some pretty, exciting changes, coming into the, upcoming jdk release and so on and, what else I put on this slide is I just thought I'd show you what, does it look like when you put all these pieces together so, we have a proof of concept that we did with Cassandra, and. You can see it here at. The bottom it starts with with exposing, the persistent, memory using, the persistent, memory aware file system, that I mentioned before. You. Can see we use PMD K here to first abstract. Away the hardware, details and then provide. Some. Transactions, on. Top of that we have the Java bindings the. Low-level persistence. Library for Java is there and then, we made a persistent, memory aware, version, of Cassandra, so. What's. Kind of interesting about this picture is the app at the top has not been a DAT modified the. App is just talking to Cassandra the way it always has but. Cassandra and below, knows, about persistent, memory this is really a common. Pattern. That we've been seeing when we modify applications. For persistent, memory the. Knowledge, that there's persistent, memory in the system only goes so far up the stack and, then at some point everything, above is unmodified, it just thinks, it's an operating as usual so, it kind of shows you so did i modify the app, well. Some. People would say no because the app in my picture hasn't changed, some, people will say yes because the app is Kassandra so decide for yourself but, you can see the knowledge of persistent memory only goes so far up the stack. You.
2019-04-14