Dr. Kaizhong Gao: "How to Define the Digital Universe" | Talks at Google
Welcome, everyone, today. We, introduce, dr.. Cotton girl from, Argonne. National Lab. He's. Currently. A principal. Scientist. And group leader in a National, Lab and a, senior fellow at. Northwestern. University and. Previous. CTO. In, IB. T with. International. Business. Technology, service and, he's. Going to talk about how to define, digital, universe thank. You, thanks. Thank, you yeah. Thanks, so. This, is a title, of my talk my. Name is Kyle Jones, so. Let's. Start, with everybody, using, this kind of slide right we talked about for. Ten years change how, the, amount. Of data change, our words. Right so. For ten years now everybody, have cell phone half tablet they, record everything and, upload, everything to the cloud right, we, go back to the fundamental, storage, technology. We, like to see a little bit different, we wanna go about how we define, the. Data how we define the digital, universe then, we go back to see how. We locate our from our perspective, what kind of data storage technology, we need moving, forward in next five to ten years, okay. So, we. Go back to the data the projected, amount. Of data from. The IDC right, this is a kind of. Several. Years old but you know the projection, is every. Year we get about a 30. 30. % data. Increase, and this, is amount of data that we actually need to store so by. 2020, people. Project about a 40, billion. Zettabyte. Of, data so. At. Least until, right, now over the past 10 years this, trend is almost following, the same way have, no we have no seen any of the slowdown yet. So. When we actually look at this you, know how we define the digital universe right, so. When we talk about digital, world with how about an amount of data that we actually created, so, we generated. And stored, and, utilized, we. Can use typically. Categorized, into, for. Computation for. Storage, and for, communications. Right, so, when we talk about this we, use the basic, unit of bits to, measure that and then, we mapping this technology. And by. Mapping, to the fundamental, physical unit like a time like. You, know like. A lens, we, define the bit per second, for communication, and bid, for squaring to for the storage. Okay. So now let's. Look at write the time and lens our basic unit for the physical, world and in. Order to define the physical world we have some and basic unit and people, use how. About the digital universe right, historically. When we talk about a date. Right we sense it so basically union for information, mostly. Based, on the, definition. People post it and you, can actually see they talk about computing, and communication, so, people, heavily, put in those definitions. In those word but, most of the time they means a storage, apart right this. Is all right when, we talk about this different, definition, but, we want to add what, does us means when they we including, the storage, aspect. So. Let's. Look at a little bit right when we add in the story respect, right instead. Of having beat and making, with the time so, or the frequency, we, have beats per second, and bit per square inch to define, anything. Related, to the communication. And storage, now. As all the data migration to the cloud what we find out is we. Want to add in another. Data. Property. Definition, where we give them the definition, called the data temperature. That, is the new definition, we try to put for. The digital. Data property, so, that is a specific tailored, towards, as a cloud a data storage, so.
What, Is exactly, that definition right so that's it so only formula, I'm going to show today, which. Is the. Data temperature, formula, which from here right so the, time they drove data in here we try to define we take the 10 times log. Obey pay. Some time for, the the file assess the frequency, and then. Father says so frequency typically, is just defined, by the, number of Assessors. Divided, by the. Time right, the, time most, of the people define, as a second, and so. You can actually using, that if, we choose that the. Unit, of time to, be one year that means we're adding a constant here then we can find out universe or linear response. Of, you. Know in, the x-axis, a logarithmic. Of, number of sites per second, the, where access is actually, the temperature defined, by the degree G right, so, why we're using this a specific, a unit, and specific, formula, right in, the next page we are going to see you have some very increasing, aspect that, can actually map in people's, you. Know understanding. About the physical, world if, you pick as a typical, data, storage, device, like hard drive right the highest rpm, hard, drive goes. By 1500. Rpm. 15,000. Rpm, so these, every. Kind of sensor data about 300 times per. Second, when, you're using this formula, that's corresponding, to the highest temperature, capability. Of the. Of, the data temperature, about. 100. Degree that's. Mapping, to our physical world a feeling about the heart. And the boiling temperature, let. Me talk about I've read people using their laptop, or you have daily work meetings. You, assess your data a few times a day and, that's. Getting back, using this data temperature, back to about 25, degrees which, is exactly mapping, to about a room temperature, value and, then. People using, this to say hey you know what if I try to store, my data I, use, maybe once a year family photos right. Tax without. Those. Getting, back to about zero, degrees, not exactly, the freezing, zero. Temperature, without freezing. The in, the freezing. The water, and then you can further extend, right if you look at all of this you can see hey you know for this definition, and, when s value talk about all the data story technology. You. Talk about Iran you talk about the flash I talked about HDD. You talk about blu-ray, and tape the. Ultra can all mapping, in. The spur and all. Making the same chart and you can see exactly, each, different, hardware technology, how they cross, one in two different capability. To, handle input, and output and rewrite, of the data and. When, you are actually looking at that the first data, I were digital. Data I recruited. You. Know Anna story in the first generation, of hard drive by. 1956. If those data have never been read, back that, he corresponding. To the temperature of your freezer at home that, is miners, eating degree right, so, this definition, is not only very interesting, to mapping to. The you know average of people's daily is a feeling but, also they have a lot of impact, right so, that in the next few minutes I'm going to talk about a little bit about how. Does that impact in, terms of the data storage technology. And the. Locator, how does that impact and what. Kind of future technology, solution, may exist, right. And then, at the same time we. Are also trying, to see, what's, a digital, universe evolution, hot as this compared, to the physical, universe and, how does this compare to our, human, society evolution. Okay. So. Before. We doing everything like that and.
The. Primary, reason we give this a definition. Is by, trying. To help people to have better definition. When they. Do. Their business right. So one. Of the reason is this time temperature definition, will give these a very. Accurate. Prediction. Or a description. About how you charge your. Cloud service, right, so this is where the money goes if you look at all the IT Crowd service provider, right typically, says oh I give you a service, you. Can have this X amount, of the data storage. And typically. A lot of time they sense this X amount of data stories, for, free what, the chart is is. Actually how many gigabytes. Or, terabytes of. Data you can assess and those, is a come from the random manner so, they talk about this bandwidth. Which. Is actually, corresponding to. Or proportional. To your data temperature, so. Every. Different company, using, the different, terms but, there is no universal, rules by. Putting the digital, data. Temperature. Definition, we, can give them a universal, definition, okay. So. Let's look at how does that map to the hardware technology right. This, is a way of the road map from HDD industry, you can actually see a different, technology helping, to, improve the area density the, project, a road map trying, to have the area, density keeping, increase at around 30 percent and if. You look using, that projection. Compared, to the SSD you can see solid-state, memory actually grows at a slightly. Higher, rate. But. When you actually look at the volumetric. Density right, HDD. Nanda. Base the storage, product. And the. Tape there are roughly, at about the, same one of my tree density so, why people, sell this different, products, drive a different price that, because the. Assessed, or the, the, frequency, for doing input, output is, drastically. Different, that's corresponding. To the, data temperature, difference right, so the actual amount of data is stored is not, as important, as a compared, to how. High the temperature, you can tolerance, in terms of data temperature, right. So. Another. Interesting aspect is, if you look at all these data storage. Technology, particular. For example an HDD, the actual area density capability. Has slowed down so. That means now, as we try to pack a more data we. Have to build a more Hardware right if, you look at the over the last 50. 60 years this, is actually, the. Every single time for the hardest drive you can see at. The very beginning, they start to drop, but since introduction. Of the you. Know, personal. Harddrive, and all. Of these assess. Him is plateaued, and flat. At around at a millisecond. On. The other hand I wrote density, keep increase so, as soon as the, early seek time. Become. Flat. Your. Time. To spend to write a four DS for. Drive start, to going, up is, going up almost like linearly, right, so. This is that also. Means that it, takes more time to put more data on the same drive so, if you try to assess all this fire at the same time you, realize you are experience, longer delay that, means the average data temperature, of the drive was. Start to drop as you, have. More files on there so.
We Run the standard. Simulation what, we really find is regardless. How you assume number. Was users. Never assess, the. Ultimately. What, you find is if you pick any given, storage device here then. You increase, the. Amount of assess, and. Then you increase amount of data you store you just keep, attacking more, and more files there and as. You grow the number of files stored, on that device. What, you find is the average temperature. Of the earth drive start, to drop that. Means you. Have more files there but, you have limited bandwidth, to do the read write operation, so, on average each, felt and be assessed, that, frequency, start, to drop and if. You think about you. Know all the story. Device we ever created, and we every, put, in service as one. Big device what. You can find is over, the past 20. Years the digital, universe the amount of data keeping, increased 30, percent, per year that. We stored. The. Amount of the files. That will be increased about the similar ratio but. The dependi way is done in increases much that's, why you, also see the monotony, decrease, of the whole digit universe, so. You, can see this is a st. render for the hardest drive as well, as you can use the same analysis. For all different other technologies. So. When. We look at our. Different. Ways of using the data right you can see from. The 1950s. To 1980s. Primarily. All this data is in enterprise, space people, only in, the larger business, can afford to have that then, you get to the PC euro and then, every family start with using and having. Their data. Story, in their own home, and then. After year 2000, we start to migrate to cloud people. Start with more, more information over the cloud until now not, all family, don't even store, much information. In their local, storage. Device. So. As these change, what's, really happening is there. Is a fixed. Number of user for, long time but. As you migrate to the cloud each, storage. Device, in the. Google server you know Amazon servers start, will have. To host data from many different user and this, goes very rapidly. And. So what happen, is that's actually creating a different requirement in, terms of the data temperature, so. This, you. Know a blue line is actually illustrations. Of the hard disk drive. Capacity. Over the last 25, years you can actually see the. Hottest drive actually increase at about 43, percent per, year in, terms of capacity and, hero. Last. Year, ok. I don't have the new data for this year yet but so far until last year for, the last 30 years is, almost. 43. Percent per year and you can. See you know as we migrate to, the cloud then. The file size doesn't necessary increase by a lot so, what you really see is the number of files on the right hand side also you know the number of files is ago monotonically. With the. The, drive, capacity. And the. Average data temperature. Of the. Drive starts, draw, simply. Because he take a same amount of time you, run them selected, to read and write data, but. You know now you have to go over a lot more files right this. Is the. Case that even ten years ago people here at Google already, find out it become a problem at here because you guys viewing.
That, Hardest. Right become an archive, solution. Rather than actually real-time service solution, right so, this is nothing you write but, when we actually look at this is, typically. The every. Data template of our drive and then its peak temperature, value. Is different, they, can be varied, by as, much as 40, 50 degree we. Look at 40 to 50 degree that means you're. Talking, about a 5, order of magnitude difference the, reason, that people can, tolerance, that is because, in, reality when. We using, that hard drive for the data storage, or using, any other storage device, what. We really have is we only have very tiny percentage, or file which are hard which. We need them very frequent, access we. Pay. 99%. Of our money, for. The data we very, infrequently. To assess them but. Think about it right when we put other service, we. Charge our customer. By, the frequency, of assess so that is where we making the money right, so in that regard when they putting this into the cloud the data storage, our. Eternal. Story scheme, is very inefficient, because, we, use a high, high. Cost more. Expensive, hardware solution. Store, a lot of data we almost never use right, but, on the other hand as also means because. These, things exist, therefore. We can use loaded. High temperature capability. Hardware, set up to. Make a complex. Service, because, it's, very hard to predict which particular. Piece of the information becomes. So hot and I need a lot more frequent, assess right. But. Here is the thing when, you are actually increases, the file size you can actually slow down this data temperature, drop. Temperature. Capability, 12 right, so. As we move forward, right, when, we look at the hid dia necessity. If this is it every, data temperature, capability, or hardest drive we. Know that as a people, migration, to the cloud you. Have more user now right so even you have small percentage, of data me. The high temperatures have. High temperature, but. Since. You have more users so the the number of frequency. Number, of assess, so. Do increase, so what, they want is, the. Actually demand is start. Creeping up as you have more user to, a sense the same storage device, but. At the same time because as you capacity, increase your. Data, temperature, capability, drop, therefore. What you really have seen, is the gap between the, market demand as well as your hardware capability. Become, larger and larger that. Gap will. Be fulfilled, by a new technology. So in. The past ten years solid-state. Drive start. To occupy, hand, hard disk drive market, is because, of this particular reason right, so, it's not saying oh they take over had his drive market. It's big heart is driving can no longer fulfill the. New request from, the the, market right because, of the migration to the cloud. So. As, we, can see right, five. Ten years later right we probably have. More, centralized. Storage, right, have more use our try to assign certain set, of data that. Means you need a email. Higher data temperature, capability, as to, the device level so, that is the word the new emerging. Technology, can, potentially. Come. In and start, to taking over some of the hand, SSD. Space right. So. Let. Me skip this. So. Now. In let's look at a little bit. Different. From. A different angle. To look at this right if. We think the whole universe is one big gigantic, drive right we. Know that the total amount of data monotonically. Increase, so. This is the whole digital unit world we only have one drive but we have all, the possible, way to assess, a data right, and we. Know that if, we increase the, amount of data faster. Than the bandwidth that we can actually retrieve. The data or to do the read or write operation there. Is a monotonic, relation. The data temperature, keeps dropping. Right, you, have mo data you created your hosted, you, have more requests, but. The ratio keep dropping, therefore your data temperature star drop if. You compare the answer to the physical, world you see a very similar things right after the Big Bang right.
Now As we detect our universe, you find all the galaxy's, start moving. Away faster, from. Us right we know the universe expand. While, the direktor, evidence, is when. We measure the actual physical, temperature, of the universe the. Background radiation temperatures. Start to drop right, it's dropping over time right, so these follow the similar paths right. There are a lot of similarities. Between all these large-scale, systems, right we look at this as well look at our little bit about as, we migrate to the cloud how. The behavior, of the, cloud service, versus. How the human. Society evolution. Right so there is another aspect very, interesting, when. We have a just a small, amount of people right people create a village right the living. ISIL is spot, so they try to build a very small path people, work to each other and. Connect. With each other and, as. We're beginning we have the PC everybody, having their own storage. Unit, then they try to have a table connect, to each other and, then. Three years later right then people the society, start with we start to build town then we start with a multi-family, house, people, are living. Closer to each other 'mo hi wider row that is build and in. The storage space we try to put the drive together we try to put a storage, server we, try to have them to hold host, the. Data that have multiple people or tens, of hundreds of people access that enterprise. Storage server. Another. 10 years later in our digital words right, we try to have the, crowd when. You look at human society, people concentrate, more population. Density getting higher you, start to building skyscraper. Right you have all of these, multi-family. Home condos. And. In. The Google and Facebook right, you guys building all of this right the concentration. Of drives the, density, of dress getting to a much, higher level right so, when you compare, these kinds of largest-scale sees you, see a lot, of similarity. In between right, so. Given say that right we can learn from each other right we, have lots of a smart way to manage, our data center, but we can also see oh there might be some, problem.
That's A human societies, facing our physical. Universe is, facing, like a mapping to our digital words, right which, is something we. Need, to anticipate, those, thing happening, and finding, the solution, right now, the very, simple, example with how our data archive, system on. The, hardware side right, because, for. The archive system you want the lowest cost solution, so, anything, the. Cost below the Hajis right blu-ray. And tape right so those having examples, of their, their data archive system so. They talk about you know steel, we have billion-dollar, market, and you. Know we primarily. A host the data that, a lower than 10 degree G the. Timely way is roughly, about you, you know you assess the data as once, per month or, lower. Frequency, okay, in, reality, most of the people putting their. Data in an archive system at, the data they actually expect they hope they, never need to assess them right, that's, why they try to drive the lowest cost, so. Here's a single rail right when, you actually look at all this data you'd still want them, to be really reliable, they can't keep the data for a long time they, have low, cost for, long term antennas, and they. Hopefully. To be rewritable, well. The. Blu-ray is not truly, rewritable, so why I've, tried to highlight rewritable. Is important. As well right so. When. We look has from stories, perspective, right people, talk about oh I take the, fell off line right or words as I remove, the data I deleted the data i erase, the data are complete erase data this I have all different meaning, right, lots of people using those. Terms. Exchange. With each other and they, don't differentiate them but to to, me to the people who is working on the storage industry, they, have a lot of different interpretation, right so. If. Even, when we talk about your race data there are two different way right. Lots of, times in the cloud space right, you, you store, your data your. Archive your data when you actually remove, them what, you do is oh a lot of them have index, file you, need comprising, a large file you remove, their index, or its security key so. Blu-ray. Tape. And hadd they are all using this matter right. And, but. In. Reality right, that's piece of information is, still, physically, residing. In that location. There. Is you. Know no. Way to change it unless you have other approach, to, make modification. To it right so on. The other hand when you are actually doing the fit of removal, of the data that means you're actually writing, another information. Changing. The state of the, that. Hardware. So, that will return to some here something, else right. So in our case the current approach of a brewery have a lot of limitation. Because, typically, they were right once on media and tapes. In, theory. You can do that and in Surrey can do that have many many times but, practical, speaking you have a lot of limitation, because tape, you, have to physically, funny. Your library finding a particular hate getting, to your driver and rewind, for a long time, funnier location, and take the file out right now after you you. Know remove up at you go file you know compress the wrestle file putting them back right so. Removing. One data layer takes, a long time right, well how do you strive well.
It's, Typical. Manageable you're talking about a microsecond. We, can make that done right so. When, we look at that I need to be dining there feasible, users practical, well, people can say, hey from hardware perspective. Welcome clear see the differentiation. But. Why. We need to do that why we or we. Why, we we shouldn't you. Know think. We, can skip this right so, let me pick a one example right so, typically, large, IT company, for, example I know like Facebook. They were very highlight about the blu-ray, there's a story, archive season and I think. Amazon, using a lot on their tape to do their archive for the glacier storage, right so, these are the two primary very, low cost solution, I think, Google using a lot on the drive based but. You guys probably also still using, quite a beta on the tape side as well so what's. Really happening is you can have a lot of information to this system but, by the time for example somebody, cut. Into a really important, event, they, want to remove 30,000, email this. Scattering, randomly, in the crowd right. What, you really see is the. Steps is for writing those information, into your archive system followers, we're similar steps it. Doesn't have a lot of overhead for time and effort and cost, but. When you actually try, to physically. Remove something, like. 30,000. Email random scatter in the crowd all. These three different technology. Have to go through completely. Different steps and. Generally. Speaking when, you actually talk about physically. Remove those information. Assuming. Only one copy or 30,000, 12 skirting. The cloud it. Take about the two months for, actually, thinking one dry one, people archive. System to remove them, they. Take about forty years if you assume a hundred drives. Running, in parallel to. Do it to, remove the blu-ray, and it. Take about eight hours to remove all of this in the hard drive RFC's. Okay. So, those, were the kind of compression when he says so, as we can see oh you, know you if we want if you think at some point we need to remove, unwanted. Data, and that, becoming. Then. The choice of the data storage, technology, may make, her a huge impact, right. So. The. Next question is can, we or should we write. Be able to actually completely, remove the data right, so, there are a lot of argument, right on one. Side is dominant. Argument, said oh you know the, data are important, we should never remove them right, and they, also say you know there are a lot of field. It's, a federal, law required, you archive data for 10 years for 30 30 years those. Are also true argument on the other side you, know there are a lot of data actually. Created, in, way the peoples of privacy right if there is something that should be removed right or. You. Know they were data you, know no data probably. Will last for, you, know a hundred, million years right maybe longer than the the, the, human societies. Lifetime. Right so the, way we were saying at some point, certain type of data may be required, to remove right so all kinds of things could happen so what's really happening is one. Thing I want to say, is in, reality, even for right now right, we.
Actually Generate a lot more date and we erase and throw, away a lot of data without human, Act restore. Them. Those. Happens, a lot in scientific, research, there. Happen, a lot in. Even our current like autonomous, cars Google, is doing your, sensor, can capture a lot more data than you actually collect. It right, and, in our daily life right and everybody, thinking there are kids of photo, but, are you sure you actually go back to check, every single photo right after, 10 years or after 20 year are you also doing that right because eventually, it's limited by your own time to. See right so you want to see the most important, one not every. Single one right the, second, when. We look at the physical process, everybody's, processor. Did or how we recognize. Things. Right. In. Physical, world right our eyes we open the amount of data coming in, it's a much larger, set, of information than, the our brain actually processed, right, so in reality, we, don't, really is to store everything, formation. That's. Ever exist, right. And then, the third thing is as, we're early hug you write the more data you stored. The. Most space you have the more files you have the, data temperature capability, drop that means take. Longer time to process them you want to make sense of that data, it's slowed you down right so, you order to compensate, that you. Have to add in a lot more processing power adding, a lot of cost. A, lot of the you, know new. Hardware. And software a new effort together, right just, in order to eliminate, all, this redundancy. Making. The sense of the same set of the data right, so. From. Beauty. Perspective, now, I drawing a national lab they care about the. Cost particular, the energy consumption right, if. We store a lot more data and, doesn't. Really use them that's. A big waste from, national. Perspective. You. Know it's a big waste of energy, right. So. With all of this we say oh at some point people may need to consider. How. We can identify and, remove those, data how we can actually physically, remove. Them so that and they no longer cause. Us to, holding, them how, much the car I give you an example this, goal is not just my last name right this is actually the Government, Accountability. Office, they. Actually check so from their report, he said the, US government actually spend more than 70%, of their, IT, spending, maintain. Those old and legacy, system, as, a result in like, a last year alone right they. Were having, some, points three billion dollars less, investment. Through the new. Technology. But, they. Have to move this fund to maintain the very old one right those are the information you store but. Some. Them doing important, but because you have no way to transfer, to the new system but. Someone may be obsolete. Right, so, when we look at a Facebook, right it's a very young company, about. 10 years old right right. Now they have try, to, store. All the information people. Every uploaded, their right. Some, of them may be important, some. Of them may, be less important. But. What. Will happen 10 years later right, what will happen 20 years later right. So, those, data, have to be put in somewhere in their, data archive season, let's, say you're putting the blu-ray right five. Years later you find five percent no data may no longer need it you. Have no way physically, remove them what you do you let them sit there ten. Years later, oh that becomes 15, percent, right so as the time aggregate, right, and somewhere. Realize Oh majority, data are not, not. That important, anymore right, so at some point there is a trigger point so. If you don't find a solution where you can actually easily replace, those I want a date with, the new data.
It's. Become a you, know drive. Up all your costs right. So. Let's. Make another comparison. Right when we look at these things we talk about those unwanted data, we. Said in, the digital, words right that's, also, cross burning the garbage right. We. Talked about that we have a carbon we have recycle, ball we have now recycle right for, things you can only remove it's like a non recyclable garbage is, sit there forever, take, your space pollute. Your environment, and take. Your resources, but and improve Ryan is in good for. Things you can remove that become. A recyclable. Stuff. Right, decom holes you can overwrite with new information. So. As, we. Migrate to the cloud now we concentrate, a lot more data that's, it's very similar like members. Of human society, now we have metropolitan. Right this, is a city I used to live right I grow up there, now, they have 20 million people there right so you can actually see as a population, density grows right this is a number, of garbage. Collecting, set, surrounding. To the city just. In order to maintain the. Function, of the city itself right, so. If we think you know if we can actually, you. Know making all these to be recyclable right the number of sides can be reduced significantly right. For the same thing if we have our data, center, right we, can better identify and, realize. Have. More amount, of data that is, actually, useful remove. Those. You. No longer need a data, replace. By the use for data than, the efficiency of the data center as a whole will be improved, right. So. There are few different way, to do right well the rhythms, protease. Say hey you know data archive, we, replace a hard drive the, the platter. Using. That to replace a tape or blu-ray at least those are rewritable. Right so. Of course there are a lot of technical detail the people in the hardware side it's, working on but. This is something to get things started right in reality, in the data storage industry, there are a lot of ways, actually. Continue reducing costs, but making, the, rewritable, of this, data archive system as an option right. So. There are something, that we are currently working, on and. And. Also technology we, implement, we, said we can lower, the drive costs, to, low enough so that they are appealing to the same field there are people are currently using tape. Or blu-rays, and if. You look at right what are these means right the, Facebook. I think they were directed benefit because, over. The time, they are probably seeing the fastest growth of you, know useless, data right, as, a percentage, of the data they stored so, they don't have to build an additional data center because a lot of these data center their. Information. Can, can, be removed and then it can be replenished should and restore, a new information, and people. Company. Like CJ they no longer needs to think, because now they, can build an archive system rather, than just hotties drive right. And the. Same thing for WD, is now they have have, new product, line and for. Amazon. Google for you guys you guys a host a large amount of data if, you guys using this you also realize, a lot of saving, from, both your data center, operations, your. Data archive. System as, there are wise for your service right lot, of these different, hardware, technology they. Can provide this kind of service without. Have, to introduce. Additional cost, right. So. The whole society from, our person actually implementing. Those things they, will actually lower your energy. Cost. And lower your energy consumption and, reduce your carbon dioxide. Emission, right. So, this is you, know from hardware perspective we. Look at how we define the digital universe how, we store, the data and, look. At what's a future, perspective I think. I will stop here I think. These are the backup, slide because, I say this is a summary of my, talk we try to give a standard, definition of digital data. Temperature, and then. Using that data temperatures, study, all, the hardware technology. Yeah. I think that's it thank. You Thanks. Okay. Here's right, now it's, a q-and-a. Do. You have any questions, about the, talk. Sure. One. Question I have is, normally. Data. Given, in active, and passive data so. You have a redundancy Simo, nowadays. Yeah. Not. Only a single copy of data yes. Yes. Yes. Yes. So so here is the saying you can consider every single file you created as an independent, when. You have multiple data you have, a chrome of obvious, piece of information so each individual, piece of information now being assessed less frequently, right, so instead you were keeping one, very, hot piece, of data now. You're creating multiple, low temperature, data right that's exactly the same thing as we talked about as. You create more copy right the, size of the, digital. Universe expand, right, the, temperature, drops, right. So the way people in the cloud, data, store in industry, trying to do is for, example when you have a breaking news right, everybody.
Rushing, Try to a searcher for some piece of information right. But. Regardless. What kind of SSD. Or HDD they. Have very limited bandwidth, they, cannot keep, looking. For a couple, pieces of information back, and forth as, an unlimited bandwidth, so, they just create more and more copies, right. But. The nice thing for that is, typically. Right after. The. Short, pure time we're out much. Less people start searching for you right so, they remove those redundancy. Right if you keep all of this information you know HDD, or SSD right. You're right with other information that's. It's like you eliminate. I call those data right recycle. Those space. Right in a minute to those redundancy. But. What happened is no, matter how you do. This process you. Always, try to keep an archive copy right. The, archive copy at the beginning is not a problem but. If our, you. Know teacher the universe keeping expanding, right after, 10 20 years 30, years generations. Right the aggregated, amount, of data become, a larger, and larger right. So, by. The time you reach there now. You have to say oh at some point I, realized. I pay a lot more, effort try, to keep all of this or, the date or the past right. And then. You have to make a choice. Right. There, were things that you. Think about it's to generation, before, right we don't even have all of this digital, data, definition, people, still. Leave. Fine right, of course. Now it's better right but, it doesn't means you have unlimited data whatever. And make you better because, you, know now Adeem probably, more people watching their their, cell phone more than they actually watching, the, real life a person right people go dating instead. Of talking to each other they do texting, right that's, let, me know be the right. Thing. What. You make. It's. Very applicable to kind. Of services we provide, but. Like. We as a cloud service provider. Sometimes. The data returned, you know like the. Rate of criticality are the data retention that. Requirement. Comes from the customer. Yes it so do you have any. Do. You have any opportunities, there where, we can provide this data temperature. Suggestions, to them yes. So, this. Is actually, the. The kind of interesting, part right because, for example if I you. Know come, to people. Like you guys are like a Microsoft, when they're salespeople, right, when they so it was their customer, about cloud service right they basic just used in our formula right you have X amount of the terabyte, storing in, our server you.
Have These bandwidth, then you pay an amount, of money right so when, those people, coming, to talking to you right, in reality they pre the premium, to, elevate, data temperature, capability, right the. Country here is how, we go this by logarithmic, so you can they say say oh we have a few tears right chain. Iam by five degree you changing by double, your price right this, can be easily translated, there right it's just I really company. Probably. Using very similar formula, but there were no way to talk. To each other to try to standardize, them, right so, that is kind of an interesting aspect, that's why I try, to use there's a formula because, I feel like oh it's very interesting because people were. Naughty working, in the technology. Side now, can use that correlate, has this degree, to their you. Know daily, life right, but at the same time when you actually look at cross service, nothing, exactly the kind of phoner all, these the company is using. Dr.. Yeah it's very interesting perspective, to compare, this with real. Yeah. Thanks. Follow-up. Questions she talked about other companies are already using, similar. Concept. And not standardized, yet how is the progress of standardizing. And what are the companies reactions, when they when you bring up this concept so. Good. Question, so the. Standard, was, formula. That I present before you just come so. It's actually in the earlier slide where you can see I. Just. Start. With publishing, this a, few, months ago, start. Talking to the people. It's. It is very, interesting aspect, because. In. Reality right. Every company, they, just need to have, a, former. This they, show it to their customer, right so, I'm practical, aspect, you. Know. If they think and they can sell to their customer, probably, that's good enough right, but. For the people working, on the storage side but. He called last couple years I started looking at you. Know I used to be working for Seagate the, hard drive company, right looking at one set of data storage, technology. But, I try to jumpy, outside to look at hey how, it's different, I can already interact, with each other so, for example why as ice-t taking over some of the market on HDD, or the, past few years right so, I try to say hey you, know either some universal, relation, there.
Right Then, I realize oh hey you know people, in the IT industry. They. Have this discrepancy, right, they all follow the similar trend but nobody really tried to put something there so, I hope that people will start, buying this and, so. I'm, starting, to try. To impress our different, data storage, company to. Start using something, like this, last. Year my first president had the magnetic. Recording conference, it. Was very interesting because normally. Only harddrive. Company, go to those conference. But, I do see people from Verizon. Facebook, and I'll go there and listen to the talk so. Hopefully, they will feel like oh this is interesting. And I. Think if we get you more. How. To say because. I myself is from Hardware side right from, storage, industry if I get more. Opportunity. To talk to the people in the cloud space. Right. Or maybe an open compute right, space a lot, of these people I start to will. Be able to start our adapt to it I definitely. Think that's very useful because, if you look at in. Storage, industry right, where are all these different, technology. From table blu-ray hardest. Drives solid-state. The. UN and we have a lot of emerging, technologies, spear. On flash, and. There, are I'm you, know face change. You know 3d, crosspoint all of this right but. Now with one, formula, right all of them just falling. Within the same chart right, we talk about 160. Degree you cover everything. Included. In, data storage, technology. Right nothing, go outside this boundary, right now right. So this is something make, make. Its you know very, attractive. Because. People are interesting, in the hardware side the interest rate what kind of new technology, or new design, or in, new approach we can have to, take a new market right take a market, currently. Have right my, starting. Point you, never really argue something, can already take over the other technology. Market. The. Irrational should be oh there. Is a new market. What. Is the best technology for that right. So, if. You have the same solution, right to, meet the same market, they're, always the lowest cost one way. But. You know people have a diversity, of need right as, you migrate to cloud actually, some. Of this criterion, become more well-defined. So, that's either for people to look at this evolution, so that's why you, see the you, know SSD. Is booming, for last few years right. But. The SSE taking over you. Know high and hardest drive is no because I see, taking, well it's more expensive it's, because the demand, for, that particular cloud service, require.
High Data, temperature, capability. Which. Hard. Drive, actually. Create, more gap not a as your your capacity, increase right, so, you lose the market, by yourself, is not a other. Technology. Taking over. So. That is where you, know my standpoint. Thank. You yeah. Thanks. Yeah. I think that's it okay. Thank you for being here. Thanks. You.
2018-07-14 13:01
High-tech company but doesn't know how to record a lecture. Very low sound.
I find the concept of a temperature for the data to be interesting, and possibly quite useful. I think that the long term worry about having more than 1/2 of your data not being useful in the long term deserves some attention, but only if the costs of storing said data stop decreasing over time faster than the rate of storage growth / dollar. I'd like to draw your attention to the Von Neumann architecture widely used in general computing... most transistors in such a computer are in the RAM, and aren't being changed, and thus have a low temperature. If you want to increase computing performance, you have to redesign to increase the average data temperature. This is what parallel processing, graphics processors, FPGAs do, and it seems to be highly successful. You can compute far more with the same relative number of transistors by making each one more likely to change on a given clock cycle. Thanks for an interesting talk.
With nanomagnetic memory, it is possible to store all of google's data centers in a sugar cube. Zettabytes of data could be stored on your smartphone.
I would encourage the presenter to continue to improve his ability to speak English. His current proficiency is a barrier to effective communication.
yayayei@g.pasar