Allen School Distinguished Lecture C Mohan IBM Almaden Research Center

Show video

Hello. So. Welcome everyone to the distinguished, lecture series, today. It is my great pleasure to, introduce him, Mohan from IBM, Almaden. Mohan. Has. Spent 36. Here at, our Madan at IBM. He. Is very, very well-known in the database, community, and way beyond for. Inventing. The aureus recovery, management, system. If. You haven't, heard about our ears you should, this. Is the D standard, recovery management, system used today in old. Relational, databases. Mohan. Received. Numerous awards. He received, the ACM. Sigma Innovation, Award which is a top award. Given to a database researcher. And was, one of the earliest recipients, of this award, he. Is a, member of a National Academy of Engineering. He is a fellow of IBM, a, fellow, of Triple. E of. ACM. I think I mentioned all fellowships, and. Not. Long time ago. Mohan, became, interested, in this new technology, blockchain. Where. IBM has a very strong presence and. Today, is really super, exciting, to have somebody, talking. About a new technology who, is a leading expert in the entered. Technology, of database, systems so. Here a smaller. Thank. You, thanks. A lot dan gives. Me great pleasure to be here, believe. It or not in these 36, years is my first formal visit, to u-dub, and. Obviously. He's. A masochist. Because he suffered through 90 minutes of this presentation at, vldb. And he's still insisted, that I come here and give the talk so. Hopefully. You all don't. Blame him for making you suffer through the next 60 minutes, as. It says over there I originally. Gave it as a keynote. At the ICDC, s conference, in June. And then the longer version at. Vldb. In Munich, the, URL that you see at the bottom is where you will find, different. Versions of these slides and even links to videos and so on the good thing is that this is being videotaped so. Soon. I will, post the link to this presentation. Also, I, have, lots of slides but. Let. Me try and get through as much as I can this, is the full blown version of the presentation, as I presented it at the vldb, conference. I've. Tried to make the presentation, be, of. Use to not. Just geeky. People but even people who are totally, new to the, whole concept of blockchain, as well as non techie people who. Are interested, more in the use cases rather than the technical details, but at the same time I try to make even the techies get something out of it so that they don't start. Yawning and all that stuff so hopefully I live. Up to some. Of these expectations, I've, set. Myself. But. I also wanted, to make the point that at. Least after, I got involved in this big. Time about a year ago and so by no means, am I an expert in this space I noticed. That the mainstream database, people as well as distributed, systems people have, not really been paying attention to the space yeah. There are distribute, systems people who are more rough if you like fringe people who do these things in you, know very specialized, kind of conferences, but the general community like. In IEEE cdcs they, were like two papers or something like that, even, though they were gazillion, papers overall, so. That was the other intent, to make the mainstream, people get involved because even in IBM, none. Of the mainstream database people were involved in this even though IBM started working on it like two years ago and as, a result you will see some non-relational. Stuff being used things like that no, sequel, for example, we, invented, relational, and sequel, but here, it is supporting. Key. Value store and things like that so I'm trying to fix that as part of my much, more deeper, technical, contribution, with a group of people and I'll refer to it later. So. In terms of the history of course you, know a lot of people credit. This, anonymous. Person Satoshi, Nakamoto, as having come. Up with this whole concept of Bitcoin and so on but believe. It or not just this morning I was reading an. Article in one of the recent issues of, ACM. Queue and that's. Arvind Narayan and somebody else who had given a history, of where. All the ideas of Bitcoin. Came from so I highly. Recommend that paper I forgot, to put a reference to it today. But. That. Work was done in what's called the open or permission, less, environment. Where anyone, anywhere can, join the fray without having to reveal their identity, and things like that as I, say at the bottom I don't, really care for that and by, the way, even. Though I'll be making strong statements, and so on and. I've been in a long time IBM, or whatever I say is all my opinion not, necessarily IBM. So opinion although you. Know it might be IBM's, opinion, but I don't want to. There. Are various, organizations. Especially in, Silicon, Valley where I've spent 36, years of my life that. Are salivating at the idea of making money through AI SEOs, and this and that and you, know all sorts of crypto currencies and such I don't, really care, for any of that and I'm not the you know make.

Money Quickly kind of guy otherwise, maybe, I won't have stayed in IBM 36 years. But. This. Technology. Has, taken. The world by storm in, fact there's a book that, father-and-son, team the, taps courts have written where, they are characterizing. This, as the, second, coming of the internet and they are really pretty much I would think talking. More about the, private. Blockchain, rather than the. Public. Blockchain, what's the difference the private one you have to be admitted to the party before you can take part in the transactions. Of such a network. Whereas. In the public one you don't have to be. Explicitly. Admitted, and hence you don't have to reveal your identity and, so on so that, means that in the private one, Byzantine. Behaviors, shouldn't be that. Much of a concern yeah they could still be bad players and all that but you, know who they are and so you can go after them this is just like uber driver or this or that driver kind of people in the world. Today a reputation. And all that will be affected, because their, identity is known. In. Terms of practical, deployment, the private blockchain, has been deployed as early. As February, of this year and that was, for a. Private. Equity fund, management, in the Guernsey Island of the British coast, IBM. Worked with this company in Chicago Northern. Trust to produce the application. Software. The. Important point there is that the transaction, rates are low the value of the transactions, is high but their rate is low because it's private equity, and. IBM. And its partners, who, together in, the context, of the psychology, consortium. Came. Up with this one. Project amongst, the many projects, in that consortium, the fabric, they. Have released as. Of July the, version, 1.0. Of that piece, of software which is open source and free for, you to download and this and that and. That's. Considered, ready, for prime time there were two prior releases 0.5. And 0.6, there are some significant, differences between especially, the 0.6, release and the 1.0, release that I hopefully get to later and, IBM. Separately. From this free, software, that you can download and, do whatever you want with it has. A, blockchain as a service, offering, which is now called the IBM blockchain, platform, which. Obviously you have to for because it's running on the IBM cloud and believe it or not it's running on the mainframe running, Linux operating system, which, has got SuperDuper, features compared, to Intel.

SGX In. Terms of secure execution, environment, and things like that and. That's the, one that previously. Used to be called the HSB, n1 and, that's. Available, in many places as it's listed, there and, I'm not going to you know talk through every bullet. Point I'll let you look at it meanwhile. A few months ago Microsoft released, what they call the cocoa. Framework, I will. Say a few words about it I don't claim to fully understand, the, very high-level claims, that are being made over there and. Oracle, joined. The consortium. The IEEE pillager consortium, back in norm August, and in, October, they've announced similar. To the IBM blockchain. Platform, an Oracle, blockchain. Cloud service, that they have said sometime next year will be available, based, on version 1.0, of the fabric, and. Also, in. October one. Of China's largest. Companies. Baidu. Has. Announced. That. They have joined the the, hyper ledger consortium, as a premium member which, even Oracle then, choose to do which, means you have to pay more money and things like that, and. The market. Is estimated to be like seven point seven four billion by 2024. Who cares about the exact number but it's the order of magnitude, that you should take, to heart and as, I said I, care. More about the private or permission, blockchain, systems and not all this other, hoopla, that's going on with Bitcoin, and all the speculative, stuff and people, think the world's hunger problem will be solved and all sorts of, you. Know exciting, things are being mumbled, by people who are spending. A lot of time working, on stuff like that trying. To fly under the radar in terms of you know bypassing. The, governmental controls, this that and maybe you'll have a disaster, like the 2008. Financial. Disaster. Thanks to many other people trying similar, tricks before. This. Is just a laundry list of topics that people have worked on including, I myself since, I joined IB, Byzantine. Agreement, protocols, and such we were working on in now Eden in 82 with. Barbara. Simons. And, Ray, strong and Danny dalla who is a visitor, and people like that and I even worked on combining two-phase, commit protocol, presumed, abort and presume. Commit width block hypo, with. The. Byzantine. Agreement protocols, that were developed in, the lab at the same time and then there are many other things that have happened and a lot of those things have relevance, to what. I'll, be talking about in a similar sense a bit before I joined IBM is when system. R and sequel work was done I joined the R star project, when, it was midway through its.

Existence. Which is the distributed, version of system. R and then. A whole bunch of things happen replication. And. OLAP. Versus, OLTP, and shared-nothing, shared, disk and all the debates, that went on there as well, as stored procedures, and object-oriented, databases and then the more recent hoopla with no sequel, and MapReduce. And heard OOP and all this kind of stuff and of, course the cloud and such, which all have, relevance, in this context, as you, will see so, what's the problem, we are trying to solve with these private kind of block chains if you look at an a, scenario. Like what's depicted here, where. You are shipping something from one country to another there are many parties, that are involved and even, if they are all using computers. There's a lot of point. To point kind of communication, that goes on but there's no single source of truth so. Each one has its own database and, they, might exchange messages, but, nobody is able to get a good grip on what, all is being. Worked. On and what the current state is and, things like that and also, even. Now there's a lot of paper documents, that are. In, use and. Attempts. Are also made, to if something goes wrong, go. And fudge the documents, to shift the blame to somebody, else and it's hard to prove that the people post, facto, and and change the original documents. So, many, such things, exist. And this leads to a lot of inefficiency, and expensive. And vulnerability. And lack of transparency and, such so, the attempt with the blockchain way of doing business is to do. Things in a more methodical, way where the programs, that are executing, the different nodes will, all be collectively, developed, and, agreed. Upon and, there'll be you know stamping, of the different transactions, by the parties, that, are agreeing to the transaction, being done in a way that later, on they can't claim that they didn't agree to it and also, the databases, that are going. To be used will be shared and replicated. And you. Know the, state changes, will happen in a very. Thoughtful. Manner, and, all that okay it won't, be the case that it's all some random set of programs, that are independently, developed, that are running all over the place of course this means that there has to be more discipline, with respect to how, people go about their jobs they can't just let things the way they are just. By installing the blockchain software. Magically, suddenly everything will improve okay. So, on. The left you see in many situations.

Something. Like a Clearing, House being an, intermediary. This, happens even with house porches right you have the escrow company and things like that but, in the right hand-side way of doing it you are going to now have this, digitally, signed and encrypted transactions. And there. Is a ledger in which which is like a bank ledger or, a transaction. Recovery. Log where, you are doing append-only, entries. Which talk about who, agreed to what and which transactions, were done when and which, transactions, committed which didn't commit and why it they didn't commit and things like that this, is not to say that the databases, are immutable. It's only the ledger that's immutable, so this has to be clearly distinguished. The database can be in place updated, and all that just, like today, you. Have the recovery log which we don't do any in-place, update on which is always appended, to but the databases, updated. In place similar, kind of story there's. A lot of you know exploitation. Of cryptography. And this and that sometimes, even one-time usage. Of keys and things of that nature depending. On the particular use. Case scenario and I don't know anything about cryptography. And all that but know. Enough to, you that this exploitation. Of various, technologies going on which is similar to what I had in laundry, lists at the beginning and. So, IBM's. Blockchain, platform, which is the service I mentioned, before not. Only has, this, hyper ledger fabric, the free software but, it also has some other additional tools, that are not available with the free version plus. There. Is another hyper ledger project, which, is open source called, composer which. Is a higher level tool if you like it's somewhat like a four GL on top, of a database system which, allows you to more easily define. Certain things and I'll get into some of the details later and, then you are of course on top the blockchain. Application. So there are a bunch of things like this that you get by. Going, to a platform, as opposed to just downloading, the free, software and doing, your own bolting, together and worrying about certain. Failure scenarios, for which automatically. Backups, being brought up and all that is not part of the open source version of the software. Ok, so. There are various, roles. That are there. In this context, which are different from let's.

Say Something like an application program or database administrator, system. Administrator, and things like that that you know off in the database. Context, and here you have blockchain. Developer, blockchain, architect, regulator. These, are like sec and such people, and. Also membership. Services, because there's. Remember. I said you have to be admitted to the party before you can participate it's, those, sorts of things that are dealt with by things. Like the membership, service which issues you certificates. And public, key private key and all this kind of stuff and, then. Of course you, do, need to hook up this blockchain, system, with, your traditional back-end. Systems where you do order fulfillment, and things like that I say P whatever right IBM's. Transaction, processing systems and things like that and then, of course the network operator, is also there which in this case in, the case of the IBM blockchain, network will be IBM, but in. China for example the, this, fabric. Is now available, on the Allie cloud the Li Baba cloud there, are things, like that and Microsoft also an azure platform has many of these sorts of. Blockchain. Platting. Software, available as a service not. Necessarily, with you know fancy features attached, to it but they're just letting you run it there. There. Are many things that go with a, blockchain solution. I won't. Spend too much time but the smart contract which. Is a generalization. Of what was done originally in, Bitcoin, which was just managing, this one, thing called the Bitcoin to. Something. More. General which is managing. Any kind of asset whether it's a digital asset like a music, recording, or a movie to, managing. Even physical, assets like diamonds, and such or, packages. In that scenario that you had before. The. Digital representation, of the physical things obviously right. Was. Due. To the ethereum, guys who came up with that generalization, right but, as we will see later a 3m had its own problems, which were inherited, from the Bitcoin way of doing business and we, will discuss that later but there are other things like systems, management, the notion of events, and things like that over here, the, composer as I mentioned, before provides. You higher level, tooling, to, define, various, aspects, of a, blockchain. Application. Plus the server-side. Software. Things. Like defining. The assets, that are being managed, there. Is an asset registry, the, kinds of transactions, that are going to be executed, buying and selling and transferring, ownership and, all that and then, you also have, the. Business network participants, which are the organizations, that are part of this network and within, each of those organizations who are the users and so. On who can do what all. The access control, and authorizations. For, invoking. Transactions. And things like that which, are not different, from really you know in a database system you have similar. Things also. So. Was. That a question yeah. It's, a little hard for me to understand, where. This, runs I mean like in classic database terms when. I actually, commit, a transaction, there's some, like. Single, event, we. Will come to that okay so the question being asked was the.

Gentleman Has a problem figuring, out compared. To a traditional database system, exactly, when does a transactions. Commit, and all that get finalized. Hold. Off on that question till I get into more of the details later I was just trying to give you a very high level view and then I'll get into some of the details. So. So. Just to delve into a bit more of the detail if you look at just one of the peer nodes not. A whole, bunch of nodes when. The. Application. Which is running on the client side which is dealing, with GUI this that, which. Is also developed, by the blockchain developer, in addition, to the server-side software which. Is a smart contract which. You can think of like a stored procedure. When. The client, starts, blockchain. Transaction. It does so by providing some input parameters, to, one. Of the smart, contracts. Each. Smart contract, in turn might invoke other smart contracts, right but, ultimately when control goes back to the client is the end of the blockchain transaction. And I would like to believe that each, blockchain. Transaction, maps to a corresponding, database. Transaction, okay but. We will come to that so as the, smart contract executes. There, are events, being emitted saying, this one contract, got invoked that smart contract got invoked and also, the. Smart contract in addition to invoking other smart contracts, also makes database, calls like in here it's shown in the sort, of key, value store. API get put delete and so on which. Is mucking, around with this database that's listed here but, in addition you have this, blockchain. Which is like as I said the transaction, recovery, log but it's got a lot more in it than a typical recovery, log it looks very different in a typical recovery log multiple. Transactions, reads and multiple. Transactions, rights will be interspersed, whereas, here each transaction, is described, by itself and it includes, not, the rights but even the reads and we will see the details of how that happens, later. So. The, way this works. The. Initial. Execution. Results, in what's called the simulation, of the transaction, execution but, the, client gets back something which it can use to later on figure out whether, the transaction, ultimately committed, or aborted okay, it gets back a handle, and we will see details of how that happens a bit later, but. It is this blockchain. Which. Is a set of sequential. Blocks. That. Is the, immutable, part and unlike, in the, open blockchain, where blocks, can go, in parallel, and things like that because. There are multiple people who are successful. In adding blocks here, you will see that it's very deterministic. And there, is no ambiguity about, what, is the next block that gets added and things like that okay and in, terms of application, areas they are all over the place, they. Are not restricted. To only the. Financial. Industry. There. Are diamonds being managed, there is supply chain and. Global. Trade digitization. Food. Safety Walmart. And IBM have been working in China farm-to-fork. IOT. You name it everything under the Sun is being, now dealt with we are, block. Chains and here, is the reference to the book by the way on that page that I refer to at the very beginning there are many papers. That I refer to in white papers and so on so. Like. I said there, is low volume, stock trading, also, like the Japanese exchange which. Is different from that to one that to rent. Production. In early, part of this year in the Guernsey Island and. What. Else do I want to tell you there, is also asset management that, IBM, itself is doing using, the blockchain where, you, have to look at many things like manufacturing record, shipping records finance, records or, the management, records and things like that it's a it's a big mess really and the, hope is that by using blockchain. Lot, of it will get streamlined. Quite a bit and there are more details about the kinds of actions, and humans, in. The Middle East believe, it or not Dubai and Abu Dhabi, are, really. Competing, with one another to, leverage. Blockchain. Especially. In the government, context, to. Enable. As, Dubai. Has declared by, 2020. To make all the government operations, paperless, essentially. They want to put on the blockchain all, the, things that an Emirati citizen. Is eligible for and, as, long as you can prove who. You are using biometrics, and such with a national identity scheme, then, everything, else will be there and you walk into some Hospital whatever you know Social Security Service whatever, the equivalent things are and they. Will know exactly you, are eligible for fifty percent discount you are eligible for free service, bla bla bla okay, so that's the sort of thing and there are various, aspects. Of the government that are very. Actively, prototyping. These things and so on this has not gone production.

Of Course so all across, the globe it's also believed even in countries like India that. With. All the corruption and this and that blockchain. Way of doing business will, in fact bring about lot more transparency, than the hanky-panky, things that go on with, traditional, ways of doing business and, and, so it's for that reason also I get really annoyed when people, want to enable the hanky-panky with all this cryptocurrency and this and that where they, want to go under the radar and, not be subject, to some of the regulations, and so on that fiat currencies, are currently. Forced, to deal with there, are many consortium. In the context of which lot of this work is done so, it, is still chaotic because, there's no such thing like a sequel standard in this space it's, still not as bad as the no sequel arena, because no sequel, essentially. Each of the companies has its own way of doing database stuff whereas, here at least it's a bunch of consortia, that are doing their own thing so the number of consortia, is lot, less than the number of companies and as, it happens with many of these sorts of things companies. Want to hedge their bets so, it's not like JP, Morgan is only in the enterprise at 3m Alliance guess what they're also in the ecologic consortium. So they are hedging their bets because, they don't know which horse is going to win so, be careful when you choose which, software, you're you know you. Choose. To deploy because, if, you bet on the wrong horse you may get messed up because there. Is no standard and you have to rewrite your app and all that but, in any case the latest kid on the block as this. American. Bank article, said even, as a February, and I believe that's still true is the, enterprise at 3m Alliance and I'll get into some of the details later. So. There was a very nice paper in this year Sigmund, by, National, University of Singapore guys, and some of their colleagues elsewhere which. Tried, to come up with the first ever of benchmarking. Framework, for they. Said private blockchain, but they were already. They've. So far covered only one private blockchain, namely the hyper ledger fabric but they've done. The, evaluation. Also of a 3m and one, other thing which is still an open blockchain, but. The model, is kind of nice to look at in terms of having, a consensus, layer where, there are different ways of doing things proof of work being the expensive. Wasting, maps and in this green conscious. World, what. Bitcoin. Chose to do and what a 3m followed with and then there are the improvements, like proof, of stake and pbft. Which is what Barbara Liskov and company did with the practical, Byzantine, fault-tolerant, which, the, fabric used in the previous release but in this release. The. Fabric, people have gone with just Kafka. Which is pops, up and. Also the contract. Itself may be written in. Some. Specialized, language people have come up with even specific. Languages, for the smart contract and even, the execution, environment. In terms of the VM and so on they've, defined specialized once and then of course the database layer there, are different kinds of systems being supported. In. Terms of, various. Architectural. Choices, at this point it's, fairly chaotic, there, isn't a systematic, study of why. Some group, chose one way of doing business was, is another way with, respect to many things do, you keep all the data in the blockchain. Or, do you keep only the hash of the data in the blockchain but, actual data is kept in, an off chain database. What. Does that do to if, the data gets messed. Around within the soft chain storage yeah you can detect the fact that it's being messed around because the hashes don't match but. You can't get to the original data so how good is that whereas. With the blockchain way, of managing. Everything, you, have so many different copies, that even if one copy or a few copies get messed up you, still have the pristine version somewhere, else and. There are many other things you know assumptions, about what kind of faults you are going, to deal with and so, on and so on so a longer, version of the Sigma paper by, Bank, Shenmue II and Qian Lee and is, like. A survey, of these things, the National University of Singapore guys. In. Terms of comparison. There. Are many, hours. I could spend talking about the different ways in which replication. Has been done even, in database systems in the past and, so. One, is the primary copy. Update. Where you, do the update in one place then you use the log produced, at that site too as. If, you are doing restart. Recovery, apply the changes, in another place but these two, systems have to have the exact scheme exact. Version of the database and so on the, second one says you, capture, from the log the changes that we're done and then, you create sequel, statements, out of it and then re execute them so now the schemas, can be different it can be db2, here Oracle, here and things like that the, third one is what my dear friend Mike, Stonebraker thinks, is the you know greatest thing since sliced bread which is this, whole silliness, of each, transaction, being executed serially, one, at a time on the primary and then, you capture, the sequel, statements, on the primary and then redo them it.

Has All sorts of problems he never bothers to tell you what all the problems, it has things like it, can't handle non determinism, if a long-running, transaction, you know is executing. And a short one a high-priority one comes there's no such thing as interrupting. That longer, one blah blah many, many problems that we all have had to deal with over the decades. Goal-oriented. Scheduling. Blah blah blah parallelism, up the wazoo being leveraged. All that, is thrown to the winds because in. Partition, he does one, transaction, at a time and so. It. Turns out in the, blockchain. Way of doing transaction, ordering, there is some amount of randomness that's deployed where, transactions. As you will see as I give the details later, get. Ordered fairly, randomly, without and he method. To the madness and, this, could result in if, you, have conflicting, updates, being performed, the first of the transactions, in a block that does an update to an object succeeding. And all subsequent, ones getting blown away whereas. If you had done it the traditional database, way all of them could have succeeded if they all started, in the primary copy and then, even if they all conflicted, on everything, they, would have been done one at a time and then they would have been redone, in, that same order elsewhere, but, by upfront, doing, what we call the simulation, the fabric, in, parallel. And then putting them all into a block you lose, out in terms of concurrency, so these are some of the issues that I would like to be seen, dealt with so this is the benchmarking, thing I won't dwell on this that the Singapore, guys did which, is the one where they have done something like TPC, benchmark, for the blockchain environment. They, have you. Know various, queries, if you like that, exercise, different components, of the system namely the, application, layer the execution, engine the data, model the consensus, layer and so on and they come to some conclusions, but for the fabric, they use the previous release the points its release and they, are revising, these numbers, and I'll find out more when, I'm in Singapore in the next, four weeks so. They have compared, in the, paper Sigma, paper a 3m parity, and hyper ledger and they, have a framework, in which you can plug in other. Systems. Of this kind and so, that's their contribution. And this paper will get cited forever because, the first such thing so this is nice kind of paper to write you know, so. Now let's come to aetherium so as I said Athenaeum, guys are the ones who, generalized. The, Bitcoin to the notion, of managing. Any kind of asset using the blockchain but, it still at these problems, because of using proof of work which, is the solving a math problem bla bla and wasting. Lots of energy and, also this possibility. Of things going in, parallel, and, multiple, branches. Being there and so on they. Also have, this notion of gas for charging, and, of course you, know the, the, JP, Morgan guy said internally. Done some work called quorum, which, they were using. As a closed source within, JP, Morgan, which. Tried. To take aetherium to the private, domain and they. Then early this year decided, to contribute, it to open source and that forms. The basis for the enterprise at 3m Alliance it has some significant, differences compared, to how. The hyper, ledger fabric does its thing so. Just to give you an example they, have a single blockchain. They. Have the notion of public state and then private state the private state involves. Only subsets. Of the nodes within your network the, public state is visible to everybody in the network. So. We, have, something called the channel, concept, which, is you, have a network of n nodes subsets.

Of The nodes can form a channel which, is like a Venn diagram and they can be overlapping, and all that in our case the database as well as the blockchain, associated. With the channel is completely disjoint. From those. Associated, with every other channel in that network whereas, here they have this single, blockchain in which all the transactions, are being recorded but, then they say only the public states data. Will be in its full-blown, glory in the single, blockchain, but, for the private, one they, have only some hashes, here so. In. Other words the private state, is not redundant, so if you blow, away all those private, state copies, you, are messed up because the the blockchain, doesn't, have it whereas in our case the. The database, is merely a cache of the latest, version of the data that's talked about in the blockchain so you can blow it away and you can start from the beginning of the blockchain and you will be able to reconstruct the entire state, that got blown away so I still don't quite understand, how this thing is intended, to function. In, a similar kind of context, but anyway, this is what they've done this is what I meant by saying there. Are all these different approach, being taken, it's high time somebody, in a more systematic, way evaluated. These things yeah. Why. Exactly the. My. Understanding, you said with the privates, tape is still stored on the log and some fashion maybe encrypted but. You're saying they actually have some power of band method no no okay, so the question was is. In the private state data in the log, itself but, I'm saying if if you have like a bomb, being blasted, on that private. State version of the database and you don't have the traditional, recovery. Log and all that to recover or somebody, clobbered, that data in whatever way they took over that site and so on the. Blockchain, itself. Doesn't have the data explicitly. Because they the, the, claim was that they. Wanted the private thing in this case for example if you look at the green private. State it's, participant. To participant 1, and regulator these, other guys are not part of that private state for green right, they. Didn't want, the. Actual, data to be in, the blockchain because, then these other guys would have visibility, to that data, in.

Our Case this is a non-issue, because. The, block chains are separate for the separate channels. The. Log, is something. Orthogonal, don't worry about that for the time being, in. The blockchain, in which, case. The. Hash is what they said is okay yeah. They don't have the actual data but the hash but, for the public state, they say the, actual data is there this is what I have gathered this, is another problem with this whole open. Source blah blah blah space. Documentation. Is pretty pathetic people, think you, know, they. Can just say hey go look at the source code that's, like an excuse for not doing a proper documentation, so, I'm really tired. Of not being able to get proper answers in many cases and even the documentation, if it exists is not necessarily matching the code and it's not even clear if the code is documented, within, it comments, it's necessarily, consistent with the actual code as it evolves and so on so, yeah. So. In. The case of the hydro ledger when, you are saying the straight channels, are separated. For three semesters right. For. Each for, each Channel, the, blockchain, and, the corresponding, databases, are disjoined, from any. Other channels. Blockchain. And database. Like. I said the, earlier, when I describe channels, it's like a Venn diagram I can be, in. A channel, with Dan, and I. Can be in a separate channel with Edie and then there can be a third Channel which involves all three of us but. As of, now we, don't allow transactions. To span. Channels. In terms, of update, transactions, because that'll need two-phase commit and all this kind of stuff at this point we are only allowing read. Access, to other channels data. Under. Appropriate, authorization conditions. But, we don't allow, updating. One channel, as well as another channels, data in a single blockchain. Transaction, if that's the sort of thing you are getting at okay. Thank you okay. So hyper, ledger fabric. IBM. Initiated, this early last year by contributing open, source, code. To open source and it's, gone through significant, changes, especially between 0.6, and 1.0, by. Introduction, of this concept of channel, which didn't exist before once, you got to admit it to the party you've got to see everything that's happening within that network. And people, said hey that's not good enough so it's imagine. This is much, more restrictive, thing than there's. Anybody, anywhere in the world can see the open. Blockchain kind of notion, and, also. It turned out there's an introduction, of the channel concept, allowed for greater scalability, and, things like that I don't have the time to go through all that but like. I said previously, we used pbft. For, consensus, but, for performance, reasons and, I don't, know whatever reasons, because I was in party to this, decision. Kafka. Is being used in this release and. So. That's what this is talking about here of course it improves. Performance clearly. Because pbft. Is still expensive, but. We are allowing for pluggable. Consensus, later as well as database layer so, if you want you can plug in these other things. There. Are different kinds of so. The smart, contract is also called for whatever reason, chain, cord in this fabric, software. There, are system ones and regular ones which are not too different from catalog, tables in relational, versus user tables, so. You can imagine you, know the system ones are the ones that are used to deploy a user chain quartz and. Then key. Value store is being used and optionally. In this release couch, DB. Is also a load which is a. Which. Is a, document. Database there, are more things here I let you look at this I want. To cover this important, thing which is the three-stage execution, of a transaction the fabric which is the. Client invokes, as in, that original picture, I showed it. Sends, the transaction, to a few endorsing. Peers a subset, of the peers in that channel. Exactly. How many and all that is all Lord you know application, design consideration. They. Execute. This chain. Code may, be potentially, more than one they. Produce the. Reads. That were performed, I mean they track the reads that were performed, the, updates are not actually, sent to the database system but, they are trapped this is like a an, optimistic, execution. Of a transaction, they, capture, the read set and the write set and they send it back now to the client the client compares. These different ones it's got for, that same transaction, it's the same chain code being executed everywhere, because.

Of Concurrency, and all that they could wind up giving different results or because, one party agrees the other party doesn't agree and so on it's. Really up to them now the application, semantics. As to what this all means if. They all agree then. This next step is the, client sends all this stuff including, the signatures, of the endorsing, peers and so on to. The ordering service which in this case is Kafka I remember, who. Just takes many such transactions, that were given to them given. To it and puts, it into a block orders. Them randomly this is where no, intelligence. Being applied in terms of analyzing the read and write sets and try to minimize number. Of aborts or any of that kind of stuff or the amount of wasted work and so on and. Then it sends now this block, of transactions, to everybody, the, original, guys who executed, some of these and ashes amnesia, they don't record. Persistently. Anything of that simulation, that they did now, each guy steps. Through one transaction at a time without communicating, with one another they. Check, whether the read set is still in the same state in their version of the database if it, is then, they say the world hasn't changed with respect to that set, of data so. Without executing, the smart contract they. Can just take the right set information, and plunk it into the database in this case validation. Has succeeded, and the transaction, has committed, if, the validation fails for any reason some data has changed originally. You asked for all sign of the employees, you got eight of them those. Eight might still be the same in, the same state but guess what two more sign of the employees have shown up in between so, now the answer is different so again validation, fails in which case this is like a transaction, getting aborted, it's really up to the client to figure out what the heck to do next typically, they'll read some in the transaction, but, the point is it, takes a long time before you figure, out that some transaction, is getting booted and that's the inefficiency, of this whole thing and this, is where I'm said if, any. Subsequent, transaction, after. A particular, one that changed an item changes the same item, you. Are guaranteed, that these following ones will get blown away because as part of updating. You. Know performing, the update during commit, you are upping the version number or whatever that's tracking. The state, of the, different items, more. Details if. You really want the details you can also go look at some of my other videos and. Yeah. I'm trying to understand what. Well. This particular way of doing is definitely, new well sir I mean particularly what's new so so, presumably in databases, in the past there. Had been right. Once log that. You didn't you can never delete. You, know if you wanted to be able to reconstruct your database you. Would have some. Known. Okay okay so I didn't, realize that's where you are headed okay hold on hold up so if, you compare this with traditional. Workflow, management systems, our business process management systems, where, multi-party. Transactions, have been done first, of all it's not as systematically. Done as here, in terms of the, same chain code running in different nodes the. Same work being done in, parallel, in multiple. Non. Database, it's the cord it's, a smart contract, don't. Get confused between the database. Which is being used in a fairly dumb way here, versus. The logic, that. Encapsulate. The, business. Rules. Of. Engagement. Which. Is what the smart contract is it is, what traditional. Workflow, the. High-level diagrams, we all draw in a workflow management system, or. You can think of it as stored procedures. Those. Were more theory than real I mean we can have a long conversation. Believe. Me, the, language, is being used even as I said much, earlier. Kotlin. All sorts of languages have been defined to express. The semantics, they don't let you for example look. At where the moon is and, what the current time is and so on because that will show up as different, values, in, the different endorsing, peers so, all sorts of restrictions are placed on what, you can do in. That smart, contract, because. They are trying to get. Deterministic. Execution. Okay. That's why I said, even that Stonebraker way of doing business is a problem, because if you had a ran function, called invocation. In your applications. Invocation. Of sequel. And you capture that and, react, secured in another place the. Results are not guaranteed to be the same whereas, the log way of doing replication, which I discussed, it's, after, effect of any randomization. This that that you did that gets captured in what actually happened, in the log and that then becomes a source for cranking. Up sequel, which gets reacts accordingly but, still it's all whatever, you can express in sequel, and such whereas. Here there's lot more in, terms of signatures, and this and that that comes into play that's not in, our traditional, database, way of doing business in, any case it's, the higher level specification.

Of These rules of engagement, that, is, very, different. From. Different. Organizations writing, their own application, programs and they exchange messages, and, communicate, and so on which is old, way of doing business. Let. Me just quickly run. Through it because he's getting nervous since I'm so. There are more things talked, about here how the fabric, layer is. Tracking. The, fabric layer between the chain code and the DBMS, is the, one that in the fabric is tracking. The. Read and write sets not the database system and in. Particular the, reads are sent during simulation to the database system and the, writes are not they are trapped in the fabric, layer essentially, this means that, fabric layer has to be all knowledgeable, about whatever. This the. Database. Primitives, that are being supported, in the, chain. Code and that. Can get really messy. Once you start allowing sequel, there which is what my, project is doing okay so this is from. Various perspectives a bad idea making. This layer on top of the DBMS, behave. Like a DBMS, and it, introduces, other complications. Also you can look through the details. That I have got here so I just wanted to flip through a few more slides, our three alliances yet another one of these consortium. They. Do allow, relational. Language. To be used they, do have some higher-level concepts. That. They talked about contract, execution, is deterministic. And it's acceptance, of its transactions, based on the. Transactions. Contents, alone a transaction, is only valid of the contract, of every input state, and every, output state. Considers. It to be valid so they have this notion if, you like something, like referential. Integrity and. Other. Kinds, of, constraints. That you have in a database system associated. With, these objects, and so on that, there that are being managed, using this the, fabric as far as I know right now doesn't have such higher-level concepts, and they, also have some other things that I just don't have the time to talk about but. In the extension, extended, version of the slides you, will find more things there they even keep track of history of transactions, which transaction, followed which transaction, and they, even keep hashes, of all this in a way that you, cannot tamper, with there, are notions of nor three transactions, and things like that just like you have notary public, people, being made. To certify, transactions, that, are you know third parties, and so on. Intel. Has a project, which they initiated, called, sawtooth, which is one of the many projects, in the hyper, ledger consortium. And this, is the one where they are trying to leverage the, SGX. A you. Know the, secure, execution, environment, thing that I referred, to before where. I said the mainframe, IBM one is even more SuperDuper, and so on in the IBM one for example the whole, fabric. Appear, and everything, can be run within the secure. Execution, environment, it turns out intel has a lot of restrictions, on how, much memory and, storage you can associate, with this, secure execution, Enclave, so you are not able to run everything over there so what, that means in the mainframe is that even. The operator cannot, look at the data or muck around with the. Functions, and so on the code itself, because. It's so that, much hidden, away from such people but, they do have this notion of concept. Of transaction, family transaction, dependencies, and such. Anyway. So, more. Things Microsoft. As I said announced something called the cocoa framework, for, whatever reason, they, say this, is like a, layer, on top of any, blockchain. System you like and, for, whatever reason, they've chosen to leave out fabric, they have sawtooth, but they don't have a break maybe, because they think Oh iBM is associated, with it let's not put, that there I'm not sure because they claim you know, they can accommodate anything, but in the picture that they have in this August report.

They Put out they, show a 3m chorim, corner, and a pillager sawtooth, I still, haven't quite figured out what. Added. Value this provides it's, not like it's. Actually, like a common, language on top of all, the weirder languages, each one of these things as but, they make some tall claims like oh this, will allow you to have super duper transaction, rates and so on if you don't really mucked around with the insides of any of these things, I don't, know how on top you can do something, that suddenly you. Know produces. Magic so I, have, to admit I only, recently, realized this report had come out in August, so I haven't, done my homework digging. Through the details of the report I don't, know if anybody from Microsoft, is here anybody. Know. Or nobody wants to admit oh. Okay. So. This, is what my current project is doing I'm not able to give you the details right now essentially. Allowing where it said get put delete we, will be allowing sequel, statements. You. Might think a big deal guess, what it has lots of ramifications because. We didn't want to make now the layer in between this, smart, contract, and this. Ledger. Become. All knowing about sequel, and it's non-trivial, by the way because, it's. Not just the Select statements, that are, need to be sent to the DBMS, if you follow the same old pattern, even. For iud, statements, insert update delete they. Might embed. A sub select insert. From sub query delete. Where blah. Blah blah condition, update. Where blah blah blah condition. Imagine. Now. You have to carve out that portion of the read. Portion, of these iud statements, and you, have to send it this means you better understand, fully the, semantics, of all sorts of complicated, sequel, and also. You cannot, do this if you are going to allow a stored. Procedure, which runs on the a site to be invoked because then the fabric is not there to police all the database calls that will be made so on and on you know I can spend many hours talking, about this but in. Terms of futuristic topics, there, are many things that are open you, know as I say at the bottom numerous, research. Pop possibilities. For database, and distribute systems people in, this what, I call the new era of distributed, computing as. I mentioned, earlier, when. It comes to this channel, notion, we are not allowing multi channel transactions. Especially. Not updates, that touch more than one channel. In. Terms of how, many endorsing. Peers should be there in a network in a channel, and. For, any given transaction execution. What, subset of those endorsing, peers should be involved by the way the major distinction, between endorsing, peer and other peers is that the endorsing, peers have the smart contract installed, in them the, other peers, are merely tracking, the database changes, they, don't have the smart contract in them so, which, subset, of the nodes should be involved in a given, transactions. Execution, what, should be the logic, oh you, know dan is an important, guy he has to bless every transaction. Whereas, amongst the three other people who are involved it's enough if majority, agree things, like that we. Don't have a cookbook that we provide to you today on what, the right. Things to do there are these are like you, know how many indexes to have how to partition, the data all the, typical, database things that we postpone, working on for, years and years long, after relational. Was invented, originally, right same, sort of thing how to deal with non-deterministic, actions. As. I said, so. That, gentleman asked the question about hey, can't, you do all this within the context of the database system there. Are people, even in Microsoft, who have done in a more single, database kind of context. Adding. Certain features, and there are other people in IBM looking, at this also.

Adding. Features to the database system itself, to introduce a notion of, something. Like the blockchain where, tables, can be only appended, to there. Are things like that with multi version tables. What we call by temporal, tables, but. They don't have all the other attribute. So of keeping. The signature, of who modified it and so on we might have something in the log but we don't as part, of the database itself, keep track of these sorts of things. But. Yeah. But guess what over, the decades, even. Logging. Of reads is almost, never done if at all it's done it's, done in a separate. Ordered, log not in the traditional recovery. Log and in, this in spite of all the fancy things that the orange book and this and that that the DoD guys have right. Oracle. Forever, supported, many of these orange book kinds of you know levels, of security, and so on still. Many, of the things that we are talking about in this context in, spite of the you know four decades of relational. Work that's gone on has, not been done. Again. This is not to say we cannot know that we think of all this go, and add these things and as I say people, are actively working on this even the IBM India Research Lab guys in fact my goal was, to first do what. We are doing now which is leave. The overall execution framework. The same in the fabric, but, just allow for relational, and do the things more efficiently, with. Respect to the fabric not having to become very knowledgeable about sequel, and so on and learn. From that what, are the right things to do and you know all, that and then, talk. About not. You. Know keeping the overall. Model sacrosanct, and. Actually. Do. More of it within the DBMS, and, not. Have to have a whole lot of other stuff on top I think. I will stop here I threw out this messing up ok thank you very much. So, I know time is but maybe we can take two questions before people yes, Paul in. This process. Of endorsing. There's. Computation. And participating. And being part of the consensus, and if it's you've. Got many different players, in your example, what is the incentive, for one of the players in a fairly, complicated blockchain, channel. To, endorse things that don't directly involve, them in Bitcoin you understand, you get a little bit of Bitcoin back when you're okay so first of all so the question was in, the way, this whole thing works in the fabric, this notion of endorsement. Or the first phase, which is simulation. What's, the motivation for somebody to do anything at all first. Of all remember this, channel, concept, is based on those parties, being part of this multi. Party thing. The, business process whatever so, there isn't such thing like some random guy is being asked to do something so these are organizations, that are working together this is not some Joe, Blow off the street that you are trying to make him do something so I don't think there's, any comparison, with, this whole you. Know open, blockchain, way of doing business whether all this notion of incentive, and all that this is part of your job effectively, that. These. Programs have been written by, all, these parties, that, are in. That contract. Jointly. Okay. So. Really. That's basically, it so if you look at for example the. The. The, scenario I had at the very beginning, these. Are you know the, customs, agents, and the shipping. Guy the forwarding. Agent the, exporter, the importer, this. Is their. Mainline. Job it's not like they are doing something, for. Their you, know entertainment, or some such thing. Like there could. Be still potentially, bad actors, in there and the. Public, blockchain. 51%. Compute, power of the total doors to modify, any transaction. With. This, how many indocin nodes do you really need before things start getting messy, in us. First. Of all. Bad. Players as I said at the beginning with, uber whatever kind of environment, will ultimately.

Get Nailed. And they'll get kicked out okay so it's not like people can, behave badly and hide under something, of the you know rock our tool so, that's, first. Order of business here, we know who you are so if you misbehave will, come after you and these are typically organizations. Like, I said to this gentleman it's not some guy off the street that we are talking about right these, PR. Nodes are owned by different, organizations, exporter. Importer, as I just rattled off so. If. They, do misbehave. Remember. The transaction. For it to be accepted there is this, endorsement. Condition. That can be specified that's, where I said today. We give you the tooling. To say that but, we don't, give. You a way of figuring out what the right thing to say is so. You can say things like I said Dan, is an important guy he has to bless every contract, amongst, the three others it's enough of the majority agreed to it those. Kinds, of specifications we. Let you provide. But, we are not giving you right now a. Cookbook, to, say, this is the recommended thing in this condition, in this kind of application in. This other application this is what the. Right thing to do but in any case the whole point is. We. Will do the things only if there is enough agreement, and exactly. What that means is, left. Up to the application, designers, so. This whole 51%. And all that I mean there are people who are talking about you know the civil farms up the wazoo in China, can go and you, know this, whole assumption, that the longer chain will keep getting extended they, can try, to screw that one also and extend, the shorter chain all. Sorts of nonsense that's going on with all this open, blockchain, that I was, you know complaining. About all this with some of the people I met earlier in the day I can, spend many hours I've been debating all this with various people in various fora and. So. I'm, not convinced, that any of that is such foolproof thing, because there are bugs in that software people's, money has been stolen so sometimes, people make it appear like it's somehow you, know completely, error free and there's, only goodness that can come from all these, sorts of assertions. That are typically made but, there are enough I disagreements.

Even Amongst, the 3m. Guys about how, to extend the block size and what to do with forking. And this and that, yes. I, have questions too but we're running out of time to get a reception, right, now.

2018-03-23

Show video