C++Now 2019: Arthur O'Dwyer “Trivially Relocatable”
All right good. Morning everyone welcome, to trivially, relocatable, I'm, Arthur O'Dwyer. And. Yeah. Let's get started because we have a lot of stuff to cover and I'm already two minutes into my time apparently so you're, all here to hear about trivially relocatable. You're. Gonna hear everything about it assuming we have time to cover it we're, gonna start with what is relocation, and motivate the. Applications. For it I've got five applications, that I promised, in the abstract. We're gonna talk about some friar art how it's currently being done by real, people in the field right now at Facebook Electronic, Arts and Bloomberg and. Why that's fragile and error-prone and I'm gonna pick on one of them, then. I'm gonna tell you how I propose to solve it gonna do a little nutshell of the proposal, but, there's a very small part of this talk because, in part two I'm gonna be comparing it to past and current proposals, and. Versus things you might think it is but it actually is not I'm. Gonna present some open questions that we probably won't have time to discuss here but get me at lunch and. Then bonus slides if we have time, there are slide numbers down there sometimes, there are URLs in the upper corner that if you have a. Connection, the god bolt you can go look, at these examples live, as. Of. A few days ago I've broken dwarf debugging. So. You. May have to add dash G 0 to some of these examples to get them to compile if you're doing it live if you're watching this on YouTube it probably doesn't matter it's already fixed all. Right. Motivating. Relocation consider. What happens when we resize or reefs Reserve and a state vector and, it's already at capacity so. I have a vector of 4 elements I try to add a fifth element there's, no more capacity I need to reallocate, the dynamic, buffer here right so I, go, to my memory, allocator whatever it is I got a bigger chunk of memory I am place a into there and then I need to get ABCD, from where they are down into the vector right. So. Today, the relocation of objects ABCD, from the old buffer to the new buffering, is accomplished, by four calls to the move constructor, and then four calls to the destructor on, some libraries, these calls will be interleaved and some they won't doesn't, really matter you're still making eight function, calls to move constructors, and destructors, and. Is, likely that neither of them is trivial, here's. An example where they are trivial int star is trivially, copyable. All libraries. No you know the people who wrote the libraries put in an optimization, to say if I'm just relocating. In star which is trivially, copyable right I can take it from one place and just call mem coffee to, get it down the other place because it doesn't have ownership semantics, right so. I just called mem coffee to get the stuff down where it is, really. Efficient assembly really, efficient assembly also on lips, did C++, and. So. That's great this. Is cool. However. If I, am relocating, a shared, butter sherry putter is does have ownership semantics, it is non-trivial emove constructible, it is non trivially destructible, those have to do work so, when I call the move constructor, in the destructor of shared putter in a loop that. Generates a whole bunch more code, and. You'll, notice it is not trivial code either I mean I see calls in.
GCC. It's making. Function. Calls, it's. Doing lock X ad l whatever that is on the other side I see a call to pthread key create like this is non-trivial code in every sense. However. When. I take. Those shared, butters from one place owner. Shipping you know he exists. In that one place and I, move him down the other place and I destroy the originals, I haven't. Really changed anything I could have just used mem copy for that taking the bit pattern from one place to the other as long as I didn't call either the move constructor or the destructor I just need to get the bits over there and as soon as I start interpreting them as short butters it's going to work fine so. The. Operation, of calling move constructor destructor, I call, that relocation. And. In, this case not. Just for int star but even for shared butter and unique butter and a whole bunch of types, their. Relocation operation. Taken as a whole is equivalent to mem copy I call, this a trivially, relocatable, type just. Like a trivially, copyable, type is one when you can copy it with mem copy a trivially, locatable type is one where the relocation operation, could be accomplished, with mem copy it is safe. Now, if you relocation. Operation is not tantamount to mem copy we call that non-trivial abhi locatable so there do exist non-trivial, relocatable, types some of them are pretty Commons did list as an example, and. Let's. See an example of that so we have two, types here for word list the singly linked list and we have stood list the doubly linked list. One, of them is trivially relocatable the other isn't here's. For word list, and, all. That stuff on the right side of the arrow is that's all heap-allocated. But my for word list itself holds a next pointer to, the first element of the four word list it, doesn't even know it's size and, it. Turns out that if I have a four word list object, which, remember is only, this. Part right. Everything, else has heap-allocated, I can, take that footprint of that forward list and I can mem copy it somewhere else and it. Is still a valid forward, list in every, sense right. So it's trivially relocatable, I left. This gray arrow the. Footnote says the. Memory. At the old address will. Still hold the bit pattern of a pointer to the first element the list but. Because. Relocation. Is taking. The object from where it is ending. That objects lifetime and starting a new lifetime somewhere else that C++, objects, lifetime, is over, we're never going to look at it again it, would be undefined behavior look at it again we've already destroyed, it right. Now. Stead list on the other hand is actually, a circularly, doubly, linked list. Inside. The footprint of the list object itself we have a sort, of truncated. List. Node it has a pre even the next pointer the next pointer points the first element the list and the priests the, last element of the list so, that we have an efficient way of call of getting end out of on a doubly, linked list right. And when. I call end on a plus the list. I actually get a pointer to this truncated list note that's. So that when I - - said I follow its pre pointer I get, the last element of the list this, is cool. However, if I were to take this list and mem, copy it. Oops. These. Pointers would get up to you know these these pointers for example if this pointers, add. Value. Was 0 X 42 all right memory, address 0 X 42 and I, take that bit pattern now I'm M copy it down here it's still 0 X 42 it still points the same place awesome. The, problem, is by. Justin I'm copying the footprint of the sted list those. Pointers, out there on the heap still point back to the old object, I didn't update them. Right, so, this. Operation is not trivial if I want to relocate a sted list I have to do something more than just copy the bits I have to actually go out and fiddle, those two pointers. So. Stood list is non-trivial, II relocatable. Alright. So, we. Have these two classes of pipes the trivial ones and the non trivial ones and if we could reliably, distinguish.
Them In the library, then. We could use mem, copy for the former class and we could use a. Non trivial move and destroy for the latter, class and we, would get correct code all the time we would only use the optimization, when it was safe all, right but only using the optimization when it's safe requires that you know when it's safe, so. Let's, teach Lib C++, that shared put er is trivially relocatable and see what improvement we get on our vector example. Again. There's a URL up there you can go just test that this code should still work. So. On god bolt there is a custom branch here p1 4 4, and. Here. I have a vector of shared footers and I. Am reallocating, it and, that. Reallocation, can be accomplished, with. Mem copy. In. Just a little bit more code than it took to do the trivial example the in, star example, over, here on a stock Lib C++, with a stock clang. It's. About three times as long so. I've, got faster code and you notice it still got calls to Pete Frankie create and all sorts of crazy stuff so, I. Can, do faster code and I can do three times smaller on this toy example. But. Wait there's more this. Is the stuff that's new if you saw my talk last how many people saw my talk last year a lot. Of people in this room all right welcome, everybody else. Let's. Talk about some other applications, so if we have a reliable, way of detecting trivial, relocatable, types we. Can optimize the move operations of fixed capacity vector I fixed. A pasty vector is being proposed as static vector it's also in boost container as static vector I don't like that name there's nothing static about it but its capacity is fixed that's the important thing. So, here, I have my fixed. Capacity vector with a capacity, of 3 its, current size is 2 it's only got 2 blobs in it they, are but within some state and I, am move constructing. Into, W that, is I am NOT relocating. Here I am NOT I don't want to end the lifetime, of V. But. I do want to move. Its contents, into W well, what boost container actually does is call. To move constructors, of each element. That. Means that what I am left with on the right side is a. A. Two. Element, static. Vector, in. The boost world and, it has two moved from blobs in it and so, when I destroy, V now I'm gonna have to call the destructors, of those two blobs this, is inefficient and it's kind of unsafe because if anyone tried to use that vector they'd say oh it's got two elements but they're in a move from state this. Is not useful behavior, and it is inefficient behavior. What. I would like to do is relocate, those two blobs that is move and destroy them in one operation to, relocate. Them over to their and their lifetimes in V these size goes down to zero so I have to manually set that in the move constructor, right, here's the move constructor, up here right. That's. What I'd like to do that is more efficient, when I go then to destroy V there's. Nothing in it I don't call any destructors and that. Operation, of moving those two blobs over for a vast majority of types like shared putter and unique put her that's. A mem copy. Let's. Use did promise stead promise is trivially be locatable for the same reason shared butter is I, have, a test function here it takes, a fixed capacity vector of.
A Hundred capacity. So very large one and it returns it by value, it took it in as a parameter it needs to return in the return slot so this is actually going to make a copy by implicit move, so, it's going to call the move constructor, the. Move constructor, of our, fixed capacity vector is. Going to use relocate. Slow, is not defined in either case I think here. Relocation. With the p1 for for compiler, just. Calls mem copy right, I just mem copy those promises, over and then, also I don't have to destroy anything so, I don't even call the destructor a promise I don't need to on the. Other side it, is calling the destructor of promise and the move constructor, of promise, and. Those. Are non-trivial and. They, could even result in calls, to operator delete and so on now, that. Code will be on reachable but. The compiler apparently, doesn't know that because, again I get a factor of two savings, on my code size. Application. Number three for relocate ability move. Constructor, again move, constructor, of a type erased type such as did function or stood any, here. Again very similarly, I have inside, the footprint, of RHS, I have a blob object, that I have wrapped up and I have forgotten everything about blob right, because I'm type, erasing it all I know is in the case of state function, you. Know how to call it how to copy anything that I put into my. List. Of affordances. Which. Usually corresponds, to something like either explicitly. A V table or something. That I like to call a manual, V table which is what we've got here v PTR, here is not the actual V pointer RHS is not a polymorphic. Type it's a state function. But, it has this member called V PTR that points to a table of function pointers that that, records everything that we need to know right type eraser. So. What options do we have to get the data from our adjust to LHS, in the move constructor, one. Thing we could do is we could type erase the move operation, and just say okay hey. Blob move, yourself from here to here we. Go through the beat butter to our move operation. And it leaves RHS in an engaged, state, but a moved from state that, means it's dangerous to call and. It's not very efficient either because now we have to destroy the blob when we're done with RHS. Lip. C++ function will. Actually, make copies, of things that fit in the SBO even. If they're non-trivial. So. I would say watch out for that, so. Moving is inefficient in bad what else can we do.
Well. We, could move and destroy right, but so rather, than leave a, move. From state function, in an engaged, but moved from state let's, actually leave it in the disengaged State that is more honest. To our users it's been moved out of we might as well make it empty so, we'll do that that, looks like that we move just, like before then, we destroy, the blob and we, reset the V Powter to point to the B table for an empty state function instead, of a full state function. Instead. Of a state function holding a blob I should say, move. A bear is inefficient, and bad notice, we're now making two virtual. Calls through function pointers to move and to destroy but, we're calling them one after the other on the. Destination and a source. Now. You're thinking with relocations. Right. We combine those into a single function pointer called relocate. And. We. Can do it in one call, and that. Relocate is probably trivial, Lib. C++, did any does, this because. It was had the benefit of being designed you know a decade after it's dead function. So. Now we can do something cool look at this line right. We're, we're calling the relocate, function of a. Blob or of some T that has been type erased all, we care about is hey T how do you relocate yourself, if T, is trivially, relocatable, we, don't have to actually call its move constructor, and its destructor that. Would produce for every for every T it would produce a different relocate, function, however. For, all the T's of the same size, which. Are trivially, relocatable. We. Can just point all of those V pointers to the exact same function, that does, a mem copy for every T of that size any. T is trivial, we, can have its relocate pointer point right there that, means we can use the same single piece of code for double which is eight bytes trivial capable in star eight bytes trivially B locatable unique put our event eight bytes trivially B locatable all of those things I can make my I, can put any of those in a state any the. Relocate, pointer of that V table will point to that exact piece of code instead, of having many different pieces of code that the linker has to resolve and then they're all cold at runtime I can have a single piece of code and it can be hot at runtime relatively. Speaking so. Let's work in a linker less work for the instruction cache good. Stuff. Application. Number four for relocate ability with a reliable way of detecting trivial, relocate ability we can avoid the small buffer optimization.
For, Map ease that are not trivially, relocatable, so. If I'm implementing, stood any I. Can, say okay I. Have. A big small, buffer it, actually can put a state vector right there and I'm gonna do that in my small buffer now a stead list would also fit, into my small Balter but. I'm gonna choose not to do that I'm actually gonna opt out and say you know what because. List isn't trivially, relocatable I'm just going to heap-allocated and keep a pointer to it just like I would do for a bigger object, what. This means is that now my ne itself, is trivially relocatable, no. Matter what it holds if. It holds an object in its SPO that object itself is trivially locatable and it's, not harmed by moving it like that if. The, object we're not trivial early locatable i would have it out on the heap and i have a pointer, which has ownership but. Is trivially relocatable, as long as I'm moving and, destroying, in the same operation, I can. Mem copy it. Application. Number five for relocate ability consider. This naive user-defined, class type it's. Got a string and it's got a vector. I call, it naive because it follows the rule of zero. Now. That is great I wish all of my code were this naive right it doesn't need a, copy, constructor, a destructor you can use the defaults, but. When you follow the rule of zero you're, probably also not providing, things like well. Equality. Comparison, for example, but also a swap if I. Take. To, capital, P pairs and I call swap on them and it doesn't here I'm just for the sake of the line I'm using stead swap even if I did this did swap to step you know usings, did swap swap baby there is no ADL overload, for, pair and so what's it gonna do it's gonna call the template version from the standard library that swaps, two T's in this, fashion we. Have a and B coming in we. Make a temp and we move constructive, from a we. Then move a sine B into, a leaving. Being a move from state we. Then move. A sign from temp back into be leaving temp and a moved from state and then we destroy temp okay. Simple. However. Pairs. Are trivially, relocatable, what. Should actually happen, in this case we. Don't need to do all those move constructions, and destroys none of those are trivial that's very complicated, instead, knowing, that these are trivially locatable why, don't I just swap the bytes alright. So I take a and I relocate, it down into buff that's, a mem copy I take, B and I relocated into where a used to be I take buff and relocated, into where B used to be I don't need to destroy buff there's nothing there. So. With the swap that's currently in the standard library the. The unconstrained, one there. Are four operations, happening because I'm counting that closing curly brace where the destructor, of temp gets called for operations. There. Are three different things going on move construction move assignment and destroy none, of those are trivial for shared butter none of those are trivial for promise none of those are trivial for a lot of types. However. If. I know that my tea is trivially, relocatable, I can. Use the one on the right hand side three. Invitations, of one operation that one operation is mem copy. Michael. I. Don't. Have to destroy the pair I am I am destroying it three times, each. Relocate, there is doing a move construct and a destroy the constructors, and destructors are, balancing, out when, I get done there's nothing in buff. Previous. Example with the Tim death. Yes. That's the last step, that. Curly brace is bolted on this slide you just can't see it good. Price.
It's. The use of char instead of by significant, here I think, you could use byte it would be fine I prefer, char because why. Would I use, stood byte when I could use a built in type. So. Oh yeah. Could. I use an XOR based swap, yes. I'm not sure that that would be a win on any actual architecture, I, do. Think that there is room in the C library for a memo op right, we have mem move we have mem copy, why not mem swap then, the, implementation. Could use you know sse or whatever it wants to do. Trivially. Locatable does imply that I am allowed to do this definitely. With, minor caveats, about, partially. Overlapping objects, which are already kind of messed up. So. Benchmark, or I say, potentially, overlapping sub objects that's why should have said anyway. Benchmarking, stead rotate, so. When. I rewrite, swap, for. Trivially below table types that, actually has massive, effects on any algorithm. That is based on swap and as Shawn parent will tell you most algorithms are eventually based on swap so, I didn't, touch the code first it'd rotate you know I don't want to touch the code for stood rotate but. It looks like it must use swap at some, point because. If I call stood rotate and with you know random inputs, and and K and I look at what kind of assembly Lib C++, generates. With. The with. My branch here I got 131. Lines with, the master. I've got 176. Lines and, notice. In 176. Lines we've got calls to operator delete why. Are we calling operator delete from stood rotate well. Because. It thinks that it might need to destroy, something, right. It doesn't move construct, and then it does a destroy and a destroy says compare that put her with no putter and if it's not no motor delete it now. I happen, to know as a human that, this does, code here this this operator delete that can never get called right, it's definitely never gonna deallocate, any vectors as a result of rotating, a single vector. But. It doesn't know that the. Compiler has to generate these calls operator delete and so on even though as a human I know they're just going to get skipped that. Code must be dead because. I'm never deallocate, in anything as a result of her doing this rotation. Those. Typically swappable and flight regularly locatable um I don't propose trivially, swappable I do have a blog post about it that you may remember from oh I'll back. But. For. The purposes of this talk yes. So. By this point having, seen all these applications you're probably thinking I, really. Want this thing, so. Let's talk about how you can get this today. In. Various, existing, code bases we're gonna talk about Facebook, Bloomberg, and Electronic Arts. So. EA, is the ones who, I think, I've had this the longer they did a proposal. Not. Including. This but including a lot of other stuff from EA STL, way back in these people's +11 cycle. This. Was not included in that proposal, and. I don't think it would've been accepted at that time anyway because we got move semantics, around the same time. This is their documentation, for it, what. They do is they give you this macro ei STL declare trivial relocate, you. Annotate all your types with that and. Then when you use the ei STL, vector that's not a state vector that's an EI STL, vector. It's. Erased. Insert, reallocate. Etc, we'll be able to use mem copy to, relocate. Objects, and says it's very useful and allowing for the generation of optimized object moving operations, alright.
Categorizing. Objects. As trivially relocatable, or not is something that no compiler, can do as only the user can know whether your type is true if you'll be locatable or not. Bloomberg. DSL, has something very similar it's. In the BS lmf or I think meta function namespace I hope it's for meta function, it's, called is bitwise movable, and they spell movable the way that I would love the standard to spell it but they don't. Allow. Generic code determine, whether, type can be destructively, moved using them copy notice I'm not using the term destructive, move but. We will see later in part 2 we'll see a paper that uses that term. So. Again anytime you have the following pair of operations, a placement, new which is the move constructor, in this case the copy constructor, either. One, I guess they say and I immediately destroy the source that's. A relocation, or what they call a bitwise. Movable, operation. If I can replace it with a mem copy right. And this, would be how I would use it with my naive. Class pair here, pair. Contains a BS l string and a BS l vector those are both I happen to know that those are both bitwise movable, types and therefore. I can use that this is unmeet syntax it's still a macro, but, it attaches a member, member. Type def to class pair that. Enables that trait and, again I there's a type, trait that I can status right so we're seeing there's a pattern here. The. Warrant and then, the, type trait. Facebook. Folly has, by far the most quotable, explanation, of your location semantics, but I'm just going to put it up here and I'm gonna let you read it for about a minute, this. Is from their their readme. What. We want is a teleporter. We don't want a cloner, and then you shoot the original, we. Want a teleporter, and. So. Foley here so Paulie is the same thing and I'm gonna pick a little bit on folly I'm, gonna show you how to break it, I'm. Going to imply. That its developers, have written bugs. This. Is not because folly is poor quality code in. Fact it is relatively, high quality code certainly. Above average, but. It's it's big selling point for me is that is highly comprehensible, it's gone github and you can go read it and it doesn't use a lot of crazy you. Know underscores, and macros and things it's, pretty readable and that means it's easy to find it's bugs, with. Apologies to Tony Hoare there, are two ways of constructing a software design one way is to make it so simple that it has obvious, deficiencies and, the other way is to make it so complicated that, its deficiencies, are not obvious, right. Everything. Has deficiencies, folly. Has the advantage, that it's, easy to find. So. Follow FB vector is just like a state vector vs L vector ei STL vector. But. Whereas. All of those will, be quietly, Pesa mised for, non trivial e relocatable, types in Folly you're, actually only ever, allowed, to use FB vector with trivial relocatable types they didn't bother to write the slow code path they, will actually static assert if you, try to put something in them that's not trivial to be locatable unfortunately, they haven't got a reliable way of detecting that so again we fall back on providing, a warrant right either through a template.
Partial Specialization. Or, they. Have a macro that wraps that partial specialization. Right just like EA STL, de this. Is what a warrant looks like according to Folly Docs, but, there are downsides of, explicit, warrants, when. You're using these explicit, warrants, right, when you're when you're telling the compiler something, about your type explicitly. Saying I promise. That this is trivially relocatable, there, are two downsides, the first one is you have to know how to communicate that to the compiler you need to look in the docs and read the the. Name of that macro and figure out how to use it or, it says you need to specialize this template and then you have to figure out how to write a proper template, specialization by, the way did you notice the, poly docs don't know how to write a partial template specialization because they they forgot the template brackety brackety bit, so. That's. A problem you know most people, you. Know people in this room probably know how to write a template specialization people, outside this room who would benefit super, greatly from a 3x code size reduction, don't, know how to write partial template, specializations, or explicit, specializations. So. It would be nice to have a syntax that was usable by normal people in. Fact that syntax would be really nice if it were nothing right capital, P pair should just be trivially locatable I shouldn't need to tell anyone anything. And, with, explicit morons you need to know when to write one you need to know whether your type actually is trivially, relocatable and again developers, have a history of getting this wrong. Here, is the folly traits, thought H header so far when you include folly you get this for free and there's no way to turn it off. They, don't promise anything about MSC. Microsoft. Compiler but, they. Say yeah factor, is trivially locatable Dec is trivially relocatable unique butter is trivially relocatable shared, butter is trivially relocatable as we've seen and stead, function is trivially relocatable so if you compile it on Lib C++, where state function is not trivial EB locatable, everything. Breaks, at. Runtime silently. Also. They are using, the. Template. Specialization for, two parameters, with vector deck and unique footer so if you pass an alligator or a deleter which is not standard, alligator or standard default the leader that could also cause some problems. So. Let's break FP vector so. Here. We have my. Little main function. Does. Anyone have the capability, right now to live test, this on a poly distribution, because. That's something I wasn't able to do online all right well, I promise. I give, you my explicit warrant, that this code will will break. Let's. See how so. I create a stood string right it allocates that string out on the heap because I'm deliberately, made it too big for the small string optimization, and then I'm gonna take that string and I'm gonna capture it in a lambda I'm.
Gonna Capture it in a lambda so, it becomes a member of the lamb the lamb that has one member named S which is a string that, has a pointer out to the heap I'm gonna take that lambda and I'm gonna wrap, it up in a state function now web C++, is state function looks like this right just like we saw earlier with my for example, it has a pointer which I called vtable earlier it calls it MGR manager, it has a pointer, to where, the object is stored normally, pointer would point out on the heap right we put the lambda in it pointer points out the lambda on the heap but, in this case the. Land is actually small enough it's only 24. Bytes right because that's the size of stead string, it. Can actually fit in the 24, bytes small buffer optimization, of Lib C++ state function so, we're gonna put it in the SPL uh-oh all. Right, now we're going to take that state function and I'm going to emplace it into a vector so. There's my FB vector and. It's. Got room for one state function I'm gonna put my state function in there and then. I'm going to reserve capacity plus, 1 it doesn't matter what the capacity is this is gonna trigger a reallocation, for, more capacity let's. Say capacity, is now to a, heap. Allocate space, for two items and now. FB vector believes that, stood function is trivially relocatable, because of that explicit, warrant which was wrong so. At this point it's gonna do mem copy that's. All that knows how to do, it's. Gonna mem copy manager, and pointer down there if pointer had the value 0 X 42 earlier, it has the value 0 X 42 now it points the same place that, same place is to the small buffer of the now destroyed old, object. Now. I get the first element of my vector and I try to call it that follows. Manager to find out how do I call a thing and it follows put her to find out what is the value of the string s. That's. Been moved from. Now. We try to call the stat function D references pointer and boom. And. Actually it's re on this slide I should say I said s was moved from not, really it still has the same bit pattern on but. Of course that memory is D allocated, so if someone else has been calling malloc you, know or operator, knew they might have gotten that memory and put something else there so we get garbage.
Alright. So handwritten warrants are bugged thrown folly, FB factor relies on handwritten, warrants, written by the folly developers, they get them wrong if. They didn't help you then. You would have to write it yourself and you would get it wrong even more surely I guarantee. You. Know the. Folly developers, are smarter, than you. Gennaro. Like average person. They. Get it wrong average person's gonna get it wrong too so, handwritten warrants are usually, wrong because generally, if, you're writing a warrant for code you didn't write that means you have to know all the internal details I'll be library, vendors implementation. Right. Just. Because you're a domain expert on, Follies, trades that age doesn't mean you're a domain expert on Lib C++ is state function you can't necessarily warrant. It for somebody else you're. Gonna get it wrong if you try. And. They don't scale right, my, capital, P payer I can't, put that in an FB vector because. I didn't say it was trivially relocatable and remember FB vector alone out of these implementations. Rejects. Types that are not trivial relocatable, says I don't think I can them copy that now of course it can mem copy it but I haven't told it so so I would have to add the warrants and. Suppose. I change that vector to a list so, that my type is not trivially relocatable the symptom would be the same Polly would say I can't, put that in the nefb vector give me a warrant and I would say all right here's your warrant I would. Be wrong I would have undefined behavior at runtime when I tried to iterate over that list. So. Adding, warrants is error-prone, and it is just too much work for the, developer I don't, want to have to warrant explicitly, for all of my code or, risk in not compiling, or pessimism. Enter. P11 for for object relocation, in terms of move plus destroy. This. Is a proposal, that. Is, on our a3 now although don't read too much into that the, most recent version is after code after I got a little bit of feedback there the. Target, for this proposal, is C++, 23, it's certainly not going to be in 2020 3 is the optimistic, like. If, anything happens with it it might happen that. It. Is full, package batteries included, core, language feature type traits library, algorithms, all this stuff to. Give, you the full package, by. Design it it preserves, the correctness, of all your existing code we. Don't want to start mem copying around things even, your own things when. They can't safely be mem copied so it will be very conservative, and try not to break your code also, by design a design goal is to perform preserve. Conformance. Of all, C++, 17 library implementations. Write a library vendor has to add type trade and a couple of algorithms. But, it is perfectly valid for the type trade to always return false, I'm. Not saying, that lib, stud C++, has to go and make there stood string trivially, relocatable, that, would be the second ABI break for there stood string and a couple of years and. They're. Not gonna do that I know that so, the. Proposal is written to make it so that library vendors do not have to make stewed string trivially relocatable they don't have to make unique, put are trivially relocatable I don't know why they wouldn't but that is a quality of implementation issue, not, a conformance, issue as, long as the type trait returns a correct. Or at least conservatively. Correct answer it, could just always return false that would also be fine and what. Matters is that we have that type trait that can tell us the, optimization, is safe versus. The optimization. Either, isn't, safe or I don't know right. I don't know if it's safe we'd better not do that optimization, but, it enables the optimization, in these cases. So. That proposal in a nutshell trivially. Relocatable, as i've been using in this talk becomes the term of art similarly, to trivially, coffee a bowl trivially destructible, these, are operations that your type has and, they can be trivial or not. Just. Like for those properties the, compiler exposes, a built-in wrapped up by the STL into a type trait, called. Stood is trivially, relocatable, of t. Just, like trivially copyable trivially, destructible, the compiler, will automatically, propagate this. Trait the this property, of the type. So. If you have a class that follows the rule of zero and all of its bases and members are trivially locatable than that class itself will be trivially relocatable. Also. Any class that is trivially copyable is obviously also tribute locatable. All. Primitive types are trivially relocatable, in general although. Of course again that's quality of implementation if there is a wacky, exotic platform where int star happens do not be trivially relocatable, okay. You can detect that at compile time using the, type trait and act. Accordingly. Non. Rule of zero class is any class with any special members that always assume to be non trivial e relocatable, the users doing something weird they haven't given us a warrant saying that they know what they're doing so, we got to assume that this class is doing something that might not be, tantamount, mmm copy when I relocate, it.
There. Is a way to give an explicit warrant, expert users can use the attribute, to give an explicit warrant, but the main purpose of this proposal is to. Eliminate warrants, from the vast majority the 99%, of user code all, right if you follow the rule of zero or, you use standard library, types you're not gonna need an explicit weren't anywhere in your code yeah. Why. Do I use an attribute for this we will get to that later. But. Short. Answer is people don't know how to write template specializations, does that help all right moving on, people, on one four four is as I said implemented, on God Bowl, it. You. Can go test it right now the key word is p1 1 4 4, it's, built from a stock LLVM back-end there's, a branch of the client front-end that I maintain that, implements. This stuff there is also a client pull request out, if. Anyone would like to help improve this code that would be awesome. And there's a branch of lips people's paws that implements, p1. 440s new library features the algorithms, making, sure all the types are properly annotated, giving. The optimizations, that we've been seeing in this talk, so, it has all the p1 1 4 4 new library features it. Also has trivially, relocatable stood Annie I took that ABI break I took an ABI break to give a trivially copyable stood vector bool iterator, because, that's not the case instead. Live. C++. And, the entire memory resource header and seat a deduction guides for associating the world order in containers, and marshal I'm looking at you. Priority. Queue replaced top which. I have a blog post about the operation missing from stood by already queue, container. Adapters which are conditionally trivial destructible, and. Spring. View support instead Rex thanks to mark de paper, so. All this stuff is maintained in this branch, and. God. Bolt pulls from that branch. So. If you want to use any of those things that's the playground for it or send me an email and I'll put your thing in there too so, you already know the entire feature alright, I don't I gave, you the nutshell version and, now you are experts because, there is no non nutshell version, ok, there are two more little wrinkles. Conditional, trivial relocate ability suppose I have this class that it now a class template, zero, rap is a template a pattern for stamping out classes, some of those classes are trivially relocatable and some of them are not if, it follows the rule of 0 I get, conditional, trivial relocate ability for free 0. Wrap a vector, is trivially, relocatable, because it follows the rule of 0 and all of its members are trivially relocatable. 0. Wrap of list is not trivially, relocatable, because although it follows the rule of 0 it, has non-trivial e relocatable, members, so, it is not trivially relocatable right, that that property propagates, automatically, using. The rule of 0 so, I get it for free but, what. If I'm an expert what, if I decide, that I really want to do my own memory management and. So I'll take a pair of int, star and T. Now. That, pair itself will be either trivially, relocatable or not trivial relocatable depending, on how t is however, I'm gonna add more semantics, on top of this type I actually want the in star to be owning so I'm gonna call new in my constructor, and delete in my destructor and I'm gonna have to do some manual memory management in, my move constructor, so. Here. My. Type is. Actually, not trivially, relocatable. It's. A, move, constructor, and destructor are non-trivial and. The. Compiler, will say ok. In. That case because. It, never, follows. The rule of 0, I. Don't. Know anything about it it, doesn't follow the rules 0 I have to assume it is not trivially relocatable so the, line in green is saying expert. Wrap of stood list is not trivially locatable that is true and so it is green it is good that the compiler figured, that out at. The line in yellow well, that's not actually true I could use mem copy but, at least it is not using, them copy when it shouldn't this is merely a pessimism, it is gonna generate totally, safe code but. Wouldn't it be nice if I could teach the compiler that expert, rap was trivially, relocatable, or, expert, wrap a vector, was trivially relocatable wouldn't that be nice ok. Following. The use case I hope. So. What i could do is i could add the warrant to the template that says all my specializations.
Of This primary template are trivially locatable this would be absolutely, wrong all. Right this would say, expert, rap a vector is Tripoli B locatable I've taught the compiler that so it knows that and it will use mem copy on that type awesome, I get fast code that is correct however, expert, rap of stead list also now has the attribute, and the compiler is gonna think that it's trivial you're locatable it's gonna use mem copy I'm gonna break all my list pointers, and, so that is read that is actually undefined. Behavior at runtime that is the worst thing that could happen therefore I should not do the thing that I wrote in red what. Should I do instead, all, right before I get that let me just read that box out loud if. You give the wrong warrant, your code will be wrong if, you write a bug your code will have bugs. This. Is still the chainsaw this. Is the chainsaw that is in Folly it's BSL, es DL they all have this if I give a wrong warrant I have bugs. The. Primary, goal of this proposal is to let. The compiler deduce, trivial. Relocate ability often enough in the 99% case that, you don't have to pull out that chainsaw for every single problem. Usually. You keep the chainsaw locked up and then. When you need to use it you need to pay attention. We're. Trying to teach you not to use the chainsaw but it needs to be there because sometimes you have to cut down a tree. So, here's one way you could do it this was the only way that was proposed in our zero of the paper I just, said I like metaprogramming everyone, should like metaprogramming we. Make partial, specialization. Here for expert. In pull so. Usually. It's not trivially, locatable but, in the case that the bool parameter, there is true I'm, gonna make it trivially relocatable using the warrant using the attribute and I'm just gonna cut and paste that exact same code that's alright I like cut and paste in code and, then, I can inherit from that follow the rule of zero I inherit, the trivial relocate ability, of my members and bases I don't have any members I do have a base and my, base will be trivially relocatable or not depending, on that template parameter which, is just saying if T is triple EB locatable I want, to be trivially relocatable otherwise, I don't write. So. Up, to R - I supported, only this mode of metaprogramming and, I did the initial loops equals post patch entirely in the style. No. Nope. Moving, on conditional. Trivial relocate ability to, this.
Is The one that I propose in r3 in you, can omit the, boolean parameter there, if you do it's just as if it were true just like no accept or, explicit. Bool which is coming in 20 this is the same idea sometimes. I want to be trivially relocatable sometimes I don't so I take a boolean parameter if T is trivially locatable I want to be trivially relocatable so I no longer need, the metaprogramming, I do, have, really long lines if you 80-column minds this might not work out so great for you that's why I originally, resisted, it but. What. It extends. Your lines but, it written or extends, your lines this way number of columns goes up number of lines goes way, down so, this was totally a win i reimplemented, my lip C++ patch using this it is much cleaner much easier, to DIF much easier to review I think this is probably what. People need. So. We're proposing both, and. Again, this does the right thing. You're. Forced to put the attribute between the struct name of the class yes, that. Is just where it goes for classes unfortunately. There's. Another option this was proposed by John McCall on the clang review I gave. It the somewhat pejorative, name maybe trivially, relocatable, the idea of this attribute is it saying I don't interfere, with trivial relocate ability if my members and bases are trivially relocatable, I'm trivially relocatable, even though I provided, special member functions my special member functions will promise, to propagate, the, trivial relocate ability of my members and basis. So. This, is a much, cleaner in this particular, example for, expert map however it is completely unusable for certain things like optional, or, things that we'll see later in the talk I'm. Gonna skip these questions and move on and we'll come back to this. So. First question isn't this all undefined behavior right, I'm mem copying objects around without ending their lifetimes isn't this a problem well, it. Is a problem in the status quo that's why there's a proposal. In. R0, r1 what, I basically said was yeah it is undefined behavior let's change that proposal. Says it's no longer undefined, behavior to mem copy between, trivially, relocatable types problem.
Solved. However. It was pointed out that that does allow a user to make a copy of unique litter by having a unique putter mem copying, into another unique butter and. Then continuing to access the original so, I need some way of communicating, to the, compiler and to. The user they need to agree that the lifetime of the source has ended, the way we're going to do that is we're just gonna add a library algorithms. Did we locate at it works just like construct, at and destroy at it. Just says I do both of those things I begin a lifetime here and I end a lifetime over here a life. For a life. And so. For non-trivial ok table types, real. Ok dad is simply move construct plus destroyed for, trivially, relocatable, types this can be implemented as a memo as long as we surround it with the appropriate compiler, magic to begin and end life times appropriately. In, practice, that compiler magic will be a no op it will certainly compile, down to no code but. How do we achieve this well, it's similar to stood bless this. Is a proposal that was just accepted I think into the working draft, by. Richard Smith and. It's, motivating. Example, is something like this with a list. Node something, like stead list an allocating, list node I get, some raw memory from my memory allocator, and. Then, of course I'm going to placement new a tea, into, the tea element. Of the list node and then I'm going to set my next pointer to null pointer. But. If you're a language lawyer, and you look at this there is something wrong somehow, I am able to start the lifetime of the tea but I'm calling the assignment, operator of next. There. Must already be a next object there how, did the next object get there for. Me to assign to it why, didn't I need to start its lifetime well. If I insert a call to stood bless or if I assume that stood. Blast is already magically, called by operator new etc, then. I can solve this problem so I've added an explicit called us did bless here what, is did bless do. It's. Declared, but it's never actually defined, its, defined out of line in another, translation unit the compiler, when it sees my example, code has no idea what stood less does so, it must be conservative, it is not allowed to do optimizations, that might break valid, code, so. It has to think of all the possible, things this. Did less might look like and it might look like this, compiler. Doesn't know maybe. What does is placement, new an entire list node the T and the next and then a placement destroys the T, if. Bless, did look like this. Then. This. Code would be perfectly well formed when, and have well-defined behavior the compiler has to optimize as, if that's the case the. Compilers not allowed to break code that might actually look like this now, of course in practice blesses an OA but the compiler doesn't necessarily, know that or at least it can pretend it doesn't know that right we're, not actually going to generate a call to an out of line bless function the compiler can pretend that it doesn't know what bless does right. So. We can do the same thing for relocate at here's. My easy implementation of CID relocate at we. Test, the pipe trait if the T is I'm. Sorry there should be uh not on this slide if T is not trivially locatable, then. We do, placement new and we. Destroy. Otherwise. We can use an out of line helper function which I've called free--free lock add to fit on the slide, when. The compiler sees this line it sees the declaration, it doesn't know what true free LOC at does so it must assume that it does the exact same thing as that other branch all, right as far as the compilers, feverish imagine is that imagination, is concerned, it. Might just you know be, able to relocate any type by calling its move constructor and its destructor for all the types in your program it, might look like that it doesn't in practice but it might the compiler, has to optimize as if it looks like that now. What's. In practice, the. Linker symbol for free LOC at is just an alias from M copy and you gotta call the mem copy, or. Even the compiler is just pretending it doesn't know what that looks like and it it, generates a no op, but. Yeah, so that that's the implementation, strategy for, a very dumb implementation, that requires no, smarts at all on the part of the. Implementer, the implementer has smarts they can do smarter things but, it is the very low bar to entry to actually implement this. End. Of part one time, check I have 40 minutes left oh dear. Questions. So, far. Yes. Yes. Is this a property of the class with respect to a particular alligator. It. Is, you're, talking about like does the memory model affect this and I would say yes but the standard the standard memory model only has one thing if you're using a fancy alligator. Or some story that's baked into your type such as a fancy vector, it, probably won't be trivially relocatable, under this definition because here we're talking about what, mem copy does anything.
Where You can mimic you know if you're saying I've stuff in a different address space and you try to mmmm. The address from one address base but not only that probably doesn't work either we're. Working in the domain of these things like my own copy but we'll come back to that question, there. Polybius. Oh and the ast L do they take care of the object lifetime problem no they pretend it doesn't exist because in practice it does not exist. Once. We get stood bless and instead wander and all these things like maybe problems, will start existing, because compilers will start optimizing and then they'll have to do something, on. The other hand if compiler started to optimize based on that daybreak folly, and they Drake Bloomberg and they'd break a STL, so you. Know there's, a a quality of implementation issue there. All. Right so let's talk about things that relocation, is not. My. Proposal is not Pablo Halperin's, in four one five eight from a few years ago destructive. Move the, central question here and I wrote a blog post on this that you can search for the, central question is can I define my own relocation. Operation, to be used by, reallocation. And so on which is not quite mem copy but it's also still more efficient, than calling the move constructor, in the destructor so, my proposal says definitively, no you cannot I call. Move and destroy, through. Relocate, at relocate. At has certain special, optimizations, that can do for trivial, relocatable. Types. But. That's it those, are your options it's either like mem copy or it's not, Pablo's. Paper said definitively yes, there, is an ADL customization. Point that you can overload to say here's, how you relocate, my type. This. Lets to support some use cases that p11, poor for does not, consider. GCC. Is stood deck for example it has a never, completely, null invariant, it can the container, itself can be empty but it will always have a heap allocation for, that table of pointers to arrays. MSB. C's did list is similar it heap allocates a sentinel node so it is never completely empty it always has some state even in the move from state the, move constructor, needs to actually allocate a new node, so, because. The move constructor needs to do a heap allocation it's not no throw, these. Objects. Are trivially, relocatable, but they are not node for a move constructible. Vice. Versa GCC stead list is no throw a move constructible, but because you have to go fix up the pointers on the heap as we saw earlier it's not trivially locatable therefore.
If I have a struct capital, P pair containing, a GCC, deck and a list. It. Is neither trivially. Relocatable, nor no throw move constructible, and again you can go test this right now. But. You can't test that type trait because it's not lipstick c++ my fork is of lib c++. But. What that means is that if you make a vector, of these capital P pairs, then. It. Is actually going to do copy and destroy copy and destroy copy and destroy it can't use mem copy because it's not trivial early locatable and it can't even use move construction, because of the vector customization, so. P1 for 4 says game over it's gonna make copies suck it up and for. 1 5 8 says you can customize, that customization, point, to, do. A relocate, on D and then relocate s non-trivial e but, because you are relocating, it. You. Don't it you're destroying the original you don't actually need to do that heap allocation so you can do it no except Li even, though it's non-trivial the spare can be made no throw relocatable, which, means that you don't. Need the vector customization, when. You reserve and you move all these items down you can actually do that with relocate, you don't have to make copies. So. Here's how Pablo's paper would do it you make an ad l friends. Uninitialized. Destructive, move is the name he gave to it and. You, just call an initialize destructive, move using, the state swap to step alright we don't like that these days back, then that, was how you did it it works just like swap that's. I read an ad I'll swap right I do the same thing for disruptive move now, it is no throw destructive movable now factor can do the optimization at once. How. Do I tell the compiler that it is. Trivially. Destructive movable if I if I have a pair here. That follows the rule of zero and has a string and a vector so it is trivially, relocatable, in, that case the only way to communicate to. The implementation, that this is allowed is to do that partial specialization, anytime. I'm having to write a specialization. Danger, Will Robinson, that is very tedious, also. It is dangerous. Right. Because if you have to write a warrant and, if you're writing a warrant you're doing it wrong I guarantee, it. So. TLDR. Pablo's, proposal could achieve no throw relocate ability in more cases therefore it could avoid the vector customization. In more cases but it was not simple it relied on ADL customization, point I relied on reopening name space did, you. Did not really have a type trait to ask whether a type was trivially relocatable, instead. What you had is a specialization, point, like did is error code you know that. You can specialize, if you can specialize it it's not really telling you about the properties of the type that the compiler knows about it's, telling you about some some library, thing that the compiler had nothing to do with. So. It's not a type trait it's. A specialization, point, all. Right what else is p1 for for relocation not, then. A spider had a proposal a few years ago p 0 0 to 3, that. Proposed relocator. This. Is the syntax for it it looks weird. It. Adds a completely new operation, called the relocator, a relocator, is like a constructor, it's like a destructor. But it's not either one it's not a relocating, constructor, it's. Not a constructor, it's a relocator, it's a new core language operation, with its own core language syntax it is not the same as move cloths destroy obviously, it should be just like move should be the same as copy. You. Know it's not the same thing except. By convention, but, it's its own operation. There. Is a higher level algorithm. Relocate, or move that says if your type has a relocate, or I'll use that otherwise, I will fall back to move + destroy just, like move if no accept says if you have a move constructor, I'll try to use that if it's no accept otherwise, I'll fall back to copy alright, the compiler can figure out trivial, reload ability, just the same way it figures out trivial destructibility, or trivial copy constructability, it says number one are you relocatable, do you have a relocator, if you, do is it, defaulted and trivial. So. In, p 0 0 2 3 I have my existing, C++ 17 code I, ask is this type trivially relocatable, no, it is not is it even relocatable, at all no it is not it doesn't have a relocator, it has a move constructor and a destructor so if I call relocate, or move algorithm. On it it will be able to get move and destroyed but I can't relocate it cuz it has no relocator it's not implicitly generated, for me because I didn't follow the rule of zero. If, I wanted a defaulted relocator I would do that that was the obvious thing it relocates. All my members and all my bases. And. The compiler can detect that it's trivial so. That was Dennis's proposal.
P1. One for for relocation, is not, miele. Douglas is p1. 0 to 9, move relocating, this. Is, not. Yet abandoned, I think it. Is a current proposal. So. Here's my existing, sepals + 17 code, he's. Got a type trait we can check right everyone follows the same basic pattern Warrenton type trait his. Type fraid is is move construction, relocating, and since I didn't change my code at all I don't get anything for free if I want to get anything I have to add an attribute he's using an attribute just like me he's putting it somewhere different he's putting it on the move constructor, and now, I can check that the compiler knows move construction. Is relocating, whatever that means what does it mean well. Good. Luck figuring it out I do think that this paper is a little bit confused about what the issues are, but. It adds an attribute just like mine you. Put the attribute. On the move constructor, and when, you do so it indicates a property, that the entire class has a relationship between the move constructor, the destructor and the default constructor in this case as well if your class is not default constructible, then you can't use this attribute the. Relationship, that we're indicating with this attribute is that, when I move construct, from, s to D leaving, s in a move from state the move from state of s is exactly, equivalent to, the default constructed, state. And. Additionally. When I destroy, that default, constructed, state it's a no op even. Though it's not trivial if I had a trivial destructor and the compiler would already know it was in no op so I have to warrant that destroying. A default constructed T has no side effects, and. Also possibly default, constructing its he has no side effects that one is really unclear but it somehow. The compiler, knows what a default constructed, T should look like and it can like make one on demand I I, don't really understand this and, also. Implicitly. No operation, on key cares where the T is physically located in memory because one of the effects. Of miles paper in fact its main motivating. Reason for existing, is to be able to pass these move, relocating, types in registers. If. I have a widget, and. I, call its constructor, and I find out where it was constructed, and later I destroy, a widget it had better be destroyed, at the exact same place unless I said something to opt out of that so in his case the, thing to to, opt out and let the widget be constructed, over here and destroyed over here is you put this attribute on it you say I move relocating, therefore it's okay to construct. Me over here and then sort of mem copy me all the way over to here. My. Paper again does something similar using an attribute as well. But. My paper does not change the ABI again, because of the requirement. Of backward compatibility. So. Niles, paper proposed the property warranted by an attribute it's very complicated. It needs default, constructability, it. Deliberately explicitly, has ABI implications. No. Vendor, would. Be able to put this into their standard library without an avi break, that. Is you couldn't make for, example I move relocating, stood unique butter event. If. You did it would be returned directly in our ax instead of being returned on the stack, and. That would be an ABI break, so, in my opinion Myles, paper forces, vendors to choose they want the status quo where. Nothing. Is move relocating, and the ABI stays the same of it but, users, get absolutely no benefit or. They. Could take the ABI break pass unique butter and registers really, annoy all of their users who care about ABI compatibility.
But. Make their other users happy p1, 144 on the other hand I am trying to get the best of both worlds no, ABI break and as we've, seen 3x, reduction in code size on a lot of examples, p1. 1 for relocation is not something. Related, to Niles, proposal, clan, has a very similar thing for allowing. You to pass classes, in registers and I have a blog post if you search our throat why are trivial, ABI you'll find a blog post that goes into more detail on trivial ABI it's a clang specific, attribute. JCCC. Might get it at some point because it is standardized, in the Itanium ABI or at least there's a proposal to standardize that if it's not already in the. Itanium C++. ABI defines, what it means to be trivial for purposes of ABI basically. It means you followed the rule of zero and you fit in a register and you know that kind of stuff maybe a register pair but. If you don't follow the rule of zero then. It doesn't matter how small you are you're gonna get passed on the stack just like any other large class if. You want to be passed in registers which something like unique butter could be, then. You just put this attribute on there that, communicates the compiler that this. Class for purposes of ABI should, be treated as if it followed the rule of zero it. Still, has all, of, these non-trivial, member, functions that must get called at the appropriate time someone, is responsible for destroying the unique butter but, to get a unique butter from a caller to a colleague we. Can pass it in a register. So. Without trivial, ABI if I have a unique butter like type, it's. Gonna get passed on the stack its address gets passed in Rd I you. Know I load. It up I do some, stuff with. Trivial ABI it. Gets passed directly in RDI and. I can just you, know move, 42 directly into whatever my code gets a lot shorter but, this is definitely an ABI break that. That's nice the attribute has ABI right in the name it tells you what it's doing. So. With, trivial ABI, it. Goes in the attain iam ABI it affects the calling convention since the pick parameter, is passed in a register, that means the caller sort, of gives it up to the call a it, just puts it in a register and now it's gone off and the register has been trashed when I come back so how do I destroy it well as the caller I cannot destroy it anymore that means the responsibility, for destroying parameters, must lie with the colleague, MSB. C already does this in every calling I