What is a Digital Twin? (with Colin Parris, CTO, GE Digital) - CXOTALK #685
What is a digital twin? How can we use it? What are the benefits? How do we implement it? What are the challenges? Colin Parris is the chief technology officer of GE Digital. GE Digital is a company inside of GE, and our focus is on putting industrial data to work. We operate in roughly about four industries. We operate in grid software. We operate in gas generation in oil and gas, power generation in oil and gas. We operate in aviation, and we
also operate in the manufacturing space, so we have four segments we go after. We produce industrial software that's designed to help our customers deliver value from the industrial data. I am the CTO. I am the CTO of GE Digital, but I have two roles. One role is
developing technology that's deployed in the GE Digital products. But the second role is helping digital transformation inside of the GE businesses. Those are the two roles that I share right now and it provides for an interesting time for me.
Colin, you're chief technology officer and digital transformation has a technology aspect as well as (very often we talk about) a cultural dimension. Shed a little more light on the technology dimensions that you're involved with. When you think about technology dimensions, I think about three potential dimensions that I get involved in. One is identifying new technologies,
technologies that we believe could be used to give our customers an advantage. That is more me sitting in the role of advanced development or research. The second is deploying current technologies where I'm thinking about, well, this technology, I think it's hardened. It's mature. It can be used in our products to advance those, and it should be part of the roadmap, so I spend time there. The third is an interesting combination
of business transformation tied to the technology itself because, in many cases, customers will come and say, "Well, I'm doing digital transformation, but I can't see the business value I get from it." I spend a lot of time finding ways to integrate business process transformation with digital transformation. In this way, I create (or help you create) a digital tool, a digital capability that goes inside your business process and gives you business value.
Those are the three dimensions I tend to work on from a technology perspective. Layer the concept of digital twins on top of that for us, please. In GE—and a lot of this was started in the GE businesses—we have these very, very large assets. The assets have dimensions of them that make a ton of sense
but also provide us a math of complexity. These assets cost multiple millions of dollars. They deliver things that are at the basis of the foundation of the world. One-tenth of the electricity that's generated comes from GE assets. We have a significant, 60% to 70% of the engines that fly in the world, so we have to bring people back and take them safely. We also do a lot of the healthcare. We have a lot of the large healthcare machines that take care of the health of the people. When we're looking at those assets,
you're thinking to yourself, "How do I run them with increased amount of availability? I want it to fail the least, and I want to do it in the most cost-effective manner." This is when we sit back and we think, "How do we find a way to do that using the data we have?" What I have now is created a digital twin. A digital twin is a model, a special type of model. It's a living-learning model. If I have the model of that asset
and that model changes the corresponding state of that asset, then I can predict. Can I get an early warning on a failure? Can I predict what I should have ready so that when I bring in that machine for it to be maintained, I can it very fast? Can I optimize it? Can I use it with the least amount of fuel? Can I use it in a way that it delivers the most amount of value with the least amount of labor? All the twin does, it's a living-learning model that allows you to deliver business value by constantly making sure that twin is the exact replica of the asset and then getting insights to take an action. What are the components that go into creating a digital twin? When I think about the components that go into creating a twin, I first start with—most people will see it's not a technology component but it's the most important component—what's the value that you're trying to deal with? Am I trying to increase the availability? Do I want that gas turbine and that powerplant to run the most, so it could generate the most electricity? Do I want to take out cost in terms of fuel cost of a jet engine that flies? You start with what's the value, the business value you're going after. Then secondly, you back into, "Can I get the right data and domain knowledge?" to see if I can get some insights that allow me to do that. Can I get the right insights that allow
me to know when a failure is going to happen? That's domain knowledge. Somebody says, "Well, usually we see these fails happen and that's what causes the failure, and here's the data associated with it." That's the second thing you get. Then the third thing is the models. Then I actually put to use physics models, AI models, or a combination to use that data to try and see, can I predict when that failure is going to happen clearly enough in advance that I take an action? Think in terms of the business value you're looking for. Think in terms of the data and domain knowledge, and then the models that we have to build. Then you get into the complex things. You have to figure out, how do you deploy it in a way that you can test it so that you're sure it's not going to damage your equipment. Then we
figure out when does it actually work as accurate as possible. When do you need to tune it? Those are the major things you think about to understand value: data and domain, insights, and then models. Why do you call it a living model? In many cases, many people are familiar with models. If you're a good designer—before you design a gas turbine or a steam turbine or a wind turbine—you create a model because the model tells you. The model lets you. The model lets you use that to figure out what components will I build, what materials will I use, how would I design it. Then we use models a lot in design. Also, if you have a problem in services,
somebody goes out and builds a model so that you could sort of find a way to emulate what's going on or simulate what's going on so that you can take an action. Usually, after we do that, we put those models aside. We leave them alone. What you do with a twin is you create that model, focus on a specific problem, and you make it a living model in the sense that you bring data in continuously and you let the model evolve as the state of something evolves. Let me give you one good example. Think about a gas turbine. While this gas turbine is running (over months, over years) the materials inside break down because you have a lot of heat. You're
exploding fuels in order to spin this turbine. If you have a way of saying, "Whenever I do this, that level of heat changes the material structure and wear and tear occurs. Ball bearings get degraded," the model begins to measure that. The data that you are taking from all of the sensors begin to say, "Well, I think these materials are wearing down. The oil is wearing down." That's why you want it living. It constantly updates what it is that's
happening inside that complex machine so that it gives you an accurate view of what's going on. You also want it to be living because what happens then is that your state changes. In some cases, maybe you're operating something in winter or you're operating something in summer. The conditions are going to change. Coming into the digital twin, it's not just the information from the sensors about the machine but it's the sensors from the environment, the sensors from how they operate it. All of that changes and what you want
is your model to reflect that change. If you make a decision on how to change the performance, get it better, or how to get the maintenance done or when to do maintenance, it reflects the actual state. That's why it's a living model. Colin, you were talking about this living model that's adapting and that is taking data from the environment as well, so it sounds like the construction of a digital twin is quite complex and quite involved. It can be. It all depends upon the type of
problem you're trying to solve. In many cases, we have been doing this for a while. Everyone I know that actually runs complex machinery (or even simple machinery), they usually have, at times, and M&D center, a monitoring and diagnostic center, so you've been capturing information. The idea is, can I take the information I'm capturing? If it's enough for me to understand the state of the machine, then I can use that directly.
Often the case, though, especially when these machines have a much larger length of time—a jet engine lasts 40 years, a gas turbine lasts 30 years, a wind turbine lasts 20-30 years—with those long-lived systems, the environment changes. The way people operate change. Sometimes, the maintenance changes, or people slip up on the maintenance – for a variety of reasons. With those changes in state, you often have to not only take the sensor data, but you've got to reflect the data from the environment and from the operations. That's the new data you have to collect. In many cases, that data is already there. It may be in a different system. It may be in your
MES system. It may be on your SAP system. You may have to fuse the data in order to get the information you need to give to the twin. It all depends upon the outcome you're going after, and so I always advise people, "Start small. Start at what you have. Tune your
skills using that. Then expand to look at other things that you can put into place to get the model more accurate or to reflect the cost in a much more adequate manner." Ultimately, what are the benefits of digital twins? If I think at a high level, we tend to think about three of them. Can I have a digital twin
give me the early warning about a problem that's about to happen? In aviation, for instance, can I get a view of when would a #4 bearing – this is a bearing that is inside the turbine itself fail, because usually, these things fail. You're at a gate and there's a light that pops up for the pilot to see, and so you have a problem. Now, you've got to be playing all the people. Can I give you that information 30 days in advance? You need 30 days because you need enough time to allow the airline to get a new aircraft in that slot, to get a new crew in that slot. Crews have to be certified. To get new support in terms of the right things in terms of fuel or food. You want to get an early warning. That's the first thing. Can I get enough of an early warning so I make a small tweak in my business rather than a large mishap or catastrophe? The second is, can I do continuous prediction? Can I predict when something would fail? Can I predict the type of wind I have? With wind turbines, if I can predict the wind a day ahead, I can know what I bid in order to sell my electricity into a utility.
Can I predict when something will fail? The lead time for parts for these large turbines may be six months. If I can predict way in advance, I can have the suppliers build that so, when it comes, it's there right away. Then the third thing I do is dynamic optimization. Can I optimize how a system runs so that what I have is the maximum electricity with the lowest fuel cost or the maximum electricity with the lowest carbon emission? Those are the three things: early warning on a problem happening, continuous prediction so I can better position myself, and dynamic optimization, optimizing the way it runs so I do it at the least cost or most performance. Those are the things we tend to focus on. Colin, how is this different from historically building up models in spreadsheets, for example, or with software, to do these kinds of analyses and predictions that you were just describing? There are two major things we look at here. Everyone asks that question. We've always built these models. What's different right now? What you find out is that the models we have built have been fairly complex models. First of all, what I'd like to do is a simplified
version of that model because I have a model of a jet engine. We tend to build these before we build any jet engines. But to run that model, it takes something that looks like a supercomputer and it takes a couple of hours to run the entire model. That's not what I'm looking for.
I'm looking for a surrogate, a smaller version of that model that's focused on a particular problem. I want to run that and that should run fast. In some cases, we need it in a half an hour. In some cases, it might be a day or two. But it has to run within the right timeframe. What I'm looking at, first and foremost, are models that are specifically focused on things that cost me money. I'm not trying to look at the entire performance. I'm looking at a specific point of what is costing me money right now: the blade is a problem; the nozzle is a problem. That's the first thing. The second thing is, I am now looking at complex systems. Before I can have a model of my jet engine, I want to know that jet engine,
in the environment, it's flying in—and it flies in different places—operated by the power to operate it in this way, given the cost of the fuel right now, so I am bringing in multiple sources of data: financial with the fuel, operational with the pilots, sensors from the actual monitoring, and environmental data. I can do things like wind temperature and dust contamination levels. When you look at all of that data coming in, that is a big data problem. It's different. Your model is now in the context of all these things and you're trying to optimize it. Now, how do you pull that data in? That's the second way I look at it. The first one is all about a surrogate model
focused on the exact thing you're trying to solve, not an overall high-level model of something. Then the second is taking in all that other data that gives you context because that's how business value is realized in the right context. Those are the two things I would say that make a big difference at this point in time. That's really interesting. You said that digital twins are fundamentally a big data problem. In many cases, it is. But again, it all depends on what you're looking at. Let me give you an idea here of how it could be, at times, a small data problem but equally as valuable. For instance,
I'm a data scientist, so when I think about things in my normal role, if I look at the consumer world, I have lots of examples. 2014, if you look at the amount of packages Amazon sold, it's like two billion packages. As a data scientist, I love that. That's two billion examples of people buying something. In Google in 2014, I think it was 10 million or 12 million ads a day get selected. I have 10 million or 12 million examples. Now I go back to GE. GE has a fleet of engines that are the GE90 model. That GE90 model, in a year, we do a little over a million flights, some airlines.
You think, "How many failures do you have?" because what I predict are failures. Well, the answer is in the low 20s, so that's not big data. That's very, very sparse data. Now, where it becomes big is that I have got to take that from multiple fleets, put it together, and then I want to take that data in the context of the environment, in the context of how it's operated by all these pilots, in the context of the cost of me not having that airline, that plane travel. Then the other sources that pile on become big data around it.
Initially, I'm doing things with sparse data. That's why I have to use physics models to aid that sparse data. Then I put it in the context of much bigger data to figure out how does that affects me financially and what action should I take. These are somewhat complex problems. It's both big data and sparse data at the right time.
Would it be more accurate then to say that the digital twin is a data problem more than anything else? It is a data problem, but it is also getting that physics right. I need the big data, but at times I have to go back on the physics because I can't wait to get all that data; I can't wait for thousands of failures. By that time, we would have put lives at risk. I've got to use the physics models and the simulators that I've developed. I can use the simulators to build a huge amount of data, and then I combine that with the data I currently have. It is a data problem all around. How can I get that data that allows me to do it? Some of my data come from empirical data that I take from the environment and some of it may come from the models I have and the simulators I've built on those models. That combination of physical plus digital
gives me something unique right now. You've got this set of data together with the physical calculations. Exactly. The model of physically how it works. I'm not expressing this well, so please. You are. You are. You have the perfect example because I have all the data and I have the insights of when I designed it, I designed it using certain physics. That physics of how I designed it, combined with the data, is what comes together to make this all work. The reason we have to use the physics is that
I can't get a lot of examples of things at the extreme end. If you do things like commerce, you can get examples of things on the extreme end. You can buy blue jerseys for sale and they never got sold. You can buy blue jerseys for sale and you run out in two minutes. There are extreme ends. When I'm dealing with a jet engine, a gas turbine, or a wind turbine, I don't want to go to an extreme end. I can't afford to have it break. I can't afford to do anything that would impact safety. In this really interesting way that the only way I get to the extreme ends is I use the physics models we know and the simulators we knew that can go to the extreme end. I can run that physics model
on a system, on a supercomputer, and I can go to the extreme end. Then I can take the data that I have from normal running and tie it together. I use these things to put the pieces together. You were perfectly right, Michael. I'm using both to give myself a breath of data through a breath of experiences—some created real, some created artificially—that allow that twin to express itself. Correct me if I'm wrong. The quality of the digital twin must be based on a combination of the quality of the data as well as the quality of the physical models. Exactly. That is the perfect way to describe it.
What we do is, even if you don't have the quality perfect at the start, we have a technique we call Humble AI. What that technique does is that technique says, "Can I figure out, based upon the quality of the physics I have and the quality of the data, what is my zone of competency in which I am very, very competent?" In that zone, use the twin. Outside of the zone, go back to the usual deterministic model or go back to the human process by which we evaluate things. Then give that twin more data and get it more competent. We spend a lot of time understanding the zone of competency. That zone of competency tells us what is the competence interval around the answers that this twin is giving you. In that zone, the twin works well and people like that because what you have are customers saying, "That's the right zone for me to be in because I have the least business risk and I can get the most business value."
Also, that zone, I do what I normally do. I understand the risks there. Then I feed more data so this thing gets better. That's what we spend a lot of time doing. How do I enlarge that zone? Can I build better simulators to enlarge that zone of competency or can I get more information like, for instance, from the fleet? Your jet engine may only fly in one environment. Your gas turbine may only work in one environment. But at GE, I have access to these huge fleets. Can I not take that data and compare it and say,
"Well, this turbine, is the model similar to yours? Is it operating similar to you?" Maybe I could transfer some of that learning, bring some of that data to bear, and give me more data in a way that I can create a better model that more accurately reflects what's going on with our assets, and so I can get better business value for you. That's exactly what we do. You called it a living model earlier, the digital twin, a living model. It's improving as the data gets better and as the physics gets refined. Exactly. A lovely way to say it because we also say it's a living-learning model. It improves because it actually learns. One way you learn, for instance, you learn by actual experience. Some of our twins,
we predict the damage that you will see inside parts of an engine. Then what we do is, whenever our engine comes in to repair, we go to those parts and we use computer vision to take pictures and detect not to only where the damage is, but the size of the damage. Did we predict that the crack would be this length? Did we predict that the damage would be this widespread? If we did and if we were correct, we're good. If we're not, see take that information and feed it back into the model. The model learns from real, actual
data that comes back. That's one way to learn. It's learning from itself. It predicted it would be here. We actually took it. We got it inspected. It's a little bit off. It learns from itself. That's one way of learning. The second way is from the fleet. It can actually find other things that are similar to it and we then take that data and bring it in. It's almost like medicine. In medicine, what you do is you have a section of
the population that you have used this medicine or this drug on and you see how they react. Then you compare that to the person. This person is this age. This person has these genetics. We have given this drug to this section of these people that are the same age, the same genetics. It works well. You bring it in and you say, "Well, let's try this drug." The same thing we do. We look at engines that look the same. They operate in the same way,
configured the same way. Can we bring that learning in so that you learn from the fleet? Then you can also learn from humans. We learn from simulators the humans run or we have huge test sites. We have a huge test site in Greenville, one of the large turbines. We have a 200-megawatt turbine there. What we do is we have that instrumented.
We can run extreme scenarios there and see what happens. We can learn from that and, from that learning, send that to the model. We could use simulators. We combine with Oakridge National Lab and we have very powerful simulators there. We take that data and bring it in. You learn from yourself. You learn from the fleet. You learn from simulation. You learn from humans doing lab experiments. All of that learning makes this model a living-learning model.
Where is the model located? Do you run the model on behalf of your customers? Do they connect to your system? Do they have it in their data center? How does that work? It all depends upon the model you're running. For instance, if you take a gas turbine example, some of the models actually run at the customer on the turbine because in some cases, if you're doing a digital twin for performance, the latency – I've got to make a decision and I've got to talk to the control system within milliseconds – that can't be at an M&E center remotely. That is actually on the physical machine itself that have models running in the control system and it's adapting right away. Now, if you're thinking about the life of a part, and we do twins that measure the life of a part, that's much longer-term horizons. These parts live four, five years. You can have that. I can send that data all the way back into the cloud.
Those are much more complex calculations in many cases, and so you'd want the computing power. It all depends upon what you're trying to evaluate. In many cases, I have it on the site running with the asset. Then in some cases, it's on the cloud. It's all based upon the value you're trying to pull up. Colin, we have a question from Twitter from Arsalan Khan. Arsalan is a regular listener to CXOTalk. He asks great questions.
Thank you, Arsalan. We appreciate that. Arsalan is making the point that you're chief technology officer, yet you're speaking about both technology but very much also about the business problems that are being solved. It then leads me to the value that's being provided to the custom. What's the value for customers and improving the customer experience of the digital twin? Let me give you two examples. Let's start with the first one in wind. When a customer buys a wind turbine, usually they're buying multiple wind turbines and they put it in a wind farm. Usually, the contract is based upon you have set me up with a machine that, given the wind is going to be blowing this way on average because we've spent a year looking at the way the wind blows in that location, I am going to produce this amount of electricity for you. That's how they make money. This amount of electricity is produced. They actually bid it
in or they have these things called purchase power agreements where somebody has said, "For the next ten years, I am going to buy this electricity at this amount for you," so they buy these machines based upon the fact that we can deliver this amount of electricity for them. But then things occur. The wind changes slightly. Maintenance problems. What you have is a digital twin that's constantly monitoring the state of this asset, making sure that we are providing the level of performance we said we would provide. We know some days the wind is going to blow harder. Can I generate more electricity and maybe save some in a battery? You know some
days it's going to blow less. But on average for the year, we have a contract that says this is the amount that's going to be produced. The twin is monitoring that. The twin is then tweaking things and tweaking the low and the pitch and understanding the damage so that I can predict what's the best way to align the turbine so I get the most performance. It is also figuring out when is the best time to maintain it because we only have so many days scheduled for maintenance. If we do more days for maintenance, it means you have
less days where you're generating power, and so that affects the contracts we have. I am monitoring to the point where I want to catch the repairs as early as possible so I do a one-day repair rather than a five-day repair. It's okay if I do three one-day repairs. It's better than waiting a while and then having to do a five- or six-day repair because I lose money.
The twin is the thing that's managing the performance as well as managing the life and the maintenance. That's one example. In jet engines, you see the same thing. People only make money when they fly, so many of the contracts are written such that I want that engine to be highly available, so if we can predict when there would be an early warning on a failure. A failure of a jet engine at an airport is recognized as something that's bad, so we're only allowed so many. Once more, I'm trying to predict what's the failure rate. How do I determine an early problem? Once I could get to that early problem and do a minor repair, I could increase the availability. People make money, customers make money on having the jet engines more available. People make money on having their electricity meet the contract
requirements. That's what the twin is used for. Ultimately, then, it's a matter of delivering on the promise of the product or the service that's being modeled. Yes, it is. But here's where it gets hard, Michael. The world changes, so wind profiles change. Sometimes, there are weather problems. There are climate shifts. There are burns. So, those profiles change but the contract didn't change, so what do you do? Sometimes, things occur. You are flying a jet engine in a harsh environment. In many
of these emerging nations, they are doing lots of building. They're building skyscrapers. There are different dust and contaminants in the air we never suspected. These get on the blades of the engine. These wear the blades in a different way, erode the blades in a different way, so you didn't anticipate that. Now, you've got to have these twins give you
enough early warning on these problems and enough ways to mitigate because when that builds up on the blades of an engine, you find quickly that that disturbs the airflow and the performance is not the same, so you increase the amount of fuel you use. I've got to find ways to balance that. While it's meeting the commitments, it's meeting the commitments in an ever-changing world with every changing operators. That gets hard. The benefit accrues both to the operator as well as to the customer. Exactly. Exactly.
We have a question from Sal Rasa on Twitter who asks about the cultural changes that are necessary when you implement a digital twin. Maybe we can start there and you can walk us through how do we build a digital twin and get it running. The twin, for us, as I mentioned before, starts always with where is the business value because, again, these are assets that are operating in a certain environment. Unless you can talk about the business value, very few people are interested. We figure out where is the value. Do I want to get more yield out of the factory, more availability of the asset, more performance? Now that I have that; I go backward into that. All right. Given that I have this,
what's the data I need to collect? What is the insight I want to get? That tells me the model. Do I have the right insights at the right time? Like I mentioned before, when I am predicting a failure on a jet engine, I need to do it 200 days in advance for a bearing failure, if I can, because it takes the airline 200 days to get a spare engine on that aircraft, 200 days to get a new pilot, a new crew, because all of these things are tied up or planned. I need that domain knowledge. Then, based upon that, I have to build a model. Then I have to put it in the process. Now is when you do business process transformation. It's not enough for me to tell you to do something. I have to know what's your
normal process by which you switch engines. What's your normal process by which you get information? In many cases, we find people just wait until it breaks. Now I've got a slip this in your business process. That's business process transformation. All of these pieces need to come together because only when you do the business process transformation do you see the value. In fact, having the tool and you never used it, you don't see the value. Think first the value story, the data and the domain story, the models you build, and then slipping in the process.
Now, let's come to culture. One of the hardest things is to actually think through the cultural changes. Again, in my history, I spent the last 6 years with GE, 20 years in IBM. IBM, and other companies that work in that evolution and that revolution, have a data culture. What you find in some of the industrial companies, the data culture is not so pronounced. It is really a product culture. What I build is I build the most aerodynamic engine. It runs fast with the best fuel.
Yeah, the data is a way towards an end. I don't have to keep the data once I've gotten what I have. I may keep some of it for FAA requirements, but the rest of it is not really there. You first come with understanding the data itself is a golden asset. Right now, at GE, that helps because we have a huge services backlog, over $300 billion of services. Having the data helps me predict what I can do for customers better with services. The culture has to start with a data culture. The second aspect of it is that you've got to
make believers of the chief engineers that are there, the financial leaders that are there, the CEOs, and general managers that the data can make a difference. You have to go in and do specific pilots, usually on their hardest problems that can't be figured out, using just the knowledge they have and using that as a way to say, "Look at what has happened and look at what the money is we've saved." In many of the industrial companies, you have two things: safety and money. Once you get the safety right and you get the money right,
people take notes. People now say, "Wow, this data could really help me." Now you've got to build a financial equation that says, "All right. Now, if I do this with the data, I can save so much money," or "I can gain so much money." Ah! Now it becomes truly interesting. I think it is all starting off with understanding and getting that data culture going by hitting a few key problems in which you can show the value of it. Then building that business case in which you could clearly see it on a problem they're focused on. Then coming back into the data culture that actually says, "Well, how do you respect your data? Do you store it? How do you clean it? Where is it kept? How long is it kept?" Then reminding people. The people who use data respectably—Amazon and
Google—they spend $4 billion or $5 billion a year (or have spent $4 billion or $5 billion a year) for five years to get this right. They don't do it on $100 million or $200 million or $5 million. It's not something that you flip a switch on. That data culture is really gotten because you have won some battles where they didn't think you could win and you've built the right heroes and they see how we can help them. Then you go into the cultural war of changing things.
Arsalan Khan follows up saying, "It sounds like you think like an enterprise architect." With pride. Thank you. [Laughter] I mean I don't think I'm that good, but I have to because the challenge we have when you're dealing with these larger assets (or even smaller assets) is that you're thinking about the design phase where the engineers are. You're thinking about the manufacturing phase where it's manufacturing. You think about the services phase in light of the money you make then and the response to the customers. This is an enterprise-wide play.
If you don't do the right things, if you don't understand what occurred in design and manufacturing, chances are you're not delivering the right value and services. If in services you're trying to make changes and you don't deal with the actual life as designed capabilities or problems inherited in the materials and manufacturing, it doesn't work. This is a systems problem so, every time you look at a systems problem, you have to come at it from that enterprise-wide view. Yeah, I would love to call myself that. That
would be an honor for me, but I am just here dealing with, how do I actually make sure that I deliver value for the customer, then value for the company, and it does fall across the entire enterprise itself – make no mistake. Also, recognize that quickly and work at the enterprise will get the fastest results and build the greatest sustainability. Colin, you mentioned earlier that the cultural, the human dimension is the hardest part. Why is
that given the obvious complexity and size, scope, scale of both the data and the physical models that you're creating when you build digital twins? I think there are two things here when you think about the human dimension. One is, in many cases, especially when you meet some of the people who have been working there for many years, there's a notion that, "I've seen that and I have a gut instinct." In many cases, they do. I wouldn't decry that at all. There is an instinct built upon many years of doing something.
The challenge we have in many of these environments, though, is that the environment is not the same. You may have built that gut instinct in an environment that wasn't that dynamic a few years ago, but now it's very dynamic. I'm in the energy space. The rate of change of renewables coming in—large-scale solar panels, large-scale wind turbines, or even small-scale on people's houses—that industry is so dynamic and they are changing. Regulations are changing. There is a variety of electric vehicles showing up. The gut feel that you have right now, in the
era it made sense is now being reshaped because there are so many new, dynamic things. You are not going to be able to understand the relationship between all these things and the impact that those things could have on you in a way that makes any sense, especially when it's rapidly changing. I think that's a hard thing for humans to get in mind because it says two things to them. One, it says, "Am I less valued?" That's not true at all. You are valued in directing what the AI and the data do, but that's a feeling we have and it's a personal feeling we all think about.
Then the second thing that tends to happen is that there are these new solutions that come in that say things like, "Well, I can replace talent by this AI solution." Again, maybe in some jobs where it's now due to interest, but that is not true in many of the environments I'm in because what you find is that there is a lot of data you have not captured. In many cases, when you think about data that's been captured, people seem to capture a lot of data. But we generally capture data to solve a problem.
That's why databases have schemas. I had a problem. I use a database to capture it. If the problem is different, you may not have captured all the data you needed. The notion that, "Oh, I have all the data I ever need to solve every problem I have," is not true.
You may need different data or you may have captured data, but you may have captured the wrong quality of data. It may not be synchronized in the right way, so you still need the human there saying, "Here's the way you should capture this data. Here's the extra data we need. Here's the value of that data." Those judgment calls still need to be made. Everyone worries. Well, I shouldn't say everyone, but quite a few people worry about
the fact that this replaces me. It does not. Those are the things that you think about. That gut feel that you worry that you've lost that edge. No, it's not true. The thing is just more complex. Or that I will be replaced because this thing already knows more than I know. Again, not true because we may not have the data that reflects what we need to do.
Then the last thing is the business process itself. A lot of my business process is still human. You still need to call somebody and do something. You still need to get the engineers to get something done, to field folks to get something done. Then you still need to explain it to them. There is still that part of it that we've got to work through in which it's the humans and the AI coming together that makes sense for productivity. It is not one or the other. That is part of the cultural revolution we need to have. On this top, we have another question from Twitter. "To create a data-driven culture,
do you think about specific incentives that you can provide to individual contributors or, alternatively, how do you drive and create that kind of data-driven culture, as you were describing?" In that data-driven culture, I think about first of all high-level purpose and motivation. If you don't have the purpose right in what you're trying to do and you haven't laid that out there, people don't get on the bandwagon. That's the first thing you get right. Then the second thing you get right is,
can I get the right standard work or the right process? If the process doesn't include the data, and if decisions aren't made on the data, then almost nobody wants to collect it because it's the process itself. You've got to change that business process where the business process is reflecting the fact that you are making a decision based upon that data and everyone will be rewarded (customers as well as your employees) based upon the right decisions to that process. Then the third is motivating and incenting people. Now, again, to collect the data, sometimes it's not just the employees. I have to incent my suppliers to give me that data. I have
to incent my partners to share the data. I have to incent the customers to give me the data on when they're doing things to make the twin more accurate. There's an incentive on that dimension. In terms of people, yes, there is an incentive. There are a variety of ways that we look at doing that. One is, we have these innovation metrics that talk about how many key—I should say—hard
problems that you solved using the data. How much was your solution reused, you know, your modeling solution used? Again, those are incentivizing people to build these models. Then there's another set of incentives in which we talk about how much reuse did you have? One way I can get a lot of my data scientists and my engineers more productive is to have them reuse things that are done. In many cases, you talk to them and they say, "Well, no. I'm brilliant. I'm the only one who can build this." Well, you know, you have other brilliant friends and colleagues. Can you not use what they use?
In some cases, we've begun incenting people to reuse things because, if you reuse things, you actually do more. You get more productive. You can maybe tackle four problems instead of two, and so you make more money because you've tackled four of the harder problems. Yes, we have done a variety of very—I would say—targeted experiments to help us do that. But we are still in the early stages and we need to emphasize that more. That is a big fact in our innovation metrics:
all our own reusability, the model generation, and also the collection of gold data. It's amazing. There are people who collect gold data. Gold data is data that's reused and of high value. If you collect that, everybody shows up to run models against your data sets. They say, "Well, Colin has collected this great data set. I want to run my model
against it. People keep giving him good data, he cleans the data, and he has it ready for us." That's value. We would reward that. Yeah, we're doing it now, but it is very early stages and we do not have it right. Again, it's an evolution. Colin, as we finish up, what advice do you have for organizations who are listening, folks listening, and saying, "You know this sounds pretty good"? How should they start? What are the types of problems that are most amenable and make most sense to begin with when it comes to thinking about creating a digital twin? The first thing I'd do as a technologist is don't think about the technology. Think about the business problem. The business problem is the one
thing that will galvanize everyone to give you the data you need to do the work you need to do. I would think, "What is the biggest business problem you have that I think that I can solve? Can I express that in a way that your finance people, your sales folks, your engineering folks, the services folks, and manufacturing folks would understand it?" That's the first thing I would suggest. Look at the actual business problem because you will get advocates once that happens. They'll understand your purpose.
Then the second thing I would suggest—and again, I've made the mistake so many times versus me telling you how to do things right because of all my failures—I go after cost problems. The problem I have is, every time I go after revenue, if I say I can grow the revenue base, everybody is not sure. Was it what you built or was it the way we sold it? Cost is really easy to find. Why? Because I can show up inside my manufacturing plants, inside my
engineering organizations, and I can say, "Which costs do you need to remove?" More than that, after I use the data to remove the cost, you can measure it. You can remove enough things so that I can go after removing that cost. Then what I say is, "Well, if I can save you so much money, can you not invest that money in me to save you more?" The great way about looking at costs is that by doing that, I create a pool of funding for myself, which is good because then I can say, "Now we'll save some more costs, but then we'll put the rest of it in going after new revenue." That's the other thing I would say is find a well-known cost pool where it's defined so that you could show the value of it. The third is, make sure you have the data. Okay, I'm a data scientist. Whenever people tell me, "Oh, I have captured all that data," if you have never used the data, make sure you have the data.
You go into their databases and you realize, "Well, oh, a lot of the sensors at that time weren't working, so I have data that's corrupt. Oh, I had the time stamps wrong, so, oh, man, this thing doesn't jive at all. I have inconsistencies. I took the same things in four databases and they all look different." Make sure you have the data. Eighty-percent of the task is getting that data right once you know the problem. Spend time and really ask them that.
Then the last thing I would say is that after you build that model, make sure you go back into the business process and transform it. If you can't show how that model you created with the data delivers money, everybody is going to say, "Well, I'm not sure it changed anything." This is the way you look at it. Really think about it from looking at the money,
understanding, going after things like cost and then growing from there, making sure the data is there, and then looping it back in so that you surely deliver the money. That's been tried and proven. I've made many mistakes not following those things. That's the advice I would give, Michael, at this time. Wow. That was awesome. Thank you so much. We've been speaking with Colin Parris. He is the chief technology officer of GE Digital. Colin,
thank you so much for the education today. Oh, Michael, it's been my pleasure and honor to be here. I'm delighted to have this conversation with you. I would love to come back in any way and help whenever I can. Thank you again. Everybody, thank you for watching, especially the folks who contributed questions. We have great shows coming up. Check out CXOTalk.com. Subscribe to our YouTube channel and hit the subscribe button at the top
of our website, so you can subscribe to our newsletter. Do it now and tell a friend. Thanks so much, everybody. We will see you again soon. Have a great day. Bye-bye.