The real-time environmental impact of 250 000 ships CON018

Show video

[Music] hello everyone and welcome to this microsoft build session my name is isabella havrulco and i'm a solution specialist within data analytics and ai in microsoft norway it's my pleasure to introduce speakers for this session joe of stoss who is chief technology officer with c4ir ocean and idik stavalin senior data scientists at noaa ignite unedic will tell us more about their work on unlocking the power of ocean data through the ocean data platform and show us the environmental impact of 250 000 ships sailing now across the world the ocean data platform is an open and collaborative data platform built on cognized data fusion technology and is a project supported by microsoft ai for earth program it is also one of the key initiatives of c4ir ocean a non-profit foundation established by occur group and the world economic firm last year with the goal of connecting industry academia governments and the public to create solutions that will restore health and ensure future productivity of the ocean microsoft joins c4ir ocean as a co-founding partner and together with the foundation took on a leadership role and high level planner for sustainable ocean economy an initiative of 14 heads of states including prime ministers of norway australia japan and canada that joined forces in order to build the momentum towards sustainable ocean economy the panel put forward ambitious plans to protect the ocean for future generations and made commitments for which development of solutions like the ocean data platform plays a crucial role with this short introduction let me hand over to irik and you so thank you isabel and we are so happy to be here today and presenting our topic at microsoft build so going to talk about the real-time environmental impact of 250 000 ships this has been a project that we have been running in our team for the past three months and we are aiming at showing greenhouse gas emissions from the entire ship fleet in hourly intervals and we're using what we call ais data to do this that's position tracking of ships quite much data and we've used this with vessel particulars knowledge about the ships dimensions fuel type and so on and we feed this into an emission model so we're going to deep dive into this later on in this presentation a few words about our foundation so we are c4ir ocean and that's based upon more than 180 years of history in norway with the responsible ocean management and the maritime technology innovation and last year archer and the world economic forum decided to set up this affiliate center c4ir ocean where we have a global mandate on focusing on ocean sustainability and we are developing and operating the ocean data platform and we are also having the lead on one of the action coalitions coming out from the high level panel for sustainable ocean economies the ocean data action coalition and we do this together with partners like microsoft and other partners as well so the ocean is big is huge 360 million square kilometers that's 36 times the size of the united states it's also deep going down to 11 000 meters in the mariana trench and if you look into volume we're talking about 1.3 billion cubic meters of water so where to start and where to focus well all the projects we are running in our center c4ir ocean is focused around these three mission missions or impact areas better ocean use zero emissions to air and water and zero plastic in the ocean so the ocean has been a very important part of my childhood education and working career and like many other people on this planet i'm passionate about the ocean and i have been working in the maritime space for many many years and i have seen the most beautiful or the beautiful in the ocean but i also seen how humans are impacting the ocean in a negative way so this makes us think what can we do to enable better insight through data and technology how can we improve and get a better fact-based decision-making process where we bring together ocean industries scientists policymakers the general public and citizen science so that is what we are doing with our work in the ocean data platform in c4r ir ocean and and now we're going to deep dive a little bit more into one specific solution that they're made um as you all know uh many of the industries um specifically the ocean industries are causing some of the challenges we see in the ocean today but it's also important to to to realize that the ocean is a significant part of the solution to climate change and that's why we are working with progressive industries thought leaders and focus on the positive impact stories so with that i i give the word to irik you will deep dive a little bit more into the technical solution hello i'm iriek data scientist with noaa ignite working with the ocean data platform and the ocean data platform wants to track emissions from shipping because what gets measured gets managed and enabling sustainable and productive management of the ocean is a key factor in ensuring a healthy planet so it's down to figuring out how greenhouse gases are calculated i'm sure you heard about co2 how much co2 something emits or about the greenhouse gas footprint as a variable people are concerned about the method for estimating greenhouse gases is not a secret the method is well known and described in method papers typically accompanying larger emission studies the implementations though are not readily available and the big deal for us became to make an implementation that matches those used in the global shipping industry nobody can tell you exactly how many a ship sales disease the latest international maritime organizations greenhouse gas study accounted for about 237 000 ships one of the world's largest chip registries the ihs fair play has 127 000 ships for 2020 global fishing watch has 1.5 million ships in their database out of which fire out of about 500 000 we found on the seas in 2020 if we take the latest global emissions inventories there are about 200 000 active ships constitutes what we could call global shipping out of these the top 20 produces about 90 percent of all the emissions it's the big ships that produces the biggest emissions but at the same time it's because the ships are so hugely large that shipping is the greenest mode of transportation so much good is transported by comparatively smaller combined engine power so if you saw a shipping vessel on the news this year it probably was the one that got stuck in the suez canal the ever given that was almost 400 meters long and what's this 42nd longest chip registered in this huge database we use twice the size size of the average of the largest category in this database it can transport 20 000 standard shipping containers two of which in the road mage makes up the the load of a semi truck so it's 10 000 semi trucks worth of stuff so most chips are smaller than this much smaller the average ship register in this database is 86 meters and the average of the smallest shortest category is about 27 meters that's the length of a basketball court so that's still way larger than the typical boat owned by a private person so these are ships for professional use science progresses one funeral at a time it's an old saying from max planck and the principal also applies here the idea of green shipping relies to large extents on new technologically superior ships taking over the jobs of old clunkers so that means new designs new fuels new engines and there sure is a lot of cool new tech coming to the world of maritime transport like this there's a company here called evoy that wants to electrify smaller vessels so think what the car industry is doing right now or here there's a totally new way of thinking about ships so this is a fully electric fully autonomous container ship called the arab bridge line so this is not built with the limitations of having a large crew of humans as a core feature but can thus optimize for other efficiencies than keeping a crew warm and happy here's a wind power vessel that is under development the ship's design here aims to lower the emissions by up to 90 percent so the masts here the things that sticks up that looks like chimneys these are called rotor sails and they give wind assisted propulsion these rotor sails can also in many cases be retrofitted to existing ships so back to our problem this must be a machine learning problem right and since we're at microsoft build this is azure ml to the rescue right unfortunately no the signal that we would like to learn here how much greenhouse gases goes up the chimney of the of the ship is not readily available for us unless we own a fleet of ships and can measure how much fuel we use or how much green black suit comes out the chimney so we can't measure that we can't feed that into a machine learning algorithm that would aim to learn what moves this variable but in the paper trail of the greenhouse gas studies we found this this one study the international consortium on clean transportation's detailed methodology from 2017 and this is one iteration of the studies for the methodologies that also the imo uses so these are bottom-up methodologies where you start an individual ship and you sum up for the whole fleet to hold all ships so the icct made a big difference to us because their methodology paper is clean it's easy to read and it explicitly prints the equations so writing good method papers is a lot like writing good documentation it makes the next guidelines life much easier so here we go let's follow an ais datum from a ship all the way through our emission model ships of a certain size emits messages to the surroundings in order to avoid collisions these are called ais data for automatic identification systems the data are varying in quality and not super reliable and they come in large quantities many ships report every few seconds some report only once in a blue moon most professional ships emits enough data to get a good overview of what they are doing ais includes stuff like speed direction id of vessel etc so ais is picked up by other ships or by towers on lan or by satellite and it's shared to do its primary job avoid collisions after that ais is sold as a commodity address vintage batches all the way up to streaming data our data point has now traveled from the ship's radio it's been picked up by a satellite and it's relayed to an ais data broker and is now for sale so we buy access to this data in partnership with our ngo friends over at global phishing watch here we see a map view of ais data showing where the ships has traveled so we see the north western part of the mexican gold fair we have added different faces to the data indicated by plot color showing what kind of operation the ship is making such as cruising maneuvering and staying at anchor as time series data shows how ships move and if you know a bit more about what kind of ship this is we can estimate how hard this engine is working in order to make this movement and if we know how much energy is needed and what kind of fuel and engine this ship has we can estimate emissions so we use ais data from spire curated by our friends over at global phishing watch and for anyone who has been working on data cleaning and quality will probably chip in our thanks to the efforts global phishing watchdoes in washing billions of ais signals our methodology works on a resolution of one signal per hour resulting in only 581 million initial data points for 2020 so this is large but still manageable by fairly conventional data processing methods about 50 of global ais data is missing and has to be filled in linear interpolation is inaccurate for long periods of missing data and we need better ways of filling in missing points so what we see here is a classic problem with linear interpolation for missing data visualized so a ship is traveling from the english channel emitting dense nice ais signals before it disappears from the data and pops up again down by gibraltar and traditionally we linearly linearly interpolate this and draw a straight line cutting through land in this case spain and portugal the method adjust for this by penalizing these interpolated points a bit as ships normally do not travel in such straight lines on average for all ships over a year these probably don't make much of a difference in emissions when accounted for but it sure is sore on the eye and we want our emissions data accessible through a map service and did find it worthwhile deviating from the icct method on this matter so we are filling in missing ais data with a pathfinding algorithm based on big data statistics we build a huge graph of paths broken down to suitable resolution for passenger tanker cargo tug fishing vessels etc and we find the optimal path with the dike star shortest path algorithm all the routing is performed in azure with postgres database and the pg routing extension so this is what this looks like so there's a ferry going from keel germany to oslo norway a 20-hour journey so we see multiple journeys here on the left left we have the interpolated where we have filled in the gaps using linear interpolation drawing a straight line from the last known point to the next and the results are all over the place in in orange on the right we have the same journey with the same missing points interpolated using the routing and we see it fits the green points much nicer so now we need the vessel particulars so that's ship types engine types fuel types sizes various indicated indicators for load capacity and ship registries have existed for hundreds of years and contains exactly such details we use the ihs fair play a database for this as you can imagine due to both human nature and the variety in ships constructions and measurements such databases are not hundred percent row column complete in order to fill gaps averages for ship types and capacity bins are used and this is probably an excellent area to improve using machine learning but with we stuck with a written methodology here so the model used to compute emissions is based on empirical work i have highlighted the three modes of energy consumption on most ships and in yellow we have the main engine the green bits covers the auxiliary engine and the pink bit covers the boiler each of these are affected by what phase a ship is in representing what kind of work this ship is doing and each phase a ship is in depends on the distance to shore the distance to port the weather whether the ship is in the river or the on the ocean and of course the speed and the class of the ship so the main factor for affecting emission emissions is how much oomph the skipper is giving how hard he pushes on the gas given giving the ship speed but adjusted for whether the state of the hull and how heavy the ship is loaded having all these factors figured out the rest is lookup tables for emission factors for ship for ships class subdivided into capacity bins and fuel type so it's it's simple but complicated many accumulated small facts we implemented the model in python with the expectations that this would depend on horizontal scaling for emission computation since the model has many ifs and bots we an adjustments for special cases we did a naive implementation first doing every ship in sequential order and iterating over each hour we compute emission for this would make the whole thing much easier to understand but we soon realized that we would run out of azure credits and time before we had any big results so python is never chosen for speed and i'd like to demo what that looks like so here we see a for loop we are iterating over all the ais data from a ship at a one hour resolution there are lots of lookups and ifs and buts inside the compute emissions function here so i start that so first we fetch 71 demo ships and then we iterate over those and then we fetch the ais data and the ships particulars and we iterate over the hours so what's counting up now is the hours we have calculated emissions for for this first chip so i stopped the recording before it is finished because it takes about two minutes per ship so clocking in at about two hours for our 61 test chips so we rewrote this emission calculation to rely on pandas data frames and then pi doing most operations on vectors and eliminating loops so lookups are removed by merging ais data with lookup tables into data series similar to how joins work in sql if else statements are can be removed using numpy select statements so the math is now applied on vectors instead of individual values so we can do all the hours for this one chip at the same time so now we start that i'm sure many of you have experienced this for yourself but wow python does not need to be slow from hours and minutes to run a ship's data for a year we managed to remove all the slow python code and our program is now fast enough that python becomes secondary to the speed of our databases now the loop is just over the pollutants we want to estimate now we clock in at about 2 minutes 37 for the 61 chips but we can do better still my laptop here has eight cores so let's run eight chips in parallel now the printing comes out all scrambled since we have eight processes that are printing with no regard for each other but that's fine so now we clock in at about 33 seconds for all 61 ships and i'm sure there are many places this can be optimized more but now we are at the place where python isn't python's speed isn't longer our main concern so now ais our ais data has traveled from a ship into our database and has been combined with details about this ship and we can compute greenhouse gas emissions fairly quick and these can be stored back to the odp data store since there are so many ships in the world and we only know for sure about those registered in large ships registries we compute those first then we start filling the holes for missing data for ships with no hits we use global phishing watch as a fallback as their database is much larger but has much less detailed information so for ships where we lack proper vessel particulars that have an estimated size and class we use averages from the good data the first bit and we break this into quartiles to account for the bias in these ships registries they contain really large ships and ships not registered there are likely to be much smaller we are deviating from the methodology's intention intentions when we compute emissions from ships and look at these individually but the world's needs to see these ships traveling places they care about and if the world needs better regulations for reporting or better better data processing pipelines so be it the world needs to see the data we have not the data we wish we had so what we see here is a web app built on streamlit for prototyping possible front end easily deployed on azure app service so we see our huge ship again the one that got stuck in the swiss canal the ever given we see it had a slow may and november but it operated all the way through the year and we see that it mainly traveled between europe and china and it did indeed prefer the suez canal route also in 2020 so i can see it go through egypt there this is the version of the ocean data platform and this is a use case for our emission service where industry academia and the general audience can can view inspect and download data such as by drawing a polygon on the map for a geographical area like this i drew here in the gulf of finland in in the baltic sea the data can also be accessed by software developers kit perhaps the preferred methods for many of the audience here at microsoft build thanks a lot i refer going into the the details it's a very exciting solution uh but let's uh zoom out a little bit so you have learned a bit about our uh solution in the ocean data platform an open and collaborative data platform where we are liberating ocean data by means of open apis and sdks and enable developers and data scientists to develop solutions for a productive and healthy ocean so let me give me give you some ideas around what type of use cases we are thinking about and services going forward so one obvious opportunity is all the regulatory drivers you have out there and i have just listed a few of them here the eu taxonomy and green deal the enhanced transparency framework from the paris agreement and the greenhouse gas strategy from the international maritime organization so emission reporting and climate impact environmental impact will go towards a more real-time way of reporting needing more high-fidelity data and that's something you can find in the ocean data platform this is another example coming from our close partner global phishing watch that we have been mentioning previously in this presentation they are doing some pretty exciting stuff using ais data something else called vms data and satellite imagery and satellite radars to illuminate illegal fishing and the blue dots you see on this map is fishing activities that are detected by machine learning algorithms and the re red areas are so called trans transitment areas where small boats are lying next to a big boat and offloading catch and so on another issue that is becoming more and more important is the impact and the water noise and we distinguish between sound and noise but underwater noise impacting on marine mammals and also wildlife collisions so we can also use the data in the ocean data platform and our solution to enable solutions into this field and this is one example from the ocean air platform where we are looking at specific areas along the norwegian coastline where you have different ocean industries operating like oil and gas seismics and also ship traffic and then you can imagine how it can go forward to do this type of analytics and show noise heat maps and so on yet another example is the fact that ships and sailboats and pressure crafts are traveling all over the ocean and also to remote spaces where we lack data and so on so this is one example from the latest won the globe race where boris hermann racing are actually having floating like highly automated sensors onboard their sailboats and capturing very important and relevant climate data in the upper level of the ocean across the antarctic in more or less real time so endless opportunities here with the ships sensors and big data to trend what is going on in the ocean so none of us can do everything but everyone can do something and we are encouraging you from sea for ira ocean and the ocean data platform to join our mission for connecting people data and technology for a healthy and productive ocean so thank you for listening in and here are a few more contact details for our organization and platform and community so thanks a lot [Music] you

2021-05-30

Show video