NODES 2020 Extended with Doulkifli Boukraa and Siddharth Karumanchi
hello everybody good morning good afternoon good evening where you are thank you so much for joining us today for what is the last note 2020 extended session and we've got another two fabulous talks for you to enjoy so i am joined today with my colleague jesus varassa i i had two we have a talk on neo-semantics and i thought who better to bring along there was there was no one else right yeah well thanks uh first of all thank you thank you very much for for inviting me luigi it's it's pleasure i mean it's i'm really excited to hear about what our guests will be talking about today but yes just a couple of words from new semantics so i've been i've been leading the efforts on on these projects so new semantics is a is a labs uh initiative is a labs project which is this incubator that uh lew's team is is managing that includes you know uh very popular extensions to neo4j like apoc the the the um kafka connector uh graphql extensions people again can can help me here but there's a bunch of them includes near semantics uh that's our uh plug-in our kind of toolkit to integrate with rdf and semantic technologies uh stack uh from neo4j so i've been uh leading that project for the last few years and and that means you know writing some code uh writing some uh some blog posts some examples and and and basically interacting with the community with uh with teams like the ones you're gonna be hearing uh from today and uh well the latest is we're doing great i mean the community is growing i mean this week we've hit the the 20 000 mark in in download so there's clearly uh uh some interest there there's a community growing there's lots of conversations in the in the community community portal we're just about to release uh a photo to uh version of it and we keep adding features so uh you know just come in and come and have a look to to the to the labs page and you'll find more about about what what new semantics can do for you and join us in the community that's uh that's me let's get started because i think we got a little bit of delay but yeah happy to answer any questions on that so uh the role of play what we're going to do for those of you who are new for those of you been here before you know the drill we've got we've got our two presenters so we've got siddhaf karu manchi who's going to be going first and then he's going to be followed by keithley so we're going to do our two talks so we've got our two guests they're going to be doing their talks and then at the end we have our panel discussion so if you have any questions drop them in the chat so we're picking them up both from youtube live and on twitch so drop the questions there and we will pick them up at the end when we do our panel conversation so without further ado i will hand over to you sid off over to you now oh great that's it okay okay sure okay so so um so when we look uh so when we look at any organization if we if we break down what actually happens it is essentially about uh conversing with data so it can be either between an organization and then a user or an intra organization conversation all of this happens through queries so if we are able to understand what the user is asking and if you're able to prepare the data to give a consolidated answer this brings a great value to the entire ecosystem so for instance if a user is interested in a product or if he's querying about something so if we understand the intent as well as the context of the queries we could always answer in a better way and uh prepping the data would be central to the entire process that is converting a raw information or raw data into a qualified organized data would uh would would reveal would reveal the hidden information which would help in the entire uh which would help in making better sense of of the question as well as to give what they are seeking and connecting various data sources especially the structured the unstructured and and and preparing them in a unified form would help us to query all the data at the same time so we are a semantic middleware uh which are based on semantic technologies and we say this particular layer would uh would sit on top of a databases and in between the users or between the query so what we do is we um we do the we we build uh ontologies which are basically subject matter uh experts the domain information codified in form of an ontology where the rules of the uh concepts etc and how these concepts have been uh have been put in a particular hierarchy then um or build knowledge graphs based upon uh uh on on the data available and connect with these ontologies and reason or rather infer data infer information on these instances with the help of a reasonable and this enterprise knowledge graph would be the base on which most of the activity takes place most of the exciting activity takes place for instance a very interesting interplay is between the knowledge graphs and the text mining and nlp for instance in order to extract the entities of interest on rather to not only to extract but also to disambiguate and to link with the right concepts then uh say for instance tagging um tagging of the data document uh docs i mean what i was trying to apply was document tagging then enrichment of the data using the semantic structure in the in the in the ratex would be a very interesting interplay and we believe that this interplay would uh would would would bring a very unique value to the enterprises and apart from this we also uh we also manage the data uh and uh on top of this build semantic uh as well as semantic search which wherein an nlp question could be understood as well as not only the the the exact answer which they're not not only the exact uh not only the answers based on the keywords in this but also the related semantic things could be could be found so that's so for today's talk we would uh briefly go through the construction of the knowledge graph uh which we are use so which we are using to uh uh to which we are using to explain what we are trying to do then uh show an ontology based inference and um uh also demonstrate the the uh the the dynamic nature of data in a sense that when any particular data attributes change how would the inference change and finally we would uh show an application which is uh powered by the electoral knowledge graph in order to uh in order to show how connected data can be uh can be represented so that it could be navigated as well as could be interpreted easily so for the for the for the sake of this demonstration we will be sticking to um a knowledge graph which we construct based on the indian electoral context so this is the basic uh this is the basic outline of how we were constructing this knowledge graph so the electoral data comes from various sources like you have the electoral information that is the real elections that happen the parties contesting the candidates then uh the parliamentary constituencies from which they contest and not only this but we are also integrating this kind of data with the demographic data which is usually comes from the sensors and also the gis data so combining all these gis i mean all these various sources of data and defining a graph scheme based on it we store it in form of a label property graph uh the reason being uh the label property graph uh is quite useful for us in um exact in in uh in exploring the algorithms as well as because we rely on um the the attributes of the relations which is the central to the label property graph method now in order to generate the knowledge graph we um we built a custom ontology based on the electoral concepts so uh the the the elector the indian electoral concepts have been codified and uh their first have been identified and they've been codified in terms of an ontology wherein the relationship between various concepts as well as the hierarchical structure is present so uh once we once we uh once we uh link this um instance data present in the label property graph with the ontology that is when we generate uh a knowledge graph so let me uh first give you a glimpse of the the ontology which we have constructed so this is the for instance if you look at a concept like a parliamentary constituency this is a subclass of the electoral divisions which are the subclass of the generic divisions so the way we can divide india is through either an administrative divisions or the electoral divisions of which parliamentary constituencies are there so in a way we have uh and furthermore we have linked these concepts to the uh the schema to the um as well as the the dbpedia concept so in a sense that uh we have our uh ontology in a way we have we are not only putting or not only uh assembling our our concepts uh in a hierarchical order but also relating them to the existing known concepts okay so this is just a glimpse of it uh it's better to have a look at in the neo4j so we have oh yeah so uh we start off with uh this is the uh we just written a query where we have got the the the thing which you are interested in uh is a constituency so the hyderabad constituency if you if if you look at here it has a lot of data so this is we so we have the electoral data which is the the name of the constituency the kind of settlement it's an urban constituency the literacy rate which comes from a demographic data then it's jio location which comes from the gis data the polygon of it everything else has been stored so this this um and um we are really uh and we are storing uh for instance uh the information regarding the uh regarding this particular constituency in terms of even in the in the uh relationship that is precisely why we felt that the the label property graph would be ideal for us for instance if you look here the number of electors who have contested in that particular year the percentage turnout etc so all this information has been uh stored in form of a label property graph now uh now we have uh uh now we have actually um exported it then infer then we have um imported the ontology and then we have linked it so this is a it's a time consuming process for so for the sake of the demo we're just gonna show the uh we're just going to show the the the the the kind of uh link uh the kind of linking we have done using the uh the name space matching so so for instance if i call this these are all the set of the prefixes which we have loaded and all of this uh importing of the rdf data and the ontology is done using the neo semantics pro neo semantics uh package so the the harbor political concepts are uh this is the name space so in order to identify the specific elements we also need to map so all the mappings oh sorry all the list of the mappings has been added here so essentially the concepts so this is the uh the element in the in the ontology with and this has been identified with the the attribute which is used in the new in the neo4j graph so so far we uh so far uh so now we are also going to show you how the the uh the linked data i'm sorry just a second ah just a second sorry yeah so uh and the configuration which we have used in order to uh yeah the configuration which we have used in order to yeah the in order to load the uh in order to load the ontologies has been this we we prefer the node types instead of the labels you know because the based on the uris the linking naturally happens in the setting so uh so yeah okay uh now in order to see whether the particular parliamentary constituency has been identified with this particular uh with the ontology let's have a look so now what is actually happened is once we loaded the ontology in the particular configuration which we have shown it is the constituency hyderabad has two labels that is it's a parliamentary constituency and it belongs to election universe and these two these two labels have been identified with the concepts in the in the uh in the ontology and this is we are storing these kind of labels which exists in the neo4j as a primary type of the we are storing it as an attribute primary so this is a primary type of the uh of the instance identification which we have done okay okay now uh now uh if we look at the uh ontology here if you look at the ontology there there are uh other other concepts which are based on so for instance if i look at a concept like a muslim majority constituency a muslim majority constituency is defined as a parliamentary constituency which has a dominant religion whose which is which with a particular value m now this this kind of inference is not there at this moment in our uh in in this in this graph at this moment so now we are just ah so this uh for this particular thing what we have done is we have sent we have exported this entire rdf sent it to the uh so for this particular thing we have exported this entire uh node in terms of uh an rdf using the neo semantics uh package sent it to an apache channel wherein we have loaded the ontology and used an owl reasoner to infer all the inferences which we will be using we get the output in terms of an inferred uh the input data in form of an rdf and now we are going to load it back into the we are going to enrich our knowledge graph back using the new semantics okay so these many triples have been loaded so this is just for this particular constituency i have added the all other inferences so for instance if i repeat the same query and check now you see that a lot of other concepts have been identified that it is it's a type of the region it's um it's a general constituency now it's it's a muslim majority constituency the reason for that is if you look at the uh sorry yeah if you look at the hyderabad okay so uh so in order to do this what we uh so so it's so um after importing we we get after importing we are we have we are getting a copy of which so we need to merge this nodes with the existing nodes with the existing concept is called the hyderabad and the merging is happening here i'll just repeat it after this particular thing so that it will become clear ok ah ok so what what has if you if you really look here yeah so uh yeah so uh here it is here uh it has been a type of the electoral electoral universe it's a part of the parliamentary concept which is which was existing these are the these also had the primary relationship as true now the other inferences like it is a muslim majority constituency it is um it is a general constituency it is an urban constituency these are the inferred type of inferred inferred type of relations which are added so just to recap the procedure so what was happening is we had a particular node now we are exporting this through the neo semantics package now apart from the primary relations which we have established we are now using jenna to infer based on the ontology and this infer data which is stored as an rdf is again imported back into into the knowledge graph and the way we do it is we have we once it is inferred an instance of it is created and then we are merging with the existing hyderabad node and then with with all the relations and the properties and the final uh inferred knowledge graph is created so after this now as the nature of data is dynamic we uh if we update any particular attribute we would like to automatically update the inferences so in order to so so if we just go through the procedure so what actually happens is in this once an ah an attribute is updated a trigger uh a trigger is fired so a data is exported as an rdf now again this has been ingested into the the jenna inference engine and the in the new info data is again coming back and ingested with the same process so i will just show that by changing one of the attributes here yeah so so for instance uh here if you look at hyderabad the dominant religion was is m is stored as m and hence you had a muslim majority constituency as an infertile okay now uh i am changing the uh the dominant religion to the hindu majority now yeah now if you look at this particular yeah if i look at this particular thing yeah now if you look at here oh it's actually taking a bit of time so meanwhile i'll just uh so this is this this is this this happens through the up we we also uh employ the epoch procedure wherein the trigger uh happens so for instance in the intermediary thing we have something called to be inferred so this i mean the part of the procedure we have written it is to be inferred like is this hyderabad constituency this particular information has to be inferred because here if you see we have changed the this thing to this now this uh this to be inferred uh node has been created and now this particular rdf would be uh sent uh to the endpoint of jenna from where the inference takes place the new inference should be the one with the hindu majority con uh constituency and now it should be merged back and this information should be stored in our knowledge graph so let me see if it is done yep so now if you look at hyderabad now it's been identified as a type of a hindu majority constituency which you can if i can show you here yes and the other inferences are remaining the same because we have not changed that particular value we have not changed that particular thing so what what what we are essentially doing is we are having a pipeline in which um any change in data any dynamic change in data especially on the inferred in on the inferred nodes is immediately identified a trigger is is trigger is it it it triggers a procedure it exports the rdf now the rdf which has been exported say is being sent to a an inference engine an inference engine is based on the ontology or ontology as well as this particular input the infer data is being sent as an rdf and we are loading it back into the knowledge graph using the new semantics package okay so so yes so on top of this so where are we using all this electoral data the electoral data especially this electoral knowledge graph um in this ever uh exploding digital universe wherein the content is is exploding so especially in this so called the post-truth world where the battling disinformation is one of the important things as well as how do we navigate news or stories across time and space so because of i mean how are each store is connected to another so in order to look at this in order to uh in order to in order to come up with a solution the best way we feel is to leverage the power of connected data which uh uh which we think would help to solve the previous mentioned problems so the this is the basic uh uh the flow over which the uh the application which we call it as watching is built on so we have again various source of data so we we have our electoral knowledge graphs then we also have uh so these entities offer interest let's say the politicians they keep on ha ha occurring in various um uh twitter feeds they are they occur uh in various newspapers in articles in the lok sabha debates or in the parliamentary debates etc so all this information can be uh processed using the natural language processing techniques and that information can be stored into the graph into the into the knowledge graph similarly we we we can also perform uh traditional machine learning tasks like classification of the text identifying the sentiment in the text and more importantly to generate a vector based on the content in order to perform numerical algorithms on them so so apart from this we also use some of the graph data science libraries um community detection methods centrality of these nodes and similarities and all of this is on the base of the knowledge graph so the knowledge graph which is which is which uh is it it has our domain knowledge graph the the knowledge base along with the community knowledge base let's say wikipedia all this information can be put in to one particular source and that is what is powering the application called watching which would um which would help us to navigate through connected data in a seamless way so just to recap again so we have our uh the electoral knowledge graph then various sources of data like the gis demographic the electoral data we built a custom ontology and there is a community knowledge graph all of this would be the base on which the nlp processing takes place for instance identifying the entity in question and correctly identifying in fact disambiguating it from the other things based on the context and then linking it with the existing nodes in the graph all of this happens is powered by the um by the knowledge graph and the text mining processes so um let this let me just quickly show the the the application yes watching so this is how the uh the loading page looks like so basically each bubble is a story so so for now the the each bubble is a story and the this is about like 40 stories which are important so the importance of the stories has been evaluated according to the importance index which is based on the trending data like for instance um if any particular entities are trending let's say kovid is a very interesting topic right now so uh so based on the trending as well as based on the importance of the stories in the network itself as well as the content of course and and also the uh and also based on the topics or the the on the genre of the stories so all these stories uh the important stories have been displayed and if i look at it so what happens is the stories have been tagged the entities have been extracted and they have been linked to the the correct to the thing as well as we also give the the genre of the the story which is uh which is a it's a world news now if we click on this particular entity we go to the entity page so in essence every entity so every atomic concept inside inside the story so instead of searching for keywords or instead of searching for topics we think that entity based searching as well as once we consider entities as the atomic units of this the everything else becomes quite seamless the relations between them becomes more explicit and the navigation through how these particular things happening would be interesting so um so this would be a micro site for this person narendra modi who is and here we display all the articles or the stories which uh narendra modi is featured in and uh here you have the the people who the entities which are related to this so manmohan singh happens to be the ex-prime minister of india and these are the the the genres which are being displayed over here so apart from this we also have we also ask is built on top of a electoral domain knowledge so we we have uh all the information regarding the uh regarding the the candidate of interest and uh the other uh political information which we are planning to put it in form of a data art which can be consumed as well as which can be uh which would not be a boring way to explore data especially when you have so many um important and diverse electoral concepts so um i think i uh will uh just quickly go through the kind of uh like one of the um algorithms which we used uh which feature centrally in most of the tasks we are doing is a community detection so though uh it would not be fair to give a demonstration in such a short period of time i will just go through the methodology in which we have uh identified them so for instance if we are looking at two uh particular entities of interest let's say two entities of interest and in our graph they are connected via uh other relations so let's say let's say uh this is the type relation based on the concepts we get from the ontology imported so what we do is um because these are multipartite graph and a multi-label graph now the graph algorithms in the neo4j are supported only on a monopartite graph so we project this graph into the based on these relations onto the space of entities and entities then we perform a centrality measure now this is not the only one kind of projection which can happen based uh but we can do other kinds of projection say based on different other kinds of relations between these two entities finally what we get is a different we get vectors of these features these network based features which we uh store them and scale them because uh the measures might be very different for uh uh in order to make it much more uh robust uh we scale them and now we compute we we compute a similarity graph based on these entities based on the the vectors and now once we have a similarity graph we do a community detection and this community detection can be used as a seed in order to improve again once we once we iterate the process i think so these so what we are currently still continuing to do is to build a full-fledged semantic search on top of on top of the the the layers which we have developing then we would really like to start uh exploring the knowledge graph embeddings so for instance um neo4j uh currently supports um recently uh started supporting um the vectorization so we can embed these um graph-based notes into a low dimensional space but unfortunately again these are restricted to a monopartic graph but if we want to consider the full scale of the semantic space as well as the the relationship space and the node space we would like to explore the knowledge graph based embeddings in order to carry out the further algorithms and try to see how much of a difference this would make and finally we would also like to implement the shackle validations uh in order to standardize our graph to the w3 standards um yep yeah yeah yeah i think uh yeah we are open to discussions for proof of concepts on in the domain of healthcare finance banking media and governance and this is our website and this is our twitter handle and i think i'm good to take questions and i would like once again thanks neo4j for giving this opportunity this well thank you very much for that that was really interesting and we've got a bunch of questions so if you've got any questions in the chat pop them in we will pick those up but we're going to do them at the end we're going to do the end of the panel so we're going to go straight over now to dolph so again sorry no i'm just good so we're now going to go over to del kiefli's talk so um over to you just just give us a couple of seconds whilst i set up your screen shed off legally so we are going to move the second script is it okay uh-huh okay thank you uh so uh i'm ducky flipkart i'm an associate professor at the university of georgia in algeria and i'm going to present a project that i led with my students a couple of months ago two of my master's students about mining the content of question answering systems so to begin with let me say that we as humans always ask questions we never stop asking questions and facing problems for which we need solutions and answers and technology has helped us to answer our questions and solve our problems by the means of uh some systems dedicated systems like quora stack exchange yahoo answers etc and on these websites people ask questions they get answers they solve problems but they also interact with each other by commenting the posts by voting the post to the questions voting upvoting or downloading the answers etc so these uh the data about the users interactions in these systems can be made available for instance uh if we take uh the case of uh i just wonder if you can see my can you see the slide here so um it's blank i don't know why it's blank on my uh um sorry but um just to launch it again well um can you see the screen here i like this okay okay so um in this as i said in these websites like uh stock exchange etc people ask questions and get answers etc and the the data about the user interactions in these systems can be made available for people to analyze and to do some data mining tasks such as uh trying to find communities of users etc and you can see here that um in the case of stack exchange the website uh about which we were uh we use the data offers the possibility to analyze its data through a set of pre-built links and when user wants for instance to get some insights from its uses of stack exchange uh the user simply clicks here for instance on a question and he enters the monitors as the user id and gets some insights from the use of stack exchange uh however as you can see here the list or the number of pre-built uh links that leads to to to the answers or that lead to the data is limited and that's why uh stack exchange has made also available for its users uh um an interface uh on which they can uh write the queries the sql queries from scratch and they entered query here and get the answer and they are displayed the schema of the database here uh at the right hand side of the uh the interface here however as you know uh when we want to write a query from scratch we need to have some database and sql skills and not everybody has these skills in order to analyze the data so uh the the project what has motivated us is the fact that if we want to get some useful insights for instance in order to uh to find the communities in order to understand how people or how how the users are really interacting in these systems we needed to get the data offline in order to be able to analyze it's the the schema and to write some complex queries and if we want to answer some advanced queries like the ones you can see here if you want for instance to rank the users based on their interactions if you want for instance to detect the communities of users who interact the same way who comments the same people who answer the same uh posts who ask the same or similar questions in this case the data uh that's given or that's made available by the stack exchange website is not appropriate because in fact the data comes as a relational model and as you know the relational model uh is a set of tables that are linked to each other and if we want for instance to perform some queries for instance if we know to we want to for instance and know uh how the users interact if we want to rank the users in this case we need to make an extensive use of joins or loops on the same table that's here the table user and that's why we thought we were motivated by using graphs instead of using the relational model and by using graphs we were don't know in fact it's my the preview um because i'm on a mac and the preview here is i think i i'm having an overhead in my on my pc and that's why i think that i should maybe i don't know relaunch it again i don't know close just sitting um just check if i can [Music] yeah make sure this is well uh-huh i don't know as long as i don't have the blank slide again okay okay okay okay so uh i'll try to be quick here you cannot avoid the blank slide again so i'll try to give the most important elements of my presentation here can you hear me okay okay so as i said when we take the data that's made available by stack exchange it comes in a relational format and you know that the relational format is um composed of a set of tables and some queries some advanced or complex queries uh and required to make different a set of loops over the same table and if we have a close look to our uh our model here the relation model we can see that the main table or the center table is table user and if you want to make some anal analysis on this table we have to make extensive uh loops or self-joints on the same table so our solution was to export the data or transform the data into a graph so here we can see the model that we used for uh representing the interactions between users so we have two labels that are user and tags and we have six relationships types between users we have for instance a user that comments another user's post user that downloads another user uh post or a comment and a user that who upvotes another user's comments etc or questions or answers etc so using the the graph we could uh avoid the problem of joining the tables in the relational uh in the relational top the model and we did the same between tags because we needed to know what other tags that are uh that coexist within the same post and uh if the tags if you can for instance draw a set of communities of the most used texts that are used together so uh our solution was first to export the data from uh from the website from stack exchange website uh as a csv finds and then we loaded it to a neo4j database graph database and then we developed a tool over the database here you can see some stats of the database we have for instance about usage we have 50 000 users and uh nearly 570 tags and the other are instances of the relationships and uh oops i have this um blank tag you have the blank slides i think um can can you see my side i'm very sorry for this um i don't know mm-hmm i think it's okay and i'm very sorry for this for this trouble okay yeah uh-huh moments we don't have one camera so just give me a couple of moments and then we'll be good to go so you [Music] okay sure yeah thank you thank you very much yeah yeah thank you it takes a little presentation it was it was really interesting and it's good to see uh neo semantics in in action in a in a real project like yours so i think uh i have some some more specific questions on on um on especially the the the inferencing iteration that you run but first i wanted to ask you uh if you could maybe uh elaborate a little bit in the type of use cases that you work with because you mentioned briefly at the end i mean we're open to projects in healthcare finance government maybe you can give it a a just a high level idea of the kind of of situations where you find uh your platform being used or or you know adding special value and so does that make sense oh yeah so um at the moment um we have developed it um using the electoral data to the full fledge so in for instance if we are looking at um looking at these non-trivial uh inferences which we're getting it and also the kind of um uh uh the uh which which would help us to uh look at the the the the sources of info for instance if you look at the politician and um his his interactions with um uh is with various parties various businesses then the sources of money into politics these are the places where we have actually looked into it and we have made uh small cases of um let's say uh financial setting and clinical text data though we are not yet it's not at the proof of concept level but we do have some use cases with um extracting information from raw clinical text to to a qualified data and then we would like to use that um in that process we would like to use these inferences for instance then uh even in the financial sector of course the fraud is a very uh very um ubiquitous uh use case for such kind of a scenario and we would like to power the our semantic search based on this uh inferences that's right yeah great that makes sense yeah and i guess this kind of investigative kind of type of use cases similar to the the kind of panama papers that we've been talking for for many years now and i guess that's where the inferences can add value and reaching your graph and deriving new information so you've mentioned uh that uh um what we've seen how you create your ontology and then you bring it into new into neo4j uh yes um and you're using kind of stan web standards and and you've mentioned owl is that the kind of uh of constructs that you use i mean exactly yes yes okay and and yes yes we uh so we built uh our the custom ontology which we built we integrated with with the dbpedia ontology as well which we found it in our form so we built it on top of that so that the the common concepts could be easily identified as well right yeah that's that's great so um i was i was thinking uh because one of the and this is where i go a bit a bit deeper into the way you're using your semantics because i see there's there's this loop where you you build your knowledge graph in neo but then at some point you have to export it uh into into the jenna owl engine and then you run the inference and then you bring back the results so uh what one area where where uh the new semantics project is going is is in space of developing more of of these inferencing capabilities and one of them is is bringing uh i don't know if owl or maybe owldl or rdfs some but bring some of these inferencing in into i mean as part of new semantics so therefore something that you can run in in neo4j is that something because i see that this would potentially simplify your architecture right if you could run this inference exactly without having to kind of have the data transit outside and then input that i think that yes do you see that that would be a huge huge advantage for us if we could because uh as far as we uh we checked there were only hierarchical inferences in neo semantics the simple hierarchical inferences and now we are making inferences based on the node properties and furthermore i mean though we have not displayed we want to do it based on the relationships for the relationships as well so for these kind of things we uh yeah we were forced to move out of the neo4j framework and you know yeah uh go to jenna and come back yeah and that's interesting because as you know the the the inferencing can be powerful and it's quite rich but it's it's also in a way a kind of worm sometimes because you can get into situations where where it either takes too long it's too heavy process when you're working on large volumes so we've intentionally been a bit cautious in that in that area and just adding small bits of of inferencing capabilities but it looks like you know that that's an area where where there's definitely interest so we might be yeah so i would say watch this space because we will be we will be uh working on that front so oh absolutely uh well i mean liu i i could i could go on i don't know if you if you probably want to ask something or chilly i mean i'm happy to continue but uh unexpected in a sense that we at least the explainability has been improved greater deal like um let me put let me be cautious and say that for instance the role of certain um uh the money in certain uh in the indian electoral system either the it has been much more clearer now like what are the main sources the builders or these kind of concepts which were which was always like part of the part of the hypothesis but yes we found concrete in understanding based on this so it's like the explanability has been improved greatly but um to your question if we did find some uh unexpected uh uh results uh no not yet i think i think you news question is in the direction are we going to have another panama papers in the indian news soon uh another case of creative journalism uh unfortunately the the we are only going through the digital content which has been the last 10 years so yeah i i see the point yeah yeah no no not even the funky stuff but yeah okay um so uh the label property graph which we originally designed was the the intention was to run some algorithms as well as to do a lot of aggregate uh information etc and all so but when we were exporting um uh for exporting it in the rdf form we just did it for the nodes because we knew that we were going to infer only based on the nodes so and we linked all the nodes with the ontologies so we have we have in a way preserve the structure of the label property graph and impose this uh uh impose this linking with the with the ontology on the notes and then create uh we're inferring based on the use cases does this answer your question yes uh the the on uh no uh let me put it this way again so so we have we have created a label property graph the based on the kind of um uh aggregations and the kind of queries which are interested in general and then uh when for the inferencing what we have done is we have just exported the nodes uh just the nodes and their attributes in an rdf form and then uh when we are inputting back into this we we are just uh linking those elements with an ontology which we have created for this purpose so essentially the the label property graph structure is still intact yeah yeah and one one good question just to add uh to get an understanding because you know mostly i guess we're talking to developers here so it might be good if you could give us a little bit of a uh um an overview of the stack i mean that you've been using because there's near on one side there's jen on the other side i mean is the whole you know this loop is it and you've mentioned you're using triggers to but is this this loop automated i mean you are using probably java since you you yes yes is based on java yes and the spring primer the spring framework for the ingestion and the everything else right so you have a process that basically does the import of the data out of neo brings it into jenna runs the inference and then brings it back and that's that's uh that's basically um java and sdn right yes good yeah no that's that's useful thank you yeah that's i mean that that's uh that's me i mean i i could probably take offline the rest of the conversations because i'm interested in in how the whole you know how you've implemented uh the the the importing of the results of the inferencing because i like this idea of flagging you know the primary types and the derived types because i guess then you have to do be able to retract some of the derived facts when they when when did the environment change and that's that's an interesting thing to to keep in mind but yes that's probably i mean i think i think i'm i'm i'm good with questions thank you thank you very much for your presentation really really interesting and looking forward to hearing certainly oh we can definitely connect offline we would be extremely happy to discuss further and try to streamline the processes if we can find any other you know neo4j tricks [Music] [Music] nope well thank you very much see you bye bye you
2020-12-22 21:11