okay uh welcome everyone we are very happy to have Sophia Chile today to give a talk on at stations over TLS 1.3 and zkp she is a senior cryptography researcher at Brave and she has been involved in developing privacy enhancing Technologies postquantum crypto and zero knowledge proofs she's also a co-author of the Mayo the pqc signature scheme she contributed to the OTR protocol and apart from being a prolific researcher I think she's a very significant contributor of the community she strongly advocates for making the community more diverse and human rights and Technical standards so she is in the advisory Council of open technology funds she has various roles in the different standardization bodies like ITF irf w3c and yeah she's also the co-founder of crypto Latinos and wiy women in cryptography and he's a co-editor of ISR a lot of things super excited to have her and take it from here yeah thank you so much HHA for that really warm introduction yes so today I was going to be talking about um so I would not be seeing you anymore because I have to share my T um but I can switch every now and then to see um if there are any questions but anyhow as I was saying um essentially today what I wanted to present you is kind of like a paradigm that is currently being rising in different types of the industry of actually how to create privacy attestations over TLS and crypted data by leveraging and using Zer knowledge proofs I was hoping that I will show you at this point some some new work that we have but we found a box so that I'm just going to give you a little bit of some pointers of what we're looking at because we couldn't finish it completely on time but just to to kind of like give you a little bit of an overview essentially what we're going to be talking to today is the design of a very complex and distributed system and later you will see why um I call this a distributed system and I hope also at the D of the talk we will will agree with me that it's kind of like very complex um and the reason why we want to do this is because especially in the web and certain of the other internet applications you have that when you actually you want to access certain kind of service you are asked to provide a proof of something so the most famous one that we have is obviously the proof of humanity for whatever it means to be a human and that is usually shown by um by asking the users to actually solve a capture but there's some other kinds of proof that sometimes we're asked to in order to access services so for example think of having to prove that you are older than 18 or that you are a national of an specific country or that you belong and you're currently living in a specific geographical location and the thing is that many of the services are asking us for those specific proofs several times so one of the core points that some people in the industry recently have is that instead of having to actually submit this proofs all the time to different Services instead we can like Leverage the fact that we already created one of these authorizations and then use that to provide or allow us to have authorizations to other services so if you think about it is a little bit similar of what privy pass is currently doing because it's the same kind of idea of taking the um the fact that you already solve one successful authorization mechanism for example you already solved the capture and then using that fact to generate further author ations without having to solve or to provide any proof any longer but one of the things that these researchers also realize is that the majority of times uh these specific proofs that we are sending to different Services is usually locked locked over TLS encrypted data so what they ask is how can we actually provide Ser knowledge grps to from that a specific TLS data so that it can give us authorization or access or whatever you want to Define that to other services but this is not as simple system because one of the first things to realize is that of course the TLs 1.3 connections happen between a client and a server and if you want to prove that indeed this data is an honest execution or it's honestly generated in this specific transcript from the TLs 1.3 you first have to commit to that specific data so that you can show that you didn't modify it or that you created kind of like a different um data um so what you want first is to be able to or to be able to have the ability to commit to this specific encrypted data once you have this committed what you want to show on that data is many things first you want to say to prove this for example over ALS sorry over as or for example chapoli and then you also want to prove that this specific data for example uh corresponds to a correct format that is correct and properly constructed Json and properly constructed HTML that you are proving this over as and then after that you also want to prove that certain statements exist over that data so for example this specific as uh block of the cipher um belongs to a field that shows that I'm over 18 and you want to prove all of these in uh ckp so in this talk I'm going to mainly be talking about the Fast and the fourth point which is discussion and applications and the first one about how to commit uh to this TLS 1.3 encrypted data which is very dunting itself and some feature word that I have I'm just going to give you a little bit of some pointers of what we have been looking at and hopefully later I can show you actually the result of this worst one we put them into ePrint okay so again just to set kind of like the scene of what is happening essentially what we want is to proof something and leverage the fact that already those kind of statements are share over TLS 1.3 or when any kind of
TLS data and then use them inable to be to be able to access other further services with the fact that we already have this kind ofc knowledge plce there's two types that you will see in the Le that people want to provide the first one is what is called proof of Providence that for example this is specific blob of traffic this specific traffic that was sent over TLS belongs of a true connection within a specific server that I'm assuming to be talking to so let's say for example that you want to prove that indeed you talk to a correct bank then you also show that this TLS connection was performed with the specific server that host the website of that specific bank and there's a second one which is the one that we are more interested about which is just saying there's exist this a specific a statement over the encrypted um data of sent over TLS connections and it corresponds to a true connection to a server and hence I'm leveraging the fact on the statement rather than in the provence of that specific data so in this talk we're going to be interested in the proof of statements because we think that is the one that is actually more aligned with privacy preserving Technologies while the other one is less privacy preserving because you are still revealing that you're talking with an specific server and hence over time you will be revealing the browser history of the users so we're less interested on that but you will see many papers that are mostly interested also in these pce of provenance okay so for example let's give a simple example let's say that you know that over this uh TLS 1.3 traffic that is encrypted with as GCM in an aid setting you know that it is um it has the format of Json and it's the result of an HTTP request and or query and response and and essentially you know that in this Json there's exist a field a key that is called Ag and the related value to the key AG is an integer that is over the number 18 what do you want to show is essentially to show that it was properly committed over a transmitted data over TLS you want to demonstrate that the data conforms to a Json context reg grammar he would here with last risk because actually Json is not a context reg grammar is context sensitive but if we assume like an abstraction of Json as a cont three grammar we could indeed prove this we demonstrated the field age exists at the top level of this specific Json document so it is the first object of this Json document and not for example the third object so you are assured that it is in the correct context you demonstrate that the value associated with h matches the grammar rule for integers so whatever value is follow in the pair of h plus value you know that that value is indeed an integer and then you show that that integer is bigger than so something whatever you want so in this talk we're going to just be talking about the first point the first specific point but I will be giving you pointers of how to do on all of the next ones but just looking at this maybe you are now agreeing with me that this is indeed kind of like a complex is okay so the first point that we have is essentially that how can we indeed commit to this specific data that is sent over TLS the first protocol that indeed proposed this is a protocol that is called Deco um don't specifically remember in what the specific year but they are the ones that first propose this and what they propose is indeed to add a third party to the TLs handshake so that this third party can actually be committed to that specific encrypted traffic that you're sending over TLS and you will see in the literature that that's one of the reasons why these kind of Protocols are called Tre party handshake protocols or treps because you are adding a third party or sometimes they're also called CK TLS which is a name I personally don't really like because you're not proving TLS but rather you're proving that certain statements exist over these protocol that is TS so you're more interested in the statements rather on TLS so I disagree on the name but you will see in the literature or in some activities that happen over social media that they are called C TLS what they allow us is to Pro provide this modified TLS handshake that allows exporting a statements over the TLs channel to a designated verifier and there's many other followup works as the ones that I have puted in this slide and you can indeed check those papers if you're interested but what is the problem with all of these passwords is essentially that they have limitations so the majority of the design of these protocols you will see that they are fixed for TLS 1.2 or under and we wanted actually to provide something for TLS 1.3 why we wanted this is because this
is the newest protocol is the one that has the highest security properties and the one that is the most efficient and indeed recently the ITF uh TLS working group has actually uh froze TLS 1.2 meaning that there will be no new extensions added to the protocol with the aim of eventually deprecating TLS 1.2 so we think that if we are going to be providing any kind of system it should actually Target the latest version of TLS which is TLS 1.3 they also lack a formal security analysis which is something that we did provideed in our work we essentially reduce the security of our work to the core security of TLS 1.3 which has been itself formally analyzed and also verified there was some efficiency concerns with previous work because indeed they use some really slow algorithms or sometimes they even use algorithms that have been proven broken now and they do provide they provide no support for as GCM or Chacha poly which are the most famous surface SS used at least for TLS 1.3 they also don't provide client privacy because in the aim of providing can of like this proofs of provenance they eventually reveal to other parties which a specific server this specific client is talking to which we don't like because essentially that means that you're revealing the browsing history over time we solve that by using kind of like a ring signature SL serish proof that I will introduce later on the top and they also have no open source implementations so we actually wanted to solve them all and we did with this word that is called Yano we are interested eventually in seeing the like a really a BRI and small talk of this specific protocol I invite you to to attend ndss where we will present the the paper Okay essentially how they work we usually call these protocols also designated commitment TLS or DLS protocols and essentially how they work is that they have kind of three faes but important here to note these faces really don't happen sequentially but many of them will happen in parallel so I know that there's a lot of confusion in the community that sometimes these faes actually happen sequentially you have to think them in parallel so very mind that we have this three phases the handshake the query and the commit so prior to introducing all of the phases maybe let start remembering of how TLS 1.3 Works essentially TLS 1.3
has two phases it has the handshake phase in which essentially the aim is to generate a shared secret that is going to be used to encrypt the messages at the handshake and another one that is going to be used to encryp the messages of the record layer and also to achieve authentication and integrity with the server that you're supposed to be talking to there's a second phase of the TLs 1.3 protocol which is the record layer which is essentially that using that secret sh uh that share secret that you just der you're going to be using to send uh encrypted and eventually decrypt the messages uh at the record L it's really interesting that the TLs 1.3 decisions in that they what they wanted when they actually proposed uh the protocol that they wanted to achieve the most secure or the higher level of security properties that they could achieve and one of the actual decisions that the TLs working group did at the time for example with the aim of providing in the stronger security properties was to encrypt as many parts of the handshakes as possible so you will see in this diagram here that app that from encrypted extensions messages on words all of the messages of the handshake are going to be encrypted the reason why they decided to do this is such that no adversity that is looking at the network will be able to see for example the certif the server certificate message from the server so they wanted to hide as much um of the TLs handshake as possible and indeed there is like given an extension for TLS 1.3 that is called encrypted client hello that even encrypts all of the other messages that are previously not encrypted in TLS 1.3 it is also a protocol that aims for efficiency um and there's even a mode for TLS 1.3 this called zrt because essentially they wanted uh to provide like really immediate connections um to a from a client to an specific servers I'm noting this because these are like the core differences with DLS 1.2 so
you're trying to design something from TLS 1.2 you will have then to redesign it for TLS 1.3 it is nontrivial to take something that works for TLS 1.2 and just like casually moving it to TLS 1.3
it doesn't really work that way because the handshakes are really different okay but essentially from the side of the ctls protocols what is that we want to do what we want is to essentially add this thir party which we're going to be calling the verifier so that the client and the verifier secret share all of their s data both the keys and also uh the encryptions and decryptions of mostly the encryption actually the encryptions of the messages but from the server side of view everything Remains the Same so what we do in practice is essentially that we secret share sessions between the client and the verifier so that the verifier never has a complete view of the whole keys or all the whole PL text they just have their shares but the verifier because of that is able to commit to the specific encrypted traffic that is sent that is forwarded by the client from the server both at the handshake layer uh and the record layer phase and in that way once you have a verifier that is able to commit to this specific traffic that you can indeed create any kind of zero knowledge proof statements over other specific commitments that you just created so essentially as I said at the beginning these Protocols are going to be defining kind of three phases that are not sequential again the handshake phase the query phase and the commitment phase what is this handshake phase is essentially what I just told you essentially how to make the handshake the secret share between the client and the verifier contrary to what we have here in which what you say what you actually see in a TLS 1.3 handshake is that we deriv these key shares um that you see in this message uh in the client hello and the server hello messages we essentially are creating kind of what is called in the TLs 1.3 standard as Keir what are those so they just like the private and public part that essentially you're going to be um the public part at least you're going to be sharing to the server so both the client and the server can generate a shared secet in this case instead of just having the client generate their own Keir what we're going to be having is that the both perfer and the client are going to be having kind of shares of that specific keyshare so that the eventual kind of public part of that specific shares is going to be the concatenation of the additive shares of those specific shared key shares from the client and the verifier this can be generated in a pre-processing phase so doesn't have to happen at the moment of the handshake the only thing that we have to be assured is that the client sends the specific client keyshare to the server once we have that and the server receives these key shirts the server then sends the server hello which is essentially just fixing what a specific Cipher Suites are going to be used in the TLs handshake the signature algorithms uh the encryption algorithms and essentially also they generate uh the shared secret so from the SE own everything looks completely as business as usual from the client under verifier point of view now they have to derive a lot of keys these keys are essentially the keys that are going to be used in order to encryp the messages at the handshake layer and eventually also you will derive some keys that will be used in order to encrypt the traffic at the record layer as you can ask yes can you explain a little bit like um in terms of implementing something like this does that mean that the you would have to modify like web browsers uh the DLS stack on the server side and it for example would it be possible to implement this like using some web browser extension only is that possible yeah so that is a great question so uh later I have an slide a very small slide actually on the implementation and essentially what we did is that we implemented these completely in boring SSL and then we compiled boring SSL in the brave browser enabled to be us it you can also generate via an extension but it will be way more difficult because the tlss pack doesn't give you uh the ability to don't modify that much of some of the messages of TLS um because of security concerns which make complete sense but I know of some companies that indeed have uh tried very hard to actually modify the TLs VI an extension it is easier when you have access to the browser for the TLs stack for the server you don't need to modify anything because everything looks as business as usual for the serve but yes you either modified um the browser so that you have the modified TLS library or you tries really hard to have only called some of the public TLS API so you can also provide this kind of protocol does that answer the question yeah yeah yeah thanks okay yes okay so as I was saying right now currently what we're aiming to do is essentially to derive all of these uh shared Secrets but obviously if you're ever worked in MDC you know that this could be very expensive so we added some kind of optimizations the first optimization is that instead of operating if you think about it in TLS in the form of using elliptical cryptography instead of using the whole point meaning the X and Y coordinates we know that a lot of the operations that are optimized already for TLS 1.3 only happen on the x coordinate so what we did is indeed instead of having the whole shares of the whole point we have shares now of the field which indeed uh allows us to actually have a lot of optimizations because we're only now working over the field because of that now we are also able to use a binary Goble circuit for all the uh topc functionality so for the handshake Keys derivation and eventually also the Reco layer Keys derivation and for the encryption and decryption over AES okay so now essentially the client and the verifier and the server from their site have derived the handshake Secrets which means that from now on uh any messages that are sent by the server or or sent by the client are going to be encrypted now we go to the second phase kind of the the handshake phase of TLS 1.3 this phase is essentially the one that is used in order to achieve authentication and integrity mainly you want to show as a server to the client that indeed they are talking to the correct server and I'm like and I'm not I'm not like a fake server that I pre pretending to be someone else how is that usually done done in TLS 1.3 is that you provide a proof uh that you know the private key associated with a public key in the form of a signature so you essentially provide a proof which is the signature and that your public key really re really belongs to that a specific server because it has been attested via the pki for example so one of the things that we already notice here is that of course the client can directly get this specific message the server certificate and service certificate verified messages but then they will also have to kind of forward these specific messages to the verifier so they are assured also that the client is not lying to them and they are pretending to have a connection to a server but we didn't wanted to forward them in the clear because again that will be a for us a privacy violation because you are revealing the client history over time to this third party which is the verifier so what we did is indeed that we created our own seral knowledge proof that we call CK PVS we essentially shows that under that this that this specific signature has been cor has been properly created by a public key that belongs to a set that belongs to a ring without pinpointed which a specific public key it is so you say for example let's say that you have a set of public keys that belongs to the trusted Consortium of European Banks and I don't know if that entity exists but let's say that that exists and then you have kind of like these public keys of these Banks and then what you show is that whatever signature you have have created belongs to one of those public keys that exist in that set but you don't pinpoint to that the specific um to the verifier specifically which one of them so what we are doing here is essentially preserving the privacy of the of the identity of the server that the client is talking to um by just showing kind of like this Z knowledge proof so that is kind of the attestation that you make to the to the verifier to say that they are indeed that the client indeed is talking to a Correct server okay another thing to note here is that as you see here in this D diagram um the client which is the one in Orange is forwarding a lot of messages encrypted to the verifier what does essentially that means essentially it means that any messages that I have that I have receive from the server I have to forward to the verifier why do I have to do this is because the verifier has to commit to this specific data prior to revealing their own keyshare why does the verifier needs to reveal their keyshare is because essentially the client needs to decrypt that data eventually but because AES in the TLs 1.3 setting which is in
the aad setting leads to a non-commuting cipher we cannot just do this we cannot just like forwarded encrypted blops to the verifier have the verifier um reveal their keer and have the then the client um essentially decrypt all of the messages we cannot do this na thing because it means that uh the client can launch a non-committing attack that we show you later in some slides so what we need to beur sure is that fast um the verifier commits both to that encrypted blocks of messages and also to the specific keys that I used in order to uh encryp uh encrypt VI Bas these specific messages okay but you will see later in this slide exactly how how we do it to transform a in this aea aead setting into a committing S as you see later all of these last messages that we see here they are just about deriving the specific keys that are going to be used later to encrypt the record layer and also a lot of hm messages as I call you which is the finished messages which just provide you with with the ability to say that this handshake has not been modifi modified because the client and the server arrive to the same view of the messages so they have not no one has injected or modified any of the messages so they also do this as business as usual in the ls 1.3 okay so one of the things can I ask another question previous slide um okay so can you clarify a little bit this um public set of keys so just to yeah just to be clear so these are like the server keys right so the idea is that uh yeah the idea is that you don't uh reveal exactly which server the client communicated with right yeah okay so um I mean wouldn't it be possible to also just prove something about like a higher level certificate or or key in the in search chain instead of doing something like this oh that would be interesting yeah well you will have to prove that that is specific public key in relationship to the P pki was used in the connection that's the thing you have to prove as well I think yeah that it will be interesting because it will be a if there will be an easier thing that will be ni nice one of the things that this Ser knowledge Pro T allowed us is that we don't have to send this at this point of the handshake we could also send it later because you can like kind of heal later the authentication from the verifier side I don't know but I have to think about the pki yes yeah and and and this like this sort of this security model is kind of interesting here so um uh I mean it it really relies somehow or the anonymity here really relies on this public set of keys being like yes known and maybe decided ahead of time and like not mutable yes so it's kind of like um a set anonymity also right like if the set is com is is very big then you will preer more anonymity but it is really small because there only exist One Bank in the European Union then then you have a then you're still revealing exactly which one so it really depends on that we thought that in majority of these cases it makes sense because you're already trying to prove some statements over some connections that happen to servers that are kind of like authorized usually at the policy level so you will have people asking for like the national ID or bank statements and that already happens with servers that belong to entities um that are kind of already part of consult at the legal level yeah but it's true what you're saying thanks okay okay now for the as so as I said one of the goals of especially DLS protocol is have the ability to commit to this specific um encrypted blobs so later you can actually show um ckps statements sorry you can show in Zer knowledge proof that this a specific statement exist in this specific traffic but as I told you AES it turns out that AES GCM in the aad setting leads to a non-competing surfer so what's the specific uh attack that you can launch the specific attack that you can launch is that launch is that you can find another key let's call it key Prime um and you can find another or you can create another message let's call it m Prime and they can arrive to the specific Cipher text block which is horrible in our cases because if you remember from the previous diagram what we essentially do is that we send first encrypted we we we forward to the verifier essentially the encrypted version of the messages so we send to the verifier the seex blocks blobs but then the client can essentially Lie by finding another key um modifying the message and then saying hey verifier give me your Keir and then I'm going to pretend like this is the true Cypher text because it turns out that I that I find something that is exactly the same as what I sent you previously even though I changed the message so that's not great so we didn't want it to do that because it turns out that in that way the client can like which will kill all of our protocol because essentially means that the client can indeed create zero knowledge proofs over statements that were really not there in the encrypted blobs um that you forwarded from the server by the client to the verifier what essentially what we did in the country is we kind of modify AES so what you also sent prior to the verifier and actually revealing the keare what you send is kind of like a commitment to the specific key and we do this commitment by adding this specific mask that we call V Prime so that we essentially create this value that we call E e EI that you see here which is kind of like a commitment to already kind of like the key that you have once you have this specific commitment to the key then the verifier indeed reveals this their key share and then the client cannot really lie because if they lie and they act indeed try to change the message from this specific Cy text BL then they will no longer be able to pass the specific commitment to the key that you just have so in a way what we do is kind of like a modify the a so we can actually use it as the commitment scheme itself something to note here is that they sorry so can you tell a little bit more about this attack so how easy is this attack and how is it to find like some meaningful message so that you can actually do this ranking yeah it's not really difficult but in the TLs 1.3 setting is considered completely fine at least between a client and a server because it is assumed that you already have um the same set of keys so therefore the client will not be arriving to a different key but in our case because we're sending this for water to the verifier the very the client will be able to check to the very far but not to the server oh I see yeah and there's a great paper uh by um thas wristen GRS that actually found this specific attack that I have attached to this slide yes that is very fun okay one thing to note here is that I'm mostly centering all of these explanations of using a GCM as the cipher Suite that is used over TLS 1.3 and as the commitment scheme that we're going to be used and this is because GCM is a block Cipher which means that it allow us to do like this or to explain this much more easily but if you remember the ls13 has other Cipher suits that can be used as the encryption algorithm including Chacha poly again in an A A C Chacha is a stream Cipher which means that this like kind of modified scheme will not work anymore but we currently have some future world actually extending this to Chao okay so now that is great we have a way to introduce this specific third party verifier to the handshake such that they have secret shared the majority of the keys they don't have never the full view of the keys the client do has the whole plane text eventually because they are able to decrypt but the verifier is able to commit to specific traffic Cent with the TLs 1.3 and it's able to know that the
client will not be able to cheat and create like different PL Texs that arrive to the same side so that is great that's what we did essentially by modifying this country in regards to the implementation the implementation we did it in C++ and we made it overboarding SSL and it's around a 14k lens of code we also did uh did some experimentation over land on the one setting and essentially the kind of times that we arrived um are kind of good in that for the online faes of a protocol it it takes us less than a second to actually launch U the handshake we do have some increase cost especially on some pre-processing so if you remember some previous slide I said that there's some there's a pre-processing state at the handshake layer so that is kind of a little bit more costly and we also do a circuit pre processing for optimizations for a that is a bit more costly but in general we arrive to Pretty Good Times which is great because it turns out that we are still kind of like in the frame or in the set of the timeouts uh allowed for TLS which is usually are typically between 10 and 20 seconds so we will not be time out even when we at this specific thir body very far okay great so now we have explained only the handshake face but as I told you there's other phases we explain kind of the handshake phase and also the commitment phase but just to uh kind of repeat again uh the commitment phase kind of happens in parallel in the handshake phase and every time that you send an encrypted message you have to make the verifier commit to it by modifying the AES so those faces are kind of in parallel you can also think the query phasee that is not really sequentially right one can think that the query phase only happens at the record layer when for example you send an HTTP request to the server and you get the HTTP response back from the specific server but if you think about it the request to open a TLS connection is a querying itself so that's why I'm telling you that these phes are indeed in parallel rather than sequentially you start with a query phase will you say hey server open a connection with me so I can talk to you that's the query um the ser opens the handshake um and all of the encrypted messages have to be committed by committed by the verifier um so that later you can provide proofs over them in ckp so as you see they all kind of happen in parallel rather than sequentially but here what we are mostly interested right now is in the record layer phase so now we're actually going to be sending some queries because it's true that maybe we can create some zero knowledge proofs over the handshake messages do I see less of an application to actually create zero knowledge proofs over the handshake messages what we're mostly interested from the application side of side of view what we're more interested is creating zero knowledge proofs over whatever is sent as traffic on the record phas what we currently have right now is already the ability to create commitments and this is different some from some Pro some some prior works so in Prior works that you read in the literature you will see that eventually they expect the client to open and the Crypt the plain text and then create a separate commitment usually by using for example pedest and commitments and use that P and commitment in order to generate Ser knowledge PRS we didn't wanted to do that we didn't we wanted to rely on the fact that you can modify AES to make it a commitment uh scheme to directly create zero knowledge proofs over the AES why we thought it was much more interested is because you don't have them to provide a proof that whatever hidden data you have on the pestan commitment is equal to the plain text that you have on the AES encryption and also provide this proof we didn't want to provide the second proof we Direct wanted to work over AES also also interesting here is that as I told you um AES is a block Cipher which means also that you can selectively reveal certain path of that a specific PL text if so you choose so for example let's say that you know a specifically in which specific block of the es Cipher there's the encryption of your Field age and you just reveal those specific blocks that was pointed out by the protocol that that I talked about in some of the previous Slide by The Deco go but we didn't really want it to do that because we feel that would be still be too much of a privacy viation to actually reveal the plain text uh certain plain text blocks from the specific encryption so we went don't want to do that we directly went into proving anything overse your knowledge okay what we said at the beginning is that we finished by having these really nice commitments but then there's a lot of things that you indeed have to prove the first things for example to prove is that indeed whatever traffic you are sending over the TLs uh Tre connection follows a format that is the appropriate one so for example we know that that specific uh traffic is following the HTML and Json format there can be other traffics that are also sent over TLS 1.3 there's like XML you can send Banner data but the majority of times you're going to be using HTML or Json okay so let's give an example for example I have this Json blob and in this blob what I want to brief is that in the top object um we have this fi H um that is equal to two so I want to prove that the field H has an integer that is bigger or equal to two so how can we do this to prove actually in seral knowledge that this specific commitment AES commitments Bel belongs to that specific uh field and that it follows the correct Json format okay so we are now going like a little bit into more informally talking about these points that introduced at the beginning of uh the presentation okay one of the things that we realize is that there's already some very similar or intuitively some initial work that we can build and this initial work is called reef and it's made by these amazing people especially by elisab marelin who has my intern uh this summer and she's amazing um she's really really amazing so if people want to talk more about this work that she introduced you can always reach out to her but essentially what this work essentially found out is that for example let's say that you have a document which could be like uh some encrypted traffic some many things and encrypted um PDF and you have this document and you want to show that indeed in that specific document there exist one statement that conforms to an specific reject so think for example that you have a document that has a password for whatever reason and you want to show that that specific password conforms to an a specific regist you of course cannot like revealing the clear to the server what specifically that document is because it's for example a password um but you still want to actually check if that password follows whatever policy is set by that server so what they realize is that indeed you can do this over zero knowledge PS by interpret interpreting the reject uh expressions in an interesting way the interesting way in the in in which they indeed interpret this a specific reg reject Expressions is by realizing that any specific um reject expression can be ConEd to what is called a deterministic fin automata right so we see it here for example in this specific Rex Expressions that you can also kind of convert it to the diagram that I have here right like if you have if you have a b you essentially stay in the state zero then if there's an a then you move to the state one then if there's a b you are basically done because you you have finished and in this case for example let's say that we have um these statements A A A and B you say you say that a belongs to zero B belongs to one so that's the mapping and then you essentially can represent that on a specific uh deterministic finite automat which is great because it turns out that if you interpret anything as a deterministic finite automata you can also create this really nice functions that actually test the different like conditions in which the specific automata is going to be working on right given the current character in this document does this document match the specific Rex expression that I have just deter in a fin out automata that looks like different states that you can like move along and it thought that is great that we can have it in this specific function because it means that eventually we can ar ar artize the specific match function and essentially later proof Zer knowledge statements over the specific um over the specific functions that you have so for example you can use a cks n for essentially for this it is quite costly if you think about it actually because you will have to go line by line so what they actually present in the document is some optimization so that you could use what is called Rec recursive proofs so what you're only interested is prevent the last state and if the last state is valid then everything that come prior to it is also valid as well so they use precisely that and they also did some modifications to the definition of what a deterministic final automata is so it can be much more optimized but it the core idea remains which is essentially that this specific reject a specific set of instructions if you think about it can be interpreted as a deterministic finite automata and by thinking of that you could already think that if we have a grammar if we have some specific format in Json on HTML we can also expand it and maybe interpreted it in a similar way they also Lo use what is called lookup table so when they eventually do the SE knowledge brief over this a specific uh statements to see if they match the specific reject they use a protocol that is called n lookups which is essentially really nice to use for any kind of recursive proof system so with that we already kind of got like a hint of what to do if we inde want to now check that indeed this a specific traffic corresponds or is aligned with a specific HTML or Json format but just to bring to bring home again reef is only for reject checking so it's only for a specific reject statements but the same idea can be probably expanded or think about it when you're actually working over Jon on HTML so can we can we expand it to the web some point here already to to give you is that you still can use Reef uh for the web so for example one of the things that currently said at the beginning is that you want to provide this statements over HTML and Json such that you can prove that uh I don't know the field is over 18 but maybe you can Al you also want to prove to a server that whatever password you send over TLS also conforms to an specific reg and in that case you can use drif directly by instead of using the separate Vector of polinomial commitment that they use on that specific ref scheme instead swapping it to use directly the AES commitment scheme and then everything should work to also do Rex checking over the TLs 1.3 encrypted data okay but now how to expand to Json let's just look at Json because it's the easier one when compared with HTML let's say that jsom in the most abstract form is actually a context free grammar again it is not a pure context free grammar it depends on it is context sensitive but let's assume that it is is abstract form is context fre grammar what we want to prove is that this is one form Json so essentially what we want to do is that it conforms to the Json grammar and also that they appear in the right context so if you think about it again we are with our um ex example that we show here we want to prove that age belongs to the top uh level of the Json and that for example the object's performance also doesn't have an age that is different uh you want to actually prove that the fast occurrence of the age is the one that we want to attest that is bigger or equal to two Okay so I'll give you another example so in another example essentially we have this we want to prove essentially this ID and that whatever follows this ID this tx01 is indeed what we're looking for it is indeed ts001 for whatever then you temperate the specific Json as a context fre grammar right kind of but not really because as I told you it's context sensitive but let's try to kind of think it about this as a pure context regram the Json is an object the object have the specific members as we see here the members are always of the DI pair and pair members every pair is usually a string and a value as you see here you have a string and then that's followed by a value um the value is usually a string a number an array of or an object if it's an array then it has elements and an element is essentially a value or value and elements as you see here essentially what we can do is kind of interpret the Json in kind of like this way interesting here interesting here is also to point out that we can also interpret this eventually as a pass tree and if we eventually interpret this as a pass tree that looks literally as a tree then it means that we can also to aw reason reason about this specific tree almost like at a deterministic final automata and in that way we can always leverage the fact that we already have this kind of way of realizing in things in a deterministic final automata from reef and also then expand it to to Json by also leveraging the fact that I can interpret it as a p stream so for example let's say that a verifier wants to verify that the idea of the first transaction is indeed the that thing as was as we saw it here the tx1 so you want to show that what you want to show is that a specific pass tree that a specific sub tree that you have then it's rooted at a specific pair note where you have the the Lees the ID and this a specific value that you're attesting and you show that this specific path has indeed lead to that specific value that you're searching for and you prove that all on zero knowledge so essentially what you have is that given a tree the pass tree of adjacent document we check is there a tree for that document does the tree contain the data that you're looking for is essentially does it have like the specific path that you walk and is the tree valid for that a specific document does the tree conform to the grammar and the spected structure of Jon and I will stop there because essentially I give you already kind of like the realization that we can leverage the fact that Reef already found out that the teis find that automata can be easily proven in ckp and that with the context grammar we can also Mal malate or modify it a little bit so that we canot end up to the same idea there's many ways to do this um but expect for the future word to actually uh see it and another thing that we realize is that again we don't have to recommit again to whatever data is presented by using p and commitments or something like that but rather that we already have that specific blocks uh encrypted and committed to Via as so both we have the future work and actually showing how it conforms to the Json and HTML uh grammar and also that we can indeed prove this over AES which is known that is very complicated but there's ways that you actually can do it to optimize okay so right now we kind of have a I said at the beginning this is a very complex system so hopefully can convince you that this is a very complex system but I already give you like a some pointers of exactly what you have to do when you're actually creating these statements so putting it everything together essentially you use the stano to provide the commitments by leveraging on AES you show that the Json is wellform you show that the key exists in the correct path of that specific Json you and you show that the value associated with that specific key it is it conforms to the fact that it is an integer and it's also for example in our case when we wanted to say that is equal or bigger to two that is equal or bigger to two by using like any kind of range proof in zero knowledge okay suspect that as the of coming work I still have like a lot of questions though because uh as I already told you it's a very complex system but we can even make it more complex um by for example realizing that uh at the moment we're only ass testing a statements of the responses from the server but it will be great to for example to create a protocol that kind of here is the statements both on the queries that are sent from the client and also the responses from the server so that you show that for a query you get the correct response that will be really interesting and also to note here um note that this cannot be also easily extended to other protocols so if you want to ATT test of a statements to for example to be used on the signal protocol or the dnsx protocol whatever protocol that provides you confidentiality note again that you will have to modify um the specific protocol that is using signal so you can generate those commitments so what we did for TLS 1.3 cannot be easily just exported as a onet to one for signal for example you will have to do the whole protocol for it again and why are we interested this why I'm interested in this for the webs is because mostly for um attestations of anti flood checks so for example let's say that you have an specific uh response that is sent from the server and in that response you have what is called a honey pod field what's a honey pod field is essentially a field in which you said to the user please don't fill this form um but you know that automated boats will be filling this form because they just do it automatically right and then you want to show to theier that indeed this a specific user feel the form and hence it's a bot you don't want to reveal that in the clear because it turns out that this a specific honey fields are essentially sometimes proprietaries to some companies so you don't want to exactly show with a specific hyot field you are using but you just want to show in ckp that for example this user is not a b and then you can use this specific ckp to later show to other webs or to other services that indeed this user has been deemed as honest okay and just to end because I'm running out of time um just to give you a little bit of a discussion that I wanted to bring because sometimes in these specific systems there's there lack the ethical discussion of this I still consider this to be not a solution that that is one siiz fits all what I mean by this is that this could be great for example for a testing inde that you are older 18 or that you are a human for whatever definition of a human it could be great but it means that it's also rest restrictive so think about it for example if there's a user that has no bank accounts or they have no uh passport or they have no National identifier because they chose so and there's many humans that indeed enter into that category in this case we shouldn't actually build the system so that they only accept this speciic proofs because that means that it will be restricted to a big set of the humanity so it will not be that the the solution for the mo but rather several solution have to be put in place also for the cases of people who don't want to have bank accounts or don't have national ID documents even more worryingly for me is when this is like for example used to show Financial liquidity to show that you have a bank statement in which you have a savings account that is more than 15,000 and yeah might be great to actually create those proofs but it might not be the only way to actually assess the Financial liquidity of the people that are using the services so this should not be the solution that fits all of the cases and also just to note as well this kind of system can enable surveillance and there should be a way to actually enforce at the client side level that whatever proves your generating um are not made so you can survey it into your users for example you can check which is specific traffic uh or conversations they're having over TLS um rather like it's only restricted to the specific proof that we want to make in order for them to be able to access all services so just wanted to bring this as an ethical consideration that when we design these systems first they are not one siiz fits on and and they also have to be carefully analyzed so that we are not enabling surveillance of the uses okay and with that uh thank you and I think I'm I'm on time any questions I know that like one minute thank you is are we going to be able to have a link to the presentation uh yes I can put it on my website or or send it to as as well can can I ask a quick question about what you said at the very end about this not being appropriate for for all situations the the ex the example that you had I thought was um something that is a good application of you know say you know showing something about having the financial assets without having to be specific and say I have at least this much and this bank and whatnot you might be able to say you zero knowledge proof I have sufficient Assets in some vague form and be able to give less information this seems to me like a good thing yes so I argue that it's both a good thing and also a not great thing so maybe I should clarify what I meant is exactly that it shouldn't be the only way so for example that you only say that you have to provide proofs of this specific they have more than $1,000 in your bank statement and that you're only able to provide these from this specific set of Banks and any other banks are not allowed so I've seen some discussions on like oh we should only do this for this specific regions and that creates already kind of like a class system of people that can have access to create bank accounts on those Banks yeah but is great but the other side which is actually not having to reveal your whole details of your bank statement is also great yeah that's hi Sophia hello um so I guess you're assuming okay that the schema of the Json is known to everyone is this sort of a fixed thing yes or at least um yes so some part of the of the context fre grammar at least are are similar are similar for kind of like all of the statements that you're sending over the TLs data but we did have a long conversation on actually how to how if there's some specific slight changes that the survey introduces because because you can modify in that way the Json what we'll be able to do so currently we have the solution for the general case but it will be great to also have it the compex sensitive part of the Json okay thanks thank you so much
2025-02-10 15:22