Good morning everybody. Welcome to our machine learning seminar, which I Co-chair together with Eliza Elisa, if you could say a few words about the speaker and continue, that would be great. Good morning everyone, um, uh, today we have, um, a speaker who will be, uh, given an exciting, uh, talk about voice biomarkers, which is, uh, a really hot topic in the area of digital health. Um, um, she has, uh, she said. 3rd year, PT student at the so, the lucksenburg Institute of health, um, where she, he, uh, has been doing her research on, uh, identifying, uh, biomarkers from the voice in the context of, uh, fatigue monitoring in Covington, 19 patients and also. Type 2, diabetes so, um, I bet the floor is yours. I'm looking forward for your, uh, talk. Uh, thank you. So much easier I'm happy to be here and to share some of my knowledge, let's say so, today we would be exploring voice ai as the title mentions the technologies that are that exist the algorithms and the applicant and with the special focus on applications and healthcare.
Don't present myself more and Lisa did the job. Thank you so much so before we start, we need to know already. Okay. What is voice air I use the photo that you created for? Communicating the lecture series. By actually. Sensitivity I should, thank you. So, voice ai as it's known refers is the artificial intelligence systems that understand and generate human speech by leveraging. Machine learning algorithms to interpret. Spoken languages and respond, or take actions based on that input. It should be, um.
Also important to know that there is a confusion between voice and voice voice is 1 of the technologies invoice AI and this. Can be, um, again, they are not the same. Um, but again. Now, we are talking about the voice voice is the sound. And what is sound, what is it constitutes? What is it, uh, it's domain. So 1st, as this is definitely a normal thing that we could know that it's a wave that is progressing and the signal progressing over time. So it has a time domain. And here, we can represent it with a wave form, which is the amplitude based on the time. There's also the frequency domain, because the signal consists of many distinct frequencies that are expressed as some of these frequencies and here, we can have the spectrum, which can be obtained by applying for you transform from the.
Domain and get the spectrum, which is the magnitude based on the frequency but the sound itself is more complex than the 10 domain alone and frequency domain alone. It's the composition of both, it's time and frequency since the signal itself. And the spectrum itself is very various overtime so the frequency is very. Over time, and here we are talking about again, time frequency domain that is obtained by applying short time Fourier transform. From the time domain, and we can have now the spectrogram, which is a fingerprint or like a visual representation of the frequency, uh, based on time and the X axis. And here, we can see, uh, these horizontal line represent the harmonics. And the, uh, intensity of the or bright. Uh, the brightness of the color of these harmonics represent the intensity of the signal within that frequency. Um, from these domains.
We can extract plenty thousands of actually, uh, voice record, voice features. And here, I will be only presenting some of them that are commonly used to describe, um, the voice and the signal itself. Um, I have to mention that. Nowadays, most the most commonly used packages and, uh, softwares to extract these audio features are open smile and black person month. Again, I want to tell all of them. I will be just going through all this. We saw the pitch. We can retrieve the pitch, which is in the same domain that measures the average rate of the vocal volts and we can use it to analyze the variability. And the control of the voice. There's the shimmer, for example, that it measures the amplitude variation from cycle to cycle and that assesses the quality of voice and the potential of vocal file pathologies.
There's also the intensity, as I said now, which represents loudness and evaluates the vocal strength. Strength and effort these audio features, we can use to feed machine learning algorithms to train for classification tasks or regression tasks or whatever from scratch. But I'm not only that we can also use the spectrogram that I've just shown. To train deep learning algorithms also, which are most commonly used, but we transform the signal to spectral grams to feed them into cnn's convolutional neural networks to get the feature maps. And then to if we want to have a classification, for example, we can add the fully connected. Or if we want to have other applications and develop other technologies, like generating text from speech, we can use the long, short term term memory, which is a recurrent neural network. Now, I said that we can use these features and the spectrogram to build algorithms from scratch, but we can also apply transfer learning. And this, um, um, API domain is used actually when we like, for example, we have a very limited sample size so we use transfer learning. What is it? Exactly. It's as it.
Name says, it's transferring the knowledge of a learned task of a task a, to a task B. And this is done actually by Pre trained network. So a network is trained on a task a. And then we can use it to either fine tune it on a downstream task with our new day data and we train it to create to have later on the predictions on the text test. Be or we can extract embeddings or features from this treatment networks. And then feed them to machine learning algorithms to train them and make our predictions on the task.
Here I present, um. The commonly also used algorithms on. Audio signals like the, which is pretrained on YouTube. It's not videos they converted the videos to sounds 88Million pounds and they retrain the based architecture in order to get and embedding embeddings extracted with 128 dimensions. There's also the bio s, uh, for example, um.
Algorithm that is pretrained on a self supervised way, which is, um, on on the speech, uh, recordings, not all the orders that we can find like a new job, but only speech. Related, um, uh, voice, uh, sounds or audio recordings. Now, we can add how, what, where we can apply these technologies and where we can apply these kind of predictions and methodologies. These are the commonly and like, booming, uh, applications, uh, and existing technologies like the speech recognition that converts. Broken languages, and to text to speech this is the natural language understands understanding where we can interpret the meaning behind the spoken input and respond to it properly appropriately.
There's the voice cloning now that we discussed previously on the ethics and, uh, how API is, uh, taking, uh, over um, yeah, there's the vocal cloning where we can create, um, the digital replica from a person's voice from a sample size from small sample. And here. The main objective that they. I created this is just to help, for example, people with voice impairments to well, communicate with the loved ones or, like, if they want to communicate in presentations or whatever, not for bad the applications.
Um, we can also recognize emotions, uh, by analyzing vocal patterns and detects. Uh, the emotions within Speaker's voice, but also now we've talked about emotion that can take in. Take part of hands also, and these technologies are applied in the digital health field. What is digital health? Um, it's the field of knowledge of practice associated with the development and the use of digital technologies to improve health. Currently. We have a lot of connected devices. We have, we can use already in our daily basis, daily life like the smartwatch, like our smartphone to.
Track our sleeping to track our heart rate, et cetera and these actually measure digital measurements. As I said, they are directly related to the biological characteristics, like the heart rate or blood pressure, or. If we talk about diabetes, there's the glucose monitoring devices and this is what we call digital biomarkers.
And, yeah, this is actually the focus of our group, which is the deep digital cannot typing research lab, which his main objective is to take the way from modern, for modern, real life, health care research studies by being the interface between HR and the digital funnel typing. Digital biomarkers where we actually working also on voice. As a biomarker, but.
Yeah, again, why voice and the 1st place, because. Voice is a natural and energy efficient and chips to collect from will and like, from real life already, everyone has a smartphone where we can record our voices and we can use that on daily basis already. Um, and this can be actually used then by, for remote monitoring and telemedicine, and we can use it. Um, then if we develop vocal biomarkers as a proxy, the monitor treatment, and the effectiveness in real life.
As the digital clinical end point. But before we go further to the voice, but biomarkers, we need to know already how voice is produced. I'm sorry, I would just like whatever.
All Randy. Maybe you could notice that my voice changed before and after drinking water, because I already, I'm Super stressed now because I'm presenting in front of you my work and with the stress I have my heart race. That is. Uh, speeding and then I have this kind of dehydration. I'm losing fluids. And also my throat that is dehydrated and.
We can express stress already from voice. This is actually how it's done, uh, in general. So, back to the voice mechanism, it starts with 3 it's characterized by 3 different levels, the cognitive level, where the, uh, speech. Is a speech process starts to be planned and then.
Get leverage the instruction to the federal physiological level to the muscles in order to, um, make the vocal cords vibrate and. Sounds will be shaped into based on the vocal fold. Um. And finally, it will go through the lips and the sound is produced and we will have the acoustic level. Back now to the voice, biomarkers are woken biomarkers. I would start by defining it, which is.
Um, the vocal biomarker is a signature, a feature combination of features of the voice audio signal that it is associated with clinical conditions. And we can use it, as I said earlier to monitor patients to diagnostic disease or detect the symptom to assess the severity of a disease stage or monitor the effectiveness of a. Treatment here, I present the general pipeline of how we can go from. Collected audio samples to, uh, integrating vocal biomarkers and digital devices. So, 1st, we start.
By collecting voice recordings with both vocal tasks, specific vocal tasks that are annotated with clinical data that are being built questioners from participants. Or patients, uh, we then process the data, uh, Pre process that we harmonize its quality to filter the noise. And, um, uh, it's quality and basically, we then, um, do feature extraction and feature engineering. We feed the feature extracted and selected to. Learning algorithms or deep learning algorithms in order to identify the vocal biomarker candidates and then.
In order to validate this vocal biomarker we need to replicate it on external data sets and validate the findings on external data sets. And once this is done, we can integrate the vocal biomarker in digital, uh, devices. In order to have using the clinical research and epidemiology for remote patient monitoring, or diagnosis of severity, waiting to. But currently there is no. Clinically validated vocal biomarker yet.
And I will be talking about this because of the challenges that we are facing within the field. Some already vocal biomarkers have been defined, mainly for neurodegenerative diseases like Parkinson's disease, where they could use the sustainable explanation for example, as a broken task and the okay this book a task. It has a name, but it's Super. For me, it's Super difficult to pronounce it already and I don't have any neurodegenerative diseases, but. Yeah, so they use this a couple tickets to study the, uh, speech rate and the performance and the the way that people are articulating. The, uh, the, the sound.
Uh, voice disorders, also they use the same time, the power foundation or reading task, and they could find that the people with voice disorders have increased. Uh, shimmer and reduced harmonic, the noise ratio, and they have a change in speech and intensity. So they have fluctuations in this pitch and intensity. In the kitchen that's sorry now, back to our work and what we are doing, uh, within the lab. Um, like everyone.
Started working with the pandemic with the 19 we focused on, um, studying the symptoms within people with 2019. we worked on the symptom resolutions, which which means we checked if the people with symptomatic or symptomatic stages. Have different vocal characteristics, which was the case. We also studied the fatigue symptom and we studied the loss of tastes and smells symptoms. Then we, uh, voice started voice is a project that is worldwide and multilingual. Actually, currently, there is English, French. Dutch and Spanish, but we just integrated also Arabic and Portuguese so if you want to participate also, uh, so it's an audio bank and annotated clinical and health data and it's meant to be a screening platform to identify by various vocal biomarker candidates today for the. Closest risk prediction and remote monitoring.
And as you can see here, there's distribution of the different, um, participations we have in voice. I will present now, some use cases that we worked on with, uh, colon voice data here is the, uh, especially used case by Dr plasmin spotter which, where he studied the vocal characteristics between people with bad respiratory quality of life and people with normal. Or good respiratory quality of life here is presentation on the spectrum grams on the left side is we can see is how to say a fingerprint of a normal, sustained, valid foundation. How would the sustainable explanation can look like in a, on a spectrogram.
Um, it's very straight. Uh, it's, um, the harmonics are there, uh, this, the, the voice is clear and, uh, the, uh, Foundation as well sustained. Compared to the person that is with the same age on same gender. Uh, with a very bad, uh, or impairment in, uh, respiratory quality of life, where we can see, there's a discontinuity in this, uh, in the, uh, sustained value formation. And there's also a fluctuation in his voice. These variations of the harmonic are harmonics are the fluctuations in his sport.
Um, next this is. More special to me, uh, as a work, which is working as for vocal biomarkers on, uh, type 2, diabetes screening. Um, here we did, we, I just need to say that. We don't want to use voice as. A screening tool for type 2 diabetes, it's more of a 1st line or at risk prediction of people with 2 type 2 diabetes and for that, we wanted to benchmark. Uh, vocal a vocal recording.
A vocal biomarker, which is based on a text reading against. The assets, American Diabetes Association, uh, questionnaire, which is, uh, an assessment, which is a risk assessment of having type 2 diabetes, which is like, they've only 7 questions about H, BMI history of type 2, diabetes. And it's actually highly rated on age and, uh.
And the, so, the more you are, uh, all the more you have risks to develop today, because I'm the more you have ability or, like, um, higher, the more you get. A higher risk of developing type 2, diabetes. But already, what's the link between type 2, diabetes and voice. Why we can, uh, why people who started diabetes have distinct vocal characteristics from the rest of the people. Already, there's the neuropathy that is really related to the diabetes that can involve that involved the production in the of the speech that causes acoustic parameters on an individual voice to be altered. There's also the chronic hyperglycemia and people's type 2 diabetes that result in damaging the pulmonary. Capillaries and thus we have a fatigue and respiratory muscles that can prevent a person from inhaling and excelling adequate volumes of air.
Not only this, but type 2, diabetes diabetes is a multifaceted disease that comes along with a lot of comorbidities. With a lot of, um, um, risk factors, like hyper hypertension agent BMI already that are highly, um. Associated with voice and lifestyle factors. Um, so what we did from the call, I voice participants, we selected only participants that are coming from U. S English speakers us, uh, participants, uh, we worked on each gender separately, female and females and mains in order to reduce the bias is already because men's and. Have distinct vocal characteristics already, but they also perceive and experience. This is a different way.
Um, we, uh, matched the control group. Uh, of, uh, with people with no type 2, diabetes, in order to randomly in order to, uh, reduce the challenges of working on, um, imbalanced data sets and from these participants, we collect the test text reading recordings. That we have Pre processed already, we harmonized the quality and check the quality we extracted features and we scale them. And here we use the 1st, 1 with the handcrafted features from opens mind. We, uh, extracted up to 6,000 details. Descriptors. And we compare it to the hybrid, uh, algorithm, which, which I defined earlier, which is a Pre trained speech, related audio set segments, and is attempted to use also open smile hand crafted features and the data driven features from the pretrained algorithm biomass.
Uh, we then reduce the dimensionality. You could see already that. The number of features from open smile and then the dimensions of the, the embedding dimension is quite high. So we need to reduce the dimensionality to avoid overfeeding and generalization. We use for embeddings and select KPIs for the. Uh, we use pipeline models to ensure no data liquids and refine tuned. Uh. Different machine learning algorithms, uh, with the grid search, and we did the cross validation to, uh, evaluate the algorithms and we selected them based on mainly sensitivity and specificity.
And the most important part at least for me is that we performed a performance stratification by key subgroups, as I said. Earlier type 2, diabetes comes along with comorbidities demographics. And, like, a lot of risk factors that can affect already voice. So we wanted to see this influence of this. Sub groups on the performance of the algorithm that is classifying type 2 diabetes stages. And for this, we use bootstrapping technique, uh, on 1000 samples.
Then to benchmark the, uh, the performance of the algorithms we used by analysis to study the agreement between the new measurements, based on voice against the, uh, other risk assessment. So, for the results, uh, we. Found here I presented the. You know, is is not like the perfectly reliable, as I said, precision, uh, specificity, sensitivity of the best, uh, reliable metrics. In this case. Um, we found. There are 2 good, uh, performances.
We could see here in the box plot with the predicted probability and the type to actual type 2, diabetes stages that there's a quite nice sensitivity in, um, distinguishing between people with type 2 diabetes and without type 2, diabetes. And for the sub groups, which it was really interesting to see that the presence of hypertension for both females and Mays increase the performance of the algorithm to classify people with type 2, diabetes. Oh, no, there's also the age factor here for females where.
He made older than 60 year old, had better performances, uh, where the algorithm had better performance to classify emails over 60 year old with type 2, diabetes. And finally with the blank man, Anna is blank analysis. We find a good agreement for both genders of 96% for female algorithms and algorithm and the 93% for me. As a takeaway from this case. Um, the study here, uh, is only highlighting the potential of using voice analysis as a 1st line, non invasive, scalable screening method for diabetes. It shouldn't be confused. And it shouldn't be relying, we shouldn't be relying only on voice to screen type 2, diabetes. There's the voice, the blood analysis that does a job already.
Um, but it it will at least increase the number of undiagnosed uh. Patients that is half of the population currently. So, half of the population is undiagnosed with type 2 diabetes. Uh, future studies needs to target early type 2, diabetes, because currently we are working on people with type 2 diabetes. So we wanted to just check already if we can classify people with type 2 diabetes from people without. So studies with early stage type, 2, diabetes and Pre, diabetes with longitudinal data can extend. Uh, these findings and, uh, and, um, we can study this on broader population also.
While expanding these data sets, we need to also understand the not. Factors of the comorbidities and how they would perform the voice space clinic tools and predicting type 2 diabetes. No, the challenge is why we don't still have any. Clinically approved or validated vocal biomarkers it's.
A wide range of challenges already. There's there are the biases that we need to overcome. There's the language bias, gender biases already in language. Only even the dialects can be biases. Um, there's the reproduce reproducibility crisis 1 we couldn't, we can replicate or reproduce the findings on new populations. We don't have still access to high quality data. We are unable to implement the clinical research and practice. And, um, we, we don't have a specificity to voice of voice features and also, most of the.
Algorithms are kind of black box, um. And we can interpret or, uh, explain what's happening there. Although what we do, what we can, where we can see the vocal biomarker. Once validated, and once we overcome these challenges, we can integrate them into medical devices. We can use them for telehealth system for, um, monitoring and we can apply them for clinical research as we like remote patient monitoring and decentralized trials but voice alone. Won't have the, um, the best performance alone to reach the precision has. And precision medicine, we need to integrate.
More dimensions, more models, uh, other than voice in order to have the better performance for diagnosis for treatment monitoring, et cetera. And with this, I. Thank you so much for your attention and I really invite you to donate your voice to voice. So we can have more data and to work on. Thank you so much. Uh, thank you.
Your very, very interesting presentation. Um, I think that there can be questions from the audience. Um, so anybody who wants to. Ask a question just switch on your microphone. Ask a question or raise your hands if you like. Hello.
Hello hello Thank you very much for your very nice presentation. I want you to ask a few questions. 1st of all. Is it true? That my understanding is that you convert the voice to an email. I know that the process it with neural networks is, is that correct? Uh, we can do both we either consider respect programs as they are. Mm. Hmm. Or we can convert them to images, and we can actually use CNNs for the images. And we can actually also use pretrained algorithms. On images like, uh, 16, for example, on.
I see is there any loss of information when you convert from the audio signal to an image. This is a tricky question. Um. It depends on the resolution already of the image that you are creating and it depends also on the sample rates. And the quality of the audio signal itself, if you have poor quality of audio, single signal, you won't have good spectral grams no matter what you do. So, it won't have any good.
Images I see. Okay, so the, so the, the audio itself need to be in a good quality in order to have better performance with spectrum brands. Images themselves I see. I see. And have you compare the 2 approaches? 1 with. And the 1 that, uh, directly processes, the audio signal to see which 1 is better.
Yes, but I didn't present this here, but we worked on, as I said, the, um. Fatigue symptom and people with covered 19 and there, we compared the performance of algorithms pretrained on images, and we use the special grams for that against the features that we could extract from open smile. And it was performing even better and the, um. He said later on the paper that you can do that you can read. There's also the shared the. A code for that and a, yeah, we found better results with. Between images, images, image.
Okay, that's really interesting. Yeah, I would like to see that paper if you could share it. Okay. Okay. Of course, I will send them. Uh, 1 more question, then you did mention fine tuning at some point. Yes. Uh, optimizing the hyper parameters of the neural network. I assume, like the number of players. Mm, hmm like.
So, my experience, this is. And notoriously difficult problem problem you said, you used, like, great. To do that. How many parameters where are you. Considering during that process.
Okay, so we, we worked on multiple machine learning algorithms. Mm, hmm. Uh, to compare them already and, uh, for each algorithm like. I will state some the logistic regression support vector machine multilayer perception from a psychic learn. So we fine tune their hyper parameters. And then we also fine tune the number of components from the reduction, and the number of features to be selected within the grid search. Okay, and that happened numerically. Right? You did not do it through trial. No, okay. Okay. Automatically done with.
Search okay, very good. And how it was that process like, how many hours a day are we talking about. Of fine tuning you mean? Yeah. Okay. Now you can I think you can see the number of participants we have. It's hundreds so it doesn't take much. Didn't take days to find. I see. Yeah that makes sense. Okay. Yeah. Thank you very much. Thank you. Thank you. No problem. Uh, you wanted to, uh, ask a question.
Actually, uh, just, uh, uh. Uh, for CN, you fit, uh, time frequent, uh. Uh, it was my impression that progressive domain spectrogram. So that was the best option for, uh.
Right, yeah, your experience yes, but. Final question is that, uh, okay, so, uh. Detecting the diabetes. Okay. 1 disease against health is definitely wonderful. Oh, sorry. Thank you. For. Presentation as well. No, but, I mean. Okay, so how do we how okay.
So, when it comes to diagnosis, uh, we probably can't be, can't tell a lot if we can't tell. This person is sick this person hasn't he has a right. Because you, you didn't check all other. This is. To make sure that these kind of biomarkers are not specific to the diabetes or something like that. Is is it does it make sense? So people, the people that we are use that we studied here, have type 2 diabetes and they don't have. Um, I would say, yeah, they have type 2 diabetes against people without type 2 diabetes. We didn't analyze.
The rest of the diseases that they can have. But we studied the comorbidities and lifestyle factors and demographics with to see. Their effect on the performance of this. But how viable could be the results because we don't know if any other. A health problem can cause the same biomarkers can create the same biomarkers.
I totally agree and we try to explain this already with these, uh, performance certification. But that includes hypertension that includes migraine that includes depression that includes fatty. So. We try to explain what works best. Or this classification, what is influencing more this kind of, uh, classification but yeah. A human being is multi model already, and that's what I said at the at the end of the presentation we can use only voice. We need to use all the modalities to characterize 1 disease.
Thank you. You're welcome. I have a question. I think related to the previous, the previous 1, um, like, did you study the effect of, uh, potential Compounding factors? Um, like, um, medication or? Um. Or even other coma comorbidities, that may be also affecting. The voice, um, like, in some, sort of is your circle like, the voice determines the, uh, maybe the, the morbidity, but the may be also affecting the voice. Um, and and similar for, for medication, um, of of the.
Yes, here, um, I don't study the effect of medications already people with type 2 diabetes they undertake treatment. Um. Or type 2 diabetes, so it's Super different already from people without type 2 diabetes. Uh, I tried to reduce the biases as much as possible compared when it comes to languages. So we took native English speakers from U. S participants. We took people we certified the genders. Also we have same vocal task. And we studied the maximum that we could collect from voice. To study these factors and Co, founders on the performance of, uh, this classification.
I see yeah just, I'm just thinking that it may be interesting, uh, something interesting to do because, um, in the end, like, you need to say these factors. So you're sure that you're predicting the VCs and not. The medication effect no, you did something like that. Yes, um, and then, um, something else I I, I'm also curious about is about the data sets you use for training um, uh, like, if I understood, well, you do. Uh, like publicly available data, uh, and then you find, you know, um. To generate the embeddings for the data that you're interested in. Um, so, uh, I'm, I'm, I'm guessing that there will be a lot of imbalance between controls. Like, let's say, I guess most of the.
Um, data sets out there about audio are controls. Like you assume, they don't have a, the. The disease. Um, so is this unbalance a problem or how did you manage to, um. I don't know, like, did you apply any under sampling or, like, data augmentation technique to compensate for that? Or how, how did you tackle.
This problem. Okay. Um, so all ready to get the embeddings we use the okay the bio, which is a Pre trained algorithm on. Thousands of segments of speech related. Tasks okay it's a self supervised pretrained algorithm from that algorithm. We extracted embeddings and that we use them to feed machine learning algorithms that 1 part. The other part is when we did select the participants, and you can see here on the slide where there's gender certification. I mentioned that already. We, uh, randomly match the control group without type 2 diabetes to the same sample size of people with type 2 diabetes already. So. And to not work on this challenge of, uh, imbalanced dataset.
I. Reduce the complexity already of this task. We randomly matched the sample size. Okay. Um, okay, so you basically, it's like some under sampling. Just remove. If extra samples exactly. Yeah.
Okay, uh, makes sense I think, um, thank you very much. Thank you. Hi, hi I speaking. Thank you very much for this Excel presentation. Very interested in work. I have a question regarding the date as well.
Were there any Pre processing that you had to do on the voice recordings? Yes, we have a dedicated pipeline to preprocess our data or it's a real life. Uh, setting data, so people on their, their houses there, whatever they are, they record themselves and participants. Through their smartphones or, um, laptops, and we did Pre processing pipeline for that in order to check the quality and, uh, exclude the ones with bad quality or. But, uh, vocal task or wrong.
I. But I can't really detail much about the Pre processing currency. Because it's, um, confidential as it will be a project for spin off within the. Okay, and what's were the challenges that you wanted to overcome with this Pre processing? The strategy. Already to check that. The, the quality is good for the recordings.
Which means that no minimal noise. The people they make, they made the correct tasks. Sometimes people they start recording, for example, before.
Doing the recording, so we need to check already. The time that they started recording, it's actually the time of starting to book a task, or no. Yeah, this kind of, um, what the assessment we do. Okay, also another question, if I understood well, the participant mainly individuals with the type 2, diabetes. Were there any reason not to include the individual type? 1? Already type on and talk to them they are super different and they both. People looked upon and I both like, yeah, with type 1, diabetes with type 2 diabetes, they have already different local characteristics and they undergo different experiences with type 1 diabetes. They experience diabetes since their childhood.
Type 2, diabetes comes this kind of lifestyle factors and they like habits. That they, that it's, like, developed with time. So already, the experience is super different. That's 1:2:type 2, diabetes type. 1, diabetes easily diagnosed. At the early, very early stage, and, as I said in childhood. But type 2, diabetes is under diagnosed and 50% already of people with type 2, diabetes.
I'm not diagnostic type 2, rabbi worldwide. Okay, thank you. Thank you very much. Thank you. Thank you. Thank you. I saw kitchen was raising hand, but I'm not sure if it is still the case. Yeah, Hello good morning. Everyone so thanks so much for this in great presentation. So I'm not a health expert by the way. Um, I'm that scientist. So my question is related to the, these 2 previous.
A suggestion or comments about the ambulance data set, et cetera. So when you take, I want to talk about metrics, you. Consider true positive and negative. I. Health is not enough. It's the wrong metrics. Why? Because if someone is. Or is it a. Habit or cancer, if we miss that it's a big problem. So, as the last person said, maybe there is a specific group very important. We need to evaluate our metric and based on that, our target group. Not in general, because in general we can make 9:9:9 99999.
Percent perfect, but with. Is 1 person who had a cancer then the model for that class is 0%. The performance of that glass is 0% and this is very bad model.
A liberal looks great, but for a specific group, or our target group is, this model is really performing badly. So this, we, uh, we, uh, we should take, I think, on consideration. Okay. Uh, then for the, for the ambulance data, there is a lot of techniques already available, but I'm not in favor of that. Like. Okay. Playing Chitra, blah, blah is not enough because we manipulate. Data recalibrate calibrate the models and we test them and real data sets and when we test them real that asset, the model will is not performing. Well.
So this as well, we, there is a new techniques, or we can handle this kind of unbalanced classes for medical, I think, for medical issue. Yeah, this is very important. I think what do you think about that? Um. Yeah.
Agree with you already. I. okay, so I wouldn't go start to go 1 by 1. Already for over sampling techniques and generating synthetic data. Or over something, I'm totally personally against that, because we are working already on medical field and, uh, the augmentations might cause these biases and overfeeding already themselves, even if we do correct perform correctly and assure ensure that there's no. Data leakage, the other thing is, um, could the, the performance metrics. You said that performance metrics you don't agree that I didn't I chose sensitivity chance. Mm, hmm. Mm. Hmm.
Okay, okay. Those are general metrics, but we also check the metrics within each class and we are mainly focusing on that for diabetes. Plus, to see if the performance of the algorithm. Is or the algorithm itself is really detecting true positive cases and true negative cases, which are both important. Okay, you can see in this slide and then that other metric with 3 computed from the confusion matrix is a true negative and false positive. That's really the error term. This is very of diagonal term. This is important than the on diagonal term. Yeah. Yep, I agree. It's very interesting to see.
As well, thank you. Thank you. Just sorry for follow up with teach him was saying is perhaps. Uh, hmm, true. Sorry, false positives. False negatives have different flavor here because if we. Make an error and we qualify somebody as being diabetics. If somebody is not.
Mm, hmm it it is not that harm harmful as if we did the opposite. We make an error. Exactly because we won't be telling. Maybe we'll be telling the person that he probably have a risk of developing type 2, diabetes and not having type 2 diabetes. As I said, voice want to be a screening tool, but the 1st line, or kind of triage of. Maybe you have type 2, diabetes and go check your doctor and go do check with the blood analysis. If you really have.
I can. See, where is it? Yeah, it's not. Policies that sure. And, uh, I have a question related to the, the.
The machine learning algorithms, uh, themselves. How how do you feel they perform with respect to. Some of the existing methods I can imagine because you mentioned in the very beginning that you have some, uh, already some prior knowledge about what the. Type of, uh, voice and normal list can exist.
Uh, within the population that is sick. So, you could think of some rule based approach that. Just to, you know, look at the diagrams. And, uh, trying to figure out who has some anomaly in the voice that are perhaps, uh, specific for. That the images for that. And my question is, um.
Have you made any comparison? Uh, between the 2 approaches, let's say, take the simplest rule based approach in which you just detect simply some anomalies and then take the machine learning algorithm. 1 of. Those that you that you tried and see what is the additional advantage of using machine learning approaches with respect to some rule based approach approach. Uh, we also tried through based approaches, of course, to compare and.
We start with them before going into a I, and black boxes. And the 1st place, we also do statistical analysis on the vocal features that we can extract to see already if we can characterize some book and features associated to the disease itself or no, before. Again, going to advanced AI, um, applications. So, yeah, we do that and this is performing much better. Of course. And that's I'm presenting it and that's why I'm presenting it. So.
Oh, yeah, hmm. Okay. Okay. I can see no more question. Uh, oh, okay. I really the topic is very.
Hot topic, because she said at the beginning, I'm not telling expert again, but I'm really curious. So I know that could be genetic and from caused by environment like air pollution or whatever, and the lifestyle, because we are stressed. I don't know. So, can we quantify the the different I mean. The consequences are then the portion of which dimension, and to be diabetic for the future. So you can build the system, not only to classify, but also to, maybe to see, okay, your lifestyle is not. Okay. So maybe in 10 years, you'll be this is.
Needed to get to be Mm hmm. A boutique or whatever, so yeah, this kind of this kind of tool I believe it will be needed. Tell me yes, of course as I said, also, I mean, there's already the questionnaire that is used the other risk questionnaire. It asks about the history of type 2 diabetes, and it's relying on agent BMI. As I said, if you have older or you have high BMI, you have more risks to have type 2 diabetes but. For our context, including early stage type 2, diabetes and Pre diabetes with this weekend. Characterize this book voice changes. And if we check if these 3 categories, early stage type 2, diabetes, Pre, diabetes, and people will. Experience already with type 2 diabetes, they have different focus changes. And track them over time. Longitudinally would be really interesting to see. Yep. That's the important.
Thank you. I can see a question from the. Uh, is the, uh, so I mentioned that right now, you're in the process of collecting. More data right right. And I have a question, does the language of the voice recording would affect, uh.
The models, because there are some languages that maybe have higher pitch, you need higher pitch or. Yes, do you have a requirement. Okay, so we already collect multiple languages in colin's voice and the more languages we collect the more. Rely upon reliable and generalizable algorithms we can create, but already language itself is a very big bias. For the biomarker development as.
I said language, even dialects differ from. Uh, like the vocal characteristics differ from different languages and different dialects already. So. Yeah, currently we worked only on English speakers we need to validate that on French speakers, for example, or in Spanish people, for example, and then also they. Culturally, they, uh, experience, and they live with type 2 diabetes or any other disease differently. So. Language only not only, but also the cultural environment, the socio demographic. Uh, factors, economic factors. Yeah, a lot of biases that we need to work on.
Yep, I see. Thank you. I have another question. Um, mm. Hmm. Sorry. Is there. Someone else? No, I just go. Go ahead. Yeah, yeah. Um, so, um. So you mentioned in the end that, uh, you, you.
The perspective that you see for the field is to be multi model and to not be a 100 reliable on voice. And I, I agree with that. And I, um, I'm wondering if you have tried to combine, um. Focal biomarkers that you have identified as relevant with other types of data, like, um, molecular data or or clinical data, or, um. Lifestyle, I don't know environmental and. Uh, like to really prove that the models, um, are better. Mm. Hmm. Different. So, uh, already in collab voice, we collect clinical data only, but we don't we don't collect the molecular data or blood samples or whatever we call, like, based on the questionnaires, clinical variables, et cetera. Uh, we have already tried with the, the project with the doctor portal, which on respiratory quality of life, where he also showed an added clinical where he also added voice.
Voice to clinical features to see if. The voice can bring, uh, informations to clinical features alone and it was the case. He had very good performance, just by adding voice to clinical features. For the testing, the quality of life. Yeah. For type 2, diabetes I didn't do that, but I, but I did this performance certification. And how it works already within the comorbidities and lifestyle factors.
And by the way, if you like, uh, you can send me the link, and maybe this QR code, which we can post on the website. This seminar that would be. Perfect can advertise this voice. Vacation yeah, I have 1 question that is related to the, the information about this startup or spin off the dimensions. Uh, and the question is about some legal restrictions within the European Union. Uh, do you have any.
No worries or experience how difficult it is in you to implement uh. Artificial intelligence systems for. And because as far as I remember in some early ideas. For the I act that was already accepted in you, there were some CV restrictions on the use of AI technologies that are. Related to human.
So, how, how is it now and. How it is somehow colliding with the idea of spinoff of. Um, so the spinoff is relying mostly on the Pre processing pipeline, how we ensure the readiness of. Voice recordings to be fed to algorithms and to develop local biomarkers. It's not really ai based. Uh. Um, technology that we are using, but, um, for that, I have very minimal knowledge about this kind of approval and ethical concerns from the GDPR and the governmental.
People so, yeah, I don't have much to stay here. I don't know. I see. So. Yeah, sorry I remember we had a presentation on the seminar on that legal part. Mm hmm around 2 years ago. But, but the, the loe has changed over time so it was wondering whether there are. Any news, I guess they are now more they get more and more accepting. The fact that a, I should be implemented within.
Different fields, and also healthcare, so maybe it's becoming easier to implement such a thing. Easier by time, but I can't say more. Okay. Thank you. I can see that. The hands yeah, I just have a comment. Uh, I guess the noise only targeting. Uh, emotion recognition in workplaces, so for.
Really because there are some companies now they are doing promotion and depression status places. Yeah. I guess probably, when it comes to like a power, I mean, they, they start to actually execute the rule, but. This is the only thing that they, it is bad probably putting a stop motion recognition research in Europe for a while.
That's it Thank you. Good to know. Thank you. Okay, I think we, uh, we have 11 o'clock, so we should be slowly ending the meeting. Yeah, thank you very much a beer for these very nice presentation and thank you for for.
For the discussion thanks, Eliza for Co, hosting the. The meeting will keep you posted about the future. Seminar meeting yes, I can see some link from.
About the, perhaps about this, um, restrictions, legal restrictions. So, if you want to use it, just click it now because after we end the meeting, it will disappear. Yes, that's why I didn't want to send anything on the chat. I will just send it to you. Um. That and I can post it whatever you think okay. Uh, can supplement your presentation.
Send it to us, and we can post it on the website. Sure. So have I have a good day today and. It's saying that. Really nice to meet you too. Bye. Bye. Bye. Bye. Thank you so much. Bye have a nice day. Bye. Bye. Bye.
2024-06-04