Vibe AI Transcribe Audio Through Your Computer

today on geekazine we're going to take a look at this it's called Vibe it's transcribe on your own it is software that you put onto the computer to do a little bit of AI we're going to talk about that next on geekazine what's up my Geeks Jeffrey Powers here from geekazine think magazine put in geek today we're taking a look at Vibe this is software you're going to download onto your computer this is software that's going to run from your computer and this is software that's going to create help you with your AI creation as well so there's a lot of things that you can actually download and install onto a computer especially a computer that has a little bit of heft to it like for instance my machine is an Intel Core I9 processor with a 4070 graphics card from Nvidia now in all reality this card is not the top of the line the processor is Not Top Line in fact it's a few years old I do have uh more than enough memory in there more than enough RAM in there to show work but it's more about that GPU for a lot of things so what we're going to do is I'm going to show you this program it's called Vibe and all it does is simply just helps you transcribe audio you can you could talk into it and it will transcribe it you can bring a video into it and it will transcribe it you can bring audio into it and it'll transcribe it and it'll give you different file formats and that's what we're going to take a look at next here on geekazine uh first of all I do have to let you know you know the the software is absolutely free so there was you know they didn't uh Supply anything and of course nobody's paying me for this video uh I have a full review policy over at geekazine dcom review that talks about how I get products how I review products and how I send it out to you all opinions are on my own like I said and of course if you've got a product that you want me to review all you have to do is go down there go to the contact and contact me one more thing before we get started uh the new t-shirt is out I have don't have it yet I do have the today IM AI t-shirt which you can definitely get but the new t-shirt is 80s Jesus saved me which is one of my original designs uh you can get it in multiple colors you can get it on Amazon over at my Amazon uh amazon.com shop geekazine or on Teespring which you if you're on youtube.com geekazine you'll be able to find the links right there it should be in the products area that you can get otherwise I'll have the links in the description as well all right with that said with that done let's go ahead and talk about Vibe what is Vibe Vibe is absolutely free it is up on GitHub it is produced by the White Eagle if you've ever gone up on um up on GitHub you the first time gets a little bit you know what do I do what do I do and then after you start getting some some of these files these repositories then you can you start going in and uh and using them and of course GitHub also has a downloadable file that will help you save your favorite gits as they call it so you can uh you can come back and use them again I do have a GitHub I think it's github.com geekazine I haven't been there in a while but I do have a couple GitHub couple projects up on GitHub as well so anyway GitHub uh as you can see this is what you first see when you get to this page so it's github.com thewhite eagle with a one instead of the eye SLV and as you can it's right there the White Eagle and if you go to the white Eagle's page you can see all the githubsign I don't have anything public I I do have a couple WordPress plugins that I could put in here so but uh I've done stuff for some other organizations for sure which is why I have the GitHub all right so this is the white Eagles and of course this is Vibe it's totally free like I said so support is on your own so they have all the files here which you can download and then you can read on how to install but there is a download link here to make it easy to install which was super easy to install you just download it uh and install it of course you got to download it for the uh platform that you're on on I do have Vibe on this machine but since I'm doing the switching and recording from this machine I decided that I won't tax my machine too much with Vibes so I switched it over to my Mac so this is on my Mac Mini and this is a Mac Mini M1 so my computer which is a uh like I said it is an Intel Core I9 uh 11th generation processor with 4070 card in there the Mac Mini is a a standard Mac Mini M1 base model 256 gigs of storage in there and 8 gigs of RAM and then I've also installed it onto one of my Intel Nooks which is an older I5 that came from I don't know 2012 I use it for uh things like my slack and and uh doing doing simple stuff but still needed it's a small computer it's perfect both of these are remote in to from this computer so I can access what it is so uh the the requirements of using Vibe on your computer are relatively low that I5 model like I said it's a it's a core i5 it's got Intel graphics on it and not much it's not even a skull Canyon what it did do is it transcribed but it took almost all the resources and it took a lot longer to do the Mac does a pretty decent job this M1 does a pretty decent job my computer does a really decent job as well so keep in mind if you do transcribe it that what it's going to do is it is going to be reading the file and then going through the transcription so when you first install it it will bring down the uh language model files that you need and of course you can choose many different uh languages here we've got English we've got Africans we've got Bosque we got Chinese there's two areas here you have the select file which can be an mp4 it can be anov it can be a MP3 it can be a wave file it can be a lot of different uh versions and then of course you can talk directly into here so you basically choose your microphone you choose your speakers you say you say I want to save the audio so if I'm sitting here and I'm going okay I want to create a new show but I want to script it and and I do this on my phone with Google word and Google Sheets I'll I'll use the the microphone and I'll just talk into the phone and let it transcribe the problem with that is every now and then if I'm not watching all of a sudden it turns itself off which is a little bit frustrating CU especially if I'm recording somebody speaking up on stage and trying to get all the words it's tough enough that you know there's you know the Reverb of the room and maybe they've got a thick accent or something like that that makes it tougher to transcribe but if I can put it onto a computer and let it record and transcribe from there then that's going to save me a lot and that's what I love about Vibe is it allows me to do that and and save the file and actually do different variations of the file so um with Vibe here we go uh once again so you can choose to record it straight from your mouth or from somebody's mouth or you can pull it from a file now where is a file come from you can do pretty much anything you could do a podcast episode you could be recording somewhere El and then bring it home take the recording put it in here and let it transcribe uh you can use YouTube and anything but I will say this you know that's copyrighted material almost everything unless YouTube has a Creative Commons license to it you're going to have to look in the description to see if that's a Creative Commons if that's Creative Commons you should be able to pull from there no problem but if it is not if it's on a standard YouTube license that's copyright material so downloading that and transcribing you can probably use it for your own stuff but if you start plagiarizing somebody's work that's going to be a problem right there so anyway let's go ahead and go back here this is Vibe we got the Dark theme on there so you know and I got one of the files here grabbed one of my old YouTube videos in this one I was doing some stuff uh with a video called inexpensive retro games in one box it was it was for a uh retro NES uh device so I'm going to do is I'm going to select the file I'm going to choose that and open it up there we go so now it is loaded into the system fairly straightforward no problems here uh from here we could actually listen to what uh what it is we could change the file to something else but we could listen right here we could change our language and of course we have more options down here where we can recognize the speakers maximum speakers so if you've got two or three people talking it'll try and determine who who they are and give you indications here uh it gives you a speaker recognition threshold and if you hit the eye you can see it's a threshold for speaker recognition or consider as not detected so it it helps you try and detect you got prompt so let's say geekazine there we go there's one word right there and then of course I do that and now every time it sees geekazine then it will it'll figure that out or here's geekazine it'll figure that out as geekazine so I'll do that and I'll go Jeffrey Powers because that's another another one that is always good to have in here so there we go put him comma delimited although it tried which was interesting anyway uh it asks for time stamps for each word instead of each sentence we're going to leave that as it is maximum sentence length of one this is for if you're doing SRT trans transcriptions you know transcriptions for television or anything like that possibly even putting them into YouTube or Facebook cuz they do have options for SRT inserts you can do that there but uh we want one sentence lengths that's perfect but you can raise it up to actually we'll do two we'll do two threads uh is for how many CPUs you can use for faster decoding this one's saying four will stay at four temperature this is the temperature of the word that's spoken if you've got a lot of acronyms a lot of heavier words like in sephra graphic technologist then you might want to set the temperature up a little bit so it can try and really transcribe this and then of course the maximum context uh tokens from past text to prompt for the director so uh just different areas to go so basically we've got our settings set in our more options now it's time that we can hit and the transcribe and as we do that that changes down here and we have this right here where you can take a look at you can watch the transcription happen as it is now if I pulled up my uh processes it's definitely going to be taking up a lot of process memory and CPU time but as you can see with this Mac Mini M1 we we're starting to see the first 1% of the transcription happen and there we go while we while it's doing that I'm going to show you this we can choose whether we want it as HTM ml format PDF format SRT VT or Json file we'll go to SRT show you that I think we can do that as we're going yes as you see this is what's set up for video like I said YouTube for Facebook or if you're making a TV show and you need timestamps for close captioning this will actually help you create those closed captions so as you can see let's read the first lines here what's up my Geeks Jeffrey pow here from Geek aine and we have another geek Tech talk live that's perfect that's exactly what I said in that video we're going to show you this little thing it's the true wire mini NES gaming console that's pretty cool true wire is an odd word so I'm glad that they picked it up they didn't separate it into true wire but as I said true wire as opposed to True wire I would guess that it would then start to get a little bit confused on that so uh it's not an NES gaming console it's a mini entertainment center as they but it has 621 games uh and let's see 621 games and a lot of them I believe are from NES and we're going to take a look at these open it up and see what's in there and play some more games I I'll link to that video down below so next on geek Tech talk live so as you can see it's doing a pretty good job creating my transcription it's at 23% right now and I'm not seeing anything erroneous just yet but I didn't do any real acronyms or like I said in seph graphic technologist we'll see what happens when I say that a few times how it's going to react to it but as you can see there it is we got SRT we got text let's see what HTML offers this is a format that once you copy it it'll be in full HTML format you can't see there but there's the timestamps right there let's go to PDF format some of these third-party programs that insert transcriptions will accept different formats like PDF and HTML and uh and items like that so this will create a PDF file we got a VT file as you can see there's there's a different way to transcribe it and then of course Json and you're going to see a lot of code here but if you're I think Roku still does Json files so if you're trying uh creating a video that's going to be on a Roku or any type of desktop streamer you might need to create a Json file this will help you create the transcription for that Json file but we keep it a test cuz that's all we need right now cuz basically what I'm going to do is I'm going to take all this uh information and I'm going to throw it into chat JP and then I'm going to say hey create me a description create me a title create me a uh create me a blog post that I'll end up re-editing anyway because that's what I do is it's nice to actually have a basics of that and create from here we're at 48% 49% has it just changed so ased as you can see it's uh going from there I think what it's doing is it's creating when I create a pause like for instance if I paused right there um where I'm seeing I'm seeing it already and then a little bit of waiting look at that a little bit of wait there we go a little bit of weight it's creating a new line for that so if I talked consecutively like here we got Super Mario Brothers 3 2 Mario 10 Mario 14 Mario 16 Mario 69 uh Dr Mario it's going to keep going until it hears an actual pause and since I was going through different games that makes sense here we go we got all so I say all right so what do we got we're going to go down the puzzle games let's see what we got hopefully all right Angry Bird Angry Bird 2 Angry Bird three so it's it's not really separating from there but as you can see lots of there we goes it's an older file it's an older video and of course uh it's as we went and just like with this video I'm pretty much talking not too much editing post editing on this maybe a little bit so you'll see a couple jump cuts from here but I wanted to keep this as quick as easy as possible so we're at 68% now and we're and we're continuing on with the transcription so let's keep going here we're at uh 76% now you can't see that right now let's bring this back down and we can resize the window so it stays everything stays in here uh we'll scroll down here and we'll look at where it is so it keeps going I think what happened was when I switched it back and forth that's where these lines came in like that so that that'd be really interesting but it's still keeping the timestamps as best as possible now keep in mind if you pull the timestamps from a video and then just uh re-encode it as MP3 wave file or whatever time stamps might not be completely accurate as if you just put the video in there if you have an MP3 or a wave file and you're bringing that in that way uh it's going to definitely save on some of your resources because it's a smaller file because it's Audio Only doesn't have to worry about the video portion of anything and it can focus on the transcribing from there so all right so we are at about 97% so this is almost done which is great and this only took about uh 8 to 10 minutes somewhere right now and that was a 20 to 23 minute as you can see I had some extra audio I had some sound effects in there so it said gunshots I think there were gunshots in there and then of course blank audio which basically was my ending geekazine uh logo which I put on a lot of my videos so if there was any Direction that's in there you would see it in here but uh I didn't do any other sound effects from there so once again if I want to put it in any other format all I have to do SRT boom just like that and now I have a full timestamp for transcription for sending it to anything Amazon to YouTube to Netflix to whatever if you're if you're making Netflix videos this is a great way to do it in fact I just had a friend he submitted a movie which I did play a role in uh where it's uh basically he had to go and go through the transcription and of course when you do that you definitely want to go through and watch the movie watch the video and make sure all words are correct and they're in the places that they need to be in and then of course you can make alterations from there let's look at the settings before we do this we can change our language here we got dark or light theme or special theme so if you want to have I don't know I'm guessing this is something where you can customize yourself uh so right now that's the uh themes that we have uh we can play sounds we can have the Focus window we can customize our model link so as you can see we're using the ggm medium bin so we'll need to change if we need to change models folder we can do so that'll take us to uh something we downloaded and we can download models from here so we have I have the standard gglm model in there we can uh do the large model for higher accuracy we can do a small model it really depends on how much space you have on the computer for there all right let's talk about the length language models and at first I thought what this meant was this language area I thought it meant that you could actually go and you could turn this into let's say French uh and then of course transcribe it and it would actually transcribe and translate it does not do a translate what it does do is it will adjust the model for the language that's being spoken so I tried to do this uh in French my that same video in French and as you can it still said what's up my Geeks but notice the weird inconsistencies jafa Powers here from Geek maybe I have a French accent or something like that if I was saying this in French then the transcription will show in French it does not do a language to language transcription all it does is transcribe what it's hearing into words so if you need if you want to have more free models these are these are the AI models that you can bring in one more thing I wanted to tell you here and that is this is chat gpt's answer to if you tried to uh put the file in and ask for transcription CU it does not transcribe the the file but it will suggest other things such as opening Google Docs choosing tools and voice typing playing the audio file and letting it transcribe which is what I normally do at events otherwise they suggest like otter AI or descript or Microsoft Word which will ALS so do a dictate option uh and then Google text the speech or whisper by open AI which is a sound model so chat GPT does have an option but it's a not part of chat GPT I went over to Gemini to check the same thing uh they first of all they don't give you an upload image so I actually had to use a Dropbox uh location and it basically said I can't directly access Chans Sky audio files from the internet and it asks it basically says to use an online transcription service do it yourself or contact the owner of the audio so it doesn't want to deal with the aspect of the copyright there so keep that in mind when uh you are using or going to try to use it for chat GPT and Gemini now I have had uh some success with this on chat GPT and Gemini if it comes from YouTube simply because there's already a transcription it's just reading the open API transcription of that and then bringing it in so I can do a summary or anything like that but keep in mind once again if you're doing it for somebody else's video it is copyright material so just uh keep that in mind when you do that we talked about focus with three O's where you can download and actually create images off of your computer this is doing transcription you can get a version of chat jpt and say things like you know trans put this into a blog post and it will then clean it up and do things like like that all from the computer which is really great the whole point is that if you want to get the transcription of the video that's in there then uh you can use something like Vibe it's going to be absolutely free it's just going to cost you whatever it costs you to turn on your computer and run your computer and uh give you a transcription which you could then like I said you could uh do all the timestamps and put this into a program or you can get the straight up text and then H put that into chat GPT or whatever to create a blog post or just to post it as it is so great little tool that you can put into your computer doesn't take up too much space depending on the models that you have let's look at the models really quick so if you go to the page there's He suggests these models there are a lot more just by clicking here uh this is over at somebody else's GitHub where they've collected a whole bunch of mod if you're talking a lot of programming language you would probably choose one of these different ones if you're talking about uh something that's allinclusive you want to get a larger uh a larger llm that's a large language model database so you have different options here and you want to use one that's going to fit to what you're trying to do for me I'm basically talking about a product so the language is going to be average from high school to college level type words so this is going to be perfect if I say in seph graphic technologist well we'll see how that uh transfers onto the text after I'm done recording here and then uh and then we'll post that of course down here so uh we'll see what happens from that but the whole point is that if you want a good starting base to transcribe your videos before it even hits up on YouTube cuz YouTube will do some of this transcription some other programs will do transcription as well but like for instance geekazine is a very specific term so how will it react to your version of geekazine will you see geekazine will you see geek D A-Z will you see jeffa powers or will you see uh will you see Jeffrey powers from geekazine I hope you know if you're also Jeffrey Po from geekazine definitely use that that's for sure so but anyway that is vibe in a nutshell it runs on Mac it runs on PC there is an Android and iOS version being developed as we speak maybe by the time you're actually watching this it'll be out there so you can put it onto uh a phone and then do some of the transcriptions on the road and that that'd be pretty cool CU then I don't have to rely on the internet to do my transcriptions what are you doing with AI to make your job a lot easier let me know in the comments below over at geekazine cuine where you can like subscribe comment Bell notifications so you know when the next video comes out until next time my name is Jeffrey Powers thanks a lot for watching thanks a lot for listening until next time you guys geek out and Vibe on with Vibe transcription app

2024-08-25

Show video