Office Hours 05 – The Interesting Challenges Of Delivering High-Quality Transcriptions

Tim Tyler Vatsal Lorne Speak Ai Office Hours 04

In This Discussion:

We talked about:

The challenges of delivering high-quality transcriptions including turnaround time, the complexity of human language, sound quality, the transcript editor and more.

What Is Speak Ai Office Hours?

The Speak Ai team is doing routine virtual get-togethers that anyone can join! We share updates, have lively discussions, answer questions, and figure out how to solve exciting and complex problems together.

Join us next week at our weekly office hours at 12:00 PM EST.

What Does Speak Ai Do?

Over 540 individuals and teams use Speak to automatically capture, analyze and transform media into incredible assets for growth.

Customers are harnessing Speak’s media capabilities to save time, increase productivity, improve research, optimize well-being, grow search engine rankings and more.

The Speak Ai team is doing routine virtual get-togethers that anyone can join! We share updates, have lively discussions, answer questions, and figure out how to solve exciting and complex problems together.

If You Would Like To Contribute

Sign up for free at (and share your feedback – we value it a lot):

Follow us on LinkedIn:

Follow us on Instagram:

Join our Slack Community:

Check out our Public Product Roadmap:

Follow our Indie Hackers profile:

1 – 0:00:00
Hello. See that we’re all sort of in different backgrounds.

2 – 0:00:05
Aren’t you switched the images? I think every week that’s all and I are in London ON right now, so took a trip back home from Toronto to London, which is wonderful. I’m sitting currently in innovation works, which was the home of where speaking I was born and where we have all worked and so it’s a beautiful coworking space dedicated to social enterprise. And it’s an innovation and is open back up with everything you know, sort of changing and improving here. And just wonderful place to be. I feel light, I feel happy and so many good people in the space that I don’t always get to see anymore.

As as many of you may know, that’s on. I are in Toronto right now and it was interesting time at all. But we love London and I miss London very much, so we might do this a little bit shorter today. I’m sorry I’m talking again. That’s why you need to make one more thing.

This is office hours, five officers, five. Yes, yeah. So we have a lot on the go this weekend this week this. Starting next week, so this was probably going to be a little short. But I think part of what we talked about is wanting to build this habit and keep this.

Keep this going and making sure that we’re creating updates and and so I’m really happy that that’s online. And join here today and I said if you guys don’t join, I’m going to. I’m going to do it anyway, so I’m going to talk to myself and hopefully you actually watch this back, so I’ll shut up for a second after Lauren. I guess any any any thoughts right off the top and then let’s let’s just talk about this sort of topic that’s on our mind right now.

3 – 0:01:33

1 – 0:01:35
I helped out a bit with the cleaning up. The transcript this morning. I was actually quite impressed with the adding of the at the new features for actually adding the transcript was very clean. Yeah, good work guys. I’m excited to use that some more and I know I enjoy adding transcript I know. People really enjoy it, but.

Attention to detail its data entry, which is a part of accounting, so I am already in the swing of that anyway, so I’m looking forward to more of that.

3 – 0:02:06
Thank you Lauren. I really appreciate that. It’s thanks for the whole team. Actually Tara team. Everyone worked very hard to bring the system up because like there are many systems available. But like I don’t know how did you. How do you feel when you are trying to edit the audio or video using that new feature?

What we what we have at speak here is like you know how? How are your feelings like? Was that easy is like hard to understand. Did anyone explain to you before you went to the transcript editor that that mode or like did you read? Instruction, did you use the shortcuts?

What we have? How was that experience?

1 – 0:02:41
If you can talk about a little bit. Really reading instructions, so I kind of went into a blind. There’s a bunch of like shortcuts. I know we’ve been talking about and I kind of saw that like on the video to the left and I didn’t use that. I just kind of scroll to where I had to go and did my thing. But yeah, it’s it was good.

It was so surprised by like how fast. All the PDF and Doc export as well was quite impressed.

2 – 0:03:09
OK, I don’t think you’ve ever been forced to do too many Tran Scripps Lauren butts up. Now that we now that I’ve seen the new system. And then thinking back of the days of what we used to do with the double click, an like ever Oh my God, I just I can’t believe yeah, you know I always I guess just tell this story quickly ’cause we don’t share that publicly. We had a wonderful person on our team helping us out. Named Silas and Silas had to edit the transcript once using the interface. I think 45 minutes maybe an hour.

And by the end of it, that’s why we’re in the room. Maybe when you were there too and we just heard like. Smashing the table and this is a nice awesome guy and just like you know one point. Just gotta go for a walk. Guys like that there was this horrible, frustrating experience in dealing with media dealing with transcription, dealing with all this stuff is not not an easy thing, so I’m really glad to see some big improvements and everyones worked hard on it.

I would say the transcribers have worked really hard on it. We have so wonderful people have been helping us out and giving us feedback in real time amazingly. Things obviously to make their job better, but also because they deeply care. About making our system better, but also just. I agree like there’s some transcribing it away is actually like.

This is some art and science to it. And like human language is so complex and that’s something that we’ve come across as we’ve moved into. Of course, always we’ve always known that from our automated transcription, but now this human augmentation. How many nuances there are in language? How much understanding you actually need to have an?

That’s why I said this a couple of times that maybe you have any thoughts. I love you. Haven’t heard your prediction back, but like when does automated transcription actually capable of replacing human transcription and my belief? Especially after seeing some of our work the last couple of weeks. So a lot further away than most people think, because human languages just so complex.

Yeah, that’s that’s very true that the complexity of the language,

3 – 0:05:10
the pronunciations, the accent, is very tough. And even if the the audio and video clip is not that accurate, you know the the the frequency of that clip is not enough to listen back to identify 100% transcription, even with the human to do with the same thing with the machine is very difficult. It’s like, yeah, if it is a very English US or Canada accent, then it makes sense sometimes. But if it’s like the. Overlap between the two you know persons. Then it’s very difficult to identify those moments anyone.

The words and a couple of words. I think we had that conversations but like I heard something different like I ask you like Taylor. Is like this, do you? Do you listen to heard the same thing goes like is that different? So like very difficult to understand?

Sometimes the pronunciations of those speakers, if they are not, you know from coming from that background. So that is very difficult and the prediction I don’t know. It’s like if that audio clip the accent pronunciations are not proper. I don’t just like next fire. Maybe your next at least 10 years.

You don’t see the accurate 100% transcription doing by machine. It’s like yes, we need the human element in that to identify those gaps to to get that accurate transcription.

2 – 0:06:25
I think so. Lauren any predictions from you? Well, what’s that one law about technology advancing? It’s like one more slot. More is like doubles every year and a half or so.

2 – 0:06:42
Don’t excite cheapside. So and Moores law actually believe came to an end. This year last year was the first I should be saying this without 100% verified love Moore’s Law and that was such an inspiring thing, but I think they’re finally hit this spot where that stopped compounding in the same way. Returns maybe yeah, so the the part that was interesting that stuck out to some feedback that we got about a transcript was. Context and that part is, I think, one of the bigger gaps that we’re going to. You know, I talk about again, complexity of human language, but it’s also the complexity of conversation, an understanding of meaning within within that conversation so. What is very hard for machine to do is is say OK above mentioned in this conversation was this topic or this item and so now when we look lower you know much farther down to the conversation and some things mentioned that sounded like another word.

But if we knew had the context that we would know that there may be talking about the word that was mentioned before, it would then make that override. But that’s very complex. That’s very complex for a human to make that decision, especially with not fully audible mention or pronunciation of a word. And that’s going to be very difficult for machine. And what is the confidence that that it knows that it should even make that override?

Or how does it signal that it should even be a suggestion that there’s an override of that so? That was something that really stuck out to me is how do we? Account for context through through machine you know through machine, through speech, to text, but then a much deeper level that’s actually needed to do this well without without people being involved.

3 – 0:08:33
Yeah, I think so. Context is very important, but the same time to understand why machine it’s it’s. It’s like the other side of the story is like for the human even like when we like listening the one hour audio clip or video clip we might miss some context in the beginning of the convergence. Like is this? The same point is like the different point that now talking about, so it’s like it’s very difficult to understand. In terms of the machine, but also the human element that what you said is like the complexity and.

The understanding of the context because that as a human we we talk a lot of different angles and different touchpoints, and we change very rapidly.

2 – 0:09:12
Yeah, when you’re talking to when you’re talking to another person, at least you try to like. Especially we have a couple of tips between Tim and I very associative thinkers on our team. So as we start speaking, more thoughts come, and then if you’re looking at the person, you’re hoping that they are following you and then you can loop it back around. But that sometimes I mean you guys have seen. Probably even anyone who’s watched office hours.

Sometimes those loops take a long time to come back around, and it doesn’t always necessarily hit home and make sense the way you think. So if if I can lose Lauren and that’s on Tim speaking, who are some of the smartest people I know and also? Are deeply involved in this. It’s probably very easy to lose a machine during those thought processes as well too, and this one other part that I was thinking about was more and I were talking about it a bit, but just. That idea of when you write and.

It you feel good in the moment, but then you go and review at the next day and you realize how many mistakes you’ve made or how incoherent it was. That’s something really fascinating to me, both from a conversation part that sometimes you’re waiting to hear, but also from a transcriber perspective. Looking at going through, for example, an hour and a half of transcription, thinking that you caught everything. But during that process your mind has such a cognitive load. By the end you’re pretty exhausted, and you can’t necessarily make the same good decision making throughout or.

Process from an overview, so I’ll stop there.

1 – 0:10:39
I’m not sure if you guys have any thoughts on that. That’s a that’s a big part of accounting to not so much with him. Language is more just taking a look back, auditing yourself and looking at what the hell was. I thinking 2 weeks so I put that there and then you make your change. So all of this technology here is a great way to audit yourself. Constantly improve upon yourself and think like.

You know why? Why I say like this? Why I write like this back then? What was wrong with me? Which kind of comes in the context of like?

The emotional analysis that we have of like what Mood was I end at that time like why? What was the cause of me? You know making this mistake here there and you know what can I do differently now?

1 – 0:11:26
Yeah, that’s very good point because like when we write even,

3 – 0:11:29
let’s say, accounting the finance or the transcription or the code, there are many nodes are connected with each other. That can be how was your day. How’s your morning is like you know, how did you react with something happen in your on that day? It’s like the same story even into the coding, right? ’cause for example, if we if we write some, if I personally write something, and if I check that code after a week, I’m like. Why? Why did I write in such a way?

That’s like that doesn’t make sense even to me sometimes, right? Because you always try to improve then what was before. That can be if you might have more experience when you write a book is like I think I can improve better than this and there is always a way. There’s always a scope of that one person improvement I think. So even with the with the transcription context when they go over the transcription, that might be the exhausted.

But if they look at the next day the words or that couple of pronunciations might be different then it’s like oh. I missed this here, so it’s like that makes so much sense I think.

1 – 0:12:30
So yeah, I think you know we talked about how,

2 – 0:12:33
how do we ensure the quality? Because a lot of people who are. Now coming to us to get transcriptions cleaned up with automated not being enough ’cause there either professionally pump publishing professional content or they’re sharing research. They expect it to be at a very high level of quality, so we need to figure out you know how can we ensure that. And I personally believe. Especially as you know related to trans transcription, but coding and writing is sometimes you get what you give so much effort throughout the actual an initial process.

You do need a whole other day to go back, or you even need someone else to review it. That’s why there’s not that many people. I believe. We’re probably. I mean, there’s lots of people who self publish, but they always a lot of people get other people to come and edit that you know you need that fresh mind, you need that other perspective to come in and actually look ANAN.

Really take a little bit more of an objective or. Detached approach because it’s hard when you get lost, lost in your own thing and I don’t know about, so this is actually just wondering like. One of the things that made me think about is like sometimes when you’re here in, you’re fired up and you have an email. You know if you’ve got a little upset or something, or emotional or high intensity, you change your email writing and then and then the rule that I learned through my career was always wait 24 hours. Do not send that Tyler.

If you’re angry, do not send that email. Yeah, and I just wonder, and that’s where the great lesson. So anyone who doesn’t ever angry with writing emails way today, but they have you. Have you ever angrily written code? And what happens? What happens when your intense or hot with code?

Is that it ever happened?

3 – 0:14:09
That now I had that question before? Actually, if I am in that sort of a mode, I would rather write a code in sort of the flow or the flow chart I draw or write a code in my notebook instead of writing the actual code, because what I do in those situations is like when I put into the notebook, I know what exactly I have to execute becausw. What happen if I put into the code base? You might miss the many many conditions in many situations how the code is connected with each other, so I I never did and I don’t know if I will do in the future. It’s like what I would do is like I would write into the notebook. So I’m not touching the codebase.

I’m resolving the point or the actual thing which I wanted to work upon, and on the next day I go back and see. OK, this is the flow which I wrote when I was hard as like you know, I had that anger source or anything. So I think so that’s the thing. What I follow is like no don’t touch the code base unless until you’re confident that you can execute it very well.

2 – 0:15:10

1 – 0:15:12
First drafts are very important, pretty much with some sort of work really. Then it’s a. I think I guess we’ll maybe,

2 – 0:15:21
you know, we our goal is to keep this short and this has been a fun conversation already though, so I you know kind of cool. But like why? I guess I just won’t touch on. My hope is that someday someone will be trying to figure out this process. Or, you know, having struggles with this and like this is actually maybe a helpful resource.

So a couple of things that I’ve just thinking from. What are the challenges that we’ve actually seen throughout this process from taking audio and video? Putting it, running it through an automated transcription and then getting it to a human to come clean it up. It’s so weird when you say like a human, it’s got a human, get a flesh machine, a biological machine to come clean it up, and then produce a high quality final output and I’ll just touch quickly and feel encouraged to add anything along this pipeline of what we’ve seen is really important. First one being high audio quality at a core level, if there’s low audio quality, the transcription from the automated transcription will cause a lot of problems, and we’ve seen. A lot of hapiness from high quality, automated transcriptions when transcribers come and look at they say holy shit.

I almost wore. This is really accurate and that that makes them happy. That’s great, but low quality. I bet you will spend way more time editing it then if you could even done it from scratch, so I think that’s a big one that we seem. You mentioned it before battle was crosstalk, so if people are talking over each other, it’s almost inaudible. It’s an audible from a machine standpoint.

It’s very hard to deal with from a people standpoint, so that’s another thing. In this case, I’ll just go back to myself. Well, I’m using a webcam microphone, so I’m guessing my audio quality from this one is Dagger dated Degre dated 1. Deeply from what I usually have an so they look all run this through speak after and the transcription quality is going to be lower and it would take them longer to clean up the transcription. So even small shifts in webcam to the microphone that’s usually use has a significant impact.

Even compression of the file type of when it comes in. Is it a raw wave file versus a very heavily compressed MP3? Two other things I’ll just add quickly. Speaker identification just doesn’t seem to be there yet in terms of speech to text, there’s some incredible systems out there that are trying our absolute best, but it’s still a very hard thing, especially when there’s multiple speakers who have the same. For example, two female speakers to male speakers, or just in any of the same tonality.

And then in the final part of that process is OK. Great, maybe you have high quality audio speakers. Don’t talk over each other and they sound different, so you got good speaker identification. They’re still going to be in accuracies and unique terms. Language, you know, specific terms names and stuff, and if the system to then clean up the automated transcript to get to the professional level is a difficult thing to work with.

It makes it a miserable task for the transcribers, and that’s what no one wants, because these are good people who want you know to just do their work and be in flow state as they go through the transcription. So those are just a couple of things that I’m saying that’s all, and then more, and I don’t know if you guys have seen anything that sticks out to you of how to. Improve this process or what needs to be taken care of to do this right?

3 – 0:18:36
With the speaker labels, what is the interesting thing on the engineering side? What we implement is like identifying the channel so I know we’re talking with the three different channel. So that is quite a bit improvement. We see in terms of the development and the engineering side. But yeah, every day is a crossover, or maybe the same frequency, something happened in the background noise, then you might see the mismatch between the speaker labels, but there is a pretty good on the engineering side with the identifying the channel. And understand who is talking where and when.

That is pretty much. I think so that accurate, but the point where is examined, like the crosstalk over or like anything else. That is the pretty much issue with the speaker label and in terms of the implement. What I think, so we’re trying to do is like re verify the transcription again to make that touch of. Better or accurate with what it was before, so it’s like that’s I think, So what we’re trying to implement with our current system so we can give the.

1 – 0:19:40
Yeah, it’s like the more accurate transcription at the end. Mark, so we recently got rid of the. Video analysis just ’cause it wasn’t really worth it, but as far as like speaker labeling, have you found that it’s gotten better or worse since we’ve just gone to straight audio? Or is it hasn’t made any sort of difference, but so ever? I don’t think it has made a difference,

2 – 0:20:08
only would probably agree ’cause the the inference hasn’t come from. The speaker on, for example, there’s no connection right now between the speaking, speaking in the audio and then who’s on the screen like it would make sense if there was that connection there, but that’s just not happening from a technological standpoint, so we’ve actually seen I would actually say we’ve seen improvement in the core audio analysis and speech to text accuracy. That’s made it better, but there’s still some pretty significant gaps there across all systems. One thing I’ll just add that I thought was very fascinating, which we thought was going to be a helpful for the process, and we wanted to speed up the workflow for it. Human transcribers and also for ourselves, was when you replace the speaker. That it would replace across the entire system so that you would save all this time and say I don’t have to keep labeling this person over and over again.

But what happened was. Transcribers and us relied too much on the machine to get the speaker identification right, and we then had multiple errors in the final output because it’s like, oh, I rely on this technology. They told me it’s this speaker, and because you didn’t have to manually verify you didn’t do sort of your full due diligence or pay attention to it fully and that led to several errors where you know an interviewer is asking a question, but it’s labeled as the Interview E. And if you’re the end person who actually is doing the research. And all the sudden you likewise the interview asking this question. That’s going to be a big interrupting your workflow and then also your confidence in the final output of the system.

So that was I mean it’s a wake up call for us. And hopefully again, anyone who’s trying to figure this out is that there’s a lot that needs to go into that. And so in this new system we’ve updated it and remove the ability to. Change all speakers and update them across the entire system because it takes the onus off the manual transcriber and then leaves to reach the errors.

3 – 0:22:01
And also like another word, we discuss about like we found like I think so too. Major issues with the the current system. What we have is like one is like the speaker label, but Allah you might be angry with me or it’s like speaker label. When we do that it just again. I don’t know why I think I personally identify when it’s only the crossover between the speakers if it is three or two, but other than that it’s pretty straightforward.

’cause it’s not like oh you jump on just for a second is like oh really, it’s like I continue my my conversions again that is that maybe a second or two that change. Frequent change might affect the speaker labels, but I think so we almost. Mitigate that issue so that is that is a good another issue we’ve identified is. That very minor places where you missed the actual word,

2 – 0:22:51
or because you might miss the context around that. Yeah, I actually think that’s been one of the things and I know for some people I hope whoever is watching this will probably find this interesting, so I’m glad these have been. We’ve got very intense nitty gritty discussions on this stuff, so I hope this is interesting. This is interesting to someone out there. Whoever stumbles across it, but like, even even that like. When is another speaker? For example, interviewer and interviewee and interviewer and interview is making a point.

The interviewer says yeah, and you’re like Oh my God, I need to make that a whole new segment because that’s a new speaker now and that now I need to make sure that it’s time to stand properly and everything that adds a lot of work and an in an environment where it’s like, you know, like you want to be as efficient as possible. These micro actions really add up. So what could be? If you just avoided every time an interviewer said, yeah, you’d probably save. 5 minutes office every hour that you were doing, and that’s a lot, especially if you’re just trying to be efficient and productive and get through as many as possible.

So that was. That’s another thing that’s really interesting is how do you manage that and then. When, when, even if someone goes, yeah, do you add that as a yeah like there’s this subjective element to when is the whole ya word formed? And it’s very fascinating. Sometimes the automated machine will say oh, they said yeah there. So then that’s very likely to trigger the human to say, OK if it’s spelled out.

Yeah, I’m going to have to, you know, make this a new segment, but I feel like if the machine didn’t say didn’t generate the word that they’re not going to create a new segment because it was so minimal. So just a very subjective moment. Sometimes that’s actually coming out and then also somewhat being dictated by the technology and the results of the technology. Again, hope that fascinated someone. It is to me.

And I have nothing more to say. I think this is this is 30 minutes on how to transcribe things better, but it’s not adjustable, just just to make this option under plan. This is really important. This is important for data analysis. This is important for professional publishing of content in a fat soul.

Says we’re looking 5 to 10 years away from getting to that which I fully believe this is going to be important for a long time to figure out this process and do it as best as possible so. We really do care about this and see this as a valuable thing that needs to continue to grow in scope and efficiency and scale over the next 5-10 years, and probably even a little bit beyond that.

1 – 0:25:26
Just because humans are so complex they’re so intelligent and there’s so much context and things in language that we just can’t comprehend yet. And it’s only going to get more and more complicated as as things change like I was thinking. I’ve seen like technology come out from Google here and there about two different people, two different languages talking with one another and like. Transcribes it’s translate, translates that while there talking with each other, almost gonna cost that question.

3 – 0:25:59
That was my next point is like I know we’re talking about the one language so far, but when it’s like two or three different people talking, or maybe two different languages, and maybe it’s like you know we’re talking. I know we’re talking in English, but it’s like that can be the element of the Hindi language or that can be element of the French language. So then it can be a nightmare to translate and transcribe the whole. Media because either you have to do either full in English or maybe translate the whole media into the English and then transcribe or I don’t know. It’s like just identify the one language.

If you want to do the transcription so I don’t know why.

2 – 0:26:39
I’m good with English right now I cannot. That is enough complexity and I would say that. So even like the part that I hear you and you know, talk to friends too is like the constant switching back between like Hindi or Gujarati and English as well too. So switching back and forth mid mid sentence or boring as you like sort of indicated nicely which is like also with the growing. Words and vocabulary that’s added every year. You know, like there’s the diction.

Diction, whatever that Websters dictionary. And like every year they need to add new language that just emerged that year. And I just this is stupid. But like this year I’m thinking this AMC this GME stock thing and its stocks you know? And it’s like these words didn’t work existent and now all the sudden you need to have cultural understanding and relevance to even be able to to comprehend that.

And there’s probably a lot of conversations right now where people are saying stocks and machines are like, what the hell is going on and only people who follow Reddit and all these things understand. So there’s there very interesting point there we got. We got some work to do.

1 – 0:27:52
Need for mean translator soon. Mean translate.

2 – 0:27:56
Yeah, exactly so. Whole category of words that emerges all the time and abbreviations and all this stuff. So yeah, that’s why you gave me fear when you were talking about other languages, you know that’s what I’m trying.

3 – 0:28:10
What I’m trying on the research side right now is like. That’s exactly what I was trying because after seeing that English transcription issue, what I was trying to do is like I have a couple of video clips and audio, so the audio clips which has like the elements of Hindi, English, and Gujarati and I’m trying to transcribe that entrance, translate that and then transcribed and you have. You have no idea how it’s one. One minute clip, but it might take at least 10 to 15 minutes to work on that and identify one language might be just transcribed into the English be cause the languages so complex and the grammatical. If you if you’re using three language. It’s like if it is English.

The grammatical sentence might be way more different than maybe Hindi. French could hardly any other languages. So it’s like, hi, how are you? That is a way more different than in Hindi,

2 – 0:29:01
French, so that’s very yeah or even a big part of speakers. Also, the analysis so that you have to do multi language analysis as well too. And then sentiment. Just as an example and then we talked about this. I think one other time in office hours. But like there’s words in the German language that there’s no other translation.

Or in multiple languages, right? Like it’s like, wait a script, exams, great, it’s like even in the Sun script.

3 – 0:29:27
It’s a. It’s a I don’t know how many words we can understand. I mean yes, obviously we have the Dictionary of Sanskrit to English, but if it is that those elements into the audio clip, Oh my God, I don’t know. It’s like either someone has to understand for some script then transcribe,

2 – 0:29:44
translate and then transcribe. So and then analysis part what you touch upon is just the complete different story. Yeah, well we might not be able to take this on all by ourselves. Little help. There’s been years and years of work to get to where we are today from a technological stamp, not from us. From pioneers, Anan language technology and innovation, it’s truly mind blowing. You know what’s been done and where we are, so I think I’m good on my side.

I think this is, you know, I was. I was a little tired here and we don’t know. We got lots going on, but this is so much fun every time and I really enjoy what we talk about. I always feel energy after so more battle. Anything that you wanted to I guess.

1 – 0:30:27
Try to try to closeout this conversation with today. I’m excited to see where the technology takes us. I’m always excited here, which you guys are researching on and this is fun to talk about.

2 – 0:30:43
Looking forward to more talking. It’s like it’s time to executive. That’s beautiful. OK, I’m I’m not going to shut up there. OK guys, thank you so much for tuning in. Appreciate this very much, have a good rest of afternoon guys.

Will see a little bit later but let’s execute on this bad boy. Alright. Thank you so much.

Share This Post

Subscribe To Our Newsletter

Harness the collective intelligence on our our journey.

More To Explore

Transcript & Analysis Samples

Rebuilding A Flooded $2,000,000 McLaren P1 | Part 8

Interested in Rebuilding A Flooded $2,000,000 McLaren P1 | Part 8? Check out the video and automated transcript from the Speak Ai team for Rebuilding A Flooded $2,000,000 McLaren P1 | Part 8!

Transcript & Analysis Samples

How to Control a Crowd

Interested in How to Control a Crowd? Check out the video and automated transcript from the Speak Ai team for How to Control a Crowd!

Capture. Analyze. Excel.

We’re building technology to help you enhance your life.
Take the next step on your journey today. 

Don’t Miss Out.

Transcribe and analyze your media like never before.

Automatically generate transcripts, captions, insights and reports with intuitive software and APIs.