Today somebody in a group I’m in which has some accessibility issues was yet again complaining that their Dragon Speaking software was not playing nice with Firefox, which led me to see if there was an alternative, and surprisingly i found none workable at the plain user level beyond Dragon, and upgrading for that person might actually be costly (From what they say it starts at nearly $200 but apparently can go as high as $700? Not clear yet).
So, obviously now I’m checking about the FOSS side of things, a search has been inconclusive as i see stuff for developers, multiple different projects (which is a marked improvement from a decade ago when i last tried and failed to do this), but so far haven’t found anything at the user level.
Have i overlooked something? Or is it that we’re many years later still at the “building libraries” stage without actual user-level stuff people can just apt-get or download?
Quick edit: I must insist, is there something for USERS, not DEVELOPERS, that i have overlooked? APIs or commandline programs or learning models are not a software i can hand to my non-programmer friend to install on their computer to replace Dragon to help them write on Firefox
For PC’s the whisper can be now run on many common laptops and desktops ( with almost good accuracy ) , thanks to projects like whisper.cpp and faster-whisper.
There also happens a lot of model’s getting released by meta like wav2vec2 and MMS ( not seen using anywhere though )
I guess there are many python packages for aforementioned projects too.
You could look into whisper. It’s a neural net one from openai but they’ve actually opened it up and it can run locally.
That was one of the various projects for developers i saw and mentioned and it’s not something my normal user level friend can just install in their computer unassisted to start writing stuff for them on Firefox, i imagine that somebody could develop something using that as a backend but (again, unless i’m overlooking it) i don’t see anything at the moment even in alpha state
It’s definitely not ready to hand off to a regular user to install and use, you’re right. I don’t think there is an off the shelf alternative.
People are recommending the proprietary FUTO voice app. This app, and other FUTO apps like Greyjay, are non-free.
https://hiphish.github.io/blog/2023/10/18/grayjay-is-not-open-source/
Not saying this is Foss or isn’t but Futo Voice exists for android
It is in fact non-free. (The article is about Grayjay, a product from the same company that uses the same license)
I use it and it works very well.
Maybe check out your browsers web speech api
Talon Voice is aimed at developers, but I’m using it as an alternative to Dragon right now. Not as user-friendly as Dragon, but very functional.
There’s sayboard and some other Vosk-based tools but only on android, as far as I know.
Speech Note for PC. FUTO for Android.