Sharing (old) knowledge#1: Speech recognition and synthesis

23 Mar

The following information was written by me one year ago on the Intelligent Multimodal Interfaces course internal forum. Since there are new students in need of fresh info, I reposted it here, make the best of it:

[Recognizers]

Sphinx Speech Recognition (open source), a cool demo can be seen here (controlling PureData with speech) – also I’ve found this how-to interesting (using python+sphynx, from the same author)

Sphinx4 (rewrite of Sphinx into java – more cross platform), there’s also a pocket version for mobile systems (iphone and so on) – it’s all part of this project from Carnegie Mellon.

[Synths]

eSpeak (written in C, either Win or Linux)
FreeTTS (java, cross)
flite (written in C, once again from CMU)

Web version: At&t text-to-speech (not open licensed)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: