Speech & Audio

Bluetooth Audio

The Bluetooth mandatory audio codec for music is subband codec (SBC), as described in the
specification of the Advanced Audio Distribution Profile (A2DP) in appendix B.
SBC obtains high quality audio at medium bit rates with low
computational complexity. It uses the same polyphase filter bank as in MP3 with 4 or 8 subbands, an adaptive bit allocation

Sphinx4 Setup

Sphinx 4 is a Java speech recognition system in CMU's sphinx ASR family.
To try out the system, download the binary
for sphinx 4, the setup your environment to support the Java Speech API (JSAPI).
Then run the demo:
java -mx312m -jar bin/HelloWorld.jar
java -jar bin/HelloDigits.jar
java -mx312m -jar bin/HelloNGram.jar
java -jar bin/ZipCity.jar [-continuous]

Open source speech recognition comparison

HTK
license: prohibits redistribution and commercial use, but using HTK to train models in commercial R&D is allowed (http://htk.eng.cam.ac.uk/docs/faq.shtml)
Development Language: C
Latest release: 3.4 (13 December 2006) http://htk.eng.cam.ac.uk/download.shtml, registration (free) required.
Platform: cross-platform, can be built on linux/unix, Mac OS X, and Windows
Support: well known HTKbook, active mailing list

Sphinx

license: BSD

Pitch Determination

Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio.

Back to top