Sharing (old) knowledge#1: Speech recognition and synthesis

23 Mar

The following information was written by me one year ago on the Intelligent Multimodal Interfaces course internal forum. Since there are new students in need of fresh info, I reposted it here, make the best of it:

[Recognizers]

Sphinx Speech Recognition (open source), a cool demo can be seen here (controlling PureData with speech) – also I’ve found this how-to interesting (using python+sphynx, from the same author)

Sphinx4 (rewrite of Sphinx into java – more cross platform), there’s also a pocket version for mobile systems (iphone and so on) – it’s all part of this project from Carnegie Mellon.

[Synths]

eSpeak (written in C, either Win or Linux)
FreeTTS (java, cross)
flite (written in C, once again from CMU)

Web version: At&t text-to-speech (not open licensed)

FlashBuilder4 for Linux

18 Feb

Flash development in Linux is often left to a generic text editor used with the free Flex SDK. It is certainly possible to code this way, but you do lose out on a lot of the functionality of a more specific IDE. The FB4Linux project provides a plugin for Eclipse that provides a similar environment to FlashBuilder 4. The only downside is that the installation instructions gloss over a few of the details required to get the plugin installed in Eclipse 3.5.2 (Galieo) or the most recent 3.6 (Helios), which is the version of Eclipse that is available in the Ubuntu software repositories at the time of writing.

Step 1

You will need to download the 4 files FB4LinuxaaFB4LinuxabFB4Linuxac and FB4Linuxad. When recombined using the command cat FB4Linux* >FB4Linux.tar.bz2 you end up with a standard tar/bz2 archive.

Step 2

Download and extract the jar files from this file here into the /usr/lib/eclipse/plugins directory. These plugins include Apache Commons Lang, Apache Xerces, Adobe RDC Client, and any other supporting plugins that they require. The actual file names are:

  • com.adobe.coldfusion.rds.client_1.0.266425.jar
  • javax.wsdl_1.6.2.v200806030405.jar
  • org.apache.commons.lang_2.3.0.v200803061910.jar
  • org.apache.xerces_2.8.0.v200803070308.jar
  • org.apache.xml.resolver_1.1.0.v200806030311.jar

Step 3

Extract the FB4Linux.tar.bz2 file you created in step 1 to a convenient location.

Step 4

In Eclipse select Window->Preferences->General->Capabilities and make sure that the Classic Update option is checked.

(untitled)

Step 5

Now click Help->Software Updates->Find and Install.

(untitled)

Step 6

Select Search for new features to install and click Next.

(untitled)

Step 7

Click the New Local Site button, and type in the location where you extracted FB4Linux in step 3 (you will need to specify the eclipse subdirectory to be specific).

(untitled)(untitled)

Step 8

With the Adobe Flash Builder 4 site selected, click the Finish button. You will be asked which features to install. Tick the Adobe Flash Builder 4 feature and click Next.

(untitled)

Step 9

Accept the terms and click the Next button.

(untitled)

Step 10

Click the Finish button to complete the install.

(untitled)

Step 11

Click Yes to restart Eclipse.

(untitled)

Step 12

You will need to download and extract a copy of the Flex SDK from here. You can set Ecli to use the new SDK by right clicking on your Flash project and selecting Properties. Select the ActionScript Compiler option.

(untitled)

You can then click the Add button to add the location of the Flex SDK you just extracted.

(untitled)

Step 13

Finally you will need to configure the external tools to run your Flash application. Click on the Run toolbar menu itme and select the External Tools Configuration option.

(untitled)

Right click on the Program option and select the New option. Then fill out the Location and Arguments settings to point to your Flash standalone player (which you can download here) and SWF file respectively.

(untitled)

Step 14

Congratulations. You should now have a fully functional Flash development environment in Linux!

 

Step 15 – A breathe of “AIR”

(some people suggest using Eclipse Galileo for better AIR development compatibility)

If you are going to build AIR applications you’ll encounter two errors:

1st – the create new Flex project wizard will never complete (finish the dialog window)

This happens because there is a missing file called “descriptor-template.xml” that is needed for flex applications. You can correct it with the following trcik but will generate the second error whatsoever, that you will have to correct later.

locate the file (it is bundled with flex SDK)

pedro@io:~$ locate descriptor-template.xml

/home/pedro/Apps/flex_sdk_4.1.0.16076_mpl/templates/descriptor-template.xml

Copy it to the eclipse folder that seems to be demanding it (I always eclipse from command line so I can look ate runtime exceptions and errors)

pedro@io:~$ cp /home/pedro/Apps/flex_sdk_4.1.0.16076_mpl/templates/descriptor-template.xml /home/pedro/Apps/flex_sdk_4.1.0.16076_mpl/templates/air/

Now the wizard completes, but when you compile and run the debug program it will generate a “ApolloLaunchDelegate?.fileDoesNotExist!” error.

2nd How to resolve the “ApolloLaunchDelegate?.fileDoesNotExist!” error:

This is very easy, you simply have to download teh Adobe AIR SDK for linux and compy it to the flex SDK folder (and select “merge” on the copy settings).

Done.

 

Acknowledgements and original post-source

Based on:  http://www.brighthub.com/hubfolio/matthew-casperson/articles/78818.aspx and my own changes

Also suggest to read: http://www.len.ro/2011/01/fb4-on-linux/

Many thanks to  for Eshangrao developing this eclipse plugin and to Matthew Casperson for the tutorial post.

 

Update on my publication list

11 Dec

Publications

Articles in International Conferences

1) Multitouch Djing Surface, Pedro André Santos Afonso Lopes, 2010, ACE ’10: Proceedings of the 2010 ACM SIGCHI international conference on Advances in Computer Entertainment Technology (BibTeX)

Multitouch Djing Surface (paper) ( ace2010-pedrolopes-mtdjing.pdf)

 


Articles in International Conferences

2) Battle of the DJs: an HCI perspective of Traditional, Virtual, Hybrid and Multitouch DJingPedro André Santos Afonso Lopes, 2010, NIME ’11: Proceedings of the 2011 New Interfaces for Musical Expression, Oslo (To appear)

 


Articles in National Conferences

3) Interação Multi-toque no contexto do DJing, Pedro André Santos Afonso Lopes, Alfredo Ferreira, Joao Antonio Madeiras Pereira, 2010, Interacção 2010 (BibTeX)

Interação Multi-toque no contexto do DJing (paper) ( interaccao2010-pedrolopes-mtdjing.pdf)

 


4) Trainable DTW-based classifier for recognizing feet-gestures, Pedro André Santos Afonso Lopes, Guilherme Fernandes,Joaquim Jorge, 2010, 16th Portuguese Conference on Pattern Recognition (RecPad 2010), (BibTeX)

Trainable DTW-based classifier for recognizing feet-gestures (paper) ( pedrolopesdtw-based_recopad2010.pdf)

 


5) Transhumance, Pedro André Santos Afonso Lopes, 2010, Future Places – Digital Media and Local Cultures (BibTeX)

Transhumance (page 169 of the Proceedings) ( futureplaces2010.pdf)

 


Thesis

6) Mt-DJing: Multitouch DJing Table, Pedro André Santos Afonso Lopes, 2010, Instituto Superior Técnico (BibTeX)

MtDjingPedroLopes.pdf ( MtDjingPedroLopes.pdf)

 


(updated from the IST official website)

ACE 2010: Multitouch Djing table

5 Dec

Here is the paper, presented at the International Conference on Advances in Computer Entertainment 2010, ACM & SIGCHI at the Interactivity Session.

Add metadata information to a PDF file (to my thesis!) with pdftk

25 Nov

Using pdftk (THE pdf toolkit) an open source toolkit t manipulate and do magic tricks with pdf files.

In most PDF files are meta-information stored (for example, about the author, the subject of the file or the software used). With pdftk can you can print this information to standard output or save to a file:

> pdftk example.pdf dump_data
> pdftk example.pdf dump_data output info.txt

The info.txt file now contains all the metadata of the document. It is composed of key fields and their values:

InfoKey: Creator
InfoValue: TeX
InfoKey: Producer
InfoValue: pdfTeX-1.10b
InfoKey: CreationDate
InfoValue: D: 20041226182200
NumberOfPages: 4

Back up that file, because we will edit on top of that and then upload again with the pdftk to change the pdf file to include metadata.

InfoKey: Title
InfoValue: Mt-Djing: multitouch DJ table
InfoKey: Subject
InfoValue: Dissertation for Master degree
InfoKey: Keywords
InfoValue: DJing, NUI, multitouch, user-centered design
InfoKey: Author
InfoValue: Pedro Lopes

It must, this file does not include any information that can save a PDF file. Already occupied when updating fields remain unaffected if they are not included in the text file. You can also define additional key fields (eg “site”) and impose values. The update of the meta-information by means of:


> pdftk example.pdf update_info info_neu.txt output beispiel_metadaten.pdf

Output file and input file may not carry the same name. With a small shell script or batch file you can rename the output file accordingly.

MtDjing Paper presentation at ACE 2010 (tomorrow)

17 Nov

There it is the programme for ACE 2010, we will be presenting Multitouch Interface for DJing tomorrow in Taipei, Taiwan (see photo below).

taipeiGiantNightMarket(Taipei Hwahsi Jie Night Market)

Thesis Defence

30 Oct

My thesis defence is scheduled for 3th November, 12h, IST – Alameda.

midi1_s

(Abstract)

Disc-jockeys have come a long way, through technological evolutions. This path led them to the status and recognition they have achieved in our society. But as impressive as those technological evolutions are, as far as DJing is concerned, there are still few applications that support hands-on interaction over virtual DJ applications, and those are typically reduced to the traditional input devices.

Furthermore, recent proposals apply new DJ metaphors, but are not successful in maintaining coherence with traditional DJ lexicon of gestures and concepts, thus requiring an extensive learning period.

Accounting user feedback from an accompanying group of DJ experts, we devised a multitouch solution that maps physical core DJ concepts into a virtual application. In the proposed solution, gestural interaction from Traditional Setups remains coherent while inheriting advantages from the virtualization of the DJing domain. The system addresses typical requirements of contemporary DJing, as identified by our research, namely: robustness, low-latency, external control, adaptability, audio plugins and connectivity standard-compliance.

Additionally, we support task improvements, such as dynamic re-routing of music flow, seamlessly merging edit with DJ-performance mode and rearrangement of interface layout according to user needs. Our system allows DJs to exercise creativity with a natural interaction, creating scenarios that are not possible in the real world.

Comparison against Traditional, Virtual and Hybrid DJ setups showed how a multitouch DJing surface developed with DJ involvement, can suit experienced users changing to Virtual setups.

A new foot controller for MtDjing @ RecPad 2010

30 Oct

DSC04189Presenting our DTW-based classifier for gesture recognition @ RecPad2010

(Download the paper here.)

Next conference: RecPad 2010

22 Oct

I will be presenting a poster at RecPad 2010, alongside Guilherme Fernandes. We will present our Trainable DTW-classifier for feeg gesture recognition, which we built on a foot-controller device that allows to control Mt-Djing application, as shown below.

user2

(Controlling Mt-Djing with feet gestures)

The program is available now here.

Presentation of paper at Interacção 2010

22 Oct

Pedro Lopes apresenta MtDjing 1

(download the paper)

Follow

Get every new post delivered to your Inbox.

Join 336 other followers