Tag Archives: NUI

#ITS2011: Augmenting Touch Interaction Through Acoustic Sensing

9 Nov

Pedro Lopes, Ricardo Jota, Joaquim Jorge
[Published at ACM ITS 2011, Kobe, Japan / paper]

Recognizing how a person actually touches a surface has generated a strong interest within the interactive surfaces community. Although we agree that touch is the main source of information, unless other cues are accounted for, user intention might not be accurately recognized. We propose to expand the expressiveness of touch interfaces by augmenting touch with acoustic sensing. In our vision, users can naturally express different actions by touching the surface with different body parts, such as fingers, knuckles, fingernails, punches, and so forth – not always distinguishable by touch technologies but recognized by acoustic sensing. Our contribution is the integration of touch and sound to expand the input language of surface interaction.

Update: some media coverage on the press [Portuguese press / Exame Informática]

Battle of the DJs: an HCI perspective of Traditional, Virtual, Hybrid and Multitouch DJing

16 Apr

To be presented at NIME 2011 (New Interfaces for Musical Expression, Oslo ). The remainder of the program is very interesting, so please feel free to look around here.

What about DJing?

How does it equates within an HCI perspective?

What forms of interaction exist with DJ gear?

How can one classify those interactions/gear?

The following paper addresses these questions, trying to create a framework of through concerning DJ interactions – a very idiosyncratic type of “music interfaces” – and proposes an evaluation of Traditional (analogue turntables, mixers and CD players), Virtual (software), Hybrid (traditional and software in synergy) and Multitouch (virtual and traditional in synergy).

The term multitouch here defines not a categorization but rather an implementation of a touch sensing surface DJing prototype. Explorations among new interfaces for DJing has already been hapenning for at least 10 years, but an evaluation from such perspective (interactions and devices) can aid researchers and developers (as well as DJs) to understand which tools fit in which style. 

Possibilities#1: Interface Manipulation

3 Feb

There are several ways I can go in implementing the Interface for the proposed work (The Multitouch DJing Tabletop), here I take some time apart to consider them all:

a) OpenGL (THE realtime standard programming language!)

Advantages: Using OpenGL is very straightforward and its available in various forms (3d/2d Engines, various object oriented languages and so on), it is by far the most efficient way to manipulate visual forms (specially in native code). Some of the related work uses OpenGL extensively, a usefull example is the hybrid solution: Mixxx.

Disadvantages: Although it is not a disadvantage “per se” of the language, in the short time we have for developing out work, OpenGL stands as a hard language, thus having some lower development curve apart from other fast-prototyping graphic languages.

b) ActionScript (Flash, the “new” phenomena)

Advantages: Flash is the new “de facto” standard as far as in-browser graphics visualization (numerous thanks to the social webbing and phenomenas like youtube or myspace that use Flash to support the data stream). Although I’m not a fan of closed-languages (Flash is not Open Source!) it has gained some very special space in the world. One of its biggest advantages it is the fact that it crosses the bridge beetween desing-programming, very easily. So an interface developed in ActionScript can be easily given to a designer for refurnishing… which doesn’t happen that easily in OpenGL.Another advantage is the extensibility of the language, very oriented to interactive systems prototyping!

Disadvantages:  As said, it is not Open Source. The official Flash Player cannot ship as part of a pure open source, or completely free operating system, as its distribution is bound to the Macromedia Licensing Program and subject to approval. Thus being its main disadvantages. Also it is NOT a fast environment as OpenGL, it is interpreted and object-oriented. It serves for smaller purposes (usually 2D and not heavily iluminated/post-processed scenes) but is suitable for interactive purposes.

c) Java Swing (interfacing with coffee beans!)

Advantages:  Java speaks for Swing as far as advantages, it inherits the good stuff from the java beans! It is also an interpreted virtual.machine language, platform independent and offering one of the best Object Oriented languages mankind has conceived. Fully extensible and very flexible when it comes to coding. Swing is an interface/windows environment to Java developers and can be used easily with new standards such as the OSC (through: SwingOSC ) which comes in handy for our project. Because it draws from Java classes its power it can be highly complex (concurrency, listeners, etc…)

Disadvantages:  The virtual machine forces Java to perform slower than native languages, it is not directed to interactive systems (like ActionScript is) but has expressive power to address them.

d) Pyglet (Python says: “Graphics!”)

AdvantagesPython is one of those newer languages that simple does A lot, being extended with Pyglet it can easily drop graphics for smaller games or UI. The advantages come directly from Python’s expressiveness and its full portability to various systems (also been ported to Java and .NET recently). It serves as a great bridge in projects that use multiple technologies.

Disadvantages:  I don’t know the level of matureness of Pyglet (simply because I’ve never worked with it), also as a not-native language Python is not as fast as the more efficient solution.

Platform Independency

All of the proposed solutions here are software independent, this is of course a mandatory detail that we must fulfill in order to achieve maximum portability/modularity.

All solutions are open and free, except ActionScript (more details on how to use fully OpenSource tools for Flash very soon!).

Computer Vision – Tbeta first analysis…

12 Jan

As previously noticed on the blog, I’ve already built a prototype (with very very DIY look and components) of a multitouch table – using Tbeta as the vision software.

Community Core Vision or CCV for short (also known as tbeta), is a open source/cross-platform solution for computer vision and machine sensing.

It takes an video input stream and outputs tracking data (e.g. coordinates and blob size) and events (e.g. finger down, moved and released) that are used in building NUI aware applications.

CCV can interface with various web cameras and video devices as well as connect to various TUIO/OSC/XML enabled applications and supports many multi-touch lighting techniques including: FTIR, DI, DSI, and LLP with expansion planned for the future vision applications (custom modules/filters).


  • Simple GUI – The new interface is more intuitive and easier to understand and use.
  • Filters (dynamic background subtraction, high-pass, amplify/scaler, threshold) – This means it�ll work with all optical setups (FTIR, DI, LLP, DSI). More filters can be added as modules.
  • Camera Switcher – Have more than one camera on your computer? Now you can press a button and switch to the next camera on your computer without having to exit the application.
  • Input Switcher– Want to use test videos instead of a live camera? Go ahead, press a button and it�ll switch to video input.
  • Dynamic Mesh Calibration – For people with small or large tables, now you can add calibration points (for large displays) or create less points (smaller displays) while maintaining the same speed and performance.
  • Image Reflection– Now you can flip the camera vertical or horizontal if it�s the wrong way.
  • Network Broadcasting – You can send OSC TUIO messages directly from the configapp for quick testing.
  • Camera and application FPS details viewer – Now you can see the framerate of both the tracker and camera that you�re getting.
  • GPU Mode – Utilize your GPU engine for accelerated tracking.
  • Cross-platform – This works on windows, mac, and linux.

Software License: CCV is currently released under the GPL License however we are still considering alternatives that may be helpful in code reusablity and development such as LGPL, MIT or BSD.


Founder: Christian Moore, Seth Sandler Developer: Artem Titoulenko, Charles Lo, citi zen, Daniel Gasca S., Ian Stewart, Kim ladha, Mathieu Virbel, Taha Bintahir Observer: Anirudh Sharma, Boyan Burov, David Wallin, Davide Vittone, Gabriel Soto, Gorkem Cetin, Guilherme Sette, Paolo Olivo, Sashikanth Damaraju, Sharath Patali, Thiago de Freitas Oliveira Araujo, Thomas Hansen, Tiago Serra

Gesture Research #1 (multitouching…)

28 Oct

So I gather a collection of gestures (that later will turn into my pool of tests for users analyse user behaviour – prior to defining the gestures for my dj system) by means of observation of multiple systems and multitouch demos:

a) Reactable

(note: the reactable relies mainly on tangible objects, but parameter changes are done via gestures/touch, so its a interface worth of researching..)


Gathered info from: Luckly, I’ve used the Reactable once – after a concert/showcase integrated in a festival where I played with Whit – and got a first person experience with the incredible interface. Also I’ve seen it being performed live 2 times in Portugal and there’s a bunch of videos out there for analysis. On a more technical note, you can acess the articles on the Reactable project at the archive page from UPF.

b) tbeta demos

(i.e.: the very popular photo demo, google maps navigation and etc…)




Gathered info from: from the demos package available on the NUI group page.

c) surface

(note: the surface has pretty much the same gestures as the rest of MT tables)


Gathered info from: Microsoft is a closed one, but I’ve tryed surface recently on the Mix event – where I saw the talk of August de los Reyes (director of the project) – and had my first person experience with the product there. Also there are a lot of videos to analyze the type of gestures mainly used but theres a void of technical information available.

d) iPhone and similar PDA/mobile devices


Gathered info from: Recently I participated in the Future Places festival and in the final day I played with the THE FUTURE PLACES IMPROMPTU ALL-STARS ORCHESTRA, which gathers many artists to improv with music – luckly there was two guys playing with iPhone apps – one using sounds to feed an analogue synth and other using a touch app that sequenced music and sounds. Whatching them interact with those portable “instruments” told me a lil’ something about the gestures.

And: Also I’ve seen and tryed myself apps for iPhone that use gesture recognition to extend the capabilities of touch.

Some thougts on MT inspired by August de los Reyes (pt.1)

6 Oct

Last Friday (2th October) I attended the Remix 09 (many thanks to Andreia for the invitation), a conference day promoted by (and promoting) Microsoft, targeting the world of web design and future interfaces.

My interest to this was drawn by a talk about the Microsoft Surface project, specially because Microsoft tends to hide specs and implementation, I was wandering what would this be about. The talk was Predicting the Past: A Vision for Microsoft Surface, so I expected a lil’ more self-promoting.

Actually, it turned out to be a very interesting talk (congrats to August on that) with the emphasis being on the history of Computer Interfaces, rather than on the Surface. So I’ll lay here some thoughts and notes that I took along the talk:

The evolution…

The evolution of hardware and software pretty much opened the possibilities for expanding the interfaces with the virtual systems (i.e.: computers). First we had the CLI (command line interface) which is a all time favorite for many but is not suited for almost any task nowadays (mostly the ones that need interaction). With the invention of graphical stations for the computer, we arrived at the GUI (Graphical User Interface – which now stars the word “User” in its name). GUI still had a long way to go, from the early Douglas Engelbart’s PARC all the way to our sexy-looking-favorite-OS.

Why is not GUI enough?

The key here, is to look at people (the users) and their tasks. Sure the mouse isn’t a bad invention (it’s actually very good and precise when you learn it) but it sure isn’t intuitive nor made for every task. Can you imagine a painter drawing a picture with a mouse device… that’s the answer.

The NUI.

So the next generation of interfaces took everything to the physical level. Natural User Interface (NUI) is the term for describing many forms of interacting with virtual systems but with our own body or moving objects (called tangible objects or tangibles for short).

We shouldn’t forget many inventions that cleared the path, and showed us that many tasks need more than a keyboard and a mouse: gaming controls (joysticks, guns, etc…), sketch pads (digital drawing tools), digital modeling pens (3d modeling tools of today), MIDI interfaces (we soon realized that we cannot make real-time music with a mouse), and so on…

But NUI set itself apart by creating a new metaphor, that its almost non-existing. With all these devices (mouse, keyboard, pen, and so on) we are creating a metaphor for simulating a physical movement that will translate into a different virtual action. But NUI sets things more closely, a interaction in a NUI is much more familiar and real because the metaphor is embedded in your consciousness.

Mental Mode… Physical <-> Cognitive

A thing that August mentioned was the difference between the mental modes of these three interface levels:

CLI: disconnected, you have to type certain “codes” to make things happen.

GUI: indirect, you indirectly are making things moving and going.

NUI: unmediated, nothing is between you mind and the interaction level, you have the commands in your mind.

*IU (next generation): The physical and cognitive will be together to assume a higher level of interaction.

Interaction type…

A very (very!) interesting thing mentioned at Remix is the differences in these various layers, when it comes to interacting with the system to perform a task:

CLI: directed, you express a very direct command (that you must be aware of) and specify everything by text.

GUI: exploratory, in graphical systems we have the possibility to explore the world and discover how to perform the tasks using the given metaphors (grab, drag, etc..) and the simulated physical tool (mouse).

NUI: contextual. Interaction in a natural interface is always bound to context. If you are trying to move a dot from one point of the screen to the other, you drag it by hand (intuition/familiarity/social-aspect one may say…) but if you are trying to draw a line from one point to the other you will draw a line with your finger (which is exactly the same action – but in a different context).

Final note

1) I still have some more on this, probably will be posted tomorrow, and still spin around the concepts being NUI – specially regarding multi-touch on a table (this is just part 1 of probably 3).

2) Also August show an interesting way to learn/study how the users would like to interact with a certain task performed in a MT environment by doing a series of workshop-user-tests with just a plain paper. [very interesting for my work]