Eye-Balls: Computer Vision in the Circus

ABSTRACT

This proposal is for a discussion of the HCI and artistic performance issues presented by the use of interactive systems in live performances, with reference to a novel system designed to track a juggling performer and create audio visual accompaniments based on their movements.

SUMMARY

This project explores the use of computer vision technologies in artistic performance, with a particular focus on the circus arts. In order to test the ideas, a vision based tracking system has been developed, to track a person during a juggling performance, and create live audio & visual elements based on their movements.

Historically, many technical innovations, such as prerecorded music, and sequenced lighting effects, have reduced the interactivity and flexibility of performance, removing the performer’s ability to improvise. The main motivation behind this work is to allow performers flexibility in their performance, by driving external elements of the performance directly from the performer’s movements, rather than them having to perform in time to a fixed accompaniment. This flexibility has the added advantage of allowing the performer to interact with the audio & video, which arguably allows it to become more than just an accompaniment to the performance itself, and become integrated into the performance as a whole.

The use of vision technology removes constraints on the movement of the body that are inherent to traditional methods of interacting with computers or performance systems. This is very important in physical performances, where the body is used for the performance, but raises new challenges for interaction design, both in terms of the visual interface itself, and in terms of how to integrate the use of the interface into a performance. In order to enable the exploration of these processes, the interactions and output of the juggling tracker system are controlled by scripts which allow rapid prototyping of new interaction ideas.

his configurability, allows a wide range of performances to be created. It allows multi-part performances to be created, which use different configurations of the system. This presents further challenges; how to orchestrate the configuration changes using the vision system and how to integrate this control within the performance.

This technology has been developed in an iterative process, including workshops with professional and amateur performers, and a public performance. This has given us greater insight into the practical and conceptual issues relating to the use of this kind of system in a live performance setting. The workshops have also provided inspiration as to possibilities of the system which we had not envisaged, and we believe has also in turn inspired jugglers to investigate new areas of their practice which became interesting when interacting with our system.

BIOGRAPHY

Joe Marshall is a second year PhD student in the Mixed Reality Laboratory, University of Nottingham, UK. The title of his project is Computer Vision for Performance.

Joe has a BA in Computer Science from the University of Cambridge. His BA project was a real-time audio processing tool for use in live performance. Prior to his PhD, he developed music software, for Yamaha, and Sibelius, including artificial intelligence inspired work on assisted composition & arrangement (part of which became the Sibelius Arrange feature1). This experience led to an interest in designing systems targeted at artistic users, and the challenges involved in creating user interfaces for complex algorithms.

This project combines Joe’s technological interests with his interest in performance. Joe is particularly involved in circus skills, most notably juggling. He also briefly held a world record for long distance unicycling. When not doing circus related things, he is also a keen pianist.

JOE MARSHALL: GRADUATE SYMPOSIUM PROPOSAL

This document is in three parts, firstly the technology we have developed is described, the second part describes the design process used. Finally, Performance and HCI related issues we have identified are discussed, this is the primary area we are interested in discussing in the symposium.

THE TECHNOLOGY: JUGGLING TRACKING SYSTEM

The juggling tracker system uses a camera, a laptop computer, a projector and a form of audio output, such as a P.A. system. The system uses the camera to detect the position of a juggler’s arms and head, and to detect the movement of multiple balls that the performer is holding or juggling. This is done using a Particle Filter / Condensation [7] tracker, with some custom modifications to allow it to track the multiple objects quickly, (a detailed description of this algorithm is in [8]). A sample input frame and the detected positions are shown in Figure 1.

These positions are then input into a script based display system, which creates interesting audio and visual output which may be controlled based on the movement of the performer or balls. A very simple example script is shown in Figure 2, which creates a motion blur effect of coloured streaks following the ball patterns. When only one ball is juggled, the script switches into a mode where the juggler can ‘paint’ pictures on the display by moving the ball.

Simplified Tracking System

As well as the main tracking system, a simplified tracking system has been developed, which uses ‘GloBalls’(balls containing LED lights) [2]. These are juggled in darkness, which makes them very easy to track. Because it is dark, this tracker cannot see the body of the juggler, so their position can only be inferred by the scripts. However, it is useful for performance work, as it does not require calibration for lighting levels, so is very fast to set up. Also it is very efficient, allowing for more processor intensive audio and visual effects to be produced.

Related Art / Performance Technical Work

Video based interactive art has existed for several years, from early video camera installations [9], to David Rokeby’s interactive audio installations using primitive computers and custom built cameras [13]. Later, more sophisticated systems such as the MIT ‘DanceSpace’ [16] created active environments for performance. These artworks generally performed a pre-set translation from input to output, which was limited in its application to longer performances. Several Recent systems such as EyesWeb [4], and EyeCon [10] aim to create generic vision systems for use in performance which can be configured for a particular performance domain. Our system aims to take a specific domain of juggling, which allows the use of customised tracking algorithms that are more accurate.

if( !this.initialised)
{
// initialise the display
// draw circles where the balls are
this.balls=display.Add("BALL_CIRCLE");
// add motion blur
this.blurrer=display.Add("MOTION_BLUR",
this.balls,225,225);
// output blurred ball display
display.Add("output",this.blurrer);
this.initialised=true;
}
if( ballCount==1)
{
// only one ball in display:
// blur length long so performer
// can paint with the ball
display.SetParams(this.blurrer,255,255);
}else
{
// otherwise: blur length short
display.SetParams(this.blurrer,225,225);
}

DEVELOPMENT PROCESS

Workshops & Demonstrations

At iterative design process has been followed for the tracking system. Workshops involving performers have been an important part of this process. Three workshops were run, the system was also demonstrated to the public at the GameCity Lab event [1], which was an event exploring research into game and play related technology.

The first workshop was with a group of local jugglers, who are hobbyists rather than professional performers, and as such had variable levels of performance experience, from simply juggling for personal enjoyment, to large scale public performances. This workshop was run under lab conditions, in order to avoid any problems adapting to new environments, which were out of the scope of this preliminary testing. A video camera was set-up pointing at a juggling space, with the projection screen output from the system displayed in front of the juggler, offset towards their left hand side. The performers were encouraged to play within the system and to explore a set of demonstration scripts provided, which were updated in various ways as new ideas came out during the workshop.

Secondly, some time was spent with a member of a highly skilled and internationally renowned professional juggler, who has a large amount of experience of juggling performance. This was run at an external studio, using a performer owned computer and video camera due to practical constraints. This allowed us to explore the issues involved when moving the system into a new environment, as well as further exploring the issues and ideas brought up in the first workshop. It also gave us a greater insight into the needs of a full-time performer.

In the final workshop, the system was taken to a local juggling club meeting for less controlled testing. This and the GameCity Lab event were both very different from the previous workshops, as they were both in very uncontrolled situations, where issues of ‘crowd control’ were more apparent. However, the positive side of this unconstrained use of the system was an exposure to more exploratory uses, and people interacting with the system in ways that weren’t anticipated.

Performance

The most recent test of the system was a short public performance at a local cabaret. Due to the need to setup very quickly (there was approximately 30 seconds for setup of the projector, camera and laptop), the simplified tracking system and glowing red balls were used.

The theme of the evening was science fiction, and a performance was created to fit into this theme, based on the story of a man flying into space, and the disillusionment that he feels on returning. The audio used a combination of sound effects from well known space arcade games, and short musical loops sampled from the song First Man In Space, by the band All Seeing I, which were combined to make a seamless soundtrack which automatically altered length depending on when certain parts of the performance were reached. The performance was in multiple sections, from an initial section where a rocket took off, to various sections in space, where the space rocket battled against aliens and flew over the earth, in response to the juggling patterns being performed, to a final scene, where the rocket on the projector crash-landed into the roof of the venue, and the performer used the tracking system to write ‘the end’ on the projection screen.

Audience Response to the Performance

After the performance, informal comments were sought from audience members, which revealed an interesting split in the level of understanding of the performance amongst the audience. The performance was designed to introduce the system gently, with the first scenes making it obvious that the projections and audio were responding to the balls. In some cases this was clear from the start, however many of the audience members only understood that the two were linked interactively part of the way through the performance, and a small number only knew this in the last scene, where the performer wrote by moving a ball around in the air. Interestingly, this did not appear to be related to technical knowledge, but was more strongly related to the experience of juggling and juggling acts of the individuals concerned. It was unclear however whether understanding of the interactivity affected enjoyment of the performance. Figure 3 Two images from the performance. The red patterns on the left of the second image are the glowing balls.

HCI & PERFORMANCE ISSUES AND INSPIRATIONS

Currently, I have developed this system to a level where it is robust and usable for live performance and have done testing within it. I am now exploring the HCI and performance issues involved in the use of this kind of system. This area is where I am most in need of feedback and I hope to focus on this in the symposium.

Several issues have come out of the workshops. As well as these, the system also inspired some interesting positive developments from a performance point of view. These issues and inspirations are discussed below.

Related HCI Work

Belotti et al.'s Five Questions for designers of sensing systems [3] describes issues to be aware of when designing interfaces which do not have traditional input devices such as keyboards and mice. These questions are very relevant to our situation; however they do not take into account the presence of the audience. Reeves et al. [12] extended this to consider the way in which the interaction with systems occurs when spectators are present. They define a taxonomy which describes how visible to the audience the performer’s interactions with the system are, and also how visible and understandable the mapping between the performer’s interactions and the visible system effects is. These two frameworks have proven particularly useful during the design process of the tracker.

When working with long performances with multiple different sub-performances, the background-foreground model as described by Hinckley et al. [6] was used as inspiration; background interaction is defined as the juggling performance, and movements within a part of a performance; foreground interaction is the movements which cause the script to change state, in order to move to a different part of the performance. In many performances this two level framework is over-simplistic, but it still provides a useful tool for framing the issues relating to these longer performances.

In terms of design of performances from an artistic point of view, as well as the artworks described above, several ideas and frameworks inspired the creation of scripts and performances. In particular, ideas relating to ambiguity in terms of the mappings between inputs and outputs [5] were relevant. The idea of ‘wonderment’/curiosity [11] was also particularly relevant to the design of the demos for the public to interact with at the GameCity Lab.

Mistakes

Juggling is a highly skilled activity, which means that even in professional acts, dropped balls are a possibility. This caused issues interacting with earlier scripts that we designed. The avoidance of undesirable outputs in this situation (as described in Belotti et al’s 5th question [3]), becomes significantly more difficult when you have a system responding to your output. With the addition of an audience, this was a real issue, as our scripts often highlighted the fact a drop had occurred.

In addition to this, the system making mistakes also caused some issues, for example when the system didn’t see one ball, this sometimes led to scripts believing that a part of the performance with a different number of balls was being performed.

The use of these scripts in the workshops made these issues clear, and significant work was done in terms of making the scripts resilient to drops and system errors, for example by making the scripts only change state based on an action that was very unlikely to occur in error, such as throwing multiple balls up high, particular ball patterns, or the performer moving to a particular stage position.

System mistakes were also interesting in that when showing the bare display of the tracking, the mistakes that the tracking made inspired some playing with the system, for example by catching balls on the elbows, the system often decided that the elbow was one of the user’s hands, this gives an interesting effect of fooling the system.

Orchestration

Much of the existing analysis relates to interactive artworks, which are often only used in one configuration. Our system is designed for use in long performances, which necessitates the use of scripts with multiple different states or configurations, which alter during the course of a performance.

Most existing systems that use this kind of changing configuration use some external physical interface, such as an instrumented object [14], or use an external helper to control the system state. The aim of this project is to allow this control to be done within the performance itself, which given the movements involved in juggling ruled out the use of extra physical interfaces.

Two basic types of control for this were identified, firstly by using extreme actions which are not part of a typical performance as control actions, and secondly by using actions that are known to occur at a particular position in a performance as control, in a similar way to software that follows a musical score [15]. The first mode of control has advantages in that it allows the performer to improvise, knowing that they can alter the configuration of the system at any point. However, the actions are sometimes out of place in a performance, and break the flow of the act. The second method constrains the performer in some ways, because it tends to require that the performer do the control actions in a particular order. It is also less reliable in some cases, causing false or missed control changes. In practice, a mixture of the two control modes is used for different parts of current performances. In many cases attempts are made to make the extreme actions fit into the context of the surrounding performance, making them less extreme in context; the two extremes of control action are less used.

Arguably, this kind of control over orchestration of a performance is a major part of what makes these augmented performances compelling to a spectator, and makes them more than just a standard juggling performance with pretty lights. It allows a wider range of narratives to be explored in an act, to make something more than a pure show of skills. Integrating this control into a performance is a real challenge, which is not yet fully solved.

Monitoring

In our system, the performer can see what they are doing by watching a monitoring screen. This screen allows them to interact with the system, rather than it just responding to them.

Whilst there was initially some doubt as to how possible it was to watch the screen whilst performing, after some interactions with the system jugglers were able to watch the screen, except when performing very complex tricks. This was interesting in terms of its effect on their performance. In particular, the simple script shown in the technology section of this proposal, which draws trails behind the balls, inspired the jugglers to perform some new actions, such as a new trick, where balls were thrown up together, to create heart and flower patterned shapes, which looked uninteresting without the augmentation but were interesting on screen (Figure 4).

EXPECTED BENEFITS OF PARTICIPATION

I hope to discuss the interaction and performance design issues in this kind of system with a wider range of practitioners. I would be particularly keen to develop further insight into my work from the perspectives of noncomputer science disciplines, and from those involved in interactive art in settings other than the circus arts. I am also interested in issues raised by the questions below:

What are the perceptual issues involved with designing for an audience, when performance and human computer interaction are based on the same movements?
How can we orchestrate interactive systems for use in long performances? What are the performance issues with this orchestration?
How can interactive systems be used to provide explicit and implicit narrative that is integrated with the core performance?
Can narrative structure be created in such a way as to become integrated into the vocabulary of the art-form that the system is working with, rather than being a performance of two parts?
Are there ways to explicitly design systems to inspire performers?

BIBLIOGRAPHY

1. GameCity Lab. [cited Dec 2006]; from: http://gamecity.org/index.php/events/detail/game_ city_lab/.
2. Aerotech. [cited Dec 2006]; from: http://www.globall.com.
3. Bellotti, V., et al. Making sense of sensing systems: five questions for designers and researchers. in CHI. 2002.
4. Camurri, A., et al. Toward real-time multimodal processing: EyesWeb 4.0. in AISB. 2004.
5. Gaver, W.W., J. Beaver, and S. Benford. Ambiguity as a Resource for Design. in CHI. 2003.
6. Hinckley, K., et al., Foreground and Background Interaction with Sensor-Enhanced Mobile Devices. TOCHI, 2005. 12(1): p. 31-52.
7. Isard, M. and A. Blake, Condensation – conditional density propagation for visual tracking. International Journal of Computer Vision, 1998. 29(1): p. 5-28.
8. Marshall, J., S. Mills, and S. Benford. Using Object Interactions to Improve Particle Filter Performance. in BMVC. 2006.
9. Nauman, B., Public Room, Private Room. 1969- 1970.
10. Palindrome. Eyecon. [cited 2006 Nov]; from: http://eyecon.palindrome.de/.
11. Paulos, E. and C. Beckmann. Sashay: Designing for Wonderment. in CHI. 2006.
12. Reeves, S., S. Benford, and C. O'Malley. Designing the Spectator Experience. in CHI. 2005.
13. Rokeby, D., Very Nervous System. 1986.
14. Schertenleib, S., et al., Conducting a Virtual Orchestra. IEEE MultiMedia, 2004. 11(3): p. 40- 49.
15. Schwarz, D., A. Cont, and N. Schnell. From Boulez to Ballads: Training Ircam's Score Follower. in ICMC. 2005.
16. Wren, C., et al., Perceptive spaces for performance and entertainment: Untethered interaction using computer vision and audition. Applied Artificial Intelligence, 1997. 11(4): p. 267-284.