Invited Presentations

Keynote Talk: Ten Years of ISMIR: Reflections on Challenges and Opportunities

09:00-10:00 Tuesday (October 27)


  • J. Stephen Downie (University of Illinois at Urbana-Champaign, USA)
  • Donald Byrd (Indiana University at Bloomington, USA)
  • Tim Crawford (Goldsmiths College, University of London, UK)


The International Symposium on Music Information Retrieval (ISMIR) was born on 13 August 1999. This invited paper expresses the opinions of three of ISMIR's founders as they reflect upon what has happened during its first decade. The paper provides the background context for the events that led to the establishment of ISMIR. We highlight the first ISMIR, held in Plymouth, MA in October of 2000, and use it to elucidate key trends that have influenced subsequent ISMIRs. Indicators of growth and success drawn from ISMIR publication data are pre-sented. The role that the Music Information Retrieval Evaluation eXchange (MIREX) has played at ISMIR is also examined. The factors contributing to ISMIR's growth and success are enumerated. The paper concludes with a set of challenges and opportunities that the newly formed International Society for Music Information Retrieval should embrace to ensure the future vitality of the conference series and the ISMIR community.


J. Stephen Downie is an Associate Professor at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign (UIUC). He is Director of the International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL). He is Principal Investigator on the Networked Environment for Music Analysis project (NEMA). He has been very active in the establishment of the Music Information Retrieval (MIR) community through his ongoing work with the International Symposium on Music Information Retrieval (ISMIR) conferences as a member of the ISMIR steering committee. He holds a BA (Music Theory and Composition) along with a Master's and a PhD in Library and Information Science, all earned at the University of Western Ontario, London, Canada.
Donald Byrd studied music composition at Indiana University in the late 1960's, and then became interested in computers and their potential to help musicians. After spending a number of years as a programmer and consultant at the University's academic computing support services, he received a PhD in Computer Science with a dissertation on music notation by computer. Since then, He has worked extensively both in industry and academia. He was one of the principal sound designers and sound-design software developers for the Kurzweil 250, arguably the first synthesizer to reproduce sounds of acoustic instruments convincingly. He was also the principal designer of the influential music-notation program Nightingale. His academic background includes research on music notation by computer (at Princeton University); work on information retrieval in text, especially visualization and human/computer interaction aspects (at the University of Massachusetts); and work on music information retrieval, digital music libraries, and optical music recognition (at the University of Massachusetts and Indiana University). Most recently, he has been working on the "General Temporal Workbench," a timeline-based system for visualizing, exploring, creating, and "playing" temporal phenomena: a system general enough for use on any timescale from fractions of an attosecond to billions of years. He is currently senior scientist and adjunct associate professor in the School of Informatics at IU.
Tim Crawford is a member of the Intelligent Sound and Music Systems group in the Computing Department at Goldsmiths College, University of London. He worked for 15 years as a professional musician before turning to academic research. He is active as a musicologist, being internationally recognized as a leading authority on the history and music of the European lute, and is currently Editor of the Complete Works of the lutenist Silvius Leopold Weiss (1687-1750). Otherwise he is mostly engaged in the application of computational methods to music-related research. He managed the UK effort for the original OMRAS project (Online Music Recognition and Searching, 1999-2003), which was the precursor of the currently-running OMRAS2 project on which he currently works. He also conceived, led and managed ECOLM (Electronic Corpus of Lute Music, 1999-2006), and is currently Principal Investigator of the Purcell Plus project which is investigating the application of eScience in musicology and the longer-term methodological implications of technology for the discipline. He is one of the founders of ISMIR and frequent contributor as author or organizer and has recently jointly edited one of a series of books on "Humanities Computing: Modern Methods for Musicology: Prospects, Proposals and Realities," ISBN 978-0-7546-7302-6 (Farnham: Ashgate 2009), in which several ISMIR authors are represented.

Keynote Talk: Wind Instrument-Playing Humanoid Robots

09:00-10:00 Thursday (October 29)


  • Atsuo Takanishi (Department of Modern Mechanical Engineering, Waseda University, Japan)


Even though the market size is still small at this moment, applied fields of robots are gradually spreading from the manufacturing industry to the others in recent years. One can now easily expect that applications of robots will expand into the first and the third industrial fields as one of the important components to support our society in the 21st century. There also raises strong anticipations in Japan that robots for the personal use will coexist with humans and provide supports such as the assistance for the housework, care of the aged and the physically handicapped, since Japan is one the fastest aging societies in the world. Consequently, humanoid robots and/or animaloid robots have been treated as subjects of robotics researches in Japan such as a research tool for human/animal science, an entertainment/mental-commit robot or an assistant/agent for humans in the human living environment. Over the last couple of years, some manufactures including famous global companies started to develop prototypes or even to sell mass production robots for the purposes mentioned above, such as HONDA, TOYOTA, Mitsubishi Heavy, TMSUK, etc. On the other hand, Waseda University, where the author belongs to, has been one of the leading research sites on humanoid robot research since the late Prof. Ichiro Kato and his colleagues started the WABOT (WAseda roBOT) Projects and developed the historical humanoid robots that were WABOT-1 and WABOT-2 in the early 70s and 80s respectively. One of the most important aspects of our research philosophy is as follows: By constructing anthropomorphic/humanoid robots that functions and behaves like a human, we are attempting to develop the design method of humanoid robots to coexist with humans naturally and symbiotically, as well as to scientifically build not only the physical model of a human but also its mental model from the engineering view point. Based upon the philosophy, I and my colleagues have been developing the flute playing humanoid robots as WF (Waseda Flutist) series as well as the bipedal walking robots WABIAN series, the emotion expression robots WE series and the talking robots WT series, etc. Especially, the purpose of the flute playing robot research is to build the model of the human flute play and to clarify the model from the engineering viewpoint by reproducing the human-like flute play using a humanoid robot having the human-like respiratory organs for the flute play. By using the robot, we become able to experimentally confirm the model of the human flute play quantitatively. The flute playing robot/model is useful for the flute playing beginners to show how to use/move the organs or it will be used for the evaluation of the flute instrument production in industry. We also started the development of saxophone playing humanoid robots recently. In my keynote talk, I will introduce the research philosophy of my humanoid robots in general by showing examples, the technical aspects of the wind instrument playing humanoid robots and the other humanoid robots related to music.


Atsuo Takanishi is a Professor of the Department of Modern Mechanical Engineering, Waseda University and a concurrent Professor and one of the core members of the HRI (Humanoid Robotics Institute), Waseda University. He received the B.S.E. degree in 1980, the M.S.E. degree in 1982 and the Ph.D. degree in 1988, all in Mechanical Engineering from Waseda University.
His current researches are related to Humanoid Robots and its applications in medicine and well-being, such as the biped walking robots for modeling human biped walking as WABIAN (WAseda BIpedal humANoid) series, the biped locomotors for carrying handicapped or elders as WL (Waseda Leg) series, the mastication robots WJ (Waseda Jaw) series to mechanically simulate human mastication for clarifying the hypotheses in dentistry, the jaw opening-closing trainer robots WY (Waseda Yamanashi) series for patients having difficulties in jaw opening or closing, the flute-playing robots as WF (Waseda Flutist) series and the saxophone-playing robots WS (Waseda Saxophonist) series to quantitatively analyze human flute/saxophone playing by collaborating with a professional flutists/saxophonists, and the anthropomorphic talking robots WT (Waseda Talker) series which mechanically speak Japanese vowels and consonant sounds, and the other robots/systems related to his research area. His interest in humanoid robots has extended to the emotion of human that he started to develop the emotion expression humanoid robots WE (Waseda Eye) series and KOBIAN/HABIAN which emotionally behave like a human based upon the "Equations of Emotion." His humanoid robot WABIAN-2R was exhibited in the 2005 World Exposition in Aichi, Japan to demonstrate the knee extended walking using the human-like pelvis and seven DOF leg mechanisms. The emotion expression humanoid KOBIAN is developed based on WABIAN-2R. The latest model WL-16 carries humans and virtually any heavy load weighing up to 80 kg. This project is aiming at developing a practical personal vehicle which supports the society of Japan rapidly becoming an aging society. He recently developed suture/ligature evaluation system WKS series which shows surgeon trainees the quantitative scores of their suture/ligature skills. This system is commercially available from a medical model and training simulator company, Kyoto Kagaku Co. Ltd., in Japan. He is also developing the airway management robot for anesthetist/paramedic trainees collaborating with the company. Refer to for more details.
He is a member of Robotics Society of Japan (a board member in 1992 and 1993), Japanese Society of Biomechanisms, Japanese Society of Mechanical Engineers, Japanese Society of Instrument and Control Engineers and Society of Mastication Systems (a major board member from 1996 to current), IEEE and other medicine and dentistry related societies in Japan.
He received the Best Paper Award from Robotic Society Japan (RSJ) two times in 1998 and in 2005, the Finalist of Best Paper Award two times in the IEEE International Conference on Robotics and Automation (ICRA) in 1999 and in 2006, the Best of Asia Award from BusinessWeek Magazine in 2001, the Distinguished Research Activity Award in Robotics and Mechatronics from Japan Society of Mechanical Engineers (JSME) in 2003, the Best Paper Award ? Application in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) in 2004, the Excellent Research Award in 2005 from the Japan Society for Artificial Intelligence (JSAI), the Industrial Application Division Promotion Award in 2005 from the Society of Instrument and Control Engineers (SICE), the Best Paper Award in 2006 from JSME, the Best Conference Paper Award in IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) in 2009, etc.

Panel Discussion: Industrial Panel Discussion

15:45-17:15 Thursday (October 29)
In this panel practitioners from industry will discuss how their companies are currently using music information retrieval (MIR) techniques to solve problems for their customers. Panelists will also discuss emerging areas of MIR research that are particularly relevant for commercial applications. Audience members will have the opportunity to ask questions of the panelists.

Panel Organizer:

  • Paul Lamere (The Echo Nest, USA)


  • Tom Butcher (Microsoft, USA)
  • Norman Casagrande (, UK)
  • Òscar Celma (Barcelona Music and Audio Technologies, Spain)
  • Markus Cremer (Gracenote, USA)
  • Keiichiro Hoashi (KDDI R&D Laboratories, Japan)
  • Kunio Kashino (NTT Laboratories, Japan)
  • Malcolm Slaney (Yahoo, USA)

Biographies of the Panelists:

Paul Lamere is the Director of Developer Community at The Echo Nest, a research-focused music intelligence startup that provides music information services to developers and partners through a data mining and machine listening platform. Paul is especially interested in hybrid music recommenders and using visualizations to aid music discovery.
Tom Butcher joined Microsoft in 2006 to build large-scale web services for computing and delivering media experiences. His interests include digital media, artificial intelligence, the Internet, and various intersections thereof. Currently, Tom is a senior engineer in the Zune group at Microsoft creating data-driven media experiences, which include automatic playlist generation, social discovery, recommendations, and more. Prior to joining Zune, Tom's work encompassed automatic tagging, indexing, and recommendations at MSN Video. An avid music enthusiast, Tom records electronic music in his spurious free time using the moniker Codebase.
Norman Casagrande joined in 2006 as the head of music research. Since then he has been working on a wide range of problems, including collaborative filtering for user/item similarity and recommendation, dealing with scalability, dynamic playlist generation, users insight, audio and semantic analysis, fingerprint, spam fighting, and many other related topics.
Òscar Celma is the Chief Innovation Officer at Barcelona Music and Audio Technologies (BMAT), a spin-off of the Music Technology Group (MTG). BMAT offers solutions for music discovery and recommendation, musical edutainment and music copyright detection. In 2008, Òscar obtained his Ph.D. in Computer Science and Digital Communication, in the Pompeu Fabra University (Barcelona, Spain). Òscar worked in the MTG from 2000 till 2008 as a Researcher and Program Manager. In 2006, he received the 2nd prize in the International Semantic Web Challenge for the system named "Foafing the Music", a personalized music recommendation and discovery application.
Markus Cremer joined Fraunhofer Institute for Integrated Circuits (IIS) in 1996 after graduating from Friedrich-Alexander University in Erlangen, Germany, where he contributed to the design of embedded audio codec architectures and digital radio broadcast systems. In 2000, Cremer co-founded the department Metadata at the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany. Since 2005, he has been directing Gracenote's Media Technology Lab in Emeryville, California. Cremer is a member of IEEE, ACM, and AES, respectively.
Keiichiro Hoashi joined KDDI R&D Laboratories in 1997. His main research interest is in the area of content-based multimedia information analysis and retrieval, namely music, images, and video. Currently, he is working to implement multimedia content analysis technologies in practical applications and services. He is also working on research projects in data mining, and recommender systems. He was a lecturer at Waseda University from 2002 and 2005, and has received his Dr. Eng. degree from Waseda University in 2007.
Kunio Kashino is Distinguished Technical Member, Supervisor, leading Media-search Research Team at NTT Communication Science Laboratories, and Visiting Professor at National Institute of Informatics (NII), Japan. His team has been working on audio and video analysis, search, retrieval, and recognition algorithms. Its activities include development of basic theories as well as their commercial applications such as Internet content monitoring. He received his PhD from University of Tokyo for his work on "music scene analysis" in 1995.
Malcolm Slaney is a principal scientist at Yahoo! Research Laboratory. He received his PhD from Purdue University for his work on computed imaging. He is a coauthor, with A. C. Kak, of the IEEE book "Principles of Computerized Tomographic Imaging." This book was recently republished by SIAM in their "Classics in Applied Mathematics" Series. He is coeditor, with Steven Greenberg, of the book "Computational Models of Auditory Function." Before Yahoo!, Dr. Slaney has worked at Bell Laboratory, Schlumberger Palo Alto Research, Apple Computer, Interval Research and IBM's Almaden Research Center. He is also a (consulting) Professor at Stanford's CCRMA where he organizes and teaches the Hearing Seminar. His research interests include auditory modeling and perception, multimedia analysis and synthesis, compressed-domain processing, music similarity and audio search, and machine learning. For the last several years he has lead the auditory group at the Telluride Neuromorphic Workshop.