Barry Vercoe joined the faculty at MIT in 1971 and established its Experimental Music facility two years later.
He initiated research into score-following software at IRCAM in 1983 and the following year was a founding member of the MIT Media Lab, where he pioneered the creation of synthetic music with the development of the Csound software-synthesis language. Until his retirement, Vercoe was the head of the Media Lab's Music, Mind and Machine group, which developed technology later incorporated into MPEG-4, the world's first international standard for sound synthesis.
Barry Vercoe was interviewed by James Gardner on 23 March 2012.
James Gardner: What are you working on at the moment?
Barry Vercoe: Well, since I’ve retired from MIT I haven’t gotten involved in the studio there at this point. Instead I’m doing work in Australia to take laptops to the remote, mostly indigenous, communities. These laptops have some music programs since I was a co-designer of the XO laptop [1] for children. That laptop was actually developed for the children at risk around the world—those in Rwanda initially, six or seven years ago, war-torn Colombia, Palestine, refugee camps and so forth. And this has now expanded out. I brought 5000 of these machines to give away in the South Pacific, mostly to the little islands—Niue was the smallest, I guess. I gave a laptop to every child in Niue. There are only 500 children there, so that wasn’t too hard. But the biggest problem in this part of the world with bringing children into an educated environment is the overpopulation that you find in Papua New Guinea where there are about 1.3 or 1.4 million children between 5 and 12, say. The government is concerned mostly with just those who go to government schools, but that’s about 25% of the population. Another 25% are catered to by the missionaries and church schools, and the other 50% don’t seem to go to school at all. Not just this week: never. And that’s a problem. So I took my laptops away from the government when I first met them at a sort of education retreat and went up into the highlands.
So you bypassed the government.
Basically, yes. And then my next stop after that was in Australia, taking the remaining laptops to remote Australia, where I helped form a new entity there called One Laptop Per Child Australia [2]. We’ve now got about 7000 laptops into very remote communities. They do have music programs on there, since I wrote a lot of the software that’s on this machine. But that doesn’t really get into the kind of music that I’ve been involved with at MIT.
Could you talk a little bit about the music software that’s on these low-cost laptops?
Well we tried to get the cost of the laptop down to $100—it was known as the $100 laptop—and we worked very hard to do that. We were sitting for hours and hours discussing things. I managed to get, on the motherboard, two traces, two channels out, but only one channel in. I lost that little battle there—it would cost another 35 cents to run another trace. We decided we had to have a good camera, high-quality video camera. That cost us a lot. And we got the price down to about $188 I think it was. The next version of this machine, which will be more of a tablet, will hopefully come down below $100.
As for the software that’s on there, well, it’s running Linux so it’s all home-grown software, written by a bunch of academics at MIT and then supplemented later by a much larger community of people. It’s open source, so at this point there are a lot of people around the world developing software.
They’re not necessarily educators, they’re people who are fascinated with the challenge of getting application, or what we call activities, for children, running on a laptop. But it’s not quite clear what the procedures should be to induce children to learn. In Australia we find ourselves in a first-world country with a first-world education system and then a third- or fourth-world appendage that had been rather ignored. And so the issue there is how to bring those people into the fold of educated peoples so that they have a future to look forward to. For instance, in Papua New Guinea...the population there when I left New Zealand was about three million and the population of New Zealand was three million. Nowadays, the population of New Zealand is about 4.2 million and the population of Papua New Guinea is six million. That’s the teenage/adolescent girls who have got no future except having babies. And you’ve got to stop that. The thing that will put a stop to that, and the terrorism that you find around the world, is education. So that’s why I’m devoting these two years to this particular course.
When did your interest in that start?
Well, it started when we started to develop the laptop. Of course I wasn’t bringing them to the South Pacific at that point. Eventually I persuaded my colleagues that we should bring some down here and the response was “but there’s no genocide down there” and I said “well, yeah, but there are a lot of people who are at risk”, particularly in some of the island communities. Places like Australia where you’ve got very remote communities—the remoteness in the desert is rather like the remoteness in the Pacific Islands. Whether it’s sand or water, it just separates people and keeps them isolated and not able to participate in the larger community and that’s a big problem.
So what sort of music software is on these laptops and how do you interact with it?
Well, we’ve developed something called Music Painter so that children can actually paint musical shapes and things. The little kids tend to draw faces, but when they hear that back it doesn’t sound very musical so they then begin to get a sense that it needs some rhythmic patterns and so forth, and pitch patterns. So without knowing traditional notation they’re able to work things out. And there are some neat things that the kids come up with. It’s an exploratory endeavour for them, just learning by doing and exploring
We now have extremely powerful laptops that are able to run very sophisticated electronic music programs, and there is perhaps more electronically-generated music in the world than acoustically-generated music. Back in the ‘60s did you see this coming?
Well, my first experience of electronic music was in 1962 when I went over as a graduate student to study with Ross Lee Finney at the University of Michigan. Now Ross had been a student of Alban Berg, who was a student of Schoenberg, so that put me immediately in the lineage of twelve-tone music and I started out doing things like that. At the same time, in the second term there, Ross had invited a composer he’d met somewhere by the name of Mario Davidovsky—a wonderful executant of the traditional cut-and-splice tape technique. I just love his music. In fact I brought one of his pieces, Synchronisms No.2 for flute, clarinet, violin, cello and tape to New Zealand when I next came home in 1964 or so. We did a performance up at the University, at the main hall there, and that was probably the first electronic music and instruments performance in New Zealand. Ron Tremain played the tape recorder parts, switching it on and off. That was an interesting experience for me, bringing electronic music to New Zealand from the US. I didn’t know much at the time about Lilburn’s work.
But my concern about that music was that it wasn’t very controlled; it wasn’t notatable. You can see in the scores of Davidovsky that the electronic parts are just a kind of cross-hatching on the score. The instrumental parts are very detailed but the electronic parts are not very detailed. And even the word ‘synchronisms’ is a misnomer because the instrumental and tape parts don’t really maintain very tight synchrony at all. They sort of run along somewhat independently and about every minute or so they achieve sync by... for instance, one of the techniques is that the instrumental parts will just go (sings) ‘digga-da-dum, digga-da-dum’ until they hear something (clicks fingers) some little cue in the electronic part, like (sings) ‘brrrrrdum-pum-pum’ and then you’re back into sync again.
So it’s ‘vamp till cue’, effectively.
Right. So I was really interested in having things that were much more precise, and the first example that I did was done on the computer. In fact the computer was a big IBM 7094. A huge machine—it took up a whole room. And I was able to carefully orchestrate all the things that were happening there, and knew exactly when things were happening. That was part of a piece that I was doing when I was working as composer-in-residence in Seattle. I had access to all the ensembles from around that area—orchestras and choirs and so forth—and put together a large piece as a demonstration piece for the music educators’ National Convention. That was 1967, 1968. The object there was to demonstrate how you could have many different schools involved in this Ford Foundation-supported idea of composers in schools, but I was asked to demonstrate it graphically by having one piece that involved seven or eight different schools. I had fun writing that score, particularly since we had only one combined rehearsal. So I wrote the score so that there were many, many different overlaid parts that I could rehearse separately. And then we brought them all together. It was based on two texts—one of them was the Latin hymn Veni Creator Spiritus and the other was a poem by Toyohiko Kagawa from his Songs From The Slums book, so the contrast between these two things—‘Now let us believe in the spirit’ and the conditions on earth. These were in opposition, of course, and that gave rise to a lot of ideas. So one of the examples that I then used was when these two things eventually do clash, and it’s then resolved by this electronic sound.
This was the first computer music, I would say, with an orchestra. It had a string ensemble, a wind ensemble, a percussion ensemble, a brass ensemble, six instrumental soloists, two choirs—one singing the Latin hymn text and the other singing the Japanese text in English translation—and electronic sounds. So it was a marvellous assembly of things. And at one point, all of these ensembles are running at different speeds. I was conducting, and you have to bring the various groups in and out of this cacophony then emerges the computer sounds.
What’s the name of that piece?
The piece is called Digressions. What it does is to take...well, it was a signal event at the time because up to that point, computer music—as distinct from cut-and-splice tape music—was rather simplistic in its concepts. You were dealing with little sine waves and carefully programmed things. And there was quite a gulf between what people were doing in the studios—with all sorts of found sounds, and other recorded and modified sounds—and what we were getting out of the computer at the time. So I was determined in this example to actually make the computer sound like a tape studio.
What were the challenges at the time—what was the process?
Well, I was just using sort of signal processing techniques; filtering and white noise modified by sine waves and things. So I pulled together lots of techniques that enabled me to get a sound very similar—very unusual for computer music at the time. And that was the sort of achievement that I made. That was my first computer music.
When you first heard Mario Davidovsky in 1962, were you aware of what was going elsewhere in Paris and Cologne and studios throughout the world?
Oh yes. Of course at the time, initially there had been Pierre Schaeffer...he was doing some very interesting stuff, cutting off the attack transients of notes and tacking them on to other things, showing that a lot of our instrumental recognition has to do with that. He did a wonderful piece I remember, where he took some oboe sounds and on to the beginning of each sound he spliced an attack shape that was that of a piano—in other words a plucked string or something like that, tacked on to the beginning of an oboe, and made little pieces like that and it just sounds like a harpsichord performance.
You went through the University of Auckland in the late ‘50s. Were you exposed to any electronic music at that time—had you heard any?
Not really, no. Nothing that sticks in my mind. It wasn’t really until I got to the US that I found myself in the midst...
And that’s where you heard the classics of the genre?
Yes.
What did you make of them at the time?
I thought that they were a new medium for expression, which I wanted to explore. But my way of exploring these was to get into something that would give me more control than the cut-and-splice studio. Also, I didn't really feel attracted to some of the early synthesizers that came out at the time, like the Moog synthesizer. I knew Bob Moog and Wendy Carlos and so forth but I wasn’t particularly attracted to those synthesizers because, as a stage performance thing, you would have to sit there tuning these analogue modules and so forth. And the idea of not having quite as much control—these things would float around during performance—wasn’t appealing. So that’s why I got into digital synthesis, and that was where I really found I could get the control that I wanted. But at the same time, of course, you have to create languages that give you that control from a composing standpoint. And that was the thing I next worked on.
You did a music degree and a mathematics degree simultaneously at Auckland.
Yes. I didn’t think they would ever meet—it was just fortuitous.
So at that time it wasn’t obvious to you to start learning computer programming, which in those days you had to do if you wanted to have the control you talked about.
I never really took a programming course in my life; one just sort of picks some of these things up. Even coming up with something called Music 360—which is a synthesis system that was very widely used for several years—I sort of backed into that. I didn't know I was writing a compiler, a language compiler, until I was in the middle of it and thought “gosh, I seem to be writing a compiler here”. So I wasn’t approaching this as a computer scientist might—basically as an artist just solving problems as the need arose.
So you learned programming out of necessity because it was the only way you could solve those problems.
Yes.
You’ve categorized people as being tool builders or tool users, and few people are both. As a tool user, what did you want from the tools you built? What was the goal of getting your hands dirty with programming?
Well, having access to a lot of technology at MIT I felt I was in a privileged position, and that other people would like to have access to a studio or something. I became rather devoted to making it possible for other people to use the technology that I naturally had access to. And that got me into running composer workshops and inviting lots of people in to use the technology. I then got into a series of summer workshops. They were typically four to six weeks long; there’d be a two-week section, where about 35 people from around the world would come in and get used to the tools of the medium. From those 35 I would pick out about 10 that I’d sort of commission—although I wasn’t paying them, they were paying tuition for the course. They’d be commissioned to write for a concert that was already scheduled, but one for which we just didn’t have any music yet. And that went on each summer. In fact John Rimmer was one of the students and Graham Hair was another. There were a lot of people who came in and took those summer courses in the ‘70s and early ‘80s.
That was my gesture to the field, I suppose, by making available the studio that we had at MIT—the first music-only computer studio. There were other people at Stanford in the AI lab and so forth, but this was the first studio devoted to music exclusively. And we had a PDP-11, which had been given to me by the Digital Equipment Corporation, and that gave rise to lots and lots of pieces from the studio in the ‘70s and ‘80s.
When it came to what software we would run, I found myself being challenged each time, mostly by a piece that I would be writing: I’d think “gosh, I need a better filter” or “I need something that will follow the envelopes” and so on. So all of the components for the language came out of necessity and there was a repertoire of things that just developed from need. That became the repertoire of Music 360, initially, and another thing called Music 11 and then ultimately Csound, which is what I’m mostly known for.
So the drivers for the innovations in those programs were compositional.
Oh yes, each time.
The first program you were involved with from a programming point of view was Music 360, which built on the work of Max Mathews’ Music N series of programs. What were you able to bring to Music 360 that previous versions hadn’t had?
Well, I’ll give you one example. In Music IV the envelopes tended to be just linear attacks and linear decays, and then there was another thing that the people at Princeton—Godfrey Winham, Jim Randall and Hubert Howe—added so that there was a shaped attack, and you could draw the shape of that attack etc., and then it would go into some sort of linear decay or something. What I found—when I was writing a piece called Synapse, for viola and computer—was that I couldn’t shape the phrases in the same way that my violist was shaping them. And I began to realize that we had designed envelopes, decays, in this language—in Music IV etc.—to sort of match what pianos do. And the performers who are wind players or viola players, where they have control over the steady state and can actually crescendo in the middle of a note, had a very different idea of envelopes. In particular they could even make a note die away very quickly, so you’d have a sforzando almost, so you could have an attack and then it would quickly die away but still then continue, say it had a very different shape, where you could continue it and then tail it off at the end. An element of control we just didn’t have. So then I said “well we’ve got to have envelopes that will do that” so that I could then match what Marcus Thompson, my violist, was doing.
Would it be fair to say that a lot of your research, your drive, has been to enable a more spontaneous interaction between live performers and the electronic parts of a score?
Yes, that’s a good description, because most of my pieces have involved live instruments. I still love live instruments, of course. So when I was talking about getting an envelope shape that matched what the live instrument was doing, this was because I wanted them involved in my what was my first combination—apart from the big orchestra piece that I was talking about, from the Seattle Opera House thing—of a live instrument and a computer-synthesized part.
I found on the MIT faculty a fantastic viola player by the name of Marcus Thompson; he was a member of the Chamber Music Society of Lincoln Center. A wonderful chamber music performer. And I wrote a piece for him called Synapse. Now you can tell from the name of that that I was concerned—synapses being connections— with the connections between, in this case, a live instrument and electronic sounds. Of course in those days the computer wasn’t fast enough to synthesize the sounds in real time, so we’re still at that point talking about a ‘music minus one’ set up. So the computer part would be synthesized very slowly and put on to tape and then we’d go along to the concert hall and start the tape. In fact Davidovsky’s pieces were faced with the problem of somehow synchronizing the two media, and that wasn't really being solved properly, as I said earlier. But in my case I was looking to something that would somehow connect the two. So my Synapse was an attempt to relate these two things. Now this was computer-synthesized music, so I then had very strict control over what notes occurred when. And I could have, in the computer part, a group of five and then a group of three and a group of seven; the viola part would have a group of five, a group of three and a group of seven so these things could be synchronized. Even though it wasn’t simple ‘oom-pah-pah’ music, the violist, Marcus, was able to really pull that piece together, so I was getting really tight synchrony between the two forces in that case. And that was appealing, even though the electronic part wasn’t being done in real time.
The viola player had to lock in...
Had to listen very carefully, yes. Many years later—about 20 years later, I suppose—I actually did a real-time performance of that same piece where the computer was fast enough now to pitch-track the viola, to do score following, and know where the violist was in the score and also to synchronize the accompaniment.
Did that come out of your work at IRCAM?
No. This came out of a collaboration I had with a computer company called Analog Devices. They were a chip manufacturer and they wanted to put Csound on one of their DSP chips, so I was really using a battery of array processors. This is around 1990 or mid-‘90s. And that was really quite a new thing to get into because that got me into real-time synthesis. And once you pass over the boundary between non-real-time, pre-recorded electronic parts and something that’s being generated in real time, then you can get into something where the computer is able to listen to the live performer and follow the live performer. The live performer then retrieves the original freedom that they might have had as a player—freedom they’ve always had in chamber music and so forth—and is not locked into a pre-recorded part. And as both composer and performer, that is a big world of difference. You’re suddenly into the real musical world, because before that the pre-recorded stuff put the performer into a straitjacket. And it was very hard to make that a musical outcome.
The computer functioning as a live accompanist is one sign of computer music technology coming of age, isn’t it?
Yes. Well, I had done experiments in Paris when I was at IRCAM. I was working with Larry Beauregard, a young flutist from Boulez’s Ensemble InterContemporain, and we did a lot of things for flute and electronic sounds. Now that was not a standard computer. That was an array processor called the 4X, which Boulez had commissioned to be developed at IRCAM. It was one of a kind, and so you couldn’t take that piece anywhere else. But I did do a lot of experiments with Larry initially, with flute and pitch tracking. The pitch tracking I was doing on the flute involved actually instrumenting the flute itself with optical switches, so we had something like 15 keys —there are 15 keys on the flute, so we had 15 bits of information for every transition. I was then able to follow what the fingers were doing on all of those keys and get a sense from that of which actual note he might be playing. Typically on a flute you can play say three notes with a given fingering, depending on how hard you blow. And so I just laid down three filters at those possible frequencies to see where the energy was, and I’d have the pitch nailed down in about 30 milliseconds. Now that’s faster than you or I can hear, so the computer’s actually hearing all kinds of things in the attack transients—the little blips that occur in the onset of a note that humans don’t pick up knowingly. It is part of our recognition of the attack transients—a clarinet starts very different from a flute note, so we’re subconsciously aware of that, but we’re not able to really track it the way that the computer was able to do.
You used the optical sensors on the flute because trying to track the pitch by getting the computer to analyse to the sound in real time was too slow at the time.
Yes, although the very next example that I wrote—it wasn’t a composition of mine—used violin. That was because Larry Beauregard actually died from cancer within a few months of completing that work with him. Then I decided that I wanted to look at how tempo rubato worked in practice. Whereas with Larry we were doing Handel flute sonatas, and Larry would speed up and slow down a little bit, but no self-respecting flute player would pull around, would do tempo rubato, in the middle of a Handel flute sonata. But you give a violinist a piece of Fritz Kreisler and they’ll do it without being asked. It was just wonderful to see that and have the computer try to figure out what was going on. Of course, it couldn’t predict, it didn’t know how to interpret the word ‘espressivo’, but what I was doing there was having the computer follow what the live violinist was doing, making a first guess. Every time it would make a prediction it would always be wrong, but it was then keeping track of where and how it was wrong. And after about four rehearsals, untouched by me, it would have it pretty much figured out.
So it would learn.
It would learn. Now this is not the real learning of music that you would expect a human to do, because a human latches quickly on to the phrase-structured grammar that music is. And so the phrase boundaries and things, getting in and out of a point of repose—which is what a cadence is about—is something that humans pick up. It’s a natural thing. A computer still doesn’t understand. I was basically doing a sort of statistical analysis of what was going on and so we ended up with something that worked pretty well, but was purely statistical, not based on a phrase-structured grammar.
So the computer’s saying “I know it’s more likely to do this, but I don’t know why”.
That’s right.
What pieces came out of those two lines of research with the flute and the violin?
I did a small original piece for Larry. It wasn’t very long, it was just an experiment, a testing out of the things we’d discovered. The things we were discovering there were rather fascinating, because we were finding out how live performers manage to survive on the concert stage. In the case of the computer working out statistically what is likely to happen, the fact is that no performer will do the same thing twice, but what happened during rehearsal is you get close enough—the collaborating players get close enough to be able to then fake it on the concert stage. So that’s what’s really going on. They are getting close enough and then it’s their musical wits that enable them to survive on the concert stage. Without the rehearsal and the understanding of the other person through rehearsals, and a reasonable sense of expectation of what they’re going to do you wouldn’t be able to do this satisfyingly on a concert stage. But the rehearsals get you close enough to that. And that’s what was happening in the computer part, too. The computer was getting close enough to know what—in this case Agnes Sulem—was going to do, so that it could work. And it was interesting as a performance.
What were those pieces?
There was Handel flute sonata in F, and the Fritz Kreisler piece was a sort of lullaby. Actually I had my daughter in Paris at the time—she was about 11 or 12, I think, and she did the rehearsals of this thing and so by the time Agnes came in as a professional violinist I’d got most of the kinks of the system worked out. But this whole experiment was a very interactive one with performers. With the flute player, Larry, he was of course working with Boulez, playing with the ensemble during the day, and then he would stay with me in the evening, in a sleeping bag. He’d sleep on the floor, because he had to get some rest, and at about two o’clock in the morning I’d kick him and say “hey, let’s try this” and he would wake up, we’d try something, and then I’d say “OK, that's fine” and he’d go back to sleep and I’d do some more. So we survived like that for about six months, I guess, working these things out. Very slow progress, but he was very interested in this work himself, and I was so sad when he actually died.
How did you feel about IRCAM as an institution? How do you feel about its goals and what it may have achieved?
Well, the name of it, the Institut de Recherche et Coordination Acoustique/Musique...the ‘Coordination’ part of it didn’t work so well. I mean the acoustic...well, the composers would come in and do a piece, and the composers were regarded as the chiefs, and the acoustic people were...not slaves, but they were the sort of scientific community and they really weren’t mixing, because none of the composers did research, really, nor did any of the scientists—for the most part—do compositions. So the experience of the two still had this rift, and getting this collaboration...it’s the old thing of tool builders and tool users...that made the ‘Coordination’ part of IRCAM not operate as one might like. On the other hand, they made a lot of innovations and a lot of progress in this whole field, so I was quite enamoured with what had been achieved there.
At IRCAM, then, your interest was more in the musical interactions between computer and live performer than with the generation of sounds as such. Was that where the fascination lay for you?
For me it did, yes. It was in taming the computer according to the laws of music, basically. We didn’t really understand music in that scientific way—in fact it was very hard to get support for music research in the US through the scientific community. I got a grant once for the National Science Foundation, and at the time, the National Science Foundation didn’t feel that music was sufficiently understandable to net useful information. There was a watchdog senator, [William] Proxmire, who each year awarded the Golden Fleece award to some research project that had wandered off the strict science sort of thing, and I was a candidate for that once. I didn’t win it but when Proxmire’s team let it be known that I was a candidate, my research manager at the National Science Foundation quickly came up to MIT and had a long talk and we pulled a lot of supporters together and showed them that this was indeed scientifically viable stuff. But it was hard to make that connection. These days people do look at human expression for its interest and its content, and they are extracting all sorts of information about how we as humans survive and operate. That wasn’t the case 30 years ago.
Were you at IRCAM when Boulez was working on Répons?
Yes.
I gather that took up a lot of IRCAM resources.
It did. I used to work at night on the machine, on the 4X. I would only get on the machine because I needed the whole machine to myself, and not time shared with other people, since I was doing these real-time experiments. I would get on at about ten o’clock when all the Frenchmen were going off for dinner, and I could stay on all night until around seven in the morning when Boulez would show up. So we’d have this meeting every morning, talk about what was going on, then it was my time to say “OK, I’m going home to bed”. So we met daily in that case. The thing about that was...what Boulez was working on with Répons, which had persuaded him to develop this big, powerful 4X processor was that it was being used in his piece as effectively an expensive digital delay line. The cimbalom would go ‘brrrt’ and then the computer would play it back, ‘brrrt, brrrt, brrrt, brrrt, brrrt’...it was an expensive delay and it could have been done much cheaper.
But my idea, when I got there and I saw that this machine was not only recording the sound, but it actually had some MIPS, some computer power, left over...I thought “well, we could have it think about what it’s doing”. And that’s what got me into having the computer then track the flute player and then somehow synchronize an accompaniment part to that. The accompaniment was initially very simplistic: little harpsichord-like sounds. Later on it got to be more complex sounds, like the piano sounds in the case of the Fritz Kreisler work for violin and piano. But bridging that gap between the live performer and whatever the computer was able to do was gradually being solved as computers became more powerful. You could then have the computer think about what the music was about, so the computer could not only pitch track, but score follow; look at the score and see where the player had got to, whether they were speeding up or slowing down etc. And that’s the whole new world that I was speaking about earlier. Once you crossed that boundary there’s a very different relationship between live performers and the computer processing of sound.
You were starting that nearly 30 years ago. What’s the state of the art now?
It’s about the same. There’s not been much progress in that field, I’m sad to say. I thought that it would induce a lot more people to get in and do a lot more research. There are some programs that you find in schools where kids can play sonatas and things and have the computer follow them, but it hasn’t really got to the point of learning from rehearsals, which is what we were doing with that Fritz Kreisler piece—learning from rehearsals and improving each time you played. It’s interactive, but there’s no real AI learning things that are in the systems that I’ve seen, at least.
Why do you think that is?
Oh, they’re harder programs to write, and maybe they’re also closely allied to the instrument that you’re using. So something that works for violin may not work in the trumpet idiom. Different instruments have very different performance idioms and so if you develop something that works well in conjunction with a violin, it’s not going to work well in conjunction with a bassoon or something. So I think that’s probably the reason that these haven’t been developed, because these things tend to be one size fits all. And that hasn’t enabled the field—yet—to develop something that can recognize the instrument it’s playing with. That’s a tough problem, actually: to recognize an instrument and say “hey, that's an oboe” or “that's a flute” or something, and then respond accordingly.
Part of the difficulty is in the programming rather than computing power, which is obviously enormous these days.
Yes. These days you could probably get computers to recognize things. I mean, there is speaker recognition and all sorts of fancy things occurring, and this can be done in real time these days. It would be a case of just somebody or a group of people focussing their attention on solving this problem. It’s solvable, I would say, and that’s where the future will be.
What do you see as the current challenges in electronic instrument design in the general sense?
The biggest challenge for the composer is not to just push a button and let a whole lot of things happen. So there’s the tendency these days to go into a studio for 30 minutes and come out with a 20-minute piece. And that’s not good. It’s not going to be a good piece. Music always has been carefully considered, even out of improvisation. There’s a whole lot of experience that went into Bach improvising, or coming up with pieces that he might well have improvised: the Goldberg Variations, or The Art of Fugue or something. music needs to be thought about and carefully constructed to be meaningful to the listener.
I’m not very much impressed by random sequences of things, where the computer seems to generate lots and lots of notes and you, as a listener, work frantically trying to make sense of this when there probably isn’t anything very sensible about it. For me, it doesn’t represent the human-to-human communication that music has always been. And when you have technology getting in and just going ‘vleurgh’, and this sort of thing, I don’t find that very satisfying. Oh, to be sure, the computer can generate patterns and you can take delight in recognizing these patterns and that sort of thing, but it’s not really expression. And ultimately, give or take a pattern or two, music is about communication.
Do you think there’s still a lot of scope to develop the way people physically interact with the instruments or the devices?
Yes, that’s true. The ethnomusicologist John Blacking made a study of how different musical styles depend on the shape of the instruments and how it fits into the hands and so forth, so this is a natural thing: design an instrument of this sort and it gives rise to music of that sort. So that’s certainly true, although having the tactile feel instruments in the computer field is something that’s not very well progressed at this point. Later on you’ll perhaps develop interactive devices that will respond to the touch and feel. But if you regard music as being an extension of motion, the Piaget view of the world that everything is an expression in some capacity, then the input devices that we have on computers are still far from ideal.
In the early days of computer music, when a day or more would pass between submitting a program and hearing it back, John Chowning says he used the intervening period to think carefully about just what it was he was doing. Did you find that as well, and could you talk about working in that fashion?
Yes. There was a whole series of commissions at MIT that we did with money from the National Endowment for The Arts after we’d got established. Many of the composers would come in and get to know the technology and they would then go away and then come back a couple of months later with a piece ready to synthesize. Davidovsky was one of those, for instance. He went away, and came back, and here was the piece, and we had a helper for him, but he’d largely mapped it out in his mind. And I think Jonathan Harvey, the British composer was also working in something of that sort—he’d go away and think about it for a while and then come back. In fact in the very first instance, when I was doing my Synapse—the first piece that came out of the studio at MIT—I would work at home during the morning and do another two or three measures and then I would go in in the afternoon or evening and then orchestrate that for the computer. So I was doing it as a composer often works, in little fragments and then putting it together. But each time I was doing that, the experience of thinking about it the next morning and thinking what worked or what I might be able to make work the next day became an iterative process. So I suppose in that sense the end of the piece is better than the beginning...
Did you find it frustrating that you had to wait for so long to hear what you'd done or did that become part of the process?
Well at MIT I had a computer all to myself, so what I was waiting on was the task of actually keying in the notes that I wanted that I’d composed that morning and orchestrating them. I was doing it actually at a screen—it was the first computer music editor, frankly. It was a thing we called NEDIT, for Note Editor. It was a graphic thing done on an Imlac PDS-4 display processor. But before that, back in the early days at Princeton, when I was initially working there, the wait comprised not only the waiting for the computer to do its thing but then dumping it on to a big digital tape, a 2400-foot reel or something, and then jumping in the car and driving for an hour-and-a-half or two hours up to Murray Hill, New Jersey where there was the only digital-to-analogue conversion facility available. And so after a couple of days work, and it being put on to tape and then driving up there and hearing it back, you’d hear... ‘bleep’... “no, not quite what I wanted”, so you’d then get back in the car and go back to Princeton, make a few changes...that was the cycle, of two or three days. John Chowning, who was then working at Stanford on the machine that was in the AI Lab had probably a much quicker turnaround than two or three days and jumping in a car and driving a long way.
Hardly instant gratification, then...
No, no, true. But then, what composer does have that?
You’re probably best known for Csound. What was the reason for developing that?
It came about after I’d been at IRCAM and had been doing these experiments with the flute and violin and tracking and so forth. I came back to MIT and we just had a new building called the MIT Media Lab—I was a sort of co-founder of that. But I then got into modifying what had been the sound synthesis language of the time, which was Music 11, modifying it to become this slightly more interactive Csound thing. And I was writing in the language C, which is very close to the hardware. Music 11 was done in assembly language—that’s very close to the hardware.
When we were at Princeton a few years before, after the [IBM] 7094 went away, that was Music IV-B, the BEFAP [3] version of Music IV. People like Godfrey Winham, a British composer working at Princeton— a very bright guy—decided, when the 7094 went away, that he was never going to write in assembly language again. That’s when he wrote Music IV-BF, which is the FORTRAN version of Max’s old Music IV, and he decided that computers were going to be continually changing and that writing in assembly language was not a good investment of time. I decided with Music 360 that there was such a big change between the 7094 assembly language and that of the 360 that IBM wasn’t going to change its tune soon, and that I would have at least a few years, perhaps decades, if I invested time in an assembler version. So Music 360 was actually written in assembler, using the assembler as a sort of compiler, because it had the ability to subset strings of characters and things like that. So I was actually writing a compiler using the subsetting feature of the assembler. That meant that we had a language then that was really fast—this was running about ten times faster than FORTRAN—so I was really down at the hardware level. When it came to writing Music 11, I also did that in assembly language, the PDP-11’s assembler. That was about eight or ten years later. When it came to Csound, I was then willing to write in the language C, which had come out of Bell Labs, Western Electric and so forth. That was still pretty close to the hardware, unlike other things; C++ and other higher-level languages. Languages then were getting further removed from the hardware and you couldn’t really tell what was going on. The thing I liked about C is that when I’d write some C code I could pretty well guess how the compiler was going to put this down into assembly language. So I knew what was going on, I knew where I could take a few short cuts and cut some cycles of computation by doing it this way instead of another way.
So this would make the program run faster on any given computer.
Yes, So Csound started to get a lot of performance out of the machine, because it was written very close to the hardware.
What were the things that Csound could do that your previous music programs, Music 360 and Music 11 couldn’t?
Oh a lot of things, like phase vocoders—so we’re getting into heavy duty Fourier transforms and operations that were in the frequency domain rather than simple time domain, which the early computer music was. So when you’re in the frequency domain you can do all sorts of things: time-stretching, and transpositions and so forth without changing the timbral quality. You could take a voice and transpose it up or down or whatever it is and keep the same vocal sound without it becoming munchkin-like. Those things were all possible in Csound which had not been possible in Music 11.
And by then the computers had also caught up so that you weren’t working with a one-off like the 4X. You were working with commercially available, less expensive computers by then.
Yes. Not only commercially available but very widespread. And one of my continuing obligations to the community at the time was, as with Music 360, that I was perfectly willing to send out copies to anyone. In the case of Music 360, I was mailing off little 500-foot reels of tape with all my source code on there. When it came to Csound, the internet was now around, so I put Csound in the public domain and gave it away.
So you’re as much a pioneer of open-source software as of electronic music...
Well, I just believed in everybody having access to the things that I had done. It wasn’t just for me.
Was there pressure from your sponsor institutions to patent, or get a revenue stream, from these things?
Not really—I’ll give an example. At one point I was involved in the MPEG [4] community, coming up with standards—in fact I’d been part of the MP3, which was MPEG-1 Layer III—and when it came to MPEG-4 we made Csound, or a version of it, become the audio part of MPEG-4. What was happening in the MPEG community, an international community, was that many companies would come along and send their representatives to these meetings. There would be 120, 150 or more people at one of these conferences—they were at various sites around the world every three or four months, or something like that. Very few of those attending actually did any work. There was just a small number of people who got in there and got their hands dirty and developed new features etc. Most of the people at those MPEG meetings were...I wouldn’t say freeloaders, but people who wanted to keep an eye on what was going on. They would report back to their company so their company would be prepared for whatever the standard was going to be. In the case of Csound becoming the guts of MPEG-4 audio, I wanted to make that open, as I had done with all previous things. But in this case, my work had been paid for by the sponsors of the Media Lab. So I stood up at one of the next sponsor meetings and said “OK, we’ve developed this new thing called MPEG-4 and I want to put it in the public domain”. I said “you as sponsors”—these were media companies and so on—“can take advantage of this because you will actually sell more coffee if you give away the coffee cups for free”. So for me, the MPEG-4 was a coffee cup and that was then inducing people to create all kinds of brands of coffee and get those things out into the working field.
So you’ve always seen the proprietary branding as counterproductive?
I suppose, yes. Perhaps not counterproductive...well I suppose, linking back to one of the first things I said to you—I was concerned about control. In something like Linux, where you have Linus [Torvalds], who’s the instigator of Linux, he maintains control of that software and says what additions and features will or will not be accepted—there’s a sort of policing going on. I didn’t have time at MIT to be policing Csound, so when I put that in the public domain I just turned it over to other people to try to wrest some sort of control over there. And what really happened was that there were lots of people who got in and scribbled all sorts of extra things in there and the program began to suffer from software bloat, which is a problem in a lot of systems. And that then slows it down, it gets big and ugly and there’s no consistency, no style to the thing.
Like somebody else building an extension on to your carefully-designed house...
Yeah, yeah. So I didn’t really want to be part of that and I’ve always maintained my own private version of Csound, and it doesn’t get into these other things. It cuts both ways—I think I’ve missed out on some things, a few valuable things that people have added, but on the other thing I haven’t had to deal with all the bloated software that the public Csound has become. And I suppose that’s the trade-off. You can be part of the public community, but unless you police it carefully, as Linus has done with Linux, then you just have to put up with what everybody contributes. That’s the disadvantage.
We’ve talked quite a bit about the technical aspects of all this, but what for you are the best pieces that you’ve heard coming out of the programs you’ve developed?
Hmmm. There was a piece developed by a young Scottish composer from Glasgow named John Lunn. He came to me as about a 22-year old, and did a piece for piano and computer and he called it Echoes. The live piano played the initial note and the computer would supply the echoes: ‘dut dut dut’ etc. Of course the echoes were pre-recorded, pre-synthesized—this is back in 1980, I think it was—and so the echoes were added on to the piano note, at least in performance. So since the echoes were pre-recorded, there was a tape running, which had all the echoes on them, and the live performer just had to beat the echo by the right amount, so you’d have ‘sound, echo, echo; sound echo, echo’. John Lunn played it himself, and did a very good job. At first I thought “John, you’ll never pull this off on stage”, but he did. And then I lost track of him. He went back to Glasgow and I didn’t hear much from him. Usually in the studio I tried to keep track of most of the composers who’d worked there—I’d get Christmas messages or interactions, they’d come back and visit. Never heard from John again. Until a couple of years ago when his name popped up and it turns out he’s the composer of the music for Downton Abbey and various other things. I now can recognize his style. I heard a track from Downton Abbey just the other night when I was down in Nelson with John Rimmer. I can recognize John’s initial piano piece in the Downton Abbey scores; the style is consistent. So there’s an example of someone who got away on me, but he did OK, I guess...
Another piece that I was quite fascinated with wasn’t a Csound piece. It was actually something that had been developed by a composer and scientist—and this person is an example of both—Jean-Claude Risset and he was commissioned to come to MIT and do a piece and he chose to do a piece for two pianos. Well, one pianist and two pianos, where the computer was playing the second piano and the computer was listening very carefully to what the player was doing and then responding to that. And he’d actually composed both scores, but the synchronization was that the second part would just come out of what he was doing on the first piano. We did a lovely recording of that actually, it’s a stereo recording in which you can hear Jean-Claude on one channel and the computer response on the other channel. So if you have stereo headphones it’s fine, but if you just hear it on loudspeakers you don’t get that separation and don’t appreciate what’s going on. But I was quite impressed with that piece that Jean-Claude did. It wasn’t a computer-synthesized piece at all. The computer was there arranging the interaction between these two forces. It was actually done on a system called Max/MSP, which had been developed by a student of mine, Miller Puckette, who had worked with me in Paris when I was doing the research with the violinist, the Fritz Kreisler piece. He was sitting at the back of the room saying “now perhaps we could organize this a little differently”. He began to develop a graphical front end to what I was doing and that eventually became something called Max/MSP, a graphical control.
Which has now become, as John Chowning has said, the lingua franca of electronic music.
It has indeed.
Have you used it yourself?
I haven’t. No, I haven’t. Don’t know why, I just haven’t had the occasion to learn it or something. But I’m just fascinated with what Miller did, He’s a smart person. It’s interesting that he had never really had any music training. He was trained as a mathematician, and he was an undergraduate in the studio at MIT sitting there—and he’s from Tennessee—and he used to sit at the back of the studio strumming on his guitar and so forth, and I would get very worried: “Miller, why aren’t you going to classes?”, and he’d say “Well, it’s only the classes on Tuesdays and Thursdays that I’m cutting, because Monday, Wednesday and Friday I’m just auditing. So the fact that I never go to any of them, I’m really only cutting Tuesday and Thursday”. And I continued to worry about him...
Doesn’t seem to have done him any harm...
And a few weeks or months later I noticed that the MIT math team had won the inter-collegial competition for the whole US, and top scorer was Miller Puckette, So—a very natural mathematician, but then again, not very interested in the field of mathematics very much. He said there were only two or three problems in the field of mathematics that interested him and he felt that he could solve these part time. He was quite happy going down to San Diego where he was made a professor of music without any music training whatsoever.
Any other pieces that you feel are exemplary in their use of Csound or Music 360 or whatever?
Yes, well, I would perhaps pick out a couple of things: there’s Sphaera, for piano and computer that was done by Bill Albright. He’d been a student along with me at Michigan—he was an undergraduate when we were grad students. As an undergrad he ran rings round all us grad students. He was very talented. I commissioned him later on to come and do a piece for piano and computer. The piano is live and the performance is by David Burge who’s the head of the piano department at the Eastman School of Music. We’ve done other performances of it, but it’s a very complex but satisfying work when you realize that all of the electronic part is actually modified piano sounds. It’s all piano sounds, and that’s interesting. And the other piece I might point out, which was a signal event, was when Mario Davidovsky, after 20 years of not doing any electronic music, came back into the field and did a piece, Synchronisms No.9 which I persuaded him to do using the computer. That was a signal event around MIT. In fact Milton Babbitt came up from Princeton, all the Harvard music faculty came along and heard the concert of Mario making a statement in the electronic field after so many years away. He left the field because he felt he was repeating himself and pulled back to writing instrumental music.