light/dark mode

music + ai



[the Analytical Engine] might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should be also susceptible of adaptations to the action of the operating notation and mechanism of the engine . . . supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.

Ada Lovelace (sketch of the Analytical Engine) 1842


prologue

Nearly 200 years ago Ada Lovelace dreamed that computers might one day "compose elaborate and scientific pieces of music of any degree of complexity or extent", that dream (or nightmare, depending on how you look at it) has finally come true. Computers are not only used as a tool for creating music, having become both an instrument and a recording studio, but now, with "AI" models, computers can also do the composing... sort of. We're entering a new era of "AI", thanks to a new approach to creating algorithms known as "Machine Learning", where instead of writing algorithms, programmers now train them to write themselves.

These algorithms are having (and will continue to have) drastic effects on every aspect of our society (including art). Today, artificial neural networks trained (often requiring enormous amounts of energy) on troves of data (which are not always ethically sourced) can make “predictions” and generate “hallucinations” (often with clear biases) that would have seemed like impossible sorcery just a few short years ago. In certain high stakes applications this can save lives, but it can also destroy them. In other contexts this biased hallucinatory predictive sorcery can be quite exciting, as is the case with media art. This technology, like many others that came before it (smart phones, the Internet, the computer) will most certainly change everything in our field, exactly how and to what extent is still anyone’s guess.



These talking machines are going to ruin the artistic development of music in this country. When I was a boy [...] in front of every house in the summer evenings, you would find young people together singing the songs of the day or old songs. Today you hear these infernal machines going night and day. We will not have a vocal cord left. The vocal cord will be eliminated by a process of evolution, as was the tail of man when he came from the ape."

John Philip Sousa (1906)


In the early 1900s the composer John Philip Sousa (best known for his patriotic marches) went on a public campaign to ban the new musical technology of his day. Sousa was deeply worried about the rise of recorded music on phonographs and gramophones. He believed these new machines would harm live music traditions, discourage people from learning to sing or play instruments, and even damage the nation’s culture. In 1906, he wrote an essay called The Menace of Mechanical Music, where he argued that recordings would turn people into “passive listeners” instead of active creators.

To spread his message, Sousa gave speeches and interviews urging the public to resist recorded music. He even lobbied Congress to outlaw it, fearing that if families stopped gathering around the piano to sing, America’s musical spirit would fade. It’s easy to say, in hindsight, that Sousa was wrong: the record player didn’t kill music. It launched a new industry, widened access, and sparked fresh innovation, turntables became instruments, and whole genres (like hip-hop) emerged.

But new technologies usually bring both magic and loss. In Sousa’s day, pianos were common in middle-class homes, and U.S. piano sales peaked around 1909. As records (and later radio) rose, sheet-music sales fell (as did our ability to read it). Listening shifted from a social activity (going to a concert or gathering around the family piano to sing together) to something you could do alone: no family, no piano, not even a live band was necessary. While it's impossible to characterize these shifts as purely good or bad, they've certainly changed our relationship to music in drastic ways. Whether this next wave of technological change brings more benefits than harms will depend less on the tech itself and more on the choices we make as cultural producers and consumers.

At his pivotal moment, it's worth thinking deeply about what it means to creatively engage with AI. This is a big topic, but considering our goals in this class, a good place to start is thinking about how we plan on engaging with coding assistants.




coding with ai

As Ada Lovelace imagined nearly two centuries ago, we’ll treat the computer not as a mere sound-making tool but as an instrument that embeds the logic of composition, systems that encode musical ideas and generate pieces. The code we write will push what it means "write" music. To be clear: though we'll be creating generative algorithmic systems, we won’t be training artificial neural networks (what's folks are calling "generative AI", I’ll briefly address these systems at the end of this page). Instead, we'll be creating what you might call “classical AI”: meticulously and deliberately hand crafted code. Our goal is to be very intentional and critical about the algorithms we write, we want to explore and experiment with the possibilities of the digital medium, and create systems biased by our aesthetic judgments (not the bias of a large and foreign training-dataset).

It’s tempting, especially with coding assistants at hand, to “vibe-code” our way to something interesting. So much is possible when we use code as a creative medium, but our imagination is always bounded by our understanding of it. If we turn to AI coding assistants to early and too often, our ideas become constrained by what a chatbot suggests, and while they're very profficient coders, they're not very creative thinkers. The more time you invest in mastering the fundamentals and practice-based experimentation, the better you'll understand what LLM-generated code is doing, have the agency to reshape it, and ensure your ideas grow from your own knowledge and experience rather than be boxed in by what an LLM produces.




ai policy

The future of coding is likely going to always involve some sort of AI-assistance, in order to make sure we learn to use these systems in ways that benefit us, extending our capabilities and supporting our creativity, rather than undermining our learning and limiting our ideas, I ask that you adhere to the following;




ai protocol: crafting prompts

Use AI for hard problems or feedback after completing an assignment, not for basic short questions like "why isn't my div centered?". LLMs are “sycophantic” by default: they’ll agree with your assumptions and guess your preferences, which can trap beginners either in long loops or chats that "drift" and can lead to code that sort-of-works but is far from ideal. Instead, write your first prompt like a good Stack Overflow post: state the goal, give context and constraints, include your environment and a minimal reproducible snippet of code containing the issue. This not only improves the model's answer but the act of articulating the problem can be generative and often also reveal the fix.







epilogue

Over a century after Sousa lobbied against recording technology, his refrain returns: "We will not have a vocal cord left" in the era of AI, no one will learn instruments or theory because models can make music faster, cheaper, and personalized to each listener. Why would anyone pay a slower, more stubborn, human musician to write music when an algorithm can supply endless songs on demand? Composer Mark Henry Phillips felt that panic too, as someone who makes a living writing jingles for podcasts he had an existential crisis, until he saw some creative possibilities...

Composer and sound designer, Mark Henry Phillips, on how AI music generators could fundamentally upend the industry, from WNYC's On the Media(Dec 27, 2024)

Phillips's take is a creative and optimistic response to what has felt like an intensely demoralizing situation for many. Maybe feeding something we've made into a generative system, and seeing how it responds can be a great way of getting over writers block. Still, many of these systems aren’t designed for co-creation. They’re positioned to replace artists, targeting buyers who would otherwise commission human work. This is evident in their design, the interface is optimized for generating finished products, not pieces to be edited and reasembled by an artist. We might think ChatGPT is a great writing partner, great at helping us write new lyrics, but consider the difference between ChatGPT's inteface and this experimental tool called TextFX made by crative technologists at Google in collaboration with the rapper Lupe Fiasco: https://textfx.withgoogle.com

Lupe Fiasco interviewed about TextFX

Lupe Fiasco presents his collaborative project TextFX in Google's case study video




beyond apps

While we can definitely take some inspiration from the way Mark Henry Phillips and Lupe Fiasco reframe these models, less as anthropomorphized thought partners (which is often how they're marketed) and more as call-and-response ideation tools, we want to push things even further. We can engage with these models at the level of code which means we can incorporate generative AI models into our the systems we create. For example, while there are models that generate complete compositions baked down into an audio file, like the ones behind the app Phillips used, there are others that simply generate sequences of notes, the musical data which we can shape with the sounds/timbres we design and layer to our liking, for example we could use the MusicVAE AI model in combination with Tone.js. We could also do the inverse, keep the decision of which notes to play on our end and instead use an AI model to generate the timbre for each note, like this example using the GANSynth AI model.

But integrating AI models in our projects doesn't have to be limited to music-specific models. We could use AI models to do all sorts of things beyond generate music/sound, for example we could use different vision models to detect hand positions and body poses and use that as a way of controlling web based instruments we make. For example, if we use our webcam's feed and pipe that as input into an AI model, as we do in this BlazePose model example, we could use the AI model's outputs to control the pitch and volume of an instrument.




ai models as art

We could go even deeper, we could collect (or curate) data ourselves and then use it to train our own AI models which we can then integrate into our projects. The work we're producing in class are all algorithmic systems which can be used to generate sound and music, our art isn't so much the system's output (the music our systems generate) but the system itself. From this perspective, an artist might approach the creation of an AI model (itself an algorithmic system) as an art object itself. Which is exactly what Holly Herndon did.

Holly Herndon TED talk

Holly Herndon presents her AI project Holly+ At TED

Holly Herndon, an artist and composer with a background in experimental new media, created an AI model similar to GANSynth, in that it creates audio buffers by generating the timbre hitting a specific pitch. But GANSynth was trained on the timbre of lots of different musical instruments, which means it can create new timbres that sound like a mix of musical instruments. Herndon's model, Holly+, was trained on her own voice, so it can only ever create audio which sounds like Holly Herndon herself: her voice, as an AI model, as art.