light/dark mode

chapter 1

sound

How can we create music with a computer? One way is to use a DAW (digital audio workstation), programs like Ableton or FL Studio. These apps are great when you want to create conventional music with ready-made tools. But when we want to push boundaries and tap the full potential of this meta-medium, we turn to code. Instead of simply using software someone else wrote, we can program our own. Pre-made software gives us only the buttons, knobs, and menus its designers imagined. That can save time, but it also limits what and how we can create. By relying on someone else's GUI (graphical user interface), we restrict our imagination to their design choices, rather than exploring the much larger creative possibilities of the sounds a computer can make and how we might play it.

Learning to make sound and music with code isn't just about getting programs to run, it's about understanding the principles that make them possible. The ideas you can imagine in code grow directly from your grasp of core concepts. If you skip that foundation and lean only on tools (whether pre-made software or even generative AI) you'll tend to produce work bounded by what you already know or what the tool suggests. It may feel productive, but without a sense of how and why this code works, and the history of ideas behind it, you lose the agency to shape it into something truly your own. Invest in the fundamentals, history, and theory, and your imagination will expand alongside your technical craft. The more time you spend experimenting and wrestling with the material, the more you'll tap the true potential of the computer as a creative medium, and the more interesting and original your work will become.



what is sound

Before we talk about music, we need to talk about sound, what it is and how we’ll be making it. Sound happens when something vibrates, like a vocal chord, a guitar string or a speaker. These vibrations cause the tiny particles in the air around it to move, creating patterns of squished-together areas (compressions) and spread-out areas (rarefactions). These patterns travel outward as waves. When these waves reach your ears, they make your eardrum vibrate in the same pattern. Your brain takes these vibrations and processes them into what you experience as sound—like speech, music or noise. In a way, sound only exists in your mind: it's your brain's way of interpreting the physical vibrations in the air into a meaningful hallucination. Without a listener, “sound” is nothing more than patterns of vibrations in the air.

vibration speed (aka "frequency" or "pitch")
vibration intensity (aka "amplitude" or "volume")


The pitch of a sound depends on how fast the particles are vibrating, and the volume depends on how much they’re moving. A sound's pitch is also called its frequency, which is measured in hertz (Hz) or cycles per second. Faster vibrations create higher frequencies and higher-pitched sounds, while slower vibrations create lower frequencies and lower-pitched sounds. The volume (loudness) of a sound depends on the wave's amplitude—how strong or "tall" the wave is. Higher amplitude means louder sounds, and it's measured in decibels (dB). A decibel is a unit that measures how loud a sound is, using a scale where every 10 dB means the sound is 10 times more intense than the previous level. In short, amplitude controls loudness, and frequency controls pitch.

We'll be generating sound by vibrating speakers attached to our computer. These could be small headphone speakers or larger blue-tooth connected speakers.




digital sound

So we'll rephrase the initial question: How can we vibrate our speakers (ie. create sound) using code on a computer? Many languages and environments can create sound. Because our goal is to publish pieces on the open web, we'll use JavaScript, the Web's de facto programming language. Websites are built on HTML, a beginner-friendly language for structuring content. We won't spend much time on HTML in this class, but if you're comfortable with HTML (and CSS) you're welcome to use them in your projects. Our focus will be JavaScript. To run it in the browser, we'll put our code inside a <script> tag in an HTML file. Running JavaScript in a web browser gives us access to the Web Audio API, a set of audio building blocks for generating and shaping sound from scratch.





layers of abstractions

In the last of the examples above, we created a function called playTone() which abstracts the logic for creating an audio buffer (the array of values determining the speaker cone's excursion, aka displacement/deflection) into a higher level concept: playing a tone. This function can be controlled by passing it two values, one representing the pitch of the tone (the sound wave's frequency) and the other the volume of the tone (the sound wave's amplitude). Abstraction is choosing what details to ignore so you can think at the right level. If we were to give our program a GUI (graphical user interface), itself another layer of abstraction, it might look like this...


play tone

(waveform not to scale, slowed down for demonstration)

pitch 440 hz
volume 5

Computing is built from layers of abstraction, often called a "stack." At the bottom, electrons flow through transistors, encoding bits (1s and 0s) that the CPU turns into machine instructions. Above that sit the operating system and audio drivers that manage timing and move sample data to your sound hardware (which drive your speakers). In the browser, the JavaScript engine and the Web Audio API provide higher-level building blocks; our own functions sit above those, and at the top the GUI translates gestures into parameter changes. Each layer hides complexity and offers a different vocabulary for making sound. Creative agency comes from knowing where you are in that stack, what each layer affords, and when to step down a level to regain precision or invent something new.

So far we've been working in the middle of the stack with JavaScript, talking directly to the browser's low-level Web Audio API. Next, we'll layer in libraries that let us think more musically and visually. Tone.js abstracts Web Audio into higher-level instruments and signal flow (e.g., Synth, Filter, Gain), a musical time system ('4n', BPM, bars), and a reliable scheduler (Transport) so we can focus on notes, rhythms, and timbre instead of coding all that logic ourselves. We're going to keep experimenting with creating audio buffers from scratch for now, but know that later Tone.js will add all these laeyrs of abstraction to our stack of tools. For interfaces, we'll be using the netnet-standard-library (or nn.min.js for short) which will privde some music theory functions (for creating modes, scales, chords) and also streamlines the browser's DOM API so building sliders, buttons, and simple UIs is quick and clear. We'll use these abstractions for most of the course to move faster and explore ideas, but keep in mind that you can always drop down to raw Web Audio or vanilla DOM API when you need finer control.





timbre

In the examples above we created two different sounds entirely from scratch, by calculating the raw data for each (the sample values to fill the audio buffer). These two sounds in some sense are opposites. The first was completely random, pure “noise” like the sound of a consonant in speech or the crash symbol on a drum set. The second was a “tone” like the sound of a vowel in speech or a key on a piano. It’s actually the purest tone we could make, the sound of a “sine wave”.

However, pure sine waves don’t exist in the natural world, they can only ever be synthesized electronically (be it analog or digital). Most "natural" sounds with repetitive or predictable vibrations—what we hear as musical tones—are a mix of a fundamental frequency (the dominant tone, like 440 Hz for A4) and additional frequencies called overtones or harmonics. These overtones are multiples of the fundamental and give the sound its complexity.

This mix of frequencies creates a sound’s timbre, the unique quality or texture that makes a guitar and a piano playing the same note sound different. While they share the same fundamental frequency, their overtones are emphasized differently, giving each instrument its distinctive "color." Overtones are what make musical tones rich and varied, compared to the pure simplicity of a sine wave.

above: waveform (shape of vibration) | below: spectrum analyzer (fundamental frequency and harmonics)

In digital audio, we often describe four basic wave types, each with unique characteristics. Sine waves are pure tones, containing only the fundamental frequency with no harmonics, making them smooth and simple. Square waves are richer, combining the fundamental with only the odd-numbered harmonics (e.g., 3rd, 5th, 7th), giving them a buzzy, hollow sound. Saw waves are even more complex, containing the fundamental and all harmonics, which creates a bright, edgy sound often used in synth music. Triangle waves are similar to square waves but softer; they also include only odd harmonics, but the higher harmonics are much quieter, resulting in a more subdued, rounded tone. Each wave shape has a distinct frequency content, shaping its sound and timbre.





a note on these visualizations

The examples above all make use of a couple of helper functions for visualizing our audio, createWaveform() and createSpectrum(), this lesson isn't about audio visualization, but if you're curious to see the code behind those functions you can view the source code for both the createWaveform and createSpectrum functions if you'd like. Visualizations like these are created using an algorithm called the Fast Fourier Transform or FFT for short. This algorithm has a very interesting history worth learning more about. I might also suggest you can check out Jack Schaedler's interactive page on the Fourier Transform for a more intuitive understanding of how adding/removing harmonics to a fundamental frequency changes the wave shape. The Web Audio API has a built-in implementation of the FFT algorithm in AnalyserNode which I explain here, as well as how to use it to create visualizations with canvas using the Web's native APIs. Of course, Tone.js also has it's higher level abstractions of this AnalyserNode which makes it easier to create visuals with the same FFT algorithms running behind the scenes, I've also created examples for these including how to create a volume meter, how to draw a waveform as well as frquency spectrum bands.






That said, our main goal with this lesson is not to learn how to create sound visualizations (we'll revisit this later). Our goal is to understand one of the most fundamental principles of digital audio: sound buffers. Over the next few weeks we'll start climbing the ladder of abstraction and begin expressing sounds at a higher level (eventually in terms of musical notes), but before we do it's worth spending some time to experiment with buffers at this lower level, this will not only help in your fundamental understanding of what digital audio is and how it works, but might also plant a seed for ideas you might have in the future which could benefit from dipping back down into this lower level.

Attribution: Text and code written by Nick Briz. The code editor icons designed by Meko and licensed under Creative Commons Attribution License (CC BY 3.0). Air pressure and sine wave diagram remixed from Jack Schaedler's Seeing Circles, Sines and Signals. All sounds generated using the Web Audio API and/or Tone.js by Yotam Mann and other contributors.