How I made a Neural Melody Maker with Magenta.js

Logan Takahashi
6 min readJun 9, 2019

--

I recently started exploring how to use a neural network to generate music. As someone who has spent countless hours and nights editing MIDI blocks by hand to make music, I was intrigued by the idea of AI collaboration as a compositional tool. I decided that the best way to familiarize myself with the concept of machine learning would be to build something. As a starting point, I read up on the Magenta.js library. Magenta.js is an open-source project started in 2018 under Google, geared towards making machine learning tools more accessible for artists. Magenta.js gives us access to three pre-trained models that each take a different approach to making music: MusicRNN, MusicVAE, and Piano Genie. Since this was my first venture into machine learning territory, I went with one of the most straightforward options—a MusicRNN model that deals with monophonic MIDI notes, or one note at a time.

I. What is MusicRNN?

Read more about the MusicRNN model

A recurrent network—or LSTM, which stands for “long short-term memory”—is a type of model that connects back on itself like a loop. This aspect is what allows recurrent models to be especially good at dealing with sequences of data. By feeding back into itself at every step, it is able to take into account all of the previous notes in a sequence and, using probability distribution, output a new sequence. Within MusicRNN, there are specific models for rhythm (GrooveRNN), melody (MelodyRNN) and performance (PerformanceRNN). Luckily, Magenta.js greases through much of the logic for us under the hood.

II. Using a MusicRNN model

It only takes a few steps to get a MusicRNN model up and running on the developer side:

$ npm i --save @magenta/music
  • Now all you have to do to create a model and generate an output sequence is execute a few lines of code. Here are the basic building blocks:
  • Lines 1–5: First we must import Magenta and instantiate a new MusicRNN model, passing the desired checkpoint as its argument. (See a list of all the checkpoints you can use here).
  • Lines 7–11: This is the note sequence, which is an array of objects. Each note should contain a pitch value, start step of pitch, and end step of pitch.
  • Lines 13–17: This is the seed sequence object, which contains the note sequence (as defined above) as one of its properties. It also includes the total number of steps and quantization info. (In this case, one quarter note equals one step.)
  • Lines 19–30: Now all that is left to do is call the .continueSequence method on the model. This method takes as arguments our seed sequence, total number of steps the result should be (in our case 8), and the heat number. (Think of heat as essentially how random it will end up sounding. Anything above 2 gets pretty wild.) Note that .continueSequence returns a promise, so the best practice is to put it inside an async function as well as wrap it in a try/catch.

Having a grasp of this basic flow of data is all you really need to get started with Magenta.js. But what if we don’t want to hard-code our notes? How can we input notes to a model in ways that are dynamic, interactive, and streamlined? And how do we turn a note sequence object into actual audio anyway? Answering these questions is where the true fun lies.

III. The Neural Melody Maker

https://melodymaker.herokuapp.com/

As part of a weekend hackathon project, I decided to adapt a standard step sequencer to be the interface for my MusicRNN model. It seemed to be the simplest way to test the flow of data I described above. One benefit of working with a sequencer is that the issue of quantization is already solved for us, as each note’s start and stop fits exactly on each step (as opposed to recording MIDI from a live take).

I set out to hack Magenta.js into an existing sequencer. Of the many demos and examples that exist online, I decided to use Mark Murray’s React Sequencer as my starting point. I wanted to be able to take advantage of React’s local state and component life cycle methods, and I also appreciated that the sequencer is monophonic and fairly minimal. I paired down its features and design even more, focusing simply on its ability to trigger corresponding notes with a sine-wave synth using Web Audio API.

The sequencer works by creating an interval which repeatedly cycles through an array of sub-arrays. Each sub-array represents one of the eight steps in the sequence, and each value in the sub-array represents one of the 12 possible notes in the scale. These arrays are mapped over a matrix of divs that are represented as pads in a grid, which can be toggled on/off and correspond to a frequency value that gets played by an instance of a Synth class.

The two main issues I had to solve in this project were: 1) how to record and format a note sequence for our model based off whatever pads are toggled on at a given time; and 2) how to sonically and visually output the new sequence on the interface’s grid.

For the first problem, I took advantage of the sequencer’s pre-existing play method. I made a global function called recorder that gets called at every step from within the play method, taking in each note as an argument. The recorder function keeps track of each note as the sequence is played through, and defines a new seed sequence after every 8 steps.

The synth takes in a note’s actual frequency value (0–20hz), but the model takes in a MIDI pitch number (0–127). In order to translate between the two, I had to create a reference .json object called MNOTES to use in conjunction with the NOTES.json object (see lines 9–11 in recorder function above). Once we have our notes as MIDI pitch numbers we can then format them to be part of a noteSequence object (see lines 12–24.)

So if the sequencer has these 6 notes selected:

…then its corresponding note sequence would be:

Once we are able to create a note sequence from our toggled pads, we can now at any point call the generateSeq class method by hitting the Build Melody button. The generateSeq method calls .continueSequence on our model, passing in the current seed sequence. Our outputted sequence—a unique generated melody—is once again an object, with a notes property in the same format as the one we input.

How do we then translate this result sequence into something that can be played and displayed on the sequencer? To solve this second problem I made a class method called newView. This is called within the generateSeq method and passes the result sequence as an argument. Here’s what the code for that looks like:

  • Lines 2–5: Using MNOTES.json and a helper function that swaps key-value pairs, we create an object that maps notes of a scale onto values ‘0’ – ‘11’. This sets us up to deal with generated notes that fall above or below our given octave by transposing them into the current range.
  • Lines 8–11: Here we create a new array seqForGrid with length 8, and then fill it with our generated notes by iterating through resultSeq.notes.
  • Lines 12–18: Now we iterate through our current state’s pads and compare them to our seqForGrid that we just made. Only if there is an ‘updated’ note at a corresponding index will we toggle it, otherwise we will leave old unchanged notes. In this way, we are able to get a sense of a melody building on itself, rather than just showing a completely new sequence each time.

I consider this project to be a very small first step in working with Magenta.js, but wanted to share my learning process so far with others who may also be interested in starting to work with the library. I am excited to keep exploring even more ways to use these tools.

Here’s the deployed version of the Neural Melody Maker so you can try it out yourself:

https://melodymaker.herokuapp.com/

Check out the source code for the project here:

https://github.com/Tdask/melodymaker

--

--