Alum Discusses AI Techniques In Music Technology
On Thursday, the AI in the Liberal Arts (AILA) speaker series hosted Daniel Flores Garcia ’24 to discuss the increasing integration of generative artificial intelligence (AI) into music technology, emphasizing the disproportionate representation of Western musical styles in AI-generated music.
On Thursday, at the Lyceum’s CHI Think Tank, Daniel Flores Garcia ’24 explored the burgeoning relationship between large language models and music in his talk, “Musicking with Generative Models.” The talk was part of a series of alumni speakers organized by AI in the Liberal Arts and was moderated by former visiting professor Ravi Krishnaswami and Jessica Strauss ’27.
Flores Garcia outlined how music and technology have had a strong relationship for several decades before artificial intelligence (AI) joined the scene. The invention of the tape recorder was the first example of this, which later led to the technique of “looping.” The loop pedal enabled the introduction of this in real time. “Live movement becomes a lot more accessible, and it kind of permeates to different genres [becoming] one of the mainstay tools of very popular musicians such as Ed Sheeran,” Flores Garcia said.
Techniques in song editing also became more advanced, but simpler in their machinery, allowing for even more accessibility. Devices such as samplers have become ubiquitous in the music industry, which introduces a new experimental quality and influence over tracks.
Recently, music has been increasingly integrated into the development of AI, creating a new way for individuals to both create music and interact with it in real-time.
Flores Garcia explained how certain AI software is now capable of altering melody and rhythm in real time, as well as altering elements on command. However, much of this AI-generated music is exclusively similar to Western musical styles. Flores Garcia’s thesis, for instance, statistically analyzed how the rhythmic composition of AI-generated music followed a more traditional European-derived rhythm that is more metrically straightforward. Flores Garcia expressed disappointment in the way AI-generated music evaluated in his thesis lacked representation of Afro-Caribbean rhythms, which center around the “clave,” a syncopated five-stroke pattern. According to Flores Garcia, because AI music is generated through large language models that are trained on Western-style music, it naturally creates a product that reflects its input.
While at Amherst, Flores Garcia sought to improve representation for non-Western music styles in the current generation, which encouraged him to create an augmented dataset with this exact purpose. He helped create a generative model, ClaveNet, which would train generative AI on Afro-Caribbean rhythms in order to improve the similarity of its output to non-Western styles.“I think this data implementation scheme is a good example of how to incorporate musical expertise into the training of the deep learning model early on from the training stage,” Flores Garcia said.
Lee Spector, professor and chair of computer science at Amherst College, attended the talk after having known Flores Garcia as a student. “I think today’s headline revenue AI models are a lot worse than most people think, but that the potential is actually a lot higher than most people [think],” he said. He emphasized that while technology has come a long way in its ability to generate new styles and new genres, its development still has a long way to go.
Flores Garcia’s model aims to make this potential more representative. Although he works full-time as a software engineer for Amazon, he hopes to continue to work on these initiatives on the side. His next project is focused on creating a Caribbean drum generator that the performer can play and interact with in real time.
Comments ()