Reflections
Composition and Generative AI
At the very beginning of my research, I started experimenting with AI, in particular with symbolic AI –often so-called GOFAI (Good Old Fashion Artificial Intelligence). Later, I started incorporating into my practice methods from Machine Learning (ML) and Deep Learning (DL). DL lies at the core of the big gen-AI models, such as ChatGPT and others. Even though the scale of implementation and development of these methods in my artistic research project is relatively small, I believe that their mere inclusion in my practice is enough to raise important ethical considerations that deserve thoughtful reflection.
The boom of gen-AI happened long after I started my research position. On November 30, 2022 –the day after my Ph.D. midway evaluation, ChatGPT 3.5 was publicly released. *Before that I was already aware of some of the advancements of OpenAI in the field, and in fact I had the pre-trained GPT-2 model in my hard-drive. I carried out some experiments with it that led to an idea for a piece in which this model would receive, as prompts, verses of poems by Nelly Sacks and output some text that would be projected in real time in a screen. In the end, this project was never finished. Very shortly after, the sudden awareness of its imminent social, cultural, and historical impact had an unexpected and profound effect on my ideas around AI and, naturally, on my artistic research process. In short, almost all my ideas and the philosophical foundation of the project entered a profound crisis, triggering an on-the-fly reevaluation, not only of technical aspects but, more importantly, of the ethical implications of using gen-AI as a composer or artist. This reevaluation extended to the evolving societal role of these systems, their integration into daily life, and ultimately, their broader impact on humanity as a species.
As of today, I believe that we are on the verge of a historical change, and it is hard to foresee a clear picture of what the future will be for humans in relation to AI. That being said, I believe that an extended discussion around the social, cultural, and economic implications of new advancements in AI is way outside the scope of this project. In this sense, I will try to keep the reflection within the general field of art creation, and in particular, music. However, I believe that fully dodging the existential discussion around the matter is not possible, and eventually, I have felt the need for more general reflections, even though, for this text, I have tried to maintain it to the minimum necessary.
Authorship, copyright, and ownership
The first aspect that caught my interest around gen-AI is the use of datasets consisting of large amounts of preexisting music to train models for generating new musical material. The newly generated music, of course, keeps some resemblance to the training data. I experimented with this methodology in my cycle Oscillations. But let me come back to Oscillations later to not spoil some of my ideas on the matter yet.
This problem is not necessarily new in music. I can think of musical quotations, along with many other compositional techniques that use preexisting music in ways that allow a listener to recognize –more or less clearly– the source as not created by the current composer. This has been a common practice from the music of the Renaissance to the music of our days. Yes, this is a large and heterogeneous time frame to consider. Potentially, the analogy could be restricted to music composed after February 15, 1972, marking a relevant date when fixed recordings became subject to ownership protection in the United States.
This moves me to compare it with a preexisting compositional technique, that of sampling. And I think here of the paradigmatic example of sampling –and probably the closest in terms of implications with gen-AI, as many procedures seem equivalent in both as well as the legal framework that seems to encompass them– the genre “Plunderphonics,” *Here can be found a comprehensive database of works and theory related to plunderphonics https://www.plunderphonics.com/xhtml/x.html started by the composer John Oswald. Oswald’s presentation at the Wired Society Electro-Acoustic Conference in Toronto in 1985 remains fully valid nowadays, and I highly recommend the complete reading of it. Here is how it starts:
“Musical instruments produce sounds. Composers produce music. Musical instruments reproduce music. Tape recorders, radios, disc players, etc., reproduce sound. A device such as a wind-up music box produces sound and reproduces music. A phonograph in the hands of a hip hop/scratch artist who plays a record like an electronic washboard with a phonographic needle as a plectrum produces sounds which are unique and not reproduced - the record player becomes a musical instrument. A sampler, in essence, a recording, transforming instrument, is simultaneously a documenting device and a creative device, in effect reducing a distinction manifested by copyright.”
Essentially, the article emphasizes the visible contradictions and blurry limits of the idea of ownership in music by questioning the validity of the concept of copyright. Oswald delineates his philosophical perspective on the issue, but in addition, he carefully reviews an important body of legal regulations on copyright. I propose the reader follow this sequence of quotations –some I took from Oswald’s presentation, some others I gathered from other sources– which, to my understanding, make the problem fairly evident:
“Copyright protection gives the owner of copyright in a musical composition the exclusive right to make copies, prepare derivative works, sell or distribute copies, and perform or display the work publicly.”
“This protection is automatic: it arises from the mere fact of creation by the author. Copyright does not require registration with an intellectual property institution. (…) This right protects a work provided that it is original, i.e., that it bears the imprint of the author’s personality: his/her sensibility, his/her non-imposed choices, and his/her perception of the subject.”
“An author is entitled to claim authorship and to preserve the integrity of the work by restraining any distortion, mutilation, or other modification that is prejudicial to the author’s honor or reputation.”
“An artist who does not retain the copyright in a work may use certain materials used to produce that work to produce a subsequent work, without infringing copyright in the earlier work, if the subsequent work taken as a whole does not repeat the main design of the previous work.”
“The fair use of a copyrighted work [by someone other than the author], including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.”
“Sonic impersonation is quite legal.”
Regardless of the last –ironic– quotation, there seems to be a significantly high degree of legal ambiguity on this issue. Laws don’t really solve the problem here. In addition, one can find a wide range of opinions on this issue by different practitioners: composers, performers, music producers, DJs, managers, record labels, artistic collectives, and so on. It is a difficult issue to take a stand on. My perspective on this comes from considering the issue from both sides of the counter. However, I believe these questions can also be depersonalized and framed within the broader field of practice:
- Would I be happy if someone takes one of my pieces, alters it somehow –but keeps some resemblance to the original– and passes it as his/her own? No, I would not. I would see this as ethically and morally reprehensible. In this scenario, even though the laws have a certain degree of ambiguity, they’d favor my position.
- Would my attitude change whether or not I am asked for authorization? Probably.
- Would this more permissive attitude change if the person acknowledges my original authorship and describes his/her agency on it as a “contribution” with some sought end? Maybe. Probably, it will depend on the details of the endeavor, including potential economic compensation.
- Should I assume that my stance on the issue is shared by every composer, performer, or artist? Certainly not.
Probably, most music makers wouldn’t be happy to have their music being fed a neural network that generates an outcome that somehow resembles their works. Maybe it doesn’t even resemble it explicitly, rather, having some agency on the model’s outcome as to what it is. Most likely, these people haven’t been asked for their authorization to include this material, and they won’t get any compensation or even credit for this. *See for example this recent article in the Washington Post: Johnathan Taplin and T Bone Burnett, "Opinion | To protect human artistry from AI, new safeguards might be essential," The Washington Post, 2023-03-13 2023, https://www.washingtonpost.com/opinions/2023/03/14/artificial-intelligence-threatens-creative-artists/. However, some others might not have any concerns at all about this, considering that their musical spirit –I am not sure if this is a good way of defining it– would incarnate in the most motley derivative outcomes.
Ultimately, my approach to gen-AI follows some principles. First of all, I reject the extractivist *The concept of extractive AI comes from adapting the notions of extractive economy to the development and function of big AI models. I use the term also relating to the idea of indiscriminate gathering of data without any concerns or interest or questionings about origin, copyrights, and later use. use of preexisting data to train these models. Data that is protected by copyright should not be included on these datasets unless with permission of the authors. That includes also images and any other creative content. *At some point, I used images generated by Dall-e 2 and 3 as illustrations for some content of the project, such as presentations or even record covers. After some reflection around it, I decided to take them out from everywhere and not use them at any cost. Later, I softened my position a bit, and I allowed myself to use some generated content from Adobe Firefly, as long as they maintained an ethical stanza aligned with my ideas (see https://www.adobe.com/ai/overview/ethics.html). However, later controversies are still arising, and it’s hard to keep a rock-solid stanza on the issue.
Having said that, I can reveal more details on how this stance materialized within my creative process. For example, for the composition of the pieces Oscillations (i and iii). I used symbolic datasets to create models later used for generative purposes. In particular, I used the scores of some songs of the cycle Winterreise. As Schubert’s music is in the public domain, this represents no legal problem. However, in the process of composing the piece Oscillations (ii), I experimented with an audio dataset to train a generative model.*I used the tool RAVE, a variational autoencoder for fast and high-quality neural audio synthesis, developed by Antoine Caillon and Philippe Esling. RAVE and its documentation are available here: https://github.com/acids-ircam/RAVE As recordings of the cycle are protected with copyright, this would have been ethically –and possibly legally– problematic. This made me desist of the idea for some time. However, I later discovered the Schubert Winterreise Dataset (SWD),*Christof Weiss et al., "Schubert Winterreise Dataset," (Zenodo, 2024). https://doi.org/10.5281/zenodo.10839767. a multimodal dataset comprising various representations of Franz Schubert’s 24-song cycle Winterreise, including two full audio recordings of it. These recordings are licensed under Creative Commons and are fully usable for training an AI model.
Training generative models with music that exists in the public domain or music that is shared with others for that purpose, *Training models with one’s own music and sharing it with other artists is an uprising practice nowarays. An artist that advocate for this use is Moises Horta Valenzuela. More on Moises here: https://moiseshorta.audio/. I believe, somehow settles the legal issue, although I am not so sure whether that bridges some ethical implications around it.
Journalist: If I recombine by hand the most recent works of David Cope, who gets the copyrights, you or me?
David Cope: This would be depending on the size and number of the recombinations. Reversing the order of two halves of one of my works would be plagiarizing. Composing a new work on the first four pitch classes of one of my compositions would not be plagiarism. There are, of course, an almost infinite number of gradations between these two extremes, and somewhere in the middle, things get very gray. These should be decided on a case-by-case basis.
Ideating vs. Making
Is prompting an important aspect of the creative process of an artwork? Does it afford the potential for novelty? Seriously, what kind of questions are these? Everyone can do prompts! One doesn’t even need to be a musician to prompt a model and get “the best music ever”! *This example was used by Koka Nikoladze in his presentation “Generative AI and Music(sk)aping,” presented at the Norwegian Federation of Composers’ Professional Seminar, "Music, Creation and Consciousness," October 16-17, 2023.
Well, I have to confess that, throughout the development of NeuralConstraints,*NeuralConstraints is a CAC tool that combines neural generation with constraint algorithms. More on NeuralConstraints on the ‘Contributions’ section. I saw myself spending significant time finding the right musical prompts –in this case, sequences of pitches, pitch classes, intervals, etc.– to get the most interesting continuations. Even using NeuralConstraints, which is a relatively small-scale neural generative tool, subtle differences in prompts had different and curious effects on the results. At some point, it became interesting to try out many different possibilities, with varying degrees of relation to the original dataset, different musical styles, or even combining simultaneously different models trained with different datasets, since NeuralConstraints allows one to do this.
Even though the outcome of a gen-AI model always comes in different forms, the development of good prompts has become a commercial endeavor, and some of them are prone to viralize the same way as other media content, such as videos, photos, or audio. Many AI-based artists spend many hours trying to define the best prompt that will give them the desired result. * See for example Minsuk Chang et al., "The Prompt Artists" (paper presented at the Proceedings of the 15th Conference on Creativity and Cognition, 2023).
In any case, in more recent times, I started to view the notion of prompting a bit differently than just a passive engagement. Of course, it can be that as well (probably most of the time is). However, the notion of prompting has triggered in me a revision of my role as an artist, the dynamic of the creative process, the concept of technique, materiality, and many more. But these revisions are not something that hasn’t already happened throughout history. I can think, for example, of the ateliers of the masters of the Rennaissance, where these had many apprentices to whom they’d delegate the physical execution of their works.
Even in contemporary art, the practice of delegation has been and is still crucial to exploring new ideas and concepts without worrying about technical or practical aspects. Even nowadays, institutions such as SWR Exprerimentalstudio *https://www.swr.de/swrkultur/musik-klassik/experimentalstudio/index.html or IRCAM *https://https://www.ircam.fr provide technical assessment to composers for projects related to live electronics/electroacoustic music. In diverse artistic fields, many forms of collaborative development of works are possible, especially involving high-end technology, like immersive and interactive systems and VR. *See for example the choreographic work of Hanna Pajala-Assefa: https://www.hannapajala.com
However, there have also been ethical conflicts around this. In the book Artificial,*Mariano Sigman and Santiago Bilinkis, Artificial: la nueva inteligencia y el contorno de lo humano (Debate, 2023). Bilinkis and Sigman bring the example of the Italian artist Maurizio Cattelan, the creator of the piece Comedian in 2019. The name might not say much, but the piece is very well-known, as it consists of a fresh banana taped to a wall that got two copies sold for the remarkable price of USD 120.000 each. *Update: the new price of Comedian in November of 2024 is USD 6.24 millions. Apparently, Cattelan is a very prolific generator of ideas, and he would commission the technical realization of these to the French sculptor Daniel Druet. They’d agreed on a price, and Cattelan would sign it and sell it for a much larger price later, of course.
Druet ended up suing Cattelan, claiming this was a scam since he had built the piece with his own hands. Cattelan, on the other hand, argued that the ideas were his and that Druet was only following his instructions. At the heart of that battle was the notion of authorship in relation to conceptualizing vs. making. We can well see this in analogy with prompting: Who is the real artist? That who conceptualizes something, or that who makes/builds/writes/fixes the notes in a score? There is no clear-cut answer to that question, and potentially, artistship involves a bit of all. At least, I believe that. However, in the end, the French Tribunal verdict was against Druet, claiming that the true author of a work was who had the idea and had expressed it in words –prompted? and not who had executed it.
AI, thus, becomes another tool to which we delegate part of the development of the work, as it has consistently happened throughout history. It doesn’t seem to interfere with the notion of authorship, even legally. Essentially, it doesn’t create anything new*It is important to clarify the use of new here and further on since it can be interpreted in different ways. On one hand, something new can be a new iteration of something coming into existence, for example, a new tool, which is ultimately a more recently assembled –or acquired– replica of an old tool. On the other hand, a new tool, as a newly conceived concept, element, or agent, is something that didn’t exist before until now. In this particular case, new is used in the later form. However, along the text, the interpretation of new should be inferred by the context. –at least not yet. It recombines, regenerates, or reconstructs something –sometimes in astonishing ways– based on preexisting data with which these models are trained. But it does not create, and this is very observable, especially for an artist when he/she becomes more deeply engaged with these generative models.
Humans do have the capacity to create something new, even though a large extent of our creativity also comes from recombination or transformation of existing elements –as M. Boden largely discusses. Humans can come up with ideas out of the blue, whereas AI cannot. And this is important. AI is not creative at the level that humans can be, although it can seemingly imitate very well certain human capacities and certain types of human creative processes. AI depends on human-created data. There is no way around this, at least for now.
The real Versificator
OK. I’ve acknowledged that AI’s generative capabilities can help develop my creative ideas. It’s just another tool I utilize, manipulate, and ultimately take advantage of, but… There’s a big elephant in the room here. gen-AI is an era-changing technology. It deserves more than to be compared with sampling, with all my respect to that wonderful technique and wonderful craftsmen. While sampling is a very artisanal, time and energy-consuming process implemented almost entirely by human agency and hands –be it cutting/pasting tape or digital waveforms, and let alone the later processing and combining of these samples, gen-AI generates something almost automatically –even if it’s a recombination, redistribution or reconstruction of something that existed before– just by writing a prompt or pushing a button. And this is very important.
One might say it merely generates, of course. It doesn’t create. But what will happen to music makers while AI becomes better and better, and more and more autonomic, to the point where there’s no need for any human creative agency at all? Neither for craftsmen nor idea-creators? What if these new artificial generators of music –or any other cultural good or piece of art, for that sake– become socially accepted “artists”? *This has been already predicted by Ray Kurtzweil in Ray Kurzweil, The age of spiritual machines: When computers exceed human intelligence,(Penguin, 2000).
Of course, AI models are not conscious entities. Therefore, there still must be a human behind them pressing the button. Maybe one person owns many. Not even taking creative credit but cashing the checks. Maybe a corporation owns all of them. What happens even when these tools are combined with powerful algorithms that learn from our choices and generate and provide us with just what we want? What happens when these are used by large companies to profit? An even more dystopic one: what if real artists could sell their voices to companies to make AI models of them? A piece of customizable music, specially fine-tuned for any audience’s taste. *This is already happening. See https://loudwire.com/singers-offered-ai-voice-cloning-by-worlds-biggest-record-label/
I raised some of these questions before when composing the piece Versificator – Render 3 back in 2020-2021 when Gen-AI didn’t exist yet in its current form. Of course, some of those inquiries feel very naïve now in 2024, in light of the new –hyperfast-paced– developments in gen-AI, such as ChatGPT-4o, Sora *‘Sora’ is OpenAI's video generation model, designed to take text, image, and video inputs and generate a new video as an output. https://openai.com/sora/ or even other more dystopic ones. I wonder how these inquiries will feel in 2026 or 2027. I have the feeling that a lot is going to change.
Derivative grey goo
Leaving the existential drama aside for a bit, I can discuss a little about what I see as aesthetic challenges of gen-AI as a tool for artists. Although at first glance, the results of experimenting with AI models in music might be interesting, overall, there’s the risk for these outputs to result in derivative and overly similar –poor– works. Martinus Suijkerbuijk has proposed a concept that feels somewhat right to me when referring to gen-AI-based art: the results of it potentially becomes a sort of derivative grey goo. *I head these ideas for the first time in Martinus’ presentation at the SAR 2023 Conference “Too Early / Too Late” (April 19-21, 2023) in Trondheim, Norway. See https://sar2023.no/node/38. Martinus’ idea of grey goo parallels with the scenario proposed by nanotechnology pioneer Eric Drexler. See Eric Drexler, Engines of creation: the coming era of nanotechnology. Anchor, 1987.
Even though I am not a fully AI-based artist, in my practice, the problem of the derivative grey goo still demands a conscious design of strategies to produce some novelty and interest, but essentially, it demands aesthetic reflection and awareness. Whenever we are dealing with generative outcomes, the role of the artist, as I view it, becomes more and more involved also with curatorial practices –curation is understood here as a process of selection and organization of artistic material. In this sense, creating something using these tools becomes a sort of latent curatorial practice –and I will expand on this idea in the next paragraph. Essentially, some reflection, re-working, and refinement of this generative outcome might be beneficial. However, this –personal– approach doesn’t pretend to make other processes less valid or authentic.
Does anyone remember the software FruityLoops*FruityLoops is distributed currently under the name FL Studio. https://www.image-line.com/. or the old Reason,*Reason is currently on its 13th version https://www.reasonstudios.com/. both precursor to today’s industry mainstream Ableton Live?*Ableton Live is a digital audio workstation used by musicians worldwide. https://www.ableton.com/. How much good music was created by tweaking and recombining the default presets in them?*I know some artists that started their career working with FruityLoops (FL Studio) or Reason, for example, Oneohtrix Point Never, Flying Lotus, and Sophie, among others. My two first soloist productions, Semblanza and Cinco Soles, are largely based on Reason’s samples and loops. You can listen to them here: https://juanvassallo.bandcamp.com/.
Latent spaces and quaternary memory
Probably one of the most interesting readings about AI and art comes from the pen of Gregory Chatonsky,*The works and writings of Gregory Chatonsky can be found here: http://chatonsky.net/category/publications/ I recommend the article “Angèle et l’art.ificiele” published in the magazine AOC. *Gregory Chatonsky, "Angèle et l'art.ificiel” AOC media, (2023). https://aoc.media/opinion/2023/11/01/angele-et-lart-ificiel/. I greatly recommend reading it.
The article takes as a departure point the appearance on streaming platforms of a remix where the cloned voice*I did this –somehow– in my piece Elevator Pitch for cello and electronics. of the artist Angèle sings a song that she has never performed. In it, Chatonsky articulates a critical vision about some of the commonplaces in reflections around the emergence of these technologies, such as the simulacrum of the artificial human and his/her technomorphic double, *Chatonsky brings the example of the “Autotune effect,” an exaggerated and distortive use of a vocal tunning post-production software that started in the early 2000s and became a strong aesthetic trace in pop music, lasting to our very current days. the fear of replacement –which I experienced some paragraphs above, the need for state and supra-state regulation on AI-generated content, and many others. However, the point that I found most interesting is his vision of the latent space as a new cultural territory. In this new territory, art existed before it actually existed, it existed statistically or as a potentiality:
“The latent space is our new cultural space whose products are counterfactual. Angèle's song existed before it actually existed, it existed as a statistic or, depending on the case, a possibility. […] Everything exists before existing.”
Due to the accumulation of the past through the material supports of tertiary memories,*Tertiary retention, a concept attributed to Bernard Stiegler, refers to memory that is exteriorized and inscribed in material forms, allowing for the transmission of knowledge across space and time, extending beyond the individual and specific cultural practices to form a lasting “reserve” in databases and networks. and as these memories are feeding the large gen-AI models, these generate outcomes that have never occurred but, at the same time, resemble anything that might have been. Chatonsky proposes that we are now moving into a new era where AI represents the emergence of a quaternary memory, which relies on tertiary memories that have reached their peak of accumulation through the vast data of the World Wide Web. The latent space, thus, becomes the new space of artistic possibilities that contains the past but also the future and the incalculable:
If our culture, and its sharing, were determined by tertiary memories, the fruit of the industrial period, we are certainly entering a new era with quaternary memories where the aesthetic contract could be that of alienation: we reproduce machines that reproduce us. The latent space becomes a space of possibilities that contains the past but also, no doubt, a part of the future and the incalculable. […] it is now a question of statistical possibilities, which do not exist (yet).
I believe that grasping this concept is essential for understanding the new practices currently emerging and set to become widespread soon. This shift will require us to reevaluate both the vision and the role of the artist, but also the notion of a work of art. As Chatonsky also discusses, these shiftings have occurred numerous times throughout history. As everything exists in the latent space, part of the work of an artist might involve thinking of him/herself as a curator of these latent potentialities. Is this something that has never happened in history before? Well, no.
Angèle marks a new earworm *An earworm is a piece of music or a song that repeatedly occupies a person’s mind, often involuntarily. in the era of the latent space of AI. When we listen to her, we hear anthropotechnology, that is to say, the gray area that blurs the boundary between human beings and technology according to multiple threads. The earworm becomes statistical: in the wavering of this voice, human all too human, and which at one point laughs at the inhumanity of this reprise, we hear the way in which art is in no way the exteriorization of a human genius in a determined matter, form, and use, but is the misleading encounter with a matter whether it is technologically organized or not. The human being invents the technique, and, in the strict sense, Angèle is invented by a technique that clones her.
To conclude this section, and to be completely honest, my view and approach to these issues are still evolving. The pace of advancement is overwhelming, and the changes are so rapid that it is hard to cement a vision of what things are or should be in relation to art, artists, and gen-AI. My reflections loop between the existential, the general, and the specific, back and forth, even when the issue arises directly from reflections on my very specific field of practice. The ethical, social, and economic implications of these escalate in magnitudes that I (we) cannot fully understand yet.
Ultimately, reflecting on these issues has been crucial to developing an informed view and criteria that somehow guide my practice, my developments, and my artistic choices. I do believe, however, that ethical considerations around their use require permanent revision, as the development and application of gen-AI are moving extremely fast. These discussions are still relatively speculative and evolving at this very minute. Therefore, extensive research, collaboration, and open dialogue between practitioners –including artists, researchers, policymakers, and the general public– are crucial to developing a better understanding of what we are dealing with here.
The Terminator: –In three years, Cyberdyne will become the largest supplier of military computer systems. All stealth bombers are upgraded with Cyberdyne computers, becoming fully unmanned. Afterward, they fly with a perfect operational record. The Skynet Funding Bill has been passed. The system goes online on August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
Sarah Connor: –Skynet fights back…