We are AI - The AI Song Contest - VPRO International

Composer and singer Holly Herndon has expanded her 'ensemble' to include an artificially intelligent vocalist named Spawn, trained using Herndon's voice. ‘Spawn does things with my voice that I wouldn't otherwise do,’ she says.

It sounds polyphonic, like an autotuned choir, and there's something sacred about it, but in a 21st-century digital way. Proto, the third and most recent album from American music producer, composer and sound artist Holly Herndon, largely defies description. The vocal parts are dominated by the collaboration with what she describes as the new member of her ensemble: her very own AI baby, Spawn. It may sound absurd, but the analogy makes sense.

Herndon, born in Tennessee in 1980, has been working with computers for as long as she's been making music, and that's about twenty years now. She says her main motive is 'a fascination for the intimate relationship we maintain with computers', and she wants to 'explore the ways in which computers can be vehicles for human expression'.

'A human anticipates social developments, such as the corona pandemic that we're now dealing with. AI can't do that.'

Holly Herndon

Thanks to her move to Berlin a year ago, Herndon came up with the idea of creating Spawn. 'Artists who would do something innovative with Beethoven's music on the 250th anniversary of his birth had a chance to win a scholarship from the German government,' she explains. 'I wanted to work with artificial neural networks to do that.' Such networks are not programmed to perform a particular task step by step, but learn independently through the examples you show them, just like a baby learns from everything it sees and hears. Together with her partner, multimedia artist Mat Dryhurst, Herndon decided to feed a neural network with information that it could learn from.

Zeitgeist

But what information? Not MIDI data, she soon decided, although this is the most popular form of communication between digital musical instruments and software. 'You often see with AI music that people analyse the works of old composers, let the computer extract patterns from them, and use them to generate a work. In this way, music is reduced to notes, rhythm and melodies, and old music is actually repeated endlessly. As an artist, I don't find that interesting. Moreover, it's diametrically opposed to the way music has developed over the centuries. Music is emotional, and emotions cannot be captured in a MIDI file.'

Herndon could also have trained her AI child based on her own oeuvre: just as human babies learn from their parents, Spawn would then learn from her 'mother'. Herndon had two albums to her name at the time, and had been producing music for many years, so she had plenty of material to draw on. She didn't choose this option either, seeing it as a cul-de-sac. 'An artificial neural network only understands the world in which you train it,' she explains. 'When you put old material of your own into such a machine, it learns from it. It cannot train and update itself according to the changing world around us, with everything that's happening right now. It has no general intelligence, like a human being. 'A human anticipates social developments, such as the corona pandemic that we're now dealing with. Artists do this pre-eminently, responding to the zeitgeist, but an AI can't do that. I see music as a living art form. On top of that, I myself have grown as a composer over the years, and I don't want to train my AI based on an outdated version of myself'.

Björk on acid

Spawn received a different training: Herndon created a whole new dataset based on her vocals in real time. She went into the studio, sang through her composed parts, and the AI learned on the spot. She also invited friends, some of whom sang parts too. The machine, Spawn, learned, found patterns and started to generate new work from them. It didn't strike gold immediately: here, too, the baby analogy applies: just as an infant babbles incomprehensibly in the beginning, so Spawn's output was not immediately a pleasure to the ear.

'Tech is often experienced as mechanical, cold and emotionless, but in the meantime we share the most personal things through it.'

Holly Herndon

'The first six months, we got total crap from Spawn,' laughs Herndon. 'With audio, you can hear all the mistakes and cracks, and it certainly doesn't produce a beautifully polished sound. The research we used to train Spawn was mainly for graphics, not audio. In the translation to audio, we got a lot of distortions. Although that might have been interesting for Spawn, it didn't yield anything useful for us. It took quite a while, about six months, before we heard anything interesting at all, something interesting to our human ears.' That sound can be found on the first track of Proto, aptly titled 'Birth'. It sounds as if a new species is being brought to life, and jerks and stutters out its first words. It's experimental, a kind of Björk on acid.

Extension

What was Herndon looking for with AI? A completely new sound? 'It's hard to explain,' she says. 'I think an artist is less goal oriented than a scientist in a lab. As a composer, I'm looking for something surprising, something that technology can unlock for me. It's the role of the composer to accompany that.' Spawn became an additional member of Herndon's ensemble. 'Spawn does things with my voice that I could never do on my own. She's a hyper version of me, one in which I can sing endlessly within a certain range without having to take a breath in between. That's really great.'

Holly Herndon

Herndon has always used digital voice manipulation in live performances, and AI is a means to take this to a higher level. 'During performances, I worked with real-time vocal processing [editing the voice on the spot using a computer] to let the audience experience that although the sound from my laptop doesn't sound human, it actually comes from my body. Tech is often experienced as mechanical, cold and emotionless, but in the meantime we share the most personal things through it. We cry and laugh via Skype with our loved ones. Computers are extensions of ourselves.'

Creativity

For Herndon, music is the perfect tool to further scientific research into artificial intelligence and machine learning in a practical way. 'Especially to better understand the new possibilities themselves,' she explains, then pauses. 'There's a lot of nonsense written about AI in the media.' The biggest nonsense, she thinks, is the notion that AI will replace us. What if AI can actually be creative? Creativity is more or less mankind’s last stronghold, nothing is more human than art, and the artistic works that we create: in that field, artificial intelligence really cannot surpass us. 'In all the examples of AI art that I've come across, computers generate new possibilities and ideas,' says Herndon. 'But the human is the one who makes the decisions. The computer is not autonomous, and AI is not something external. What does an AI actually know? Only what you put into it. AI is aggregated human labour. We are AI! A lot of people pretend that the technology is much more advanced, while my experience with Spawn has made me realize how limited AI actually is. The human effort and intervention that AI requires is really underestimated. And why on earth would I want Spawn to compose autonomously? I love composing too much myself!'

This was the third interview in a four-part series about AI and art. Next week, part 4: British artist Anna Ridler, who made an AI installation in which tulips grow based on the price of Bitcoin.