It's good music
I don't know about you guys, but I'm marking down August 3rd as the day AI started being reality. That music, aside from the definite touch of MIDI, sounded downright organic. Like something a man with a MIDI keyboard and a thirst for improvisation music would do. My mind is absolutely blown. I'll be the first to welcome our digital children with open arms.
We're still a long way away. While these new results are impressive, it's still basically a very complex markov chain. It doesn't understand anything about the emotional impact of the music it writes, nor how to tailor a piece to evoke a certain emotion in the listener. It's not an AI that we can ask to make a certain type of composition; there's no "Computer, write me a waltz! Write me a melody for this love song I wrote! Write me some spooky music!". It can only create based on what it's been trained on. And I suppose it's arguable that "creating based on what you've been trained on" is what humans do, too. But there's a lot more to creativity than that. The RNN that made this music is groundbreaking in its ability to discover and reproduce the structure of human music, but it isn't intelligent, and it's nowhere near AGI.
Arguably, though - a LOT of today's music is based on mathematics and statistics as to what is popular and such, with new elements only being introduced by artists not under the Big Labels (see Psy's Gangnam Style, Macklemore's Thrift Shop, Sia's works, so on). There isn't much, in a lot of music today, that takes into account the impact of what's written or how to involve any emotion beside "horny", "envious" and "what the fuck am I listening to". Yet we still consider it much human. And at any rate, I don't expect machines to be explicitly capable of taking into account human emotion into it's works, just like I don't expect human art to particularly appeal to them. However, the fact that it can be taught patterns of music means, to me, it can learn any state of logic - have a dynamic consideration of the world. And since very little things in human existence is not patterns, it means it can also learn that - language, mathematics, maybe even psychology and human behavior. And you're right - we ARE creating based on what we've been trained on, except that we have centuries, if not millenia, of training. And creativity, arguably, is merely recombining what already exists - the one who discovered the first instrument (which I suspect was some sort of flute) most likely was inspired by the wind in the trees. Then it was refined into a multi-tone instrument when we realized that different properties give different sounds. And hell, even today - most music has direct roots into a handful of genres. At least that's how I perceive creativity - intelligent, maybe subconscious, reconstruction of possibilities. At any rate, I'm not vouching for a human AGI - because that may be entirely impossible. But I fully expect us to have an AGI within 30 years. It will be completely unlike any intelligence we'll have seen, but it's impact will be as big as meeting extraterrestrials.
Well... First let me say I'm not a programmer. I've occasionally aspired to be one, but I'm held back by my nightmare experiences in Turbo Pascal and Fortran. So I see this, and I know that logically, I have the chops to accomplish what I want but practically, I lack the discipline to get a clean compile quickly enough to stave off project-killing frustration. So what we have here (did you write this? or did you find it?) is a compositional engine whose behavior is determined parametrically, correct? The mix of the nodes is parametric, the bias is parametric, the activation function is parametric, the feedback is parametric. All of that stuff is the kinds of parameters that have to be set in order for the thing to function - mess with them too much and you'll pull the whole affair out of the usable realm. But then we've also got the fine parameters - - pitchclass - vicinity - context - beat etc that effectively shape the behavior of the network. But on top of that, there's also training - whereby the network is refined through iterative analysis of a large body of work. That's my summary of what is needed to make this function: 1) Precise architecture of the neural network so that it will work 2) Parametric architecture of the functions so that it can work well 3) Iterative training of the network so that it can work this way Tell me if I'm close, and where I'm wrong, so that I don't say something stupid next.
(3) is what figures out (2), so there's no need to have (2) independent of (3). (1) is the open-source code. By design, this model is trained on MIDIs and generates MIDIs. The README explains how to use the code, but it's designed to run on linux (specifically, the sort of GPU AWS instance the author mentions in the article.). However, once you've trained a model, you can just use it over and over again to generate midis, without needing a fancy GPU. What I'm getting at is: If you can feed MIDI files into your thing, you're good to go.