As much as a programmer likes machine learning, there must come a time when they are overwhelmed by the study process. All the coding, maths and infrastructure of it might make one reach out for that extra cup of coffee.
Now, e-commerce giant Amazon has made the world of generative artificial intelligence a little easier to understand by introducing its machine learning-powered MIDI-compatible keyboard, DeepComposer.
Meet AWS DeepComposer
AWS DeepComposer is a 32-key, 2-octave keyboard design. This ML keyboard offers developers to experience generative AI in a better way.
The DeepComposer supports any audio model that is either created by the developer on the keyboard or a pre-trained model, which means a pre-existing audio sample. This includes tutorials, sample codes, training data which can be used to get started with the process of generative AI models without having to write a single line of code.
Now, when you log into the DeepComposer console:
- You record a musical tune or use the precoded tune.
- Then you have to select a ‘generative model for your genre, and again this can be pre-trained or your model.
- Use this model to generate a unique multi-instrument model which will be similar to your generative model.
Then when you play it, you can directly share it to SoundCloud.
The Generator & The Discriminator
Using a pre-recorded model for DeepComposer just makes it a fun console to use. But training your own generative AI model on a data set for your favourite genre is something that will make the learning process far more interesting for developers.
GAN (Generative Adversarial Network) is a generative AI where you put two different neural networks against each other. In simple terms, one of the neural networks acts as a fraud who is trying to manufacture fake jewellery and the other neural network acts as the one who checks whether the jewellery is genuine or not.
So, what happens when both these networks go back and forth over and over again?
At some point, the fake jewellery becomes indistinguishable from the genuine one.
Now, when it comes to DeepComposer, we have two models. The generative AI has a Generator and a Discriminator.
By the above example, Generator is the one trying to manufacture the fake jewellery, and Discriminator is the one that’s trying to detect the faults in it.
When an input is given to the Generator, because it has no access to the data set, it uses random data to create a sample that is forwarded through to the DiscriminatorDiscriminator.
- The Discriminator uses training process techniques like gradient descent, backpropagation, etc.
- The Discriminator model learns how to recognise the genuine data samples (from the training set) when compared with the output from the fake samples that the Generator produced.
- Now comes the critical part about GANs: When the Discriminator learns, it updates the samples (the loss block in the picture). These updates are applied to the Generator model, which progressively learns to make a generative model that eventually cannot be recognised by the Discriminator, which means, Discriminator eventually considers it genuine.
In simple terms, the more you fail, the more you learn.
Now that you know what Generator and Discriminator are, its time to train your own model:
- Select Architecture parameters for the Generator and Discriminator.
- Then select the loss function used during the training to measure the difference between the algorithm’s output and the desired output.
- Hyperparameters.
- A validation sample that you’ll be able to listen while the model is being trained.
Outlook
Amazon has been releasing new products to make machine developers educate themselves for the last two years. In 2017 they launched AWS DeepLens, world’s first deep learning-enabled camera, to help developers learn more about machine learning for computer vision. In 2018, they introduced AWS DeepRacer, a reinforcement learning cloud-based race car which is 1/18th scale autonomous. This year they came up with DeepComposer.
Amazon has been venturing in the field of creativity and is trying to integrate machine learning into them to make better models and make developers take more steps towards machine learning’s future.