Note
—————–
It’s been a year and some change since my last update. Many things have changed in the world over these past 15 months. I had originally intended to track my progress through different stages of my life, but that hasn’t been the case. Even though I didn’t make any posts, I’ve been working behind the scenes on my projects. However, I must admit that I haven’t constructed any physical electronics since my last update. My focus has mainly been on computer programs and some Artificial Intelligence projects, which have consumed most of my time. I also haven’t been cooking much; at least nothing noteworthy. It’s mostly been basic breakfast food, chicken, pasta, and the like. Think of the basic staple meals.
Book Review – Man’s Search for Meaning by Viktor E. Frankl – Personal Rating: 3.5/5
I do not want to get into too much detail, but the book was enjoyable. It is not the type of book I would typically read, but I wanted to try something new. I got an interesting perspective from the author on how he survived being a prisoner in a concentration camp, I believe everyone should read it, because the lesson that is taught in the book, is something we are aware of deep inside our brains, but it’s a lesson that needs to be realized.
I have read additional books since then, but I feel like they are not worth sharing. The next book I plan on reading is Empire of Pain: The Secret History of the Sackler Dynasty.
AI Beat Generator
So to begin this, I have no real experience with creating music, but with the creation of OpenAI’s Chatgpt, I wanted to push its limits and see what it could do. With this in mind, I think it’s important to test out the capabilities of tools that would seem imperative. Taking all of this into account my primary goal was to create a small AI beat generator. I wanted to see if my programming knowledge and my limited machine-learning knowledge could get this done.
With no experience in music creation and little knowledge of training complex models, I started asking Chatgpt to give me an explanation of how to create this project.
So the first thing it told me was to choose a neural network, and with some details on the computer I’m working with, it suggested a Long Short-Term Memory (LSTM). According to GeeksforGeeks “a type of Recurrent Neural Network (RNN) that is specifically designed to handle sequential data, such as time series, speech, and text“. Naturally, I wanted to get a deeper understanding of LTSM but that was out of the scope of the project. I wanted to build up the project quickly, to see if it was possible.
Next For AI to produce a desired output you need to give it data to learn from. So, in this case, I gave it 4 hours’ worth of music for the first series. As stated before, I did not want to make anything too complex. Ideally though for a full-scale project, you would get maybe at least 100 hours worth of music (data).
After I gave it the data to work with I soon realized that my computer was not advanced enough to handle the raw audio files. I asked the chatbot for any alternatives and it gave me a plethora of some. One of the ideas was to convert the audio into a spectrogram, which according to Wikipedia “is a visual representation of the spectrum of frequencies of a signal as it varies with time”. The new plan was to convert all the training songs into spectrograms and when I wanted to generate a new beat, I would convert the image back to audio using Python. So I wrote a Python script to convert the audio into a spectrogram over a 10-second interval (Figure 1.0)1, at this point, I also decided to increase the amount of training music from 4 hours to 12 hours.
With 12 hours worth of music divided into 10-second segments, I would have approx ~4320 different segments. This was easier for my computer to handle. After I used the training data to train the LTSM architecture and my custom model, I waited about an hour for the training to be finished. Now it was time for my beat to be generated and listened to. Here is what it sounded like:
Then with more specialized training, and changing of settings, it started to sound more coherent:
Now so far the audio did produce a bunch of 808 drum beats, but I wanted there to also be some synesthesia and extra stuff in the sound. So for an additional 20 hours, I tried to figure out what ways I could get the model to stick and understand I wanted more than just raw 808s. After working with Chatgpt it gave me a solution I did not understand at all, but it started to produce better “quality” music. Now there were some tradeoffs that were made in order for the program to actually run so when you listen to the music it sounds pretty low quality as it’s not in High-resolution audio. I have a theory on why it’s not in HQ but I’m not trying to make a perfect alternative to producers. But this is what it produced:
As you can hear it’s more coherent. After this, I wanted to make improvements, but that was going to be outside the scope of my project and I was starting to spend too much time on this. So with that being said I considered the project completed.
One improvement that can be made is that there is no conversion from image to audio (spectrogram), another is that I could train it with maybe 70 hours worth of music and not have to cut it into 10-second segments.
Behind the scenes, this took me over 40 hours, to get it to work and finally produce a suitable output. With that being said, my lack of experience in music and machine learning made it too troublesome and it took a lot of time. But if you look at the perspective of a producer, they may spend a couple of days producing a beat. Now compare that to a perfected AI model, the +40 hours it took to make the beat generator, would allow me to make a new beat in less than 5 seconds. And I could in theory make 17,280 in 24 hours. Now the production of AI-produced beats may not be the most perfect beat, but we will eventually not be able to tell the difference. With this being said I have thought about this for a couple of months but there will be a shift in where producers would train an AI Model on their beats and mass produce this creating the “perfect beat”. It may be even happening right now. Even though it may not be lucrative, you can overwhelm your competition with a flood of music.
Figure 1.0
Future Plans / Notes
- I’m not entirely sure how often I’ll post, but I have a general idea. I’ll shoot for one per month-ish. In reality in about another 6 months.
- I’m working on another big project. It’s been taking all my time these past 12 months. If it goes well, I’ll write up a post for it.
- In general, I have made a lot of projects, I just don’t think it is worth sharing
This post was written by Dubem Nwachukwu.