Let's build a Text to Music Generation App using Generative AI
Science & Technology
Introduction
In this guide, we will develop a text-to-music generation application using Meta's AudioCraft library, specifically leveraging the MusicGen model. This application will allow end users to input a text prompt, from which it will generate corresponding music. In this step-by-step tutorial, we will make use of Streamlit for a user-friendly interface while implementing various functions for model loading, music generation, audio saving, and file downloading. Let’s dive in!
Prerequisites
Before we get started, make sure you have the necessary libraries installed. Clone the AudioCraft GitHub repository and install the requirements.
git clone https://github.com/facebookresearch/audiocraft.git
cd audiocraft
pip install -e .
Note: It's recommended to check the dependencies and install them carefully, especially if you're using Python 3.8 or higher.
Setting Up the Project
Open your VS Code and create a new file named
app.py
.Import Necessary Libraries:
import streamlit as st import os import torch import numpy as np import base64 from audiocraft.models import music_gen
Load the MusicGen Model:
Create a function to load the pre-trained MusicGen model.
@st.cache_resource def load_model(): model = music_gen.from_pretrained("facebook/musicgen-small") return model
Creating the Streamlit Interface
Set Up the Streamlit App:
Define the layout and page configuration for the application.
st.set_page_config(page_title="Music Gen", page_icon="?") st.title("[Text to Music](https://www.topview.ai/blog/detail/text-to-music) Generation") with st.expander("See Explanation"): st.write(""" This app is a music generation application built using Meta's AudioCraft library and it can generate music based on your natural language description. """)
Get User Input:
Add a text area for user prompts and a slider to select the audio duration.
description = st.text_area("Enter your description:") duration = st.slider("Select time duration (seconds)", 2, 20, 5)
Implement Music Generation Functionality
Generate Music from Text:
Create functions to generate music based on user input.
def generate_music_tensors(description, duration): model = load_model() generation_params = ( "use_sampling": True, "top_k": 50, "duration": duration ) output = model.generate([description], **generation_params) return output[0] def save_audio(samples): sample_rate = 32000 save_path = "audio_output/" os.makedirs(save_path, exist_ok=True) audio_path = f"(save_path)audio.wav" torch.aud.save(audio_path, samples, sample_rate) return audio_path
File Downloading:
Implement a helper function to allow users to download the generated audio file.
def get_binary_file_downloader_html(bin_file, file_label): with open(bin_file, "rb") as f: data = f.read() b64 = base64.b64encode(data).decode() href = f'<a href="data:application/octet-stream;base64,(b64)" download="(file_label)">Download your audio</a>' return href
Integrate Everything
Use the above functions in the main application logic, handling the user input and generating the appropriate output.
if description and duration:
music_tensor = generate_music_tensors(description, duration)
audio_file_path = save_audio(music_tensor)
download_link = get_binary_file_downloader_html(audio_file_path, "Generated_Audio.wav")
st.markdown(download_link, unsafe_allow_html=True)
Running the Application
Run your Streamlit application using the command:
streamlit run app.py
Conclusion
After implementing the above code, your application will be capable of generating music based on text prompts. With this functionality, you can experiment with various musical genres, styles, and prompts, giving rise to unique audio outputs.
Keyword
- Music Generation
- Generative AI
- AudioCraft
- MusicGen Model
- Streamlit
- Text Prompt
- Audio Output
FAQ
Q: What is the MusicGen model?
A: MusicGen is an AI model developed by Meta that generates music from natural language descriptions.
Q: How do I run the application?
A: After creating the app.py
file and adding the necessary code, run the command streamlit run app.py
in your terminal.
Q: Can I customize the duration of the generated audio?
A: Yes, you can use the slider in the Streamlit app to select the audio duration between 2 and 20 seconds.
Q: What kind of music can I generate?
A: You can input any description or genre, and the MusicGen model will generate music based on your input.
Q: Is the generated music free to use?
A: The generated music can generally be used, but it's advisable to check the copyright guidelines associated with the MusicGen model and Meta’s policies.