OpenMusic is an open-source, next-gen diffusion model for generating high-quality music from text prompts. With just a few words, you can create a custom audio track that perfectly captures the mood you want.
This text-to-music tool, powered by QA-MDT, uses a quality-aware masked diffusion transformer for improved results. It addresses issues like mislabeling and low-quality audio within open-source datasets, resulting in higher-quality music generation.
OpenMusic is useful for artists, producers, or anyone looking to experiment with sound. It provides quick audio samples for creative projects, mood-setting background music for videos, or even inspiration for larger compositions.
Below, you can listen to examples and see the text prompts that generated them.
How to use it:
1. Access the OpenMusic HuggingFace space and enter your music description in the textbox. Be specific about genre, instruments, mood, and style. Here are some examples:
- “Sad song of two lovers who never talk again, starting intensely with emotions and then gradually fading down into silence.”
- “A modern synthesizer creating futuristic soundscapes.”
- “Acoustic ballad with heartfelt lyrics and soft piano.”
- “A deep bassline mixed with upbeat electronic synths, creating a club anthem.”
- “Melodic orchestral composition with a build-up of strings and percussion, evoking cinematic tension.”
2. Click “Generate” to start generating your music. The QA-MDT model analyzes your text prompt, considering factors like genre, instrumentation, and emotional tone. It then synthesizes a unique audio track that aims to match your description. The quality-aware aspect of the model helps ensure consistency and reduces artifacts in the generated music.
3. For developers interested in fine-tuning the model, visit the OpenMusic GitHub repository for training strategies and technical details.
4. To learn more about the underlying QA-MDT model, read the research paper at https://arxiv.org/pdf/2405.15863.











