NVIDIA's Fugatto AI Audio Tool Stuns Music Producers

Nvidia has unveiled Fugatto, a new AI audio generator capable of creating and transforming any combination of music, voices, and sounds through text and audio inputs.

Key developments:

The tool, formally named Foundational Generative Audio Transformer Opus 1, seems to demonstrate remarkable versatility in audio generation and manipulation

Here’s an overview of the features & capabilities:

Feature	Capability
Sound Generation	Creates unique sounds from text prompts
Voice Transformation	Modifies accents and emotional tones
Music Editing	Isolates vocals and swaps instruments
Audio Control	Offers fine-grained control over generated content

“This thing is wild,” says Ido Zmishlany, a multi-platinum producer and songwriter.

“The idea that I can create entirely new sounds on the fly in the studio is incredible.”

Technical specifications:

Uses 2.5 billion parameters
Trained on millions of audio samples
Runs on NVIDIA DGX systems with 32 H100 GPUs

Industry impact:

The model’s versatility opens new possibilities for music producers, ad agencies, and game developers. Multi-platinum producer Ido Zmishlany notes, “The idea that I can create entirely new sounds on the fly in the studio is incredible”.

Practical applications:

Music Production:

Testing Song Ideas: Interface for modifying synthesizer audio with added drum beats using a text prompt.

Quick prototyping of song ideas
Testing different styles and instruments
Enhancing audio quality

Voice Modification:

Accent changes
Emotional tone adjustments
Language learning applications

Future implications:

While Nvidia hasn’t announced a release date, the tool’s development signals a transformative moment for music production. Built on a dataset of millions of audio samples, including the BBC’s sound effects library, Fugatto represents a new frontier in AI-assisted creativity.

For musicians looking to stay ahead of the curve, this tool could become as essential as stem separation tools for music production, offering unprecedented control over sound design and manipulation.

Rafael Valle, manager of applied audio research at Nvidia, emphasizes the human element: “We wanted to create a model that understands and generates sound like humans do“.

This approach ensures that while pushing technological boundaries, Fugatto remains an instrument of human creativity rather than a replacement for it.

New AI audio tool Fugatto by NVIDIA does the impossible – and music producers are stunned

Key developments:

Industry impact:

Practical applications:

Future implications:

Bandcamp lays off most of its remaining engineers, artists weigh alternatives

CVC Capital Partners buys majority stake in DistroKid, the biggest pipe for AI uploads

Key developments:

Industry impact:

Practical applications:

Future implications:

About the author

Share this article

Keep reading

Bandcamp lays off most of its remaining engineers, artists weigh alternatives

CVC Capital Partners buys majority stake in DistroKid, the biggest pipe for AI uploads