Enhanced vs Flow Matching: which AI audio restoration preset to use
Published
The Refiner exposes two presets for AI audio restoration, and they are not just different settings on the same model — they are different families of model entirely. Enhanced is a deterministic encoder-decoder that finishes in seconds. Flow Matching is an iterative generative model that takes minutes and burns more credits. They produce different kinds of output, fail in different ways, and the right choice depends on what you are trying to do with the track.
This is the long-form decision guide. There is a shorter technical summary on the presets reference in /help, and you can A/B both presets on real material from the homepage demos. Below is what each one is actually doing under the hood, where each one wins and loses, and the decision tree we would use ourselves.
Two model families, two ways of working
Enhanced is a single forward pass. The input audio gets analyzed — typically in the spectral domain — a learned network predicts what a clean version of that spectrum should look like, and a neural synthesizer renders it back to a waveform. One pass, no sampling, no randomness. Feed the same file in twice and you get bit-identical output. That reproducibility matters when you are iterating on a mix or running A/B comparisons.
Flow Matching is iterative. It learns a continuous trajectory from noise to clean audio, conditioned on the degraded input. In practice that means many denoising steps, each one sharpening the previous estimate. The model is not patching the original signal — it is reconstructing a plausible clean version from scratch, guided by what you gave it. High quality runs more steps than Normal; more steps cost more compute but resolve finer micro-detail.
The distinction matters because the two approaches have different ceilings. Encoder-decoder models like Enhanced are bounded by what their synthesizer can produce — they fill in the high end well, but they cannot invent micro-detail that was never in the input representation. Iterative generative models can, because every denoising step is free to add structure that emerges from the noise prior. That is a strength and a risk: more detail, but the detail is plausible rather than recovered.
What each preset is good at
Enhanced shines on lossy-codec damage. The 11 kHz rolloff on a 128 kbps mp3, quantization noise, smeared transients from low-bitrate encoders — that whole class of artifact is what its training was built around. On a Suno mp3 download or a YouTube-rip WAV, Enhanced typically delivers most of the perceptual improvement you would get from any preset, in under thirty seconds, for one credit.
Flow Matching shines where Enhanced flattens out: cymbal shimmer, breath noise on vocals, reverb tails, transient attack micro-detail. The iterative resynthesis recovers texture that an encoder-decoder cannot produce. It also has a side effect that matters for some users — vocoder-style artifacts in the source wash out in the rebuild, so Flow Matching outputs are harder for AI-content detectors to flag than Enhanced outputs.
Both presets export WAV. Default is 44.1 kHz at 24-bit. Pro and Studio plans unlock 48 kHz on either preset, and Flow Matching on those plans can render 32-bit float — relevant only if the file is heading into a DAW for further processing without re-quantization.
Where each one falls short
Enhanced is fast and cheap, but its synthesis path is itself a learned neural model. That means the high-frequency content it renders shares some statistical fingerprints with other generative-AI audio. If you are uploading to a platform that screens with AI-content detectors, an Enhanced output may still get flagged — sometimes, not always. We do not claim Enhanced "passes" detection; it does not reliably.
Flow Matching is slower and more expensive. Six to twelve minutes per three-minute track at 3-4 credits is real cost when you are running through an album's worth of material. And because the model is reconstructing rather than enhancing, there is a small chance it invents detail you did not intend — usually inaudible on consumer playback, but worth listening for on critical-listening monitors. If bit-for-bit reproducibility matters (deterministic mastering workflows, version-controlled deliverables), Enhanced is the safer choice because the same input always produces the same output.
The time, cost, and quality trade-off
On a three-minute track, Enhanced runs in roughly 10-30 seconds and costs 1 credit. Flow Matching runs in 6-12 minutes and costs 3 credits on Normal, 4 on High. That ratio — roughly twenty times the runtime and three to four times the credit cost — is the headline trade-off.
The quality delta is real but not always audible. On phone speakers, laptop speakers, or earbuds during a commute, most listeners will struggle to tell Enhanced and Flow Matching apart on the same source. On studio monitors or open-back headphones, with a track that has legitimate high-frequency information to recover (acoustic instruments, cymbals, breathy vocals, dense reverb tails), the gap opens up. Whether that gap is worth twenty times the wait depends entirely on what the file is for.
AI-detector behavior
This is the one trade-off that is not about audio quality. Platforms that screen submissions for generative-AI content typically run their own classifiers, and those classifiers look for statistical signatures in the high-frequency spectrum. Suno, Udio, and other generators leave fingerprints there. So does Enhanced, because its neural synthesizer renders a similar class of high-frequency reconstruction.
Flow Matching is different. Because it rebuilds the signal through many iterative denoising steps rather than rendering it in a single pass, the original generative fingerprints get diluted into the reconstruction. Detector flag rates drop. We are not going to claim it makes a track undetectable — detection is an arms race and nothing is guaranteed — but if your distribution path runs through platforms that screen, Flow Matching is materially better behaved than Enhanced.
For the full release-prep workflow including loudness, format, and platform-specific notes, the longer write-up is in how to master Suno tracks for release.
A decision tree
Default to Enhanced. On most material — Suno mp3 downloads, YouTube rips, lossy renders from any source — it does enough of the job that the extra cost and wait of Flow Matching are not justified. Run it first, listen on whatever you would normally mix on, and ship it if it sounds right.
Reach for Flow Matching when one of three things is true. First, when fidelity is the point — final masters, release deliverables, critical-listening sessions where you can hear the residue Enhanced leaves behind. Second, when the source has obvious vocoder character that Enhanced does not move the needle on — usually acoustic-leaning tracks where the top end matters most. Third, when your distribution path runs through AI-content screening and you specifically need the generative fingerprints washed out.
If you are unsure, run Enhanced first. It is one credit and finishes before you can make coffee. If the output sounds finished, you are done. If it does not, run the same source through Flow Matching and compare. Free tier covers 2 refinements per month, which is enough to do exactly this comparison on one track and decide which preset you want as your default.
Plans and credits
Plan limits and credit costs are on the pricing page. Briefly: Free is 2/month, Hobby covers light use, Pro and Studio unlock the 48 kHz and 32-bit float output options, and one-off credit packs cover surge use without changing your plan. Start free if you have not already — running both presets on a real track of yours is the fastest way to settle the question of which one belongs in your workflow.
Frequently asked questions
- Is Flow Matching always better than Enhanced?
- No. Flow Matching recovers finer high-frequency detail and washes out generative-AI fingerprints, but it costs more credits and takes 6-12 minutes per track instead of 10-30 seconds. For most material, Enhanced is the right default. Flow Matching is the tool you reach for when fidelity is critical or AI-detector behavior matters.
- Will Enhanced output pass AI-content detectors?
- Sometimes, but not reliably. Enhanced uses a neural synthesizer to render the cleaned waveform, and that synthesis step shares statistical fingerprints with other generative-AI audio. Flow Matching rebuilds the signal through many iterative denoising steps, which tends to wash those fingerprints out — outputs are harder for detectors to flag, though no method is guaranteed.
- How long does each preset take?
- Enhanced finishes in roughly 10-30 seconds on a three-minute track. Flow Matching takes 6-12 minutes on the same length, depending on whether you choose Normal or High quality. High runs more denoising steps for finer detail.
- What do they cost?
- Enhanced is 1 credit. Flow Matching is 3 credits on Normal and 4 credits on High. Free plan includes 2 refinements per month; Hobby, Pro, Studio plans and credit packs are on the pricing page.
- Do both presets export the same format?
- Both export WAV at 44.1 kHz, 24-bit by default. On Pro and Studio plans you can render at 48 kHz, and Flow Matching on those plans can export 32-bit float — useful when the refined file is going straight into a DAW for further processing.