NoCloud Media

Video tool

Video Denoiser — Remove Background Noise

Pick an FFT-based strength or the RNN voice model. The picture is stream-copied — only the audio gets denoised. Your video never leaves your browser.

How it works

  1. 1

    Drop your video file

    MP4, MOV, WebM, MKV, AVI, or M4V. The file stays on your device.

  2. 2

    Pick a denoiser

    Light / Medium / Strong run FFmpeg's afftdn filter (FFT-based; works on any audio content). RNN voice runs FFmpeg's arnndn filter with the rnnoise model — voice-trained, cleanest on speech under heavy background.

  3. 3

    Process and download

    FFmpeg.wasm denoises the audio track and stream-copies the picture, so the video re-encodes only the audio (fast, picture stays bit-perfect). Output keeps the input's container — MP4 stays MP4, MOV stays MOV.

Why use Video denoiser?

Strip HVAC drone, laptop fan whir, or computer hum from a screen recording's narration without re-recording.

Same family of denoiser Krisp ($12/mo) and Adobe Podcast Enhance (cloud + signup) use. Here it runs on your machine, on your video, with nothing uploaded.

Picture stays bit-perfect — only the audio re-encodes. Faster than running through a full editor and avoids picture quality loss.

Two algorithms in one tool — FFT-based for any content (voice, music, ambience) and RNN-based purpose-trained for speech.

Private — your video never touches our servers, which matters for personal recordings or pre-release content.

Common use cases

  • Clean up the audio of a Zoom recording with laptop-fan whir before posting to YouTube
  • Strip HVAC drone from a remote interview video that you can't re-record
  • Take street rumble out of a phone-recorded vlog clip
  • Suppress mic-stand hum in a music video before publishing
  • Remove ambient air-conditioner noise from a YouTube voiceover screen recording
  • Salvage a webcam interview recorded in a noisy room (RNN voice for heavy background)
  • Polish a tutorial screen-recording's narration for clarity without re-recording
  • Clean up a wedding video's audio recorded in a windy outdoor location

About MP4 and MP4

Same two algorithm families as the audio denoiser, applied only to the video's audio track. The picture is stream-copied — no frame is re-rendered, no quality is lost. **Light / Medium / Strong** use FFmpeg's `afftdn` filter (frequency-domain denoiser, content-agnostic, ~6 / ~12 / ~24 dB cut). **RNN voice** uses `arnndn` with the rnnoise pretrained model — voice-trained, cleanest on speech under heavy background noise. Bench-measured (`npm run bench:denoise-comparison`): RNN voice noticeably outperforms FFT denoising at very high noise levels (-10 dB SNR — voice barely audible above the hum); at moderate noise it ties, so the choice is content-driven. The picture's video stream is stream-copied (`-c:v copy`), so encoding speed is determined by audio re-encoding — typically 5-20× realtime. Output container matches input. The rnnoise model file (~300 KB) is lazy-loaded on first use of the RNN voice option and cached — subsequent runs are instant.

Frequently asked questions

Is my video uploaded to a server?
No. NoCloud Media denoises your video's audio track entirely in your browser using WebAssembly. Your file never leaves this tab. Krisp and Adobe Podcast Enhance both upload your media to a server; this doesn't.
Will the picture quality be reduced?
No. The video stream is copied through unchanged (`-c:v copy`) — no frame is re-rendered, no quality is lost. Only the audio track is re-encoded after denoising, which is necessary because every audio sample changes during denoise.
Which strength should I pick?
For talking-head video with HVAC / fan / mild crowd noise: try **RNN voice** first — it's purpose-trained for speech. For videos with intentional background sound (music, room tone, ambient field recording): pick a **Light / Medium / Strong** afftdn variant — those preserve the surrounding sound. Among the FFT variants: Medium is FFmpeg's documented default; Light keeps room tone; Strong is for heavy background but voice may sound a touch processed.
How does the RNN voice option compare to Krisp / Adobe Podcast Enhance?
Same family of algorithm — a small neural network trained on speech-vs-non-speech. Krisp ships its own proprietary model; Adobe ships theirs; we ship the open-source rnnoise model from xiph.org (BSD-3-Clause). The bench (`npm run bench:denoise-comparison`) shows RNN voice meaningfully outperforms FFT denoising on heavy background noise. For typical recording conditions all three produce subjectively comparable output. The differentiator here is privacy + price: this runs locally with nothing uploaded, no signup, free.
Will RNN voice damage music in my video?
Yes — the rnnoise model attenuates anything it doesn't classify as speech, so background music gets mangled. For music videos or videos with significant ambient sound, pick Light / Medium / Strong instead. Those use afftdn, which is content-agnostic.
Does it work on long videos?
Yes, but memory is the constraint. Audio denoising holds frequency-domain data in RAM during processing. A 30-minute talking-head video at 256 kbps audio is ~58 MB of audio data — well within reach. Multi-hour videos may push browser memory limits.
What's the maximum file size?
It depends on your browser's available memory. Files under 1 GB are reliable; very large files may run out of memory.
Which browsers are supported?
Chrome, Edge, Firefox, and Safari 15+. We require WebAssembly and SharedArrayBuffer, both standard in modern browsers.

Related tools