How do I record system audio on Mac?

macOS doesn't let apps record system audio directly. Workarounds like BlackHole require virtual audio devices. Screenpipe handles this natively — it captures system audio without any virtual device setup on macOS 15+ using the screen recording API.

How do I transcribe computer audio automatically?

Install Screenpipe — it captures both system audio and microphone, can transcribe locally with Whisper, and makes transcripts searchable with AI. Optional cloud AI, sync, exports, connectors, and team workflows are separate choices.

Can I record system audio and microphone at the same time?

Yes. Screenpipe captures system audio (what you hear) and microphone (what you say) simultaneously, with speaker identification. Most other tools require complex audio routing to achieve this.

How to Capture Audio from Your Computer and Transcribe It

TL;DR: Screenpipe captures both system audio (what you hear) and microphone input simultaneously, transcribes everything locally with Whisper, and makes it all searchable with AI. No virtual audio devices (BlackHole/Soundflower), no cloud processing. Works on Mac, Windows, and Linux. Setup takes 5 minutes — just install and grant permissions.

You want to record what you hear through your computer — a Zoom call, a YouTube tutorial, a podcast, system sounds — and get a searchable transcript.

This should be simple. It isn't. Most operating systems make it surprisingly hard to capture system audio (the audio coming out of your speakers/headphones), especially on macOS.

Here's how to do it properly in 2026.

The Problem with System Audio Capture

macOS

macOS doesn't let apps record system audio directly. When you "screen record" with QuickTime, you get video but no system audio — just your microphone.

Workarounds exist (BlackHole, Soundflower, Loopback) but they require virtual audio devices, manual routing, and break regularly with OS updates. On macOS 15+, Apple finally added system audio capture to the screen recording API, but few apps take advantage of it properly.

Windows

Windows is better here. WASAPI loopback capture lets apps record system audio without extra drivers. Most screen recorders support it. But combining system audio + mic audio + transcription in one tool is still uncommon.

Linux

PulseAudio and PipeWire allow monitor source recording, but configuration varies by distribution and desktop environment.

The Simple Solution: Screenpipe

Screenpipe captures both system audio and microphone input simultaneously, transcribes everything locally using Whisper, and makes it all searchable. No virtual audio devices, no manual routing, no cloud processing.

What it captures:

System audio — everything you hear through speakers or headphones (meetings, videos, music, notifications)
Microphone — everything you say
Both simultaneously — hear the meeting and your own comments, attributed separately

What it does with it:

Real-time local transcription using Whisper
Speaker identification — who said what
Full-text search across all transcripts
AI-powered queries: "What was the action item from the standup?"
Timestamps linking audio to screen content

Setup

Download Screenpipe for your platform
Grant microphone permission (and screen recording on macOS for system audio)
That's it — audio capture starts automatically

No BlackHole. No Soundflower. No virtual audio routing. It just works.

Use Cases

Meeting Recording Without a Bot

Most meeting transcription tools (Otter, Granola, tl;dv) add a bot to your call. Screenpipe captures your computer's audio output — no bot, no "Otter.ai wants to join" notification, no asking permission.

Every Zoom, Meet, Teams, Slack huddle, or Discord call is captured automatically because you're recording system audio. See the AI meeting notes use case for details.

Tutorial and Lecture Capture

Watching a coding tutorial? A university lecture? A conference talk? Screenpipe transcribes the audio so you can search it later. "What did they say about the useEffect cleanup function?" — instant answer from the transcript.

Podcast Research

Listening to podcasts for research? Screenpipe transcribes as you listen. Later, search for specific topics across hours of audio: "when did they discuss Series A fundraising?"

Call Transcription

Phone calls on speaker, VoIP calls through your computer, customer support calls — anything that plays through your audio output gets captured and transcribed.

Comparison with Other Approaches

	Screenpipe	BlackHole + Whisper	Otter.ai	OBS + Manual
System audio	✅ Native	✅ Virtual device	❌ Meeting only	✅ With config
Mic audio	✅	⚠️ Complex routing	❌	✅
Auto transcription	✅ Local Whisper	Manual	✅ Cloud	❌
Always-on	✅	Manual start	Per-meeting	Manual start
Searchable	✅ AI-powered	❌	✅	❌
Screen capture too	✅ Accessibility + OCR	❌	❌	✅ Video only
Setup effort	5 minutes	30+ minutes	5 minutes	15 minutes
Privacy	100% local	Local	Cloud	Local

Audio + Screen Is Better Than Audio Alone

Here's why Screenpipe captures your screen and audio simultaneously:

When someone says "look at the third column" while sharing a spreadsheet, the audio transcript alone gives you "look at the third column" — useless without the visual. With screen capture, you get the actual spreadsheet content.

When someone drops a URL in the Zoom chat, audio tools miss it entirely. Screen capture catches it.

When someone demos a UI and says "click here, then here," the audio is meaningless without the visual. Screen + audio together give you the full picture.

Getting Started

Download Screenpipe
Grant permissions (takes 30 seconds)
Your audio is now being captured and transcribed — automatically, locally, continuously

Every meeting, every call, every video, every podcast. Searchable forever, on your device.

For more details, see the audio capture and transcription use case.

Related reads:

Try Screenpipe →