You've successfully subscribed to Castopod Blog
Great! Next, complete checkout for full access to Castopod Blog
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.

Install Whisper.cpp on your Mac in 5mn and transcribe all your podcasts for free!

Audio transcription is getting better every month. Whisper.CPP makes it faster and easier.

Benjamin Bellamy
Benjamin Bellamy

A few months ago, I wrote an article about podcast transcription with Whisper (from OpenAI):
Transcribe your podcast for free using only a laptop and Whisper

Installing Whisper is pretty easy but having CUDA working with your GPU drivers can be tricky…

Thanks to Georgi Gerganov, things got a lot easier! Whisper was originally written in Python (which is used for many A.I. projects), Georgi wrote it in C++. And he called it… Whisper.CPP.

His C++ version is very interesting:

  • it runs faster
  • it has no dependency
  • it is supported on almost all platforms (macOS, iOS, Android, Linux, FreeBSD, WebAssembly, Windows, Raspberry Pi)

Note that it currently does not work on GPU yet (CPU only) but is optimized for Apple silicon.

Benchmark

I ran some tests to see how fast Whisper.CPP runs compared to the Python version on CPU (Intel Core i7-13700KF 3.4 GHz) and the Python version on GPU (12GB Dual GeForce RTX 3060).

  • “10×” means that it takes 1mn to transcribe a 10mn audio file.
  • All files were converted to 16KHz WAV format before starting transcribing.
  • The time spent for loading the models was not excluded.
  • The Python version uses the “BeamSearch” decoder (default parameter), while Whisper.cpp uses the “Greedy” decoder.

WhisperBenchmark

We can see some differences depending on the model we use:

  • Whisper.CPP is always much faster than Whisper on CPU, over 6 times faster for the tiny model up to over 7 times faster for the large one.
  • Whisper.CPP is faster than Whisper on GPU for the tiny (1.8× faster) and base models (1.3× faster).
    It runs slightly slower than Whisper on GPU for the small, medium and large models (1.3× slower).

So if you don't have a GPU (or if you can't make CUDA work), it's a no-brainer: use Whisper.CPP!
If you have a GPU, you should probably use Whisper, unless you are using the tiny or base model (which I do not recommend as they give very inaccurate results).

Installation on macOS

Installing Whisper.cpp is very easy and you should be able to make it work by reading the instructions on whisper.cpp's GitHub page.

Whisper.CPP is provided with its source code only, so you have to compile it. Since it has no dependency (no extra library is required), this step is very easy to perform, all you have to do is to type make in a Terminal window.

But before that, make sure that you have the tools you need, otherwise macOS will probably raise an error and kindly suggest that you install XCode Command Line Tools. If you don't see any error you can skip the next step!

xcode-select: note: no developer tools were found at '/Applications/Xcode.app', requesting install.
Choose an option in the dialog to download the command line developer tools.

xcode

Bill of materials

There are 4 programs you probably need in order to use Whisper.CPP and have it up and running on your Mac:

  • XCode or XCode Command Line Tools: XCode is the official development tool provided by Apple.
    You can either:
    • Download and install XCode from the Apple app store.
      XCode is huge (7.8GB) so depending on your Internet connection, this can take a while and you don't need the whole package if you just plan to use make.
    • Install XCode Command Line Tools (~130MB “only”) when macOS suggests that you do so, when running make or git (just click on “Install”)
    • Install XCode Command Line Tools by typing xcode-select --install in a Terminal window
      xcode-select
    • Install Homebrew first: It will ask you to install XCode Command Line Tools
  • ffmpeg (optional): ffmpeg is a power tool that can convert any video/audio file to any video/audio format. This is not mandatory but highly recommended. I used it to convert my audio files to 16KHz wav format.
  • Homebrew (optional): Homebrew is a Package Manager that I recommend in order to install ffmeg. Only if you want ffmpeg.
  • Aegisub (optional): Aegisub is a program that can be used to correct SRT transcripts.

Installing Whisper.CPP

  • Open a Terminal window (press ⌘+[SPACE], then type terminal then press [ENTER])
  • Type:
    git clone https://github.com/ggerganov/whisper.cpp.git
    This will download Whisper.CPP source code.
    You can also download the zip package and unzip it if you don't like git.
  • Type cd whisper.cpp
  • Type make
  • Whisper.CPP is now compiled on your laptop. (But you still need to download some trained model to make is work.)
  • Type cd models in order to go to the model directory.
  • Type:
    bash ./download-ggml-model.sh small
    (this will download the small model - which is the one I recommend - but you can download another one if you want to).
  • That's it!

clone-make

Try the example:
./main -m models/ggml-small.bin -f samples/jfk.wav

In order to get an SRT file, type:
./main -m models/ggml-small.bin --output-srt --max-len 47 -f samples/jfk.wav

Whispercpp

Now you can correct it with Aegisub.

aegisub

Installing Homebrew and ffmpeg

  • To install Homebrew, open a Terminal window and type:
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    Homebrew
  • Now to install ffmpeg, simply type:
    brew install ffmpeg

Converting to 16KHz wav with ffmpeg

Whisper.CPP takes 16KHz wav files only.
You may use your DAW to generate that. Or use ffmpeg:
ffmpeg -i /path/to/input-file.mp3 -ar 16000 /path/to/output-file.wav

More about Whisper.CPP: Roadmap | F.A.Q. #126

Happy transcribing!

Photo by Karolina Grabowska

Tips & Tricks

Benjamin Bellamy

Podcasts, e-commerce & open-source. Father of Castopod. CEO of Ad Aures.