Audio Transcriber API

Transcribe audio from URL or upload from local disk.

Sep 28, 2023 3 min read

About the project

With this API, users can simply post a URL or upload an audio file, and the API will transcribe the audio and returns the transcription. This project was created so I can transcribe limitless audio files locally without calling external APIs.

I built the API endpoints using FastAPI. FastAPI is a modern and high-performance web framework for building APIs based on standard Python type hints. It provides automatic interactive documentation and it is a lot fun working with the tool.

How it works

When the client upload an audio to the API, it will assign a Celery task for transcribing the audio and returns the task ID.

async def transcribe(audio: AudioFile = None, url: str = Form(None)):
    if audio:
        # Save the uploaded file into a temporary file
        ext = pathlib.Path(audio.filename).suffix
        _, filepath = tempfile.mkstemp(dir='/tmp', suffix=ext)
        with open(filepath, 'wb') as f:

        # Transcribe asynchronously
            task = transcribe_from_file.delay(filepath)
        except TaskException as e:
            raise HTTPException(status_code=500, detail=str(e))

    return {'taskId':}

The client then need to check the status of the given task on a separate endpoint. When the transcribing is done, the client will also receive the result.

async def transcribe_status(task_id: str):
    task = celery.AsyncResult(task_id)
    if task.ready():
        return {'status': 'DONE', 'result': task.get()}
        return {'status': 'IN_PROGRESS'}

Under the hood, the API is using OpenAI Whisper model for transcribing the audio. The model is relatively small but it gives very good results.

import whisper
def transcribe_from_file(filepath: str):
    model = whisper.load_model('base')
    result = model.transcribe(filepath)
    return result


By using this API, users can transcribe their audio files locally without having to spend additional costs for using third-party APIs. You can download the source code on my Github.