mirror of https://github.com/JHenzi/OpenVoice2Text synced 2025-12-05 23:58:48 +00:00

No description

Find a file

Tilo Himmelsbach b99d8def82 readme		2022-10-16 13:07:12 +02:00
app	readme torch cpu version	2022-10-16 12:53:13 +02:00
tests	test_vtt_file	2022-10-15 20:01:25 +02:00
.gitignore	Initial commit	2022-10-09 20:29:41 +02:00
Dockerfile	test tasks, venv_no_test	2022-10-15 17:58:31 +02:00
Dockerfile_simple	Docker_simple	2022-10-15 16:03:14 +02:00
LICENSE	Initial commit	2022-10-09 20:29:41 +02:00
README.md	readme	2022-10-16 13:07:12 +02:00
requirements.txt	readme torch cpu version	2022-10-16 12:53:13 +02:00
requirements_test.txt	init	2022-10-15 14:18:55 +02:00

README.md

Whisper FastAPI Service

OpenAI's Whisper dockerized and put behind FastAPI

features

transcribe/translate via fastapi's UploadFile form-data
- as response get a json
- or get a vtt-file
load whisper model
1. via environment variable to docker-container: docker run -e MODEL_NAME=base ...
2. get-request: curl http://localhost:2700/load_model/large
  - gives you response like this: {"loaded_model":"https://openaipublic.azureedge.net/main/whisper/models/<some-hash>/large.pt"}
docker-image

TL;DR

run docker-container

docker run --rm -p 2700:2700 dertilo/whisper-fastapi-service:latest

transcribe files of "almost any format": wav,flac,mp3,opus,mp4,...
- either goto: localhost:2700/docs
- OR curl it: curl -F 'file=@<some-where>/<your_file>' http://localhost:2700/transcribe

run service locally

# run service
pip install -r requirements.txt
python app/main.py

# in another terminal make request
curl -F 'file=@tests/resources/LibriSpeech_dev-other_116_288046_116-288046-0011.opus' http://localhost:2700/transcribe

response json looks like this:

{
    "text": " Not having the courage or the industry of our neighbor, who works like a busy bee in the world of men and books, searching with the sweat of his brow for the real bread of life, waiting the open page of for him with his tears, pushing into the wee hours of the night his quest, animated by the fairest of all loves, the love of truth. We ease our own indolent conscience by calling him names.",
    "segments": [
        {
            "id": "0",
            "seek": 0,
            "start": 0,
            "end": 6,
            "text": " Not having the courage or the industry of our neighbor, who works like a busy bee in the world of men and books,",
            "tokens": [
                50364,
                1726,
                1419,
      ...

              }
}

build & run docker-image/container

docker build -t whisper-fastapi-service .

docker run --rm -p 2700:2700 whisper-fastapi-service

TODO

use ONNX-models via OpenVino, see
GPU-docker-image