Quickstart

docker run -it -p 8080:8080 iceychris/libreasr:latest

The output looks like this:

make sde &
make sen &
make b
make[1]: Entering directory '/workspace'
python3 -u api-server.py de
make[1]: Entering directory '/workspace'
python3 -u api-bridge.py
make[1]: Entering directory '/workspace'
python3 -u api-server.py en
[api-bridge] running on :8080
[quantization] LM done.
[quantization] LM done.
[LM] loaded.
[LM] loaded.
[quantization] Transducer done.
[Inference] Model and Pipeline set up.
[api-server] gRPC server running on [::]:50052 language de
[quantization] Transducer done.
[Inference] Model and Pipeline set up.
[api-server] gRPC server running on [::]:50051 language en

Head your browser to http://localhost:8080/

Architecture

Overview

LibreASR is composed of

  • core and language models

  • api-server, which is serving models over a gRPC API

  • api-bridge, which exposes a WebSocket API for clients to use

  • Client implementations

Features

These features are already implemented:

These are on the Roadmap:

  • french, spanish, italian, multilingual

  • Tuned language model fusion

Training

This sections contains instructions on how you can train your own models using LibreASR.

Overview

RNN-T Model

  python3 create-asr-dataset.py /data/common-voice-english common-voice --lang en --workers 4

This results in multiple asr-dataset.csv files, which will be used for training.

  • edit the configuration testing.yaml to point to your data, choose transforms and tweak other settings

  • adjust and run libreasr.ipynb to start training

  • watch the training progress in tensorboard

  • the model with the best validation loss will get saved to models/model.pth, the model with the best WER ends up in models/best_wer.pth

Language Model

See this colab notebook or use this notebook.

Inference

Deployment

Example Apps

React Web App ESP32-LyraT

Performance

| Model | Dataset | Network | Params | CER (dev) | WER (dev) | |———–|———|————|——–|———–|———–| | english | 1400h | 6-2-1024 | 70M | 18.9 | 23.8 | | german | 800h | 6-2-1024 | 70M | 23.2 | 37.6 |

While this is clearly not SotA, training the models for longer and on multiple GPUs (instead of a single 2080 ti) would yield better results.

See releases for pretrained models.

Contributing

Feel free to open an issue, create a pull request and join the Discord.

You may also contribute by training a large model for longer.