a27f22ceaf95412710a395d0bd5655cd1ad15c40
Three changes to reduce voice-note transcription latency on the VPS: - Model: large-v3 -> distil-large-v3 (~6x faster, near-identical English accuracy; language is already hardcoded "en"). - beam_size: 5 (default) -> 1 (~3-4x faster on clean audio). - cpu_threads: 8 -> 4 (the box has 8 cores running api, dreamer, watcher, nextcloud concurrently; ctranslate2's inter-op pool plus context switching makes 4 effectively faster than 8 here). Combined effect expected ~10-15x over prior config. No accuracy regression expected for the voice-note use case (English, clean audio, domain terms already supplied via initial_prompt).
Description
No description provided
Languages
Python
95.9%
HTML
3.7%
Shell
0.4%