Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Show HN: LLM-Companion: push-to-talk + TTS web chat app for OpenAI-like APIs (github.com/lxe)
4 points by lxe on Jan 11, 2024 | hide | past | favorite | 3 comments
llm-companion is a rather fast and high-quality (I think) TTS push-to-talk chat interface with OpenAI-compatible LLM APIs, such as OpenAI, Oobabooga, vllm, llama.cpp, etc.

It uses Whisper for transcription (local), StyleTTS2 for TTS (also local).

Here's a little demo video if it in action: https://twitter.com/lxe/status/1745348827983560991

It uses streaming for both LLM results and TTS, which significantly decreases the interaction latency as compared to something like ChatGPT Voice mode.

The goal was to create a fully self-contained locally-running AI chat app, and this is the result.

Enjoy!



Not only is the demo funny, but this worked, surprisingly, as advertised. Had to restart the environment a few times for some reason. Not sure I understand the authors security concerns, but this is a fantastic early implementation.


Extremely impressive. Really great work. What’s next on the roadmap?


Thanks! Need to really redo the client-side code. Also it's very hacky with how microphone permission and lock works on ios, so unfortunately it might have to be a mobile native app at some point.

I also need to host this in a manner that doesn't require people entering their OpenAI tokens, which I usually find a poor experience.

I'm probably going to wire this up to a cheap/free endpoint hosting a Mistral model by some AI host, dozens of which have recently popped up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: