• 3 Posts
  • 216 Comments
Joined 1 year ago
cake
Cake day: April 27th, 2024

help-circle



















  • Thanks, had not heard of this before! From skimming the link, it seems that the integration with HASS mostly focuses on providing wyoming endpoints (STT, TTS, wakeword), right? (Un)fortunately, that’s the part that’s already working really well 😄

    However, the idea of just writing a stand-alone application with Ollama-compatible endpoints, but not actually putting an LLM behind it is genius, I had not thought about that. That could really simplify stuff if I decide to write a custom intent handler. So, yeah, thanks for the link!!


  • Thanks for your input! The problem with the LLM approach for me is mostly that I have so many entities, HASS exposing them all (or even the subset of those I really, really want) is already big enough to slow everything to a crawl, and to get bad results from all models I’ve tried. I’ll give the model you mentioned another shot though.

    However, I really don’t want to use an LLM for this. It seems brittle and like overkill at the same time. As you said, intent classification is a wee bit older than LLMs.

    Unfortunately, the sentence template matching approach alone isn’t sufficient, because quite frequently, the STT is imperfect. With HomeAssistant, currently the intent “turn off all lights” is, for example, not understood if STT produces “turn off all light”. And sure, you can extend the template for that. But what about

    • turn of all lights
    • turn off wall lights
    • turnip off all lights
    • off all lights
    • off all fights

    A human would go “huh? oh, sure, I’ll turn off all lights”. An LLM might as well. But a fuzzy matching / closest Levensthein distance approach should be more than sufficient for this, too.

    Basically, I generally like the sentence template approach used by HASS, but it just needs that little bit of additional robustness against imperfections.