Try BentoPDF if you haven't / are unhappy with StirlingPDF

smiletolerantly@awful.systems · 1 day ago

Actually… Just tried it. I am on 2025.10, so newer than what was mentioned there. It still does not understand any better than from what I remember. Bummer.

But hey, at least the acknowledge that there’s the need for something between dumb pattern matching and an LLM.

smiletolerantly@awful.systems · 1 day ago

Holy shit YES!

That article is from yesterday, and the relevant section is: https://www.home-assistant.io/blog/2025/10/22/voice-chapter-11/#improved-sentence-matching

Awesome to see improvements there. Thanks a lot for linking!

smiletolerantly@awful.systems · 2 days ago

Oh wow, awesome!

smiletolerantly@awful.systems · 2 days ago

Sounds like you never wanted to be friends, just fuck them.

smiletolerantly@awful.systems · 2 days ago

I’m almost sorry to be blunt, but…

Women don’t want to be chased. You’re a misogynist who has reduced women to “people to have sex with” in your own mind.

If you change that, you gain the possibility of actual connections, including intimacy, with them; if you don’t, you don’t. Either way it’s up to you if you want the status quo to continue or to improve.

smiletolerantly@awful.systems · 2 days ago

Thank you for your sacrifice :D

smiletolerantly@awful.systems · 2 days ago

While I don’t like it, it’s not hidden either:

https://bentopdf.com/privacy.html

There should definitely be an option to disable this for self-hosting, but if it’s just a counter for how often each tool is used by all users combined… Eh…

(Stirling also has something similar)

smiletolerantly@awful.systems · 2 days ago

Prisoner Of War:

smiletolerantly@awful.systems · 2 days ago

Why not open a PR to make it configurable? The maintainer is super active and friendly.

smiletolerantly@awful.systems · 2 days ago

well… at least he realizes that was bullshit…?

smiletolerantly@awful.systems · 2 days ago

This is all your fault for sending those dmails :/

smiletolerantly@awful.systems · 2 days ago

Depends - was the assault comment directed at assailants or victims?

smiletolerantly@awful.systems · 2 days ago

Thanks for the recommendation! That looks interesting indeed.

This entire topic is probably a sinkhole of complexity. It’s great to have somewhere to look for inspiration!

smiletolerantly@awful.systems · 2 days ago

Yeah those are good points. Also noticed the CDN thing, it’s a bit annoying for a privacy-first project… But should be an easy fix 😄

Stirling’s backend is Java. So, yeah, heavy and slow sounds about right.

smiletolerantly@awful.systems · edit-2 2 days ago

The one exception here: it’s great to have it installed on your parents’ PC when you’re the one doing the update once in a while when you are around. Rock solid in between, no nagging, and if something did break, easy to roll back.

smiletolerantly@awful.systems · 3 days ago

Ah, thanks for mentioning. Yep, they have a docker image; as mentioned, a nixpkg will be available soonTM; and frankly, you can just build / download the release artifacts and put them on any static host.

smiletolerantly@awful.systems · 3 days ago

Try BentoPDF if you haven't / are unhappy with StirlingPDF

smiletolerantly@awful.systems · 4 days ago

Please read the title of the post again. I do not want to use an LLM. Selfhosted is bad enough, but feeding my data to OpenAI is worse.

smiletolerantly@awful.systems · 4 days ago

Yep, that’s the idea! This post basically boils down to “does this exist for HASS already, or do I need to implement it?” and the answer, unfortunately, seems to be the latter.

smiletolerantly@awful.systems · edit-2 4 days ago

Thanks, had not heard of this before! From skimming the link, it seems that the integration with HASS mostly focuses on providing wyoming endpoints (STT, TTS, wakeword), right? (Un)fortunately, that’s the part that’s already working really well 😄

However, the idea of just writing a stand-alone application with Ollama-compatible endpoints, but not actually putting an LLM behind it is genius, I had not thought about that. That could really simplify stuff if I decide to write a custom intent handler. So, yeah, thanks for the link!!

smiletolerantly@awful.systems · 4 days ago

Thanks for your input! The problem with the LLM approach for me is mostly that I have so many entities, HASS exposing them all (or even the subset of those I really, really want) is already big enough to slow everything to a crawl, and to get bad results from all models I’ve tried. I’ll give the model you mentioned another shot though.

However, I really don’t want to use an LLM for this. It seems brittle and like overkill at the same time. As you said, intent classification is a wee bit older than LLMs.

Unfortunately, the sentence template matching approach alone isn’t sufficient, because quite frequently, the STT is imperfect. With HomeAssistant, currently the intent “turn off all lights” is, for example, not understood if STT produces “turn off all light”. And sure, you can extend the template for that. But what about

turn of all lights
turn off wall lights
turnip off all lights
off all lights
off all fights
…

A human would go “huh? oh, sure, I’ll turn off all lights”. An LLM might as well. But a fuzzy matching / closest Levensthein distance approach should be more than sufficient for this, too.

Basically, I generally like the sentence template approach used by HASS, but it just needs that little bit of additional robustness against imperfections.

smiletolerantly@awful.systems · 4 days ago

Intent recognition for HomeAssistant without an LLM?

smiletolerantly@awful.systems · 11 days ago

What one book or piece of literature would adapt into a movie/TV series if given the funding and full creative control? Why?