The difference is that the work a contracted tradesperson will do is typically under some sort of guarantee, e.g. typically 2 years on work done in your home (up to 5 for bigger construction etc. type work), at least here in Germany… which you don’t (need to) factor in when DIY-ing.
> that they're unable to [manage and] kill child processes they themselves spawn makes it seem like they have zero clue about what they're doing.
Yeah, at the bare minimum these projects could also use something like portless[1] which literally maps ports to human- (and language model-)readable, named .localhost URLs.
Which _should_ heavily alleviate assignment of processes to projects and vice versa, since at that point, hard-to-remember port numbers completely leave the equation. You could even imagine prefixing them if you've got that much going on for the ultimate "overview", like project1-db.localhost, project1-dev.localhost, etc.
Well, or just use port 0 like we've done for decades, read what port got used, then use that. No more port collisions ever. I thought most people were already aware of that by now, but judging from that project even existing, seems I was wrong.
That’s a little different, right? Using port 0 would imply that clients have not hard coded what port they should connect to and also we don’t mind having duplicate processes occupying other ports which are no longer on active use
Interesting article you’ve linked. I’m not sure I agree, but it was a good read and food for thought in any case.
Work is still being done on how to bulletproof input “sanitization”. Research like [1] is what I love to discover, because it’s genuinely promising. If you can formally separate out the “decider” from the “parser” unit (in this case, by running two models), together with a small allowlisted set of tool calls, it might just be possible to get around the injection risks.
Sanitization isn’t enough. We need a way to separate code and data (not just to sanitize out instructions from data) that is deterministic. If there’s a “decide whether this input is code or data” model in the mix, you’ve already lost: that model can make a bad call, be influenced or tricked, and then you’re hosed.
At a fundamental level, having two contexts as suggested by some of the research in this area isn’t enough; errors or bad LLM judgement can still leak things back and forth between them. We need something like an SQL driver’s injection prevention: when you use it correctly, code/data confusion cannot occur since the two types of information are processed separately at the protocol level.
The linked article isn't describing a form of input sanitization, it's a complete separation between trusted and untrusted contexts. The trusted model has no access to untrusted input, and the untrusted model has no access to tools.
That’s still only as good as the ability of the trusted model to delineate instructions from data. The untrusted model will inevitably be compromised so as to pass bad data to the trusted model.
I have significant doubt that a P-LLM (as in the camel paper) operating a programming-language-like instruction set with “really good checks” is sufficient to avoid this issue. If it were, the P-LLM could be replaced with a deterministic tool call.
They’d probably get the farthest, but they won’t pursue that because they don’t want to end up leaking the original data from training.
It is possible in regular language/text subsets of models to reconstruct massive consecutive parts of the training data [1], so it ought to be possible for their internal code, too.
Copyright for me not for thee? :) That's a good point though. Maybe they could round trip things? E.g., use the model trained only on internal content to generate training data (which you could probably do some kind of screening to remove anything you don't want leaking) and then train a new model off just that?
I agree, this is inherently unsafe. The two core security issues for agents, I’d say, are in LLMs not producing a “deterministic” outcome, and prompt injection.
Prompt injection is _probably_ solvable if something like [1] ever finds a mainstream implementation and adoption, but agents not being deterministic, as in “do not only what I’ve told you to do, but also how I meant it”, all while assuming perfect context retention, is a waaay bigger issue. If we ever were to have that, software development as a whole is solved outright, too.
On the other hand, that one same engine would then be under near-full control of a single company (Google), with all the disadvantages a monopoly usually brings.
I'm not the founder nor Kagi employee, just a customer, but
> Can you describe or offer any insight into the "significant IP" that you need to protect and defend?
The novel IP is having implemented and still implementing the browser APIs necessary for both Firefox and Chromium extensions to work in a Safari (Webkit)-based browser. See [1] for the significant progress.
> What threats from a larger company are you primarily concerned about?
Integrating said functionality themselves to offer another viable iOS browser, which Kagi is currently the only [2] offerer of (or another viable macOS/future Linux/Windows browser, although more than one exist there already).
[2] Unless the EU steps up, all iOS browsers will continue to have to be Webkit-based with minimal, lackluster extension support. Not viable for anything beyond the most basic of use cases.
Regarding your first paragraph, I've even talked with people who go out of their way to actively _avoid_ said product after encountering AI-generated advertising.
So that'll probably continue to have an effect for as long as average people with good eyes can still distinguish "AI"/generative media from "real"/traditional footage.
I have observed this as well, and we've already seen some pushback when major brands use AI in their creative. I wonder if we're entering an era where AI will actually taint a brand.
reply