I for one welcome everyone to the tarpit where a normal person is seen as a robot in an endless poison pit and sounds like a Black Mirror television episode.
I have integrated Stratum's columnar indices as a secondary index in the new query engine of https://github.com/replikativ/datahike itself, so for numerical data you will be able to use Datalog/SQL to have combined (OLTP, OLAP, ...) processing. Same for proximum (persistent HNSW vector index) and scriptum (persistent Lucene).
Stratum already can be copy-on-write updated online with better write throughput than purely columnar alternatives (Stratum uses a persistent B-tree over column chunks) as far as I tested. I have not compared it in benchmarks yet though, DuckDB recommends to not update it online for instance. But it depends on the workload, if you do random access writes the columnar layout overhead will still be a slow-down compared to OLTP/Datahike's row/entity-wise indices. Also storing fully variable strings in a column is inefficient, for this you want the entity-wise indices.
At the Ise Jingu, the shrine is not built to last; it is built to be reconstructed from scratch every twenty years.
If we want our systems to last, we would need the "process knowledge"—the actual mastery of the craft—to be in human hands rather than decaying in a dead system.
I don't think we can afford to process-knowledge-transfer many of our essential systems... without machine assistance.
Most of the platforms were successfully petitioned to have rust sdk mandatory added so that rust code can be added to the platforms. The previously situation was rust was not allowed because the external dependency of the rust sdk was blocked.
Note that the rust having no stable api is not fixed, so I think there's a bunch of internal systems on each platform to hard lock the rust dependencies across multiple rust users.
There's some friction between platform packagers and the code that the author wrote exactly as it was written.
Why not directly have the llm write ISA assembly. We're still grading based on results / theory proofs and for example the certifications for cryptographical government use are based on binary codes and not sources.
Edited:
Why not go further and print the chips, pcbs directly via 3d printing with llm instructions.
Edited (joke):
Why not go the furthest and turn the entire earth into a computer and grey goo?
> Why not directly have the llm write ISA assembly.
The honest ernest answer to that is it's a bad idea because it is not portable. Unfortunately for Intel, m they don't have the dominance they once did, so you have to pick between ARM, x86, or something more exotic, and then be attached to that specific ISA. It's an interesting thought tho.
My approach for a game sandbox https://github.com/libriscv/godot-sandbox for user generated content was to experiment with standardizing on a riscv 64 bit linux ISA.
https://bellard.org/jslinux/ bellard is notable for this approach where you write a riscv execution layer and then write a windows / linux / dos etc emulator on top.
We have different IR backends to make it portable btw. There are llvm, qbe, etc
In the repo I mentioned in the link am goofing around exactly with llvm, to actually figure it out.
I think you, got the point. I am under impression that we are going to nowhere with these approach of that token generation is a solution for everything. For me for some reason it feels like bruteforce but stakes right now are too high, so step back is not an option for bigger players.
Maaaybe, just mayyybe, in training weights there are not enough examples of producing valid assembly? And spitting something that was being fed by scraping oss repos is easy to impress for glorified autocomplete machinery?
I thought the latest advance in computing (spring 2025 - last year) is self-play / reinforcement learning. Like we've ran out of training data a few years ago.
Reinforcement learning having the large language model devise puzzles that they solve via llm-as-judge.
The definition of llm-as-judge is your llm generate 8-12 trajectories and a different llm judges the result. I'd use an oracle like windows or linux operating system execution for the problem of ISA-assembly creation.
The winning entries are used to train the large language model.
reply