Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Related: Pure Ruby Relational Algebra Engine (github.com/seansellek)
102 points by flipandtwist on May 23, 2016 | hide | past | favorite | 29 comments


Oooh, this is nice. While learning ML, I was stumbling a bit with python, wishing that Ruby were a good language for working with datasets.

For all the faults people find with ruby's performance, I don't think much can be found wrong with its syntax; it is such a wonderful language to code with.

Might I suggest aliasing `<<` to `add_tuple`?


Thank you! That was exactly my motivation for building this.

Aliasing that is a great suggestion. It's actually implemented here: https://github.com/seansellek/Related/blob/master/lib/relate...

I think I should document those aliases a bit better ;)


Ruby's syntax is one of the worst I've seen. Ambiguous, hard to parse, and inextensible.

http://programmingisterrible.com/post/42432568185/how-to-par...


It's awful to work on, and wonderful to work with. Awful for machines, great for humans.


I write Ruby for a living and I completely disagree. S-expressions are the gold standard.


I like that it doesn't have any dependencies. Makes it easier to read and understand the source, casually (one of my favorite pastimes during lunch breaks lately).


See also: Axiom[0] and rom-rb[1] (which did use axiom I believe), although, I don't know what does or doesn't make them 'pure' relational algebra libraries.

[0] https://github.com/dkubb/axiom

[1] http://rom-rb.org/


Reminds me a bit of Project:M36 Relational Algebra Engine, written in Haskell. No nulls allowed.

https://github.com/agentm/project-m36

https://hackertimes.com/item?id=11465145


At Standard Chartered they are using relations as data structures in their Haskell-like language.

It works very well. (The biggest hurdle for a really nice integration is IMHO stuff like typing the join operator. But even the dynamically-typed only version was useful.)


We have a typed version now - it's awesome


Yeah, I've heard the praise. I tried to rally Gergo into porting the idea to GHC. (Using the new pluggable constraint solvers in GHC seems like the best bet.)

I still remember when I hadn't had any clue about functional programming, and I was moving from stuff like QBasic and C to Python. I thought dicts were awesome. And they are---by comparison to only having arrays.

Relations do everything dicts do, but you don't have to decide on the structure beforehand. Exactly the same argument that Codd had.


Very nice. I'm not a Ruby person, but I'm glad to see someone flying the relational flag.


This is neat. It's nice to see learning tools like this in a language like Ruby, as many implementations of relational algebra codebases in ruby are pretty complicated (see ActiveRecord/ARel and Sequel).


How does this compare to a pure, recursion-free subset of Prolog? Not counting cosmetic differences, e.g., in Prolog, relation arguments are positional, whereas in the relational model, attributes have names.


Prolog allows unification, which doesn't have a parallel in relational algebra. I'm not sure how much more power this gives you without recursion, though.

If you're interested in connections between relational algebra (and SQL, and databases) and logic programming, you should look at Datalog, which is a restricted subset of Prolog that is akin to relational algebra plus fixed points (transitive closure, for example). In particular, Datalog forbids compound terms (like lists) and recursion in a negated position.

I think Datalog without recursion/fixed-points is precisely as powerful as relational algebra, but I don't have a proof handy.


First order logic can express precisely what relational algebra can express. And what Datalog represents is First order logic with least fixed point, in case if there is linear order on the data (cells), it could represent P-solvable problems.

https://en.wikipedia.org/wiki/Descriptive_complexity_theory


Computing natural joins requires unification.


Natural joins only require equality testing. General unification is stronger. Prolog can unify [X,3|T] with [2,Y,4,5] to find X = 2, Y = 3, T = [4,5], for example.


What you're talking about requires compound data types in the first place (e.g., lists in your example), whereas in the relational model, attributes are normally assumed to range over primitive types.

But, even in the absence of compound data types, the answer to a Prolog query may contain free variable, whereas the answer to SQL query may not. So you do have a point.


You mean datalog?

(You don't need to be recursion free. Just restrict your recursion very carefully.)


It was intended to be a slightly snarky remark: “Oh, you tell me you implemented a subset of what Prolog can already do out of the box?” But apparently, it wasn't interpreted by others as such. :-|


Oh, restriction to a carefully chosen subset can be a feature all by itself.

Eg for data interchange compare json vs evaluating any random javascript expression. The latter is strictly more powerful than the former. The former is better.

Or an example in the other direction: imagine Haskell-Prime, like Haskell, but you can mutate every variable. Strictly more powerful, but awful. (Or imagine adding GOTO to your favourite programming language.)


> Strictly more powerful, but awful.

Wrong. The power of Haskell mostly comes from how easy it is to reason about code. Your proposed Haskell-Prime would be less powerful in this sense.


I think we are in violent agreement.


Excellent piece of work.


I'm a big fan of ruby but doesn't saying "Pure ruby" kind of detract from rather than improve a libraries value? Ideally it would have rust/C bindings for all the computationally difficult stuff.


The reason I said that is because a lot of the other libraries I found, like Arel or Sequel, just build off of SQL to interact with databases. And the problem with that is that SQL is a terrible way to get a handle on relational algebra. It is entirely based on RA operations, yes, but it is so high level you barely feel connected to what's actually happening.

Not good when you're trying to learn basic operations and how they interact with each other.

That's the only reason I said "pure". Wasn't really coming from a point of view of performance or compatibility.


There's other good reasons to be "pure Ruby", educational value: https://hackertimes.com/item?id=11757009

And also, while not terribly relevant in this specific case, deployment is easier if you don't have to build native libraries.

Sometimes it's worth going to C bindings for runtime speed, sometimes it's better to keep it purely native for development speed or other reasons.


Why? I hate libs with native compilation. Bundle install doesn't always "just work" then... now you're somewhat bound to your compiler toolchain and whatever platforms your native code is compatible with (which oftentimes doesn't seem to include Windows, without a ton of extra effort)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: