Related: Pure Ruby Relational Algebra Engine

mmanfrin · on May 23, 2016

Oooh, this is nice. While learning ML, I was stumbling a bit with python, wishing that Ruby were a good language for working with datasets.

For all the faults people find with ruby's performance, I don't think much can be found wrong with its syntax; it is such a wonderful language to code with.

Might I suggest aliasing `<<` to `add_tuple`?

flipandtwist · on May 23, 2016

Thank you! That was exactly my motivation for building this.

Aliasing that is a great suggestion. It's actually implemented here: https://github.com/seansellek/Related/blob/master/lib/relate...

I think I should document those aliases a bit better ;)

davexunit · on May 24, 2016

Ruby's syntax is one of the worst I've seen. Ambiguous, hard to parse, and inextensible.

http://programmingisterrible.com/post/42432568185/how-to-par...

regularfry · on May 24, 2016

It's awful to work on, and wonderful to work with. Awful for machines, great for humans.

davexunit · on May 24, 2016

I write Ruby for a living and I completely disagree. S-expressions are the gold standard.

joemi · on May 23, 2016

I like that it doesn't have any dependencies. Makes it easier to read and understand the source, casually (one of my favorite pastimes during lunch breaks lately).

lukeholder · on May 23, 2016

See also: Axiom[0] and rom-rb[1] (which did use axiom I believe), although, I don't know what does or doesn't make them 'pure' relational algebra libraries.

[0] https://github.com/dkubb/axiom

[1] http://rom-rb.org/

kristianp · on May 24, 2016

Reminds me a bit of Project:M36 Relational Algebra Engine, written in Haskell. No nulls allowed.

https://github.com/agentm/project-m36

https://hackertimes.com/item?id=11465145

eru · on May 24, 2016

At Standard Chartered they are using relations as data structures in their Haskell-like language.

It works very well. (The biggest hurdle for a really nice integration is IMHO stuff like typing the join operator. But even the dynamically-typed only version was useful.)

dons · on May 24, 2016

We have a typed version now - it's awesome

eru · on May 24, 2016

Yeah, I've heard the praise. I tried to rally Gergo into porting the idea to GHC. (Using the new pluggable constraint solvers in GHC seems like the best bet.)

I still remember when I hadn't had any clue about functional programming, and I was moving from stuff like QBasic and C to Python. I thought dicts were awesome. And they are---by comparison to only having arrays.

Relations do everything dicts do, but you don't have to decide on the structure beforehand. Exactly the same argument that Codd had.

cmrdporcupine · on May 23, 2016

Very nice. I'm not a Ruby person, but I'm glad to see someone flying the relational flag.

d4mi3n · on May 23, 2016

This is neat. It's nice to see learning tools like this in a language like Ruby, as many implementations of relational algebra codebases in ruby are pretty complicated (see ActiveRecord/ARel and Sequel).

catnaroek · on May 23, 2016

How does this compare to a pure, recursion-free subset of Prolog? Not counting cosmetic differences, e.g., in Prolog, relation arguments are positional, whereas in the relational model, attributes have names.

rntz · on May 23, 2016

Prolog allows unification, which doesn't have a parallel in relational algebra. I'm not sure how much more power this gives you without recursion, though.

If you're interested in connections between relational algebra (and SQL, and databases) and logic programming, you should look at Datalog, which is a restricted subset of Prolog that is akin to relational algebra plus fixed points (transitive closure, for example). In particular, Datalog forbids compound terms (like lists) and recursion in a negated position.

I think Datalog without recursion/fixed-points is precisely as powerful as relational algebra, but I don't have a proof handy.

yellowflash · on May 24, 2016

First order logic can express precisely what relational algebra can express. And what Datalog represents is First order logic with least fixed point, in case if there is linear order on the data (cells), it could represent P-solvable problems.

https://en.wikipedia.org/wiki/Descriptive_complexity_theory

catnaroek · on May 23, 2016

Computing natural joins requires unification.

rntz · on May 23, 2016

Natural joins only require equality testing. General unification is stronger. Prolog can unify [X,3|T] with [2,Y,4,5] to find X = 2, Y = 3, T = [4,5], for example.

catnaroek · on May 24, 2016

What you're talking about requires compound data types in the first place (e.g., lists in your example), whereas in the relational model, attributes are normally assumed to range over primitive types.

But, even in the absence of compound data types, the answer to a Prolog query may contain free variable, whereas the answer to SQL query may not. So you do have a point.

eru · on May 24, 2016

You mean datalog?

(You don't need to be recursion free. Just restrict your recursion very carefully.)

catnaroek · on May 24, 2016

It was intended to be a slightly snarky remark: “Oh, you tell me you implemented a subset of what Prolog can already do out of the box?” But apparently, it wasn't interpreted by others as such. :-|

eru · on May 24, 2016

Oh, restriction to a carefully chosen subset can be a feature all by itself.

Eg for data interchange compare json vs evaluating any random javascript expression. The latter is strictly more powerful than the former. The former is better.

Or an example in the other direction: imagine Haskell-Prime, like Haskell, but you can mutate every variable. Strictly more powerful, but awful. (Or imagine adding GOTO to your favourite programming language.)

catnaroek · on May 25, 2016

> Strictly more powerful, but awful.

Wrong. The power of Haskell mostly comes from how easy it is to reason about code. Your proposed Haskell-Prime would be less powerful in this sense.

eru · on May 25, 2016

I think we are in violent agreement.

jasonm23 · on May 24, 2016

Excellent piece of work.

_uhtu · on May 23, 2016

I'm a big fan of ruby but doesn't saying "Pure ruby" kind of detract from rather than improve a libraries value? Ideally it would have rust/C bindings for all the computationally difficult stuff.

flipandtwist · on May 23, 2016

The reason I said that is because a lot of the other libraries I found, like Arel or Sequel, just build off of SQL to interact with databases. And the problem with that is that SQL is a terrible way to get a handle on relational algebra. It is entirely based on RA operations, yes, but it is so high level you barely feel connected to what's actually happening.

Not good when you're trying to learn basic operations and how they interact with each other.

That's the only reason I said "pure". Wasn't really coming from a point of view of performance or compatibility.

semanticist · on May 23, 2016

There's other good reasons to be "pure Ruby", educational value: https://hackertimes.com/item?id=11757009

And also, while not terribly relevant in this specific case, deployment is easier if you don't have to build native libraries.

Sometimes it's worth going to C bindings for runtime speed, sometimes it's better to keep it purely native for development speed or other reasons.

tomc1985 · on May 24, 2016

Why? I hate libs with native compilation. Bundle install doesn't always "just work" then... now you're somewhat bound to your compiler toolchain and whatever platforms your native code is compatible with (which oftentimes doesn't seem to include Windows, without a ton of extra effort)