Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
SQL looks like English is a well intentioned error (github.com/splware)
22 points by Judyrabbit on March 7, 2024 | hide | past | favorite | 22 comments


I am no fan of the syntax of SQL, however the authors solution of making SQL more procedural is (imo) correctly identifying an issue but then reaching for the wrong tool to fix it.

When I am authoring SQL, I find the declarative and relational concepts extremely powerful and well suited to the problem I am solving.


I agree with you.

Changing SQL to a description of how you'll get the data would make a query incomprehensible to me.


As long as I am allowed to write keywords in capital case I am fine with the syntax. I kinda like it. It smells like the 70s.


I enjoy how it builds on previous success WITH its use of regex. There is nothing quite LIKE ideas from the 50's haunting you in your sleep.

   SELECT age FROM Customers WHERE Country='Mexico'
Could be written something like:

   table.customers.filter(c=>c.country == "Mexico").age
The filter is kinda ugly, there is this strange assumption there to be only one Mexico in country.

Clearly the most advanced approach is this:

  //customer[country="mexico"]/age
1999!


> and after GROUP, an extra BY needs to be written.

group /= groupBy.

Group just put things in groups. GroupBy allows for a function on the item to be used (think passing a custom comparator into a sort() function). In SQL's case I think this is just limited to a column id, but I could be wrong.

> The main downside of being like natural language is non procedural. Program logic is generally executed step by step

SQL has thrived on being non-procedural. It is high-level and abstract, and its various implementations make much better optimisations than would the average-idiot-programmer-in-a-hurry.

All that would go out the window if the programmer had to write how to do things. Forget 'repeatable read' or 'linearisable' or any correctness guarantees provided by a database - if it's up to the programmer to write how, not what, then your correctness guarantees are basically: "I hope every programmer who touched this codebase took the right number of locks in the right order at the right time".


The authors argument that SQL's english-like syntax is a "well-intentioned error" overlooks the accessibility it provides. SQL, inspired by relational algebra, enables set-based operations, distinct from procedural programming's linear approach.

This is why SQL has endured for so long.

That mathematical base allows for efficient data manipulation across vast datasets.

To his point that no other languages strive to be spoken like, languages like ruby (and python to a degree )have successfully adopted a natural, english-like syntax, enhancing readability and developer productivity.

Emphasizing procedural over declarative concepts misinterprets SQL's core advantage: expressing complex data relationships and manipulations succinctly. Such perspectives miss the broader utility of making programming languages approachable and efficient for diverse tasks and user backgrounds.

Side rant: I just wish SQL was more composable.


Being readable as english for the business people was a fad of the era. It didn't pan out, it made a mess. I still want to view the results of a controlled trial to test that hypothesis... Personally English is very different from my language and I process programming languages and natural languages very very differently. SQL is often good because is declarative, is almost always bad because of its composability story, but is and always will be a mess that makes some things more complicated than they should.


The things with SQL is that it's more like parroting RelAlgebra instead of trying to implement it with a comfortable syntax. All the bad ideas of the Cobol language generation come with it.

And, like you said, SQL kills one of the core ideas of RelAlgebra: composability.

I find it very sad that behind the scenes in most relational DBs, the mess off SQL gets translated to the operator language which the optimizer works with. Now, /that/ language is pure relational algebra complete with nice mathy properties.


The optimizer does not work work on pure relational algebra. Or works on something very similar to SQL in the case of PostgreSQL.


The operators in PostgreSQL used to represent something that feels a lot closer to rel algebra than SQL.

Admittedly, I haven't looked at pgsql code for 10+ years so this could have changed.


See my other post in this thread.

The authors are from China so English is not a plus for accessibility in their market’s business users or analysts.


Lately I've been experimenting the usage of LLMs for the explainability of SQL stored procedures with exceptionally good results - think about identifying important fields, linking them to existing glossaries, expanding said glossaries, etc. I wonder how much SQL looking like natural language helped for my use case.


> Lately I've been experimenting the usage of LLMs for the explainability of SQL stored procedures with exceptionally good results

Heavily agree here. LLMs have significantly shortened the time it takes me to write complex queries.


Do you have any suggestions on prompts to use here?


I find the author to be fairly out of touch with modern programming languages based on the assertions made in the article. But more importantly, I think leaning into procedural programming is a fundamental misunderstanding of the power of declarative syntax and also does not take advantage of the RDBMS’ biggest strength: the ability to think in sets.


While interesting, it's hard to identify any component of the author's argument that isn't fundamentally flawed.

Ambiguity is not "the true advantage" of natural language and certainly not one we want a query language to adopt.

Pointing to the limits on the complexity of SQL queries most business folks can write overlooks the fact that there are SQL queries that most business folks can write.

Wanting to make SQL more procedural is a terrible idea, for all the reasons already expressed more eloquently in this thread than I can.


Anyone who has had to sift through large BNFs such as [0, 1] to find that the syntax error was a syntactically significant "TO" knows that this is the case. Natural language has synonyms, computer languages really should avoid synonyms and also any spurious syntactic sugar.

[0] https://www.postgresql.org/docs/13/sql-altertable.html

[1] https://www.postgresql.org/docs/13/sql-commands.html


It perhaps is an error from the point of view of the ”leading reporting tooling vendor in China” https://www.scudata.com/about-us/ which starting in 2007 built their own Java NoSQL data warehouse+sqllite?!) platform.

https://github.com/SPLWare/esProc https://blog.scudata.com/qa-of-esproc-architecture/

Composibility aside, I’m more interested in their identification of limits of relational algebra and the alternative keywords they use for things difficult to optimize in SQL engines than their critique of using English. For that, see https://c.scudata.com/article/1694595486828?p=1&m=0


While I don't agree with the conclusion of the article, I still wonder if there's an alternative to SQL that is:

- declarative

- easy to understand

- scales well with the complexity of the query

Perhaps it could look and feel like a functional programming language but underneath uses the same execution engine as SQL

I've seen a few attempts at it, but none of them have clicked with me.

Thoughts?


English is indeed a well intentioned error, but why does SQL look like that?


To make it more accessible to bon-programmers. It worked but it is unclear if it was worth the downsides, especially the loss of conposability.


Cobol and Fortran kind of made the exact same well-intentioned error of trying to mimic English, despite being perfectly procedural.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: