> "Don't be bothered with by the fact that the solutions end with "or false" here. It's a function of how the search algorithms work; the solver looked for more solutions, then failed. I'll admit, I don't totally understand why it only sometimes does this, but it's expected."
I think this is explained in The Power of Prolog[1] that the answers coming from Prolog are not printing text to a terminal, they are valid Prolog terms(/data/code). That's why the result uses the same `;` for OR as code does. Answer (x ; y ; false) is "query can be answered by x or y or no other answer found". (This would let you do meta-programming, reasoning about the results and rewriting the results in a LISPy data-as-code way, if you were more advanced than I am).
Prolog systems do optimisations to jump to the correct answer without searching, if they can, (e.g. database style indexing on the facts and rules) and in those cases there is no code left to search after showing the first answer, no need to prompt the user "should I search for more answers in the remaining code?", and so no need for an output "false" to say "I finished searching and found no more solutions".
This embodies why I don't like Prolog. Prolog's philosophy is that you should just write the predicates without thinking about how the engine works. But as soon as you do something actually complicated, you realize that the different optimization modes of the engine give different results, and shortly after that you'll find yourself in the "exhaustively try every possible combination until we get one that satisfies the predicates" mode, and your code will go from taking 1 second to run to taking 8 days.
And because you don't control the engine (you're not supposed to think about it, after all), there's nothing you can do but rewrite the whole thing in a traditional programming language.
> as soon as you do something actually complicated, you realize that the different optimization modes of the engine give different results
The same is true of SQL query planners. You can perform basic queries without understanding how your SQL engine of choice works under the hood, but if you want performance, you must understand how your DB works. SQL is just the interface.
This is different in kind from imperative programming languages (which are much closer in abstraction to the underlying machine architecture), but we rub along with SQL ok; why not Prolog?
Yeah, but the difference is that SQL provides a huge number of ways to solve the "My query got slow when it got complicated" problem. In Prolog, you have the cut operator, and when that stops working for your usecase, you're just SOL.
I relate to "I wrote it in Prolog, it works on a test example, it doesn't work on the full thing and I don't know if it will ever finish", and that was why I stopped my last attempt at Advent of Code in Prolog. It took me hours longer to make something which worked, it was more code, and it never finished on the full puzzle, and I was fed up and had no motivation or approach to discovering why.
But I think it's different in experience, but not different in kind, from knowing that in Java and C# and Python, string catenation in a loop will perform badly and generate a lot of garbage collection pressure, and you need to know about either StringBuilder or the pattern of making a List of chunks then converting to string once at the end, to get better performance. Or in PowerShell, Objects have more convenience but a lot more overhead than simple values and you have to reach for .NET Library functions for better performance. Neither of those is a problem in C, but not because C has solved those problems, but because C doesn't have any conveniences. Any language that has higher level conveniences, have them implemented in some way with some assumptions that will have tradeoffs that you need to understand to use it well, and the higher it is, the more things there are to know - but also the shorter convenience code you can write.
Still, though, the claim that all Prolog has is the cut operator is not right. If you write a linear list scan it will be O(N), and if you put two of them together it will be O(NxN) just like in any language, e.g:
member(X, Items),
member(Y, Items),
If you can put your lookups into a Trie instead of a list, they will be faster. SWI Prolog ships with libraries for Tries, Association Lists, Red-Black trees, and other things. Fundamentally, code with functions and scopes and branches is tree-shaped, and code describes tree-shaped execution patterns (see the beginnings of Structure and Interpretation of Computer Programs, SICP, I think), and Prolog "cut" trims branches from the tree, but if you rewrite your code so it doesn't describe branches of the tree that you don't want to execute, then you don't need to cut them and your code has less runtime (in any language).
SWI Prolog has tabling (memoization)[1] which you can wrap around any predicate with one line, and the engine will cach the results for faster lookups. You can write this in other languages, e.g. with Python decorators, but you have to write it yourself instead of it being included.
SWI and Scryer have constraint solvers, which are basically another language embedded in Prolog using its programmable syntax, backed by a different search engine, that can solve numerical problems with constraint propagation much faster than with Prolog normal search - something that just isn't possible in other languages in the same way, only available as library code (e.g. Z3 constraint solver), and not a standard techniqe.
SWI Prolog has compare/3 and zcompare/3 e.g. "compare(Op, 4, 5)" will fill in the operator Op = < because four is less than five. You can then "do_thing(Op, Data)" and where do_thing/2 is different for less than, equal, and greater than, and because of first argument indexing on the operator, that can turn something that would have been a search through three branches into a determinstic single path, acting to cut the tree without using a cut operator, and without using if/elseif/else.
I somewhat disagree that you shouldn't be aware of how the engine works. The mechanics are quite simple. Prolog's horn clauses are combined in depth first search manner trying to proof that the negated goal is false.
However, most prolog books focus on rooting the declarative mindset because programmers are generally more familiar with imperative programming. But just as with SQL or lisp there are definitely good ways, bad ways and plain mistakes you can make when approaching a problem.
How is this different from other programming languages though?
One example I often think about is from Ken Silverman: "sub eax, 128" → "add eax, -128". So equivalent ways to write the same program may have different performance characteristics also depending on the tools that are applied. How many people could tell without trying which way to write this example is preferable?
The same phenomenon will be encountered in all kinds of languages, where engine and compiler improvements make existing code faster or slower.
I think this is very well phrased, and I would argue the same holds for Prolog too.
In my opinion, a key difference between Prolog and other languages in that regard is one of degree, not kind: Compared to other languages, addressing performance problems in Prolog engines tends to have far greater effects on Prolog programs, because so much is implicit (i.e., left to the engine).
If the performance problem is not in the engine, but in the program itself, then we will face the same questions with Prolog as with other languages: How to formulate the program better, is there a better approach altogether?
For example, earlier today an interesting question regarding performance was posted in the Scryer discussions:
The comparison in this case is between Gecode and Scryer on a seemingly simple but nontrivial combinatorial task. What is the problem here? Most likely the Scryer engine itself can be improved. And also very likely, there are better ways to model the task, and also better search strategies, and these tend to have far greater performance impact than the base language, and these questions remain also if we change the base language.
In my opinion, these questions regarding different kinds of formulations tend to be more frequently associated with Prolog than with other languages because Prolog is more frequently used for complex tasks where it is not a priori clear how to even approach the problem.
> "How to formulate the program better, is there a better approach altogether?"
When I code an imperative Bubble Sort, a profiler can identify that function as a hotspot and I can Google "faster sort algorithm" and can understand relatively easily that the nested linear scans were taking the time. In recent years, Casey Muratori has become a prominent internet voice against naive use of "Object Oriented" (OOP) patterns, because using a lot of OOP inheritance and abstraction adds a little overhead here and there and everywhere, leading to poor overall performance with no single place to speed it up.
My Prolog code is closer to the OOP situation, especially when I try to express something with a DCG. It is easy to accidentally code non-deterministic searches in places where I did not expect, or desire them, in a way which makes the whole code describe a huge search space and there is no single place where the extra runtime is localised and no good way to incrementally improve the situation. The comparison in your link between Gecode and Scryer is illustrative; the author wrote a solution in Gecode which completed in 10 seconds. They spent "many hours" writing Prolog and they cannot get the code to finish on the large case, they don't understand why, and they have no way forwards except to ask the internet. Likely there is no single part which is slow in the way that Bubble Sort is slow, only "there may be better approaches altogether" - but how can the author help themselves find and move towards the better approaches? "Learn a faster sorting algorithm" is a practical, achievable, step forwards; "just be a better programmer" is an impassé.
The sticking point is that with OOP patterns, the question of "how do I become a better programmer?" often does not need any answer, because the layers of indirection are additive, each call adds some milliseconds, and that overall leads to a program that still works in a reasonable time, it merely feels sluggish or has a tedious delay. With Prolog, the calls can quickly become a combinatorial explosion of search space, leading to a program which does not finish at all, and thus the question needs an answer. With an imperative codebase a suggestion to "use a faster algorithm for this one task" is one step along a lifetime of gradually becoming a better programmer. With OOP abstractions, people can get results, solve problems, or be employed making slow web portals without ever improving or making highly performant code. With Prolog "You simply have to be a better programmer" before you can get any results at all on larger cases is a much steeper learning curve.
Eh, no, admittedly this is a bit confusing but the ";" at the top-level is not the same as the ;/2 disjunction operator. At the top-level the ";" is just a switch to say "give me more". In most Prolog systems you can also press space for the same thing.
Whether a query will end with "false" or "true" really depends entirely on the query, and the predicates it's calling. So it depends on how a predicate is called. You just have to analyse the predicate in the good, old-fashioned way of eyballing it and maybe running it a couple of times to see why it fails or succeeds, and why it first succeeds and then fails like above.
This is really all down to determinism, which is to say, whether a predicate leaves behind a choice point or not. In this particular case, what's going on can be revealed by looking at the trace of the query. For the following, I loaded the Pokemon database from https://github.com/alexpetros/prologdex/blob/main/src/dex/po...) into SWI-Prolog and traced it like this:
Now, if you squint real hard you'll see that the Prolog engine first unifies the variable 'Pokemon' with the name of a Pokemon, "squirtle". That happens to be the first pokemon in the database that has "water" as the type. Basically the engine looks for a type/2 fact where the second argument is "water" and finds type('squirtle', 'water') first.
In the second call... I mean this one:
Call: (15) dex:type(squirtle, ice) ? creep
The engine looks for a fact that would satisfy the second goal in the top-level query. This is the top-level query:
Since "Pokemon" is already bound to "squirtle", the engine tries to satisfy the fact
type(squirtle, ice)
It can't, so it fails. And then it repeats with the second water pokemon in the list, which is wartotrtle.
What happens when the engine finds a pokemon that is both water and ice? The first one in the database is dewogong, at which point the tracer goes like this:
So this time both goals in the top-level query are satisfiable and the engine returns the name of the value of "Pokemon" that makes the query true (well, satisfiable).
At that point the "." means I pressed "enter" and that's basically saying "no more thank you". I was getting a little impatient. If I had pressed ";" (or space) instead, we'd have a repeat of the previous pattern with successive failures until another water and ice pokemon was found. The next one in the database is cloyster, and so it goes.
That back-and-forth between success and failure goes on until we run out of pokemon. At some point the engine can't unify either goal of the top-level query with a fact in the database so the final result is "false"; because after succeeding a bunch of times, it can't succeed no more.
Honestly this is a fair question to have and I had to trace the query to be able to give a fair account of it so the advice to take it as "expected" is not that bad. Prolog is doing something mind-numbingly repetitive on the background and it's hard to follow it, so sometimes the best thing is to just run your program and see what it does, instead of trying to intuit it.
Are those the results of a disjunction? I think it's clear that not: our rule says that "sam likes X if X is a fruit, for all X"; "for all" because in Prolog every variable is implicitly universally quantified. So the interpretation of the results of the query is:
sam likes apples AND sam likes oranges AND sam likes bananas
Not:
sam likes apples OR sam likes oranges OR sam likes bananas
Sam likes all the fruit! The different values of X are not disjuncts, they're conjuncts.
All that'll probably get me lynched because it flies in the face of a lot of fudging perpetrated by Prolog courses and even some textbooks that should know better, and that tell you that e.g. a set of rules is like an if-then-else and that when you get more results at the top-level those are alternatives. That's just wrong and I don't know why people teach Prolog like this, all it does is confuse students and make them hate Prolog more, but the formal definition of a logic program is that it's a conjunction of definite program clauses; where a definite program clause is a ... clause, therefore a disjunction. So a logic program is a conjunction of disjunctions, i.e. basically a formula in conjunctive normal form.
So the ";" that appears automatically at the end of your query when you press space is not an "OR", otherwise it wouldn't appear when the results of the query are the members of a conjunction. What is it then?
Try this. Instead of pressing ";" after the first result of your query (after the X = 1) press "h" for "help". On SWI-Prolog you'll get this menu:
?- X = 1 ; X = 2.
X = 1
Possible actions:
; (n,r,space,TAB): redo | t: trace&redo
*: show choicepoint | . (c,a,RET): stop
w: write | p: print
+: max_depth*5 | -: max_depth//5
b: break | h (?): help
Action?
Thus finally* revealing the meaning of ";" at the top-level: it instructs the listener (i.e. the Prolog REPL) to enter the redo port. The ";" at the top-level means "redo" not "OR".
And that's true even in your query which is explicitly formulated as a disjunction. With an ";". It's confusing.
I think this is explained in The Power of Prolog[1] that the answers coming from Prolog are not printing text to a terminal, they are valid Prolog terms(/data/code). That's why the result uses the same `;` for OR as code does. Answer (x ; y ; false) is "query can be answered by x or y or no other answer found". (This would let you do meta-programming, reasoning about the results and rewriting the results in a LISPy data-as-code way, if you were more advanced than I am).
Prolog systems do optimisations to jump to the correct answer without searching, if they can, (e.g. database style indexing on the facts and rules) and in those cases there is no code left to search after showing the first answer, no need to prompt the user "should I search for more answers in the remaining code?", and so no need for an output "false" to say "I finished searching and found no more solutions".
[1] https://www.metalevel.at/prolog