It makes me think of MUDs, specifically my once-upon-a-time favorite, Dragonrealms. It had the usual "command direct-object" syntactical format common to MUDs, but it became an infamous problem to wrestle with the parser to find the correct "obvious" action.
> put amulet on drawer
> put amulet in drawer
> put amulet behind drawer
> place amulet on drawer
> give amulet to drawer
> push amulet to drawer
> show amulet to drawer
+1 for MUDs, MOOs, etc. I played Ancient Anguish in the late 90s / early 2000s. Similar story, although often figuring out (guessing) the correct verb was used as part of quests...
I'm not sure DSLs qualify as "the English-Likeness monster". A good DSL isn't about being like english, it's about being concise and expressive in the given domain.
I don't think raganwald is right here, or rather he's only right about a subset of DSLs -- the subset of DSLs which attempt to map a specific domain to english. I think most good DSLs eschew mapping domains to english in favour of mapping those domains to some new language designed to express information about that domain.
I'm having a variant of the Baader-Meinhof phenomenon where I start seeing something everywhere after writing about it (https://hackertimes.com/item?id=5056667).
I personally like how cucumber tests look exactly like english but it's up to the test writer to explain (through the writing of "step definitions") what each type of phase means (in actual ruby) and then when you combine them you can perform any number of _domain-specific_ tasks. Kinda neat. and a halfway between actually writing a DSL and explaining to the ruby runtime how to interpret a subset of english for this domain.
The point is to some extend right, that you need in a general purpose language an easy way to reason about the consequence. But DSLs are very useful in their analogy towards another system, i.e. Mathematica is very successful in that way.
Another example is Siri. If the end result is not of most importance and an educated guess is acceptable when Siri for scripts could be a way to easily program the device.
Programming languages all posses one thing that most (perhaps all) natural languages lack: a 1:1 equivalence with some mathematical construct, or an abstraction backed-up by such a construct. In order for a natural language to be useful for programming computers the language must be completely describable in mathematical terms.
Completely agree; left a similar comment upthread. Natural language will be insufficient for programming until we succeed in accurately reducing all natural language to mathematical expressions, which could be achievable if we unlock the secrets of exactly how the human brain stores and accesses data. Needless to say, I'm not holding my breath.
Mathematicians have been developing more abstract and concise symbols for hundreds of years and it looks nothing like English or any other natural language. If I had to guess I would say that future programming languages will look more like mathematics than anything else.
Why wait for the future, you can program in APL or (I recommend this one) J today!
Have a look at quicksort in J:
quicksort=: (($:@(<#[), (=#[), $:@(>#[)) ({~ ?@#)) ^: (1<#)
Cute, isn't it?
I was playing with J for some time - a few weekends maybe - I managed to get to the level of conscious incompetency, which is a lot for such a foreign language and then stopped, I just had enough. J is certainly very powerful, consistent and logical language and I believe that after some years I would be able to read and comprehend J's code almost as fast as any other language from the get go. I'm just not convinced it's worth it.
Put another way: I certainly hope that future programming languages will not look more like J. And you can, of course, grab a J system and use it today if you want.
Unlike J, all the brackets/parens/etc are balanced, and the functions and operators have single-character names.
Having played with both APL and J, I find APL to be easier to read. J was really just an experiment, adding rank matching and forks and hooks to APL, as well as using ASCII only for syntax. It doesn't take long to remember the keys for entering the non-ASCII characters into APL, and it looks more readable than J.
I think a future programming language should have precedences for its operators, though.
The case of bad English-likeness that has wreaked the most damage is SQL. Not only does it turn the elegant relational model into a retarded COBOL, it continues to inspire abominations like HQL and LINQ.
I think Wittgenstein demonstrated how much can be said in a language that doesn't have direct meaning when it comes to a logical framework. I think it's also underestimated the computing horsepower our brain uses to merely handle the ambiguity inherent to language. Given that to execute a program language needs to fit into some kind of logical, causal, framework, and that computers don't deal well with ambiguity, it's not surprising that this is a problem.
You're going to have to explain what on Earth you mean. PowerShell just reads (and writes) like someone forced Perl and .NET to have a child. AppleScript, Inform 7, Smalltalk-72, and other "English" languages dream of being in the same boat as PowerShell in terms of consistency, readability, or power.
I'm totally in agreement as far as the current state of NLP and AI goes, but all bets may be off as these fields mature. For example, I have a client that gives me a very detailed specification for a simple iPhone app in a few English paragraphs, along with storyboards, wireframes, and similar apps. I bang out an alpha in a week or two that's mostly faithful to her conception. She writes up some minor changes she'd like, I implement them, get back to her, and on we go until the app is close enough to what she had in mind.
In this case, I'm essentially a really slow compiler turning English language into machine language. Who can say to what extent machines will be able to perform this function in the future?
I'm obviously oversimplifying, and an AI-driven English->asm compiler comparable to my godlike programming abilities is at the far, far end of the technology gradient from now to Lieutenant Data, but it doesn't seem far fetched to have such compilers be "good enough" for designers to create certain MVPs and such without programmer intervention, in the next decade or so.
the point about dsls for business logic is a good one. it always sounds so good in theory...
i'd love to hear success stories about this. what makes a dsl intended for domain experts work? i suspect it's got little to do with the dsl and a lot to do with politics and motivation of the parties involved. and unicorn blood, of course.
[edit: limiting functionality (related to raganwald's reply below) reminded me of this exchange - https://hackertimes.com/item?id=5111265 - which was perhaps a wiser approach than i thought at the time. lbrandy says: "as simple as possible for analysts to use". and that seems to have been a success story.]
OK, maybe writing arbitrary English as a program won't work; but, I can think of a lot of occasions where it would be useful for a computer program to be understandable by any English speaker. I'm thinking of lawyers, who want to be able to review whether a program conforms with legal requirements, or managers, who want to be able to confirm that business logic is doing what they want it to do. Zed Shaw kind of hints at this in his classic "ACLs are Dead" talk, where he talks about using metaprogramming to make Ruby syntax more readable for lawyers.
No, to be honest they should just learn to program then. Maybe what you're talking about would be great if all programs were 100 line simple things but that's not reality. If we want to solve the "lawyers need to read and understand computer programs for legal reasons" problem, it makes more sense to just educate lawyers on programming. In some ways this is like saying lawyers need to be electrical engineers, civil engineers, etc. We need physics to more Englishy and less Mathy. :) I jest but I'm serious. For business rules, perhaps this simple case could be expressed with a specialized DSL or config depending on your needs but I think this is a little tertiary.
Reminds me of the "death of programmers" that I've been hearing is right around the corner for that last 3 decades. You know, computers will understand what you want and just do it. Except that 90% of programming is not translating ideas into computer cyphers, it's figuring out (or guessing at) what the user really wants.
For example, what should happen if the network drops out in the middle of a save? Should it save a copy locally? Where? No, not the temp directory; that will be erased if the computer is rebooted. And so on.
There is nothing wrong with code that looks like english. The problem is that you can take that too far, or just sometimes not learn the actual syntax and try to use English instead. That isnt a problem for many languages. I really havent seen a lot of code where an attempt to make it read more like english was.making it less readable or causing other problems.
Also, there is a system for logical rules called Attempto Controlled English which I think has some proven use cases for business rules.
Applescript has junk like extra meaningless words added to programs just to make them resemble English. And things that are easy to express usually, like a function with four arguments, become a mess because of a fear of punctuation.
There's no point in asking computers to interpret the ambiguity of English commands, but there is room for improving tools that attempt to elucidate meaning from English statements and are capable for formulating questions. The fact is that English is perfectly serviceable for people to communicate complex concepts to each other through either conversation or context. (s/English/your native language)
Of all the languages I've used, Python comes closest to English-like while still being a language I can enjoy programming in. I seriously doubt it's possible (absent strong AI) to get much further. Also, Guido's probably a once-per-generation genius, so he can get away with more than we can.
But since nobody speaks Lojban natively (and probably never will (I am aware there exist a few people who speak Esperanto natively)), you might as well learn a programming language. Both Lojban and Python will be second languages to someone, and Python (or any other programming language) is probably easier to reason about for the types of things we use programming languages for.
I actually disagree to some extent with Raganwald here. I am both a software engineer and an amateur linguist. Let me try to go into where my disagreement is. I figure that someone, probably me, will have to set the record straight about how poorly our expectations of English grammar match the realities and utilities of this language.[1]
Natural languages and computer languages are very different both in how they are used and in the constraints that exist on them. It is very obvious that you can't write programs in written English for that reason, and the only reason why we have this confusion is not what we expect from programming languages but what we expect from natural languages.
As every introductory linguistics textbook will remind you, natural languages are defined by usage patterns. Computer languages however are defined by parser behavior. They must be rigorously specified, while such specification kills natural languages.
Many people, particularly public school English teachers, would like to see English specified to this level of clean, precise, prescriptive rules, but this doesn't really work in practice. Different dialects occur and these dialects often have very different features (note the habitual tenses in African American Vernacular English, for example, which also occurs in some dialects from England). Usage defines the language, just like it can define good/bad coding practices in one school of thought or another, but usage does not define C. If I decide there is a clearer way to handle pointers, what I come up with isn't C any more. However habitual tenses ("I be workin' Tuesdays") as much as they may be seen as "bad English" by some prescriptivists, are just natural, organic developments within the language.
There are also dialects of English that are less imprecise. For example consider legalese which is often considered to be a hyper-defined dialect of English (also a very formulaic one). The advantage is that legalese can fall back to other English terms for additional meaning and one can be reasonably sure what a contract means even if you are not a lawyer.
So what does this mean about English-Likeness? Certainly we wouldn't expect the laws of the nation and contracts to be written in a purely symbolic language even though these need a level of definition well beyond what ordinary everyday English can provide. In fact if we did so, this would interfere with the function of legal instruments. Legalese may be a hyper-defined dialect but at the end of the day, it is still just as much English as African American Vernacular English is. English likeness is more a blessing than a curse in the field of law, even though it is quite clearly both.
Because English-Likeness is both a blessing and a curse there, it is worth asking the same here. Can an ordinary individual read a program written in Perl or Python and understand it? No, but that same ordinary individual probably cannot read the Supreme Court's opinion in Pedilla v. Hanft, or Hamdi v. Rumsfeld either and really understand what the court is arguing about.
But what English-likeness does is it speeds up learning. There is a difference between how quickly you can pick up Spanish as an English speaker and how quickly you can pick up Bahasa Indonesia or Algonquin. Spanish in both structure and vocabulary is much closer to English than either one is to Indonesian or Algonquin, so I think the same applies, and languages which follow very different structures (lisp reminds me of Irish Gaelic due to the latter's VSO word order) tend not to gain much traction.
Additionally English is probably the best language in the world to be like for a programming language. We have a relatively simple lexical structure and a fairly inflexible word order and much of this is due to the fact that Middle English arose as a sort of creole (with all the grammatical simplification that goes with that) between Anglo-Saxon and Old Norman French. Other creoles might be good choices too, but this is a good choice and we might as well stick with it.
[1] Note, we would never say "Me will have to" and yet within the parenthetical, the normal rules of pronoun cases get shifted a bit. See Bailey, Charles, "How English Grammars have Miscued" in Polome, Edgar (ed) "Language Change and Typological Variation," JIES monograph 30.
Another amateur linguist + software dev here (loosely involved in NLP-related work and studied linguistics at university). I invite you to try to convince me that there might possibly be a natural-language-esque version of the (in)famous inverse square root function that is more comprehensible than the original.
float Q_rsqrt( float number ) {
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y;
i = 0x5f3759df - ( i >> 1 );
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) );
return y;
}
You have written many words and have buried your counterpoint in the process. Put aside the superficial points (e.g. reasons to choose a dialect of English over a dialect of another language) and address the main issue: if programming is the act of translating human concepts into precisely defined math and logic, how can we reasonably rely on such a fundamentally imprecise tool as natural language to get the job done? I contend that there is no sufficiently high level of abstraction where natural language would be more suitable.
My point is that one doesn't use a natural language to get the job done, one uses a computer language in the likeness of a natural language to get the job done. The similarity speeds learning process.
As for an example, you just gave one. But if we wanted to make it more natural-language-like it would be a slightly different syntax:
DECLARE FUNCTION q_rsqrt (float number) THAT RETURNS float AS (
....
);
Now if you think of the natural language phenomenon of ellipsis, this gets shortened essentially to
function sqrt(float number) returns float (
and we might want to change this to make it more concise
The method signature is the low-hanging fruit. Try the method body.
ADD: Do you accept that programming has its basis in mathematics and logic? If so, how is making computer languages resemble human languages a more direct way to improve comprehension than learning some math? Walking up a mountain path is difficult, but not nearly so difficult as moving the mountain until it lies beneath one's feet.
I don't see anything wrong with the message body. The point is we are talking about similarity, not identity. I have been clear: you can't program in English but there is some benefit to having languages which are easy for English-speakers to learn.
> Do you accept that programming has its basis in mathematics and logic?
This is one of the foundations of programming, particularly pronounced in technical fields. As you get to solving higher level problems however, other things come into play.
For example, suppose we are building a robot to automate some physical process (let's say, making coffee). We are going to have low-level programming which takes care of ensuring that we can tell the robot what exactly we want to do. If resources are tight we may do all of this low-level, but if we can spend more on resources we might want to have a higher-level (scripting, if you will) environment for brewing coffee.
On the higher level, our calculations are not going to be how many j of heat to add to the water. They will be what temperature to bring the water to. We may specify how finely the grounds are to be ground, etc. These instructions will resemble a cookbook recipe more than they will resemble a mathematical diagram, and this is the difference between high-level and low-level languages.
> If so, how is making computer languages resemble human languages a more direct way to improve comprehension than learning some math?
but the similarity is not just in vocabulary. In English we might write "three plus two equals five" where "plus" is essentially a conjunction, three and two are subjects, equals is a verb, and five is the predicate/object. This works because English has a subject-verb-object verb. Similarly a equals b plus c tells us that we can find out the value of a by adding b and c. In English there is a difference between a = b + c and b + c = a. There may not be a logical difference but there is a semantic difference that does not exist in pure math.
So let's imagine math written in another language, Irish Gaelic, translated word-for-word into English and ignoring conjucated prepositions. Irish Gaelic has a verb-subject-object word order, so we might write equals a adding and c. Maybe a nice way to represent this symbolically would be (= a (+ b c)). There wouldn't be a computer language like that, would there be?
The problem though that lisp encounters is that it is disorienting, and it is disorienting because the structure bears no resemblance to English.
If English-Likeness is a problem, surely that applies to word order and other syntactic similarities just as much as it applies to lexical similarities.
Let me see if I got that right. You see the need to transform:
float Q_rsqrt( float number )
into:
DECLARE FUNCTION q_rsqrt (float number) THAT RETURNS float AS
but you see no need to do anything to:
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y;
i = 0x5f3759df - ( i >> 1 );
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) );
return y;
I intentionally didn't refer to your later, more "concise" attempt, because the topic of discussion is the "English -Likeness" and, unlike "DECLARE FUNCTION q_rsqrt (float number) THAT RETURNS float AS", your "float function sqrt(float number)" doesn't add English likeness to the syntax.
Now, your assertion was that "what English-likeness does is it speeds up learning". Since we're all just flinging around theories unsupported by research and data, I'd like to posit that the main motivation behind attempts to do away with programming languages and replace them with natural languages is the misconception that you can learn to program without learning to think like a programmer. It doesn't really have much to do with syntax itself, it has to do with inherent complexity of programming.
To take your example, Lisp uses an unusual syntax which makes it disorienting at first. This "disorientation" is something people will deal with in a matter of days, whereas their struggle to plumb the depths of macros and functional programming will take much, much longer.
All in all, I believe that small degrees[1] of "English-Likeness" might help English speakers learn the syntax of the language a bit faster, but that benefit would prove insignificant when compared to the rest of the effort necessary to learn to program.
[1]: Small enough to avoid "polluting" the language with the problems @raganwald writes about.
I've tried not to bring up "English-Likeness" aspect because that's a whole other can of worms. But since it's out there now...
I don't believe this is einhverfr's intent in the least, but the promotion of English is often an ethnocentric activity. Why not a dialect of Chinese (so many native speakers, high information density per character)? Why not a dialect of Spanish (multi-continental reach of native speakers, high incidence of cognates to Romance languages, highly predictable orthography and phonetics)? Why not any language spoken in the Middle East or India or former Soviet states?
If human accessibility were truly the issue, why should we focus our efforts on English speakers -- the people already the most aided by existing language-like tokens! -- instead of promoting efforts to "translate" programming keywords and constructs to other languages? I'll grant that one does not necessarily preclude the other, but I would argue that the former is chasing after diminishing returns at this point in time, whereas the latter represents a significant opportunity to increase ease of access to programming globally.
I did mention features of English which make it more condusive to being such an archetype (i.e. that computer languages immitate English). I don't think it is the only choice. I think in general that creole languages and their descendants (that includes English btw) have significant syntactic advantages in this regard. The issue is morphological simplicity.
This doesn't mean I am a fan of such languages from a natural language perspective, btw... I am just saying if I was choosing a language to build a computer language based on, I would not choose a language where imperative tenses of verbs might vary depending on number of addressees, for example....
As I think of it, it might be very interesting to look at computer language design based on other natural languages and see what other problems or opportunities come up. For example, I wonder about slot-based polysynthetic languages (like Algonquin). A computer language based on that would have to be fundamentally different from what we have now.
Sorry, no time to continue the debate. You're welcome to keep believing, and human-languages interactions sure would be neat at any level, but I'm coming from too pragmatic an angle to believe this will happen before a complete mapping of the human brain or the singularity.
> You're welcome to keep believing, and human-languages interactions sure would be neat at any level, but I'm coming from too pragmatic an angle to believe this will happen before a complete mapping of the human brain or the singularity.
Ummm straw man there.
My point was just that computer languages which leverage people's existing understanding of natural languages do better than those which don't. I said way up at the beginning of my post that the constraints were totally different and that natural languages are quite a bit further from computer languages than we like to think.
I thought we were talking about likeness, not substitutability.
DECLARE FUNCTION q_rsqrt (float number) THAT RETURNS float AS ( .... );
And no programmer will want to write in such a language because it's so verbose.
I don't mind making programming languages look more like natural languages; I would offer Python as an example of a language that takes advantage of this where it makes sense.
But I draw the line at having to type boilerplate; if it's absolutely necessary, at least keep it to an absolute minimum.
I wouldn't say that is "natural language like." I would just say that is a very loquacious programming language. I contend the C equivalent of that is more readable. You're using the same sort of constructs in typical programming languages but still lacking anything like the kind of fluid reading and understanding one would expect from the promise of English like programming languages.
If that is the promise it will never be met for the very simple reason I mentioned above: natural language is defined by how the speakers individually use the language. Computer languages are defined by specification. That kiss the flexibility you are asking for right there and that's why you can never program in English. It isn't just a question of imprecision. NULL in SQL is more imprecise than most of the things folks might argue about in English.... The problem is as to how you define correct usage and correct behavior.
The only promise that English-like languages can ever fulfil is that they will be easier to learn for English-speakers than languages which are not.
This is the best, most succinct and apt summary of the situation I've ever seen: "I contend that there is no sufficiently high level of abstraction where natural language would be more suitable."
I'm reminded of Paul Graham's remark about Prolog being more high level but not more terse, and that this had something to do with its present-day situation. Natural language is high level because it is terse and humans can supply a great deal of omitted information on their own. Teaching the computer how to supply that information is an incredibly error-prone burden. Teaching people how to use precise English seems to be worse than teaching people how to use a precise language that isn't English.
I think you're missing the practical use cases of natural language programming, at least those in the next decade or so.
"Hello Computer! I need an inverse square root function. It should take a float as a parameter and..." - Not practical
"Hello Computer! I need this image to pan across the screen. It should take one second. When it gets there, I want it to stop and become clickable. When you click it..." - Very practical
The practical solution here could even be iterative...
"No computer, no! The image should ease into position, not just go straight there!"
The point is you wouldn't rely on it for precise logic, just "good enough" behavior. You can definitely argue that this isn't programming though.
Well, we start to split hairs here, although my over the top "Hello computer!" skewed what I was getting at. If NLProgramming gains ground, there will likely be some level of AI behind it, but not necessarily a "true" AI. And it's not what I want...it will put me out of work ;)
I can't help but sense this leading right back to the bad old days of PowerBuilder and Visual Basic. The whole idea behind 4GLs was, you "tell" the computer what you want it to do in high-level concepts. The problem is the 90% of the program it makes easy is was easy before, and the 10% that remains is effectively impossible for the non-programmer, and the programmer will balk at all the extra work they have to do to squeeze it in.
but you still require a hyperdefined subset of English to make that work, and such a parser cannot take into account dialectal and language drift issues.
But at that point you run into another problem, namely that you can interact with computers in natural language but I don't think you can program them, because once you have that level of AI, you can't guarantee that two computers will have the exact same understanding of the command any more than you can expect that two programmers will read a spec in the exact same way.
What this means is that the consistency gained in programming languages would be lost. A computer program would no longer generate repeatable results.
Interesting point, but a counterpoint is that an AI for such purposes could either be deterministic (if it was running on local machines), or it could be a single highly creative and nondeterministic AI running "in the cloud" so to speak. There are also cases where absolute determinism isn't important, like in some games.
I think both of our points are somewhat moot though, as I don't see NLP code as something that you would necessarily put up on github. I see it as playing a similar role to DSLs or metaprogramming nowadays, and similarly, it wouldn't always be a good solution.
Simple use case: I recently finished an HTML5 puzzle minigame. The layout is pretty simple and has to adjust to window resizes, but the layout can't be done with straight HTML/CSS. So, my resize function is 120 lines of dense javascript, despite being able to describe in a simple paragraph what I would want. Indeed, a simple paragraph is what was given to me in the first place to describe the layout.
I think the game example proves my point. You don't play a game by programming the game server. You play the game by itneracting with the computer. These are fundamentally different areas.
All right. I program and I speak English all the time, how's that for credentials?
I mean this in the nicest possible way but this is like saying a whole lot and nothing at the same time. I think if I heard this as a verbal response to the original article, I would think the speaker just wants to hear himself speak.
You haven't really stated counter point by which to justify any kind of disagreement. It's seems to be like a promissory note of "well if the language conformed to some subset dialect like legalese that could be great." I think there are very big problems for English-like programming languages that this sort of skirts over and dismisses. Just because you think people that think a phrase like "Me ain't got no dollas" is horrible grammar are just fuddy duddy "prescriptivists" doesn't really solve the problem of English-like programming languages.
> It's seems to be like a promissory note of "well if the language conformed to some subset dialect like legalese that could be great."
I don't think it would be great if regular English did so. Natural languages can't do so. If you do that you kill the language. My point is simply that we expect English to be like that and it's good that it isn't, but we can still leverage people's understanding of the language to help people learn programming faster, and moreover when languages do not do this they have more trouble gaining traction.