> ... until programming languages reach the
> expressiveness of written English, I’ll keep
> welcoming them.
English is so inherently ambiguous that you are really asking about legally expanded, lawyerese, and not English. And that's going to be several times longer than the source code.
I agree there's space for improving languages, I believe Google's new "Go" probably isn't that big a step, but I think asking for the "expressiveness of English" is dangerous nonsense.
Thanks. You're right; when I did the experiment myself (once with a Haskell program, once with a C++ one), the results definitely looked like lawyerese. Surprisingly, though, they were still shorter than the program. Maybe that just means I'm a bad Haskell/C++ coder, but I don't think that's all of it.
One point that I may not have properly gotten across is that I'm not suggesting that programming languages should approach English in form. I was just trying to establish English as a lower bound for the amount of information required to communicate an idea. (It's not even a lower bound, since English is fairly redundant, but it's an upper bound on the lower bound if you know what I mean)
If I said "until programming languages reach the information density of written English", do you think that would make more sense?
I don't have time to write an essay about this now, and I'm not sure I'd be the right person to do so anyway, but here are some thoughts I believe to be relevant.
Real Information Theory, as opposed to Data Transmission Theory, has to take into account the receiver. The transmitter has to have a model of the receiver, and then devise the data that will transform the receiver according to the intent of the transmitter.
For example, if you're a mathematician and I talk about the topological space obtained by gluing the edge of a mobious strip to the edge of a disk being the same topological space obtained by identifying antipodal points of S^2, you'll know what I mean. If not, I have to explain further.
Similarly, a computer program can start with a top level description of your intent, and a competent programmer can then complete the steps.
But a computer is not a programmer. When you describe your algorithm in sufficient detail for a programmer, you haven't given enough information for someone who isn't a programmer. A non-programmer will look at what you've written and then say "How do I do that?" Thus you need further instructions.
So you might say "We sort this vector of numbers by picking one at random, scanning down the vector to separate into those that are smaller, equal and larger, then recurse." and a sufficiently skilled programmer who's never met the Quicksort can now implement it.
But there are details missing.
You programs should (perhaps) be arranged so that the main routine is sufficient for a very skilled programmer to complete. Each routine called then is sufficient for a slightly less skilled programmer to complete, and recurse.
The problem is that the terminating case is the computer, and not a programmer, or even a person. Thus the terminating case is a long way down.
Perhaps there is a language design that will bring the terminating case higher.
That's true, but it seems these are practical limitations, not theoretical ones.
The nice thing about the terminating case being a computer, though, is that it's a moving target (Moore's law, etc.). At the extreme case, a compiler could model every molecule in a human brain and use the simulated intelligence to convert the description into a language that it can compile.
I would be crazy to expect that any time soon, but I think it shows that in theory we can do better than existing programming languages do.
By the way, I re-worded the last sentence to "as long as programming languages are less expressive than written English, I'll keep welcoming them.", because I don't think it's inevitable that programming languages can become as expressive as natural languages. I do think that as long as they aren't, we can do better.
In my opinion expressiveness doesn't exist objectively.
With that motivation for language design (achieving the expressiveness of English), I'm afraid you're running to a self-centered opinion; it is expressive only for those who have learned it since childs. The hierarchy of programming languages derives from the fact that expanding those languages is a lot of work (reimplementing the parser), in comparison with a natural one (metaphors, or neologisms). That's why, I think, a more general language (LISP, Smalltalk) is better. They don't give you expressiveness, which is only given by semantics, but at least they don't narrow those semantics. In comparison, English is a very ambiguous language which seems more expressive because it's semantics are huge. Which leads to the problem that, even if you built an artificial brain, you'd have to teach that brain ontology. Ultimately, the problem's root is not found within programming languages, but is an inertia to extend those which exist.
This ignores the fact that even if you explain quicksort to a random person and give them a list of a few thousand numbers they can only look at one at a time you will invariably find out that most of the people will get it wrong most of the time. Just because you think you can imagine the execution of an algorithm given its description doesn't mean any random human brain can; doesn't mean even you can, if pressed, executed a really complicated program without errors.
The error rates we get on modern computers are scary for most "algorithmic" problems.
What you are describing is the ideal model teacher: Someone that adapts to exactly the highest level of understanding that the recipient is capable of thereby reducing the amount of information that needs to be transferred to a minimum (fastest learning) while still ensuring that the message arrives intact.
In comparison with a 3 year old as the subject (instead of a computer) the difference seems to be all in the ability of the recipient to absorb and apply contextual information.
> Real Information Theory, as opposed to Data Transmission Theory, has to take into account the receiver. The transmitter has to have a model of the receiver, and then devise the data that will transform the receiver according to the intent of the transmitter.
It seems that in general, for any two entities the level of the abstraction used for communication between them, should be the highest common level of abstraction that both of the two entities can operate in.
Take your example of the mathematician. If he is using an application that understands what a topological space or a mobius strip is, it seems then he can use those terms in the language that he uses to communicate to the computer.
Some computers understand what a Vector is. One can refer to vectors in the language used to communicate with that machine. For example a command like "multiply vector a and vector b" would seem reasonable. However if the machine is just a bare x86 box without any operating system, it will not "understand" what a vector is so one has to "talk" to it in terms of registers, memory and IO ports and so on.
This idea of course leads into DSLs. A machine can rise very high on the abstraction level scale today but the abstraction will be confined to a very narrow domain usually.
English's 'information density' is mainly due to the inference ability of the target. A human being infers an incredible amount of nuance from words and the tone/context in which they are delivered.
Programming languages are formal specifications of instructions to a machine. They have to be incredibly deterministic/unambiguous because the machine does no inference of it's own. Even with a so called 'high level language' that handles a lot of under-the-hood stuff like memory allocation and garbage collection for you, there are no intuitive leaps. If you wrote the algorithm in English, but made sure to be rigidly unambiguous, you'd probably find the English version to be not so short.
I think what you meant was that there is a huge gap between spoken languages and programming languages, feature wise.
To make the argument more convincing, don't use English. The grammar is ambiguous. Lojban, a created language, is not. It is being used as a spoken language (though it's not very popular yet). Still, it is being used very differently from a computer program in a fundamental way.
One of the major issues is the concept of object definitions. Object-oriented languages have a very specific meaning for what an "object" is, while spoken languages do not. Rather than creating a specification for what data and functions an object contains, spoken languages group objects by properties. Ambiguity aside, it's a problem of whether you're defining it top-down, or bottom-up. Prototypes are closer to the way we use language for object definitions, but it's still not quite the same.
Another issue is how the language is used. A random portion of the source code is completely non-sensical to a computer, while a random paragraph of a book can still be meaningful. The portion of the source code could make sense to a programmer, maybe enough to reconstruct enough of it, but no computer can process it.
So, to go back to the original point, programming languages and spoken languages are both supposed to convey ideas. Spoken languages do a much better job, hands down. OP used length as a rough measure, but the idea is valid. The question is how can we make computers understand spoken languages in a meaningful way. Answer that, and you'll have a better programming language.
There will always be a need for better programming languages, with the current situation making new languages more important than ever. We have computers we can't completely utilize at the moment and they're becoming more parallelized (and thus more difficult to work with, using current tools) by the second.
Despite the progress that's been made, computing is still a very young field and there's a whole lot of room to grow. We need better languages, better compiler technology, better kernels, better paradigms for handling tasks (processes and threads don't cut it), and above all else, we need new ideas.
If people aren't experimenting with new ideas, we won't move forward. While most of the implementations (literal implementations or specific language designs) will fail, the good ideas will carry on and support the next generation. We should all welcome innovation in this space, even if we don't like what's being created.
It seems that eventually after a particular programming paradigm (for ex. parallel, functional etc.) establishes itself, only a few programming languages will rise to the top that happen to express the ideas in that paradigm in a more clear and concise way.
For example, imagine describing a sorting algorithm to someone and you decide that it should be done in an imperative and sequential way (not functionally or without any parallelism involved). Describing it in English is too ambiguous and verbose, x86 assembly is probably too presice and too hardware specific, a turing machine program is too theoritical and too rigurous. So then you make something up, and I think eventually it looks similar to Python, Ruby or Pascal.
However, in practice the language itself is not worth much outside academia unless it comes with a solid library. At the end of the day, the user of the language will have to open network connections, render web pages, access databases and draw GUIs. If the language doesn't let them do that, no matter how elegant it is it will remain just a toy language.
The interesting things don't happen in a particular paradigm but rather where multiple paradigms meet. Python is imperative and strongly object oriented, but it also has many functional qualities to it. Ruby even more so, with great support for closures. Boo mixes imperative, object-oriented code with thorough metaprogramming, type inference, and a host of functional features. Nemerle is a functional language but mixes object orientation cleanly into the mix. These are the places where you see interesting concepts coming to light, not in the pure-X-paradigm camps.
As for libraries, this is becoming less and less of an issue. With .NET and the JVM, you have a single set of libraries that works with any compatible language and allows language designers to focus on what they're good at: the language. The effects of that can't be understated; it's allowing for a huge amount of innovation with fairly little time investment and ease of adoption.
> As for libraries, this is becoming less and less of an issue. With .NET and the JVM, you have a single set of libraries that works with any compatible language and allows language designers to focus on what they're good at: the language.
Yes, the language does't have to come with its own libraries. It can piggy-back on some other libraries. If there was just one operating system. with a good and stable API the language might not need any libraries, just the ability to make system calls. So for .NET and java, Sun and Microsoft already did the hard work as far as library code is concerned so any language on those platforms is already miles ahead of a new lanuage without any "batteries".
Interestingly enough, if an API is stable and well designed, a language could be created to take a better advantage of it. I am currently looking at Vala, it basically started as a better language than C that would take advantage of glib.
However, in practice the language itself is not worth much outside academia unless it comes with a solid library
So you have to become popular enough to get a large enough community to broadly implement lots of libraries. This means the language can't be too innovative. It has to be Blub++. A lot of the old baggage of the older languages will be carried forward due to cultural expectations.
Isn't this just how human beings work, though? Isn't this exactly what we should expect given how people work?
Yes, we should continue producing programming languages, but the danger with a programming language from Google is that they have such a reputation that other programming languages that might be superior could be overlooked in favor of the trendy option.
That happens fairly often in general, but it's really not that big a deal. People misapply programming languages all the time, but it never seem to hamper innovation. Innovation tends to take place in the lesser adopted languages and eventually makes its way into the mainstream; Google putting this out won't really change anything in that respect.
However, once programming languages reach a certain maturity, the marginal benefits of a better language will be dwarfed by the effort it would require to utilize it.
Now compare your description to the program’s source code. The source code is almost certainly several times longer, less intuitive, and less descriptive.
I tried this with a short Lisp program a few years ago and found it to be false. I may have been too detailed in my descriptions with the English version, but I don't think so. Evolved[0] languages for human communication are notoriously imprecise.
The "plain English" translation of a legal document will be much shorter than the original as well. Such language isn't used in legal documents because lawyers can and will argue over the interpretation. To make programming like everyday English, the runtime would have to sort out the ambiguities. I think that would require human-like AI.
Given human-like AI, much of what we do as programmers could be done by the AI instead. A non-programmer could simply ask the AI to create the program or provide certain kinds of output. Even in a situation like that, I suspect there would still be a demand for human programmers who could more precisely communicate their intentions to computers, if only for the purpose of creating better AIs[1].
[0]That means pretty much any language people actually speak. I'm not sure if it's true for designed languages like Esperanto.
[1]It might be a good idea to restrict AIs from certain kinds of self-modification or from creating new AIs. I think we've all seen that movie.
Of course we need a new programming language - but i don't think we need GO.
Here's my personal whishlist:
* Concurrency as default (i don't want to express explicitly which parts of my porogram can run in parallel).
* I want my programs to be able to run backwards natively (not that kind that MS pseudo backwards-debugging feature crap). i personally call this feature an "entropy-shield"
* a really pure functional language (not that "pure functional" languages that allow and than try to monad away "non-functional functions"
Such a language has to be completely time-agnostic - that means it will never "run" a program.
But - as always - nobody will know what i'm taking about ;-)
About 90% of the time in the better Smalltalk implementations, you can tell the debugger to forget the last few stack frames, or jump back to the "start" of the current one. Nearly unfettered replay-ability is almost as good as stepping backwards.
I agree - If you're focused on debugging.
But there's much more to that. "Running backwards" is not "replay". Running backwards a clock-program will result in hands turning anti-clockwise. Not mentioned the possibilities you can do, if program-speed can be set beetween 100% and -100%.
I think most people think about that as they do with nearly every true innovation: "What schould that be good for?"
My wishlist: a Perl successor that takes the functional path, not the OO path (like Ruby). Also, one that plays well with either C/C++/Unix or the JVM, for deployment practicality. I think I want Lua + ML in one.
I think we do. As PG says, programming languages are not born equal.
Different languages have different strengths and weaknesses. Some are particularly good at number crunching, others make working with text easy. Some provide rich, powerful type systems, others go for minimalism. Some languages are designed to enforce OO, others build on list/stack/tree data structures. Some go for functional purity, others are based on logic proving, others still are based on the flow of data instead of the application of functions. Some languages promote stacked hierarchial components, others promote flat hierarchies instead.
Until we find that sweet spot, where all these different features are combined in a powerful, expressive, yet simple way, we will need to continue experimenting with new programming languages. As we gain experience in writing different kinds of software and solving different kinds of problems, we can model programming languages around these tasks to simplify them and build abstractions upon them.
Personally, I don't think our search for the ultimate language will end until we can find a nice balance between imperative and dataflow programming languages, since they seem to be the two most fundamentally different paradigms, yet some problems are better expressed in one and others in the other. (Eg, imperative is very good at expressing purely mathematical concepts as well as imposing order on computation, while dataflow is great at highly concurrent processing - until we can merge the two, I believe we will always have problems)
So yes, we do need more programming languages. Hopefully some day someone will create a multi-paradigm langauge which manages to gracefully balance the various pros and cons. Unfortunately, I think it could be a long while yet, before an ultimate language is created.
Writing a program is maybe 10% what you want it to do, and 90% telling the computer how to do it. So I think the question of when we won't need another programming language can be answered as: when the default programming language of the day has libraries and/or syntax that grow to include every "how" that's been done before. This way you only have to specify the "what" unless you're doing something nobody's figured out how to do.
So I guess my idea of the limit programming language/environment would be that it's automatic (or at least trivially easy) to make 90% of every new program into a gem, and then automatic/trivial to find the gem that does exactly what you need, as well as to debug it. (maybe rubygems are already this good, i haven't looked into them).
The question of "Do we really need another programming language?" is one that comes up time and time again. The answer is often one of the following:
* No - We have languages that are turing-complete so new languages have no benefit.
* Yes - New languages can take trends that make their way into other languages in the form of libraries and built them directly into the language, making the source code cleaner.
* Yes - New hardware (cf. multi-processors) enables new features which can't be exploited by older languages.
* No - New languages segment the population, meaning we keep re-inventing the wheel by writing the same libraries over and over again.
What about looking at the situation from Google's point of view?
You need your engineering staff to use a language like Python instead of C or C++ for multiple reasons. C's too dangerous and C++ is too slow and full of traps. But Python doesn't compile efficiently into native code (forget Swallow for now).
My hunch is that Google will move their internal staff to coding in this language for newer projects, and not really care who else out there is using it.
I agree there's space for improving languages, I believe Google's new "Go" probably isn't that big a step, but I think asking for the "expressiveness of English" is dangerous nonsense.