Feh. It's an interesting split between what I'll call the concisionists and the ...

loup-vaillant · on May 3, 2014

> As a concisionist, this change stinks.

Hardly. As thestinger argued so eloquently, legitimate use of the ~ are few and far between: typically, recursive data structure definitions, which are rare even when you go Recursive Crazy. (Functional programming would even avoid unique pointers, since they prevent sharing.)

From an information theoretic perspective, giving a special syntax for such an uncommon use doesn't make sense. If you want your language to be concise, you want the short-cuts to be used for frequent cases. Before we give a special syntax to unique pointers, we should address everything that's more often used. That would turn Rust into APL.

Since we don't want such a combinatorial explosion of special symbols, we're back to the good old Huffman encoding: short cuts for the frequent use cases, more verbosity for the rare ones.

4bpp · on May 3, 2014

> legitimate use of the ~ are few and far between: typically, recursive data structure definitions

Wait, don't you need it to instantiate the corresponding recursive data structures? As in, a typical cons-cell list would be created in a fashion like Cons(1, ~Cons(2, ~Cons(3, ~Nil))).

loup-vaillant · on May 3, 2014

That's an illegitimate example.

Exposing unshared nodes like that does no good. Just write a cons function that embed the unique pointer, that will get rid of the tilde, and make for an even better syntax than what you just showed.

4bpp · on May 3, 2014

Sorry, I don't think I understand; could you explain what you mean by "exposing unshared nodes"? Also, are you saying that for every self-referential constructor of an algebraic datatype, the correct thing to do is to write a boilerplate function that simply wraps the constructor and a call to whatever the boxing operation winds up being...? That seems somewhat odd.

loup-vaillant · on May 3, 2014

"Node" is the name I used to talk about the cons cells your singly linked list is made up of.

Now 2 things.

First, the tiled denotes unique pointers. Your example was a singly linked list. Singly linked lists have 2 important characteristics: adding or removing the first element is O(1), and the tails of those lists can be shared. With unique pointers, you can't share. By design. If you want several references to a "node", you need to use shared pointers, whose allocation is manage by reference counting or garbage collection. So you have a data structure whose only advantage is O(1) insertion and removal… of the head. That's not very useful to begin with, considering the absurd amount of heap allocation you need to do. Other data structures fare better (vectors, ring buffers…).

Second, your example supposes the existence of a `cons()` function to begin with. If you really want to use unique pointers, you should write a function that accepts values, and wraps them in a pointer instead. That way, you can write `cons('A', cons('B', cons('C', empty)))`. There, no more pesky tiled: they have been factored in the definition of `cons()`. Less repeating yourself for the win.

> are you saying that for every self-referential constructor of an algebraic datatype, the correct thing to do is to write a boilerplate function that simply wraps the constructor and a call to whatever the boxing operation winds up being...?

Not quite. I'm saying the constructor itself should take care of the boxing operation. When devising a data structure, you generally know what it will be used for. It's memory allocation scheme should be a part of it. Hidden, if possible. For instance, unique pointers have value semantics. As such, they're an implementation detail. Leave them out of the interface. If it turns out you didn't need them after all, you can scrap them without breaking outside code.

Hmm, I guess that makes three…

stream_fusion · on May 3, 2014

Does this change pattern matching syntax of recursive data structures, if every node is an explicit box type? Could be cumbersome for matching parse-trees?

loup-vaillant · on May 3, 2014

I have read that is doesn't change a thing. Maybe it's even more concise.

From what I have read, there is a general mechanism for pattern matching: let the custom data structure implement a "pattern match" method or something, which is then implicitly called with the pattern matching syntax. The explicit box type would be no different.

My guess is, parse-trees would be just as easy to match.

masklinn · on May 2, 2014

> As a concisionist, this change stinks.

There's two side to it: it makes creating unique pointers slightly harder, but at the same time it makes (unnecessary) overuse of unique pointers slightly harder/less likely, and that looks to be a concern of the core team.

> Designing a language to be prolix just makes code more laborious to understand.

Does it make the code more laborious to understand though? The concept becomes easier to search for and it's easier to talk about it (both because box/boxing is a term of art, and because "box" is a single syllable).

pnathan · on May 2, 2014

There's a nuance to point #2. The right amount of concision expresses the situation perfectly to someone skilled in the art, without requiring further reading to grasp or without requiring unneeded information to be waded through.

(philosophy) A language should support concision to allow those skilled in the art to use their time effectively.

Alphasite_ · on May 2, 2014

I would argue that yes, it does. The examples I've seen of the change have become a mess of letters and i'm not too fond of it.

steveklabnik · on May 2, 2014

Could you cite some specific examples? I think that'd really help here.

Ygg2 · on May 2, 2014

Well if you had something like

     ~Vec<~Vec<~Vec[~T]>> (not 100% realistic example)

becomes

     Box< Vec< Box< Vec< Box< Vec[Box<T>]>>>>>

which really looks weird. It's like you have tire marks over your code.

dbaupp · on May 3, 2014

Sorry, but I think that example is so unrealistic as to not really be worth discussing.

Pacabel · on May 3, 2014

I wouldn't be so quick to dismiss it. Maybe that exact situation won't often arise, but in other languages like C++, Java and C# it's not at all uncommon to have nested collections.

dbaupp · on May 3, 2014

Yes sure; but it is rare to have indirection for every layer of a nested collection (i.e. each collection will essentially be a pointer to some data, with a little bit of metadata (like length & capacity for a Vec), so having Box<Vec<T>> is a pointer to a pointer to the data: essentially pointless!).

In other words, one would write

   Vec<Vec<Vec<T>>>

That might be considered ugly, but it's not ugliness caused by the `~` change.

Ygg2 · on May 3, 2014

Could the opposite happen? What if you have `~~` pointer or `~&~` pointer?

pcwalton · on May 3, 2014

I've never seen `~~` except for workarounds caused by the current lack of dynamically sized types (which is being fixed). `~&` is not very useful, as you'd be placing a stack-bounded lifetime on the heap (except for 'static, I suppose, but I've never seen that).

cgag · on May 2, 2014

words are easier grep for, but I'd rather read x = (2 * 2)/y instead of "x assign 2 plus 2 divide y"

dbaupp · on May 3, 2014

Mathematical operators are well established with well known meanings, but the Rust-specific ~ is not.

cgag · on May 3, 2014

It would be after a little while of using the language, I just don't think it should optimize for new users.