Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

> Namespaces let you version data and unambiguously mix elements with the same (simple) name in the same document. Esp. the first point is necessary for long-term data archival.

How do namespaces help with versioning? That seems like a complete non-sequitur.

As for unambiguously mixing elements with the same simple name, I acknowledged that that's a theoretical possibility, but I've never seen it be important in practice.

> It can be compiled to strongly-typed DTOs for your language of choice. I.e., seamless, strongly-typed cross-language data exchange.

The tooling for that is very limited and ineffective, IME, to the point that you're better off writing some class definitions and generating XML or JSON serializers from those. There's a huge impedance mismatch between the kind of constraints that are natural to express in XML schema and the kind that are natural to express in programming languages.



> How do namespaces help with versioning? That seems like a complete non-sequitur.

They tell you how to interpret data and to which schema definition the data conforms to. Elements `<a:MyElt>` and `<b:MyElt>` tell you explicitly how to interpret them. Without the namespace, you have to guess.

> The tooling for that is very limited and ineffective, IME, to the point that you're better off writing some class definitions and generating XML or JSON serializers from those. There's a huge impedance mismatch between the kind of constraints that are natural to express in XML schema and the kind that are natural to express in programming languages.

My experience is totally the opposite. If anything, XSD can express more constraints than most PLs will allow.


> They tell you how to interpret data and to which schema definition the data conforms to. Elements `<a:MyElt>` and `<b:MyElt>` tell you explicitly how to interpret them. Without the namespace, you have to guess.

So you'd mix and match elements from different versions of the schema in the same document? Does that work? I've never seen that done and can't imagine how code would handle that unless it was via some very simple translation rules (in which case the value would be minimal).

(I've seen documents that use the (single) schema declaration as a way of declaring that they're version 3.0 or version 3.1, but there doesn't seem to be any practical advantage to that over something more lightweight like "_version": "3.0" at the start of the document).

> If anything, XSD can express more constraints than most PLs will allow.

I don't actually disagree with this, but they're different constraints and it's not easy to losslessly convert. So it's very hard to use XSD as the source of truth and generate good, idiomatic versions of your constraints in the PL representation of your types. (It's also difficult to generate good, idiomatic versions of your PL constraints in XSD)


> So you'd mix and match elements from different versions of the schema in the same document?

No, the use-case is having an archive of documents conforming to different schemas. Or another use-case: schema evolves during the system's lifetime and you don't want to / can't upgrade old data to new schemas.

And yes, I even mix and match different schemas in the same document: pre-parsed information is stored in "my" elements, whereas the original data source is stored as extension in the XML, in its own namespace etc. So when the need arises for further processing/parsing, everything's already there in the document, with _the_ definitive source of truth. (Uninterpreted raw data)

> idiomatic versions of your PL constraints in XSD

That way is very easy: no PLs support (the XSD equivalent of) foreign keys, so that's "solved". Structs and inheritance are directly expressible, and even sum types from the languages that support it. Granted, XSD using sum types generates clumsy classes in PLs that don't.


> No, the use-case is having an archive of documents conforming to different schemas. Or another use-case: schema evolves during the system's lifetime and you don't want to / can't upgrade old data to new schemas.

Right, I talked about that case - AFAICS the schema is acting as a basic version tag (which is worth having, but can be done much more simply).

> And yes, I even mix and match different schemas in the same document: pre-parsed information is stored in "my" elements, whereas the original data source is stored as extension in the XML, in its own namespace etc. So when the need arises for further processing/parsing, everything's already there in the document, with _the_ definitive source of truth. (Uninterpreted raw data)

Embedding the original document sounds useful, but namespaces still seem vastly overengineered for that case - you'd presumably have a standard, well-defined place for the original document to go, so anything parsing/using your document knows about it and can just skip that node. I guess you get a little bit of value from being able to write xpaths that will never accidentally hit a node in the embedded document, but again that's something I've never seen actually be a problem in real life. Namespacing seems to be built to support the idea that you'd arbitrarily interleave nodes from multiple schemata, and that still seems like a solution in search of a problem.

> That way is very easy: no PLs support (the XSD equivalent of) foreign keys, so that's "solved". Structs and inheritance are directly expressible, and even sum types from the languages that support it.

Oh? Can you point me at a good implementation for Haskell or especially Scala? (TBH I think if we're accepting that the PL is the source of truth for what the constraints are then we don't gain much from encoding more of them into schema versus just checking them after parsing, but every little helps).


> the schema is acting as a basic version tag

Except that it's syntactically separate so no other version tag can masquerade as yours. PLs have namespaces as a separate syntactic construct as well, and for a good reason.

> Embedding the original document sounds useful, but namespaces still seem vastly overengineered for that case

Quite the opposite, it's the simplest option. Everything (original and interpreted data) is kept together, and because of NSs, there's no danger of misinterpreting the one for the other.

> Can you point me at a good implementation for Haskell or especially Scala?

Not using those.

> I think if we're accepting that the PL is the source of truth

XSD can be processed to automatically generate parsing and checking code for whatever other PL than the original one.


> Not using those.

> XSD can be processed to automatically generate parsing and checking code for whatever other PL than the original one.

Well, where are the actual working implementations of these things that you're saying are possible? You say there are tools that have good conversions between XML schema and language sum types; what tools? (and if not in Haskell/Scala then what languages?) Because my experience is that you just don't get good idiomatic representations from the tools, and end up either maintaining the schema and the code in parallel manually, or autogenerating a "dumb" schema that's missing most of your validity constraints.


> and if not in Haskell/Scala then what languages

Java and C#.


Neither of those can really been said to have sum types.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: