I think the world of service architecture is roughly divided in two camps: (1) people who still naively think that Rest/JSON is cool and schemas and databases should be flexible and "NoSQL" is nice and (2) people who (having gone through pains of (1)) realized that strong schemas, things like Thrift, Protobufs, Avro are a good thing, as is SQL and relational databases, because rigid is good. (Camp 1 is more likely to be using high level dynamic languages like Python and Ruby, and camp 2 will be more on the strongly typed side e.g. C/C++, Go, Java).
> people who still naively think that Rest/JSON is cool and schemas and databases should be flexible and "NoSQL" is nice and
Yes, and no.
Yes: REST / JSON is nice. I've used them widely as a kind of cross-platform compatibility layer. i.e. instead of exposing SQL or something similar, the API is all REST / JSON. That lets everyone use tools they're familiar with, without learning about implementation details.
The REST / JSON system ends up being a thin shim layer over the underlying database. Which is usually SQL.
No: databases should NOT be flexible, and "NoSQL" has a very limited place.
SQL databases should be conservative in what they accept. Once you've inserted crap into the DB, it's hard to fix it.
"NoSQL" solutions are great for situations where you don't care about the data. Using NoSQL as a fast cache means you (mostly) have disk persistence when you need to reboot the server or application. If the data gets lost, you don't care, it's just a cache.
> SQL databases should be conservative in what they accept. Once you've inserted crap into the DB, it's hard to fix it.
You can make your schema very light and accepting almost like NoSQL which is how you get into the situation you described; the solution is to use stricter schema. That and it helps to hire a full time data engineer/administrator.
> Using NoSQL as a fast cache
I'd rather use caching technology, specifically designed for caching, like Redis or Varnish or Squid.
At one time I implemented services that honoured the robustness principle - "be liberal in what you accept, and conservative in what you send". However, I have found that if you are liberal in what you accept, other services can end up dependent on quirks of your liberal acceptance. More recently I have become a believer in the "be conservative in both" strategy. Rigid is good.
Camps 1 and 2 are not mutually exclusive (well, except your inflammatory "naively" comment).
Rest/JSON is a well understood, broadly adopted, low friction RPC format.
NoSQL is not always MongoDB (for example, Google Datastore is ACID compliant), and schema enforcement via an ORM layer I would argue is actually a good thing, as it provides schema validation at compile time.
The longer a database has existed, the more likely somebody in the company wrote something crucial that accesses it without your knowledge and without going through your ORM (usually because your ORM isn't implemented for the language they're using, or it emits bad queries for their use case). Sanity checks that aren't enforced by the database can't be relied on to be up to date or even happen at all.
Rest/JSON is not a RPC. Technologies like Thrift can be used for RPC. It depends on how you define low friction for rest + JSON. JSON is schemaless, and that can be great for prototyping, but as soon as deployed services get out of sync, in terms of how they communicate, it becomes more of a burden than an advantage. Thrift, protobuf, avro, can enforce schemas and can raise exceptions for communication mismatches so less defensive programming is needed checking json responses. For internal service communication, I really think using a schema enforcing communication protocol is a good thing.
In the absence of a good reason not to use Python (and there are many case where one should not), I use it quite a bit myself. And usually (absent good reason) I use it with PostresQL. SQLAlchemy is a wonderful thing.
Granted, I am neither a database nor ORM savant, but I find that it makes explicit almost as easy as implicit - but with more safety! I haven't seen that elsewhere, but I haven't looked very hard either. I have heard claims that Groovy/Hibernate do this just as well as well, but it isn't clear to me that this is completely true.
Conceptually, everything is a Schema and then you generate Changesets for DB operations like insert, update, etc. and apply various validations and transformations as a chain of function calls on the input map. No such thing as a model anymore. It fits really nicely with a data > functions mindset.
Is your first example really "naive" though? In my experience, loose, flexible schemas and dynamic languages are very well suited to rapid early-stage development, much more so than rigid languages and schemas.
Sure, in the long term things should be refactored, structured, and optimized. But if you do that too soon you risk locking yourself out of potential value, as well as gold-plating things that aren't critical.
> Sure, in the long term things should be refactored, structured, and optimized.
How often does that really happen though? Once you've amassed enough technical/data debt, resistance to refactoring increases until it never happens at all. Having well defined, coherent data models and schemas from the start will pay off in the long run. Applications begin and end with data, so why half-ass this from the get go?
Assuming you're not omniscient you'll be refactoring regardless. The difference is whether you'll be paying as you go (clients want a JSON API, we need to add new columns to a table but it'll lock rows) or if you'll be taking on technical debt to be repaid in the future (turns out Mongo sucks and we would do much better with Cassandra for serious horizontal scaling).
I believe that if you aren't extremely certain about what the future holds it may be best to work with a more flexible technology first and transition to a more structured setup once you have solved for your problems and identified intended future features. And if you are extremely certain about what the future holds you're either insanely good at your job or just insane.
I think the key is to do regular refactoring as you go -- it has to be an ingrained part of the process. It's really a management issue. Not every company/project has the foresight to budget for this, of course. If a team can't or won't regularly improve their infrastructure, then yes, a more structured approach would probably be better for anything that needs to last.
Of course there are other considerations. A more "planned" structure always makes sense if you're talking about systems or components that are life-critical or that deal with large flows of money. The "fast and loose" approach makes the most sense when you can tolerate occasional failures, but you have to have fast iterations to be quick-to-market with new features.
In my experience the likelihood of your scenario (increasing resistance to refactoring)
is almost always indirectly proportional to the amount and quality of refactoring tools available.
500KLOC JVM/.NET application? No big deal.
50KLOC JS/HTML-based SPA? Pfooooh. That could take a while...do we really need to?
This is the key point I think. Flexibility is very important when a project is immature. And as it matures it benefits from additional safety. Too bad so few tools support that transition.
Camp 3: those of us who think "shove something, anything that 'works' out as fast as possible" is a recipe for disaster, irrespective of the transport format or service infrastructure.
I suppose folks in this camp would overlap significantly with those in camp 2, though.
REST/JSON makes sense between a human backed browser and a middle tier; between machines, SOAP is the best contract first technology, or at least a custom XSD backed XML schema.
I don't understand the conflation of Rest/JSON with NoSQL. The most popular frameworks for the dynamic languages you cite all use SQL with strong schemas out of the box. Come to think of it I'm not sure why we're even associating Rest with JSON specifically or any content type for that matter when one of the major points of Rest is content-negotiation via media types. You can even implement a messaging concept with Protbufs in a strictly "rest-like" fashion.
You're missing the point where you have a flexible, schemaless design while you're specing out the service and seeing if it's even necessary, viable, and will be needed for a long time for a long-lived system.
Properly encoding rigidity through type systems, SQL checks/triggers/conditions, etc is hard. It takes a really long time to really iron out all the string and integer/double/long typing out of your system, let alone do it in a way which matches up properly with your backing datastore. Once you've got it set up and nailed down with tests, then you're golden, but that's a cost that is usually not worth paying until long-term need is determined.
(3) people who ponder what's the best solution for a given situation? Sometimes you want dynamic languages, or schema-less data. Sometimes you want rigid interfaces and verifiability.
Eg rolling out a public API in Thrift/protobuf will severely difficult its adoption, whereas Rest/JSON is pretty much the standard - but then building microservices that communicate to each other in Rest/JSON quickly leads to a costly, hard to maintain, inconsistent mess.
We're obsessed with "one fits all" absolutes in tech. We should have more "it depends" imho
I've worked on both types of projects. My previous project used Scala on the backend and Node for the public API. Thrift was used to maintain persistent, reusable, connections between the public API and the backend, and to enforce strict typing on message values.
What I really liked about Thrift is that all I needed to know how to use the service was the thrift definition file. It was self-documenting.
My current project uses HTTP and JSON to communicate from the public API to the backend services. There is significantly more overhead (latency and bandwidth) and no enforced document structure (moving toward Swagger to help with that).
HTTP+JSON is great for the front-end where you need a more universally parsable response, but when you control the communication between two systems, something like Thrift/Protobuf solves a lot of problems that a common with REST-ish services.
No No No. This is like asking which is a better tool - a nail or screw? Since everyone is so familiar with SQL, they are avoiding very good reasons to use NoSQL.
'NoSQL' is just a broad term for datastores that do not normally use standard Structured Query Language to retrieve data. Most NoSQL do allow for for very structured data as well as some query languages that are similar to SQL.
BigTable (Hadoop, Cassandra, Dynamo) and block stores (Amazon S3, Redis, MemcacheD) are absolutely critical to cloud services. Json tuple document DBs are needed for mobile and messaging apps. Graph is for relationships, and Marklogic has an awesome NoSQL Db focused on XML.
Full disclosure: I am the founder of NoSQL.Org - but I also use multiple relational SQL databases every day.
Although companies like Snowflake are bridging the gap between the two camps, with products that can natively ingest JSON and Avro, and makes it queryable via SQL allowing for joins etc.
It's possible to use SQL to retrieve data from a text file; that's not the same as using SQL backed by an actual relational database engine (and a well designed schema) that has been perfected over many decades.
1. new business evolving quickly to meet and discover the product that fits the market they are chasing.
2. established and optimizing for a possibly still growing market but very well established set of features and use cases. They can take longer to deliver new features and can save lots of money by optimizing.