I think the world of service architecture is roughly divided in two camps: (1) p...

adekok · on Sept 28, 2016

> people who still naively think that Rest/JSON is cool and schemas and databases should be flexible and "NoSQL" is nice and

Yes, and no.

Yes: REST / JSON is nice. I've used them widely as a kind of cross-platform compatibility layer. i.e. instead of exposing SQL or something similar, the API is all REST / JSON. That lets everyone use tools they're familiar with, without learning about implementation details.

The REST / JSON system ends up being a thin shim layer over the underlying database. Which is usually SQL.

No: databases should NOT be flexible, and "NoSQL" has a very limited place.

SQL databases should be conservative in what they accept. Once you've inserted crap into the DB, it's hard to fix it.

"NoSQL" solutions are great for situations where you don't care about the data. Using NoSQL as a fast cache means you (mostly) have disk persistence when you need to reboot the server or application. If the data gets lost, you don't care, it's just a cache.

fail2fail2ban · on Sept 28, 2016

> SQL databases should be conservative in what they accept. Once you've inserted crap into the DB, it's hard to fix it.

You can make your schema very light and accepting almost like NoSQL which is how you get into the situation you described; the solution is to use stricter schema. That and it helps to hire a full time data engineer/administrator.

> Using NoSQL as a fast cache

I'd rather use caching technology, specifically designed for caching, like Redis or Varnish or Squid.

teddyc · on Sept 28, 2016

"Once you've inserted crap into the DB, it's hard to fix it."

Agreed, it is hard to fix. NoSQL databases can be really hard to fix when they are full of crap, too.

moefogs · on Sept 28, 2016

That's a really silly (and in my experience) inaccurate comment.

I like schemas ... most of the time! I like Avro ... some of the time! And JSON some of the time. And I write mostly in Python.... and Scala.

The world is not so black and white.

oftenwrong · on Sept 28, 2016

At one time I implemented services that honoured the robustness principle - "be liberal in what you accept, and conservative in what you send". However, I have found that if you are liberal in what you accept, other services can end up dependent on quirks of your liberal acceptance. More recently I have become a believer in the "be conservative in both" strategy. Rigid is good.

wolfpwner · on Sept 28, 2016

But you are oftenwrong.

novaleaf · on Sept 28, 2016

Camps 1 and 2 are not mutually exclusive (well, except your inflammatory "naively" comment).

Rest/JSON is a well understood, broadly adopted, low friction RPC format.

NoSQL is not always MongoDB (for example, Google Datastore is ACID compliant), and schema enforcement via an ORM layer I would argue is actually a good thing, as it provides schema validation at compile time.

prodigal_erik · on Sept 28, 2016

The longer a database has existed, the more likely somebody in the company wrote something crucial that accesses it without your knowledge and without going through your ORM (usually because your ORM isn't implemented for the language they're using, or it emits bad queries for their use case). Sanity checks that aren't enforced by the database can't be relied on to be up to date or even happen at all.

mrinterweb · on Sept 28, 2016

Rest/JSON is not a RPC. Technologies like Thrift can be used for RPC. It depends on how you define low friction for rest + JSON. JSON is schemaless, and that can be great for prototyping, but as soon as deployed services get out of sync, in terms of how they communicate, it becomes more of a burden than an advantage. Thrift, protobuf, avro, can enforce schemas and can raise exceptions for communication mismatches so less defensive programming is needed checking json responses. For internal service communication, I really think using a schema enforcing communication protocol is a good thing.

_puk · on Sept 28, 2016

JSON is not inherently schemaless, it's just generally not used with one.

Initiatives like JSONSchema go some way to restoring some constraints to the format and prevents unchecked deviation over time.

nodja · on Sept 28, 2016

I don't agree with this. I'm just a small developer working for a medium-sized company, but I'm a python guy and I love RDBs and REST/JSON.

Sometimes it's easier for me to alter the JSON payload a certain way in the frontend, then the python backend handles it and saves it to the DB.

The RDB helps not to stray too far into crazy-land, while python and JSON gives you the flexibility to prototype and experiment.

ci5er · on Sept 28, 2016

In the absence of a good reason not to use Python (and there are many case where one should not), I use it quite a bit myself. And usually (absent good reason) I use it with PostresQL. SQLAlchemy is a wonderful thing.

Granted, I am neither a database nor ORM savant, but I find that it makes explicit almost as easy as implicit - but with more safety! I haven't seen that elsewhere, but I haven't looked very hard either. I have heard claims that Groovy/Hibernate do this just as well as well, but it isn't clear to me that this is completely true.

_asummers · on Sept 29, 2016

You should check out what the Elixir folks are doing with Ecto. I'm a huge fan so far!

ci5er · on Sept 29, 2016

Thanks for the pointer. I'm already using BEAM in a project, so that wouldn't be a bridge too far for that one...

_asummers · on Sept 29, 2016

Conceptually, everything is a Schema and then you generate Changesets for DB operations like insert, update, etc. and apply various validations and transformations as a chain of function calls on the input map. No such thing as a model anymore. It fits really nicely with a data > functions mindset.

ci5er · on Sept 29, 2016

> It fits really nicely with a data > functions mindset.

That will probably confuse my procedural mind, much like declarative "stuff" often does! :-)

iamnothere · on Sept 28, 2016

Is your first example really "naive" though? In my experience, loose, flexible schemas and dynamic languages are very well suited to rapid early-stage development, much more so than rigid languages and schemas.

Sure, in the long term things should be refactored, structured, and optimized. But if you do that too soon you risk locking yourself out of potential value, as well as gold-plating things that aren't critical.

mtberatwork · on Sept 28, 2016

> Sure, in the long term things should be refactored, structured, and optimized.

How often does that really happen though? Once you've amassed enough technical/data debt, resistance to refactoring increases until it never happens at all. Having well defined, coherent data models and schemas from the start will pay off in the long run. Applications begin and end with data, so why half-ass this from the get go?

seangrogg · on Sept 28, 2016

Assuming you're not omniscient you'll be refactoring regardless. The difference is whether you'll be paying as you go (clients want a JSON API, we need to add new columns to a table but it'll lock rows) or if you'll be taking on technical debt to be repaid in the future (turns out Mongo sucks and we would do much better with Cassandra for serious horizontal scaling).

I believe that if you aren't extremely certain about what the future holds it may be best to work with a more flexible technology first and transition to a more structured setup once you have solved for your problems and identified intended future features. And if you are extremely certain about what the future holds you're either insanely good at your job or just insane.

iamnothere · on Sept 28, 2016

I think the key is to do regular refactoring as you go -- it has to be an ingrained part of the process. It's really a management issue. Not every company/project has the foresight to budget for this, of course. If a team can't or won't regularly improve their infrastructure, then yes, a more structured approach would probably be better for anything that needs to last.

Of course there are other considerations. A more "planned" structure always makes sense if you're talking about systems or components that are life-critical or that deal with large flows of money. The "fast and loose" approach makes the most sense when you can tolerate occasional failures, but you have to have fast iterations to be quick-to-market with new features.

dahauns · on Sept 29, 2016

In my experience the likelihood of your scenario (increasing resistance to refactoring) is almost always indirectly proportional to the amount and quality of refactoring tools available.

500KLOC JVM/.NET application? No big deal.

50KLOC JS/HTML-based SPA? Pfooooh. That could take a while...do we really need to?

mixmastamyk · on Sept 28, 2016

This is the key point I think. Flexibility is very important when a project is immature. And as it matures it benefits from additional safety. Too bad so few tools support that transition.

conover · on Sept 28, 2016

I'll take gross generalizations for $500, Alex.

sidlls · on Sept 28, 2016

Camp 3: those of us who think "shove something, anything that 'works' out as fast as possible" is a recipe for disaster, irrespective of the transport format or service infrastructure.

I suppose folks in this camp would overlap significantly with those in camp 2, though.

KayEss · on Sept 28, 2016

I still kind of think you can have great REST and JSON and make full use of a great RDBMS.

REST/JSON with strong schemas (and loose where that makes sense), using Postgres and C++.

fail2fail2ban · on Sept 28, 2016

REST/JSON makes sense between a human backed browser and a middle tier; between machines, SOAP is the best contract first technology, or at least a custom XSD backed XML schema.

huntedsnark · on Sept 28, 2016

I don't understand the conflation of Rest/JSON with NoSQL. The most popular frameworks for the dynamic languages you cite all use SQL with strong schemas out of the box. Come to think of it I'm not sure why we're even associating Rest with JSON specifically or any content type for that matter when one of the major points of Rest is content-negotiation via media types. You can even implement a messaging concept with Protbufs in a strictly "rest-like" fashion.

solatic · on Sept 28, 2016

You're missing the point where you have a flexible, schemaless design while you're specing out the service and seeing if it's even necessary, viable, and will be needed for a long time for a long-lived system.

Properly encoding rigidity through type systems, SQL checks/triggers/conditions, etc is hard. It takes a really long time to really iron out all the string and integer/double/long typing out of your system, let alone do it in a way which matches up properly with your backing datastore. Once you've got it set up and nailed down with tests, then you're golden, but that's a cost that is usually not worth paying until long-term need is determined.

herval · on Sept 28, 2016

(3) people who ponder what's the best solution for a given situation? Sometimes you want dynamic languages, or schema-less data. Sometimes you want rigid interfaces and verifiability.

Eg rolling out a public API in Thrift/protobuf will severely difficult its adoption, whereas Rest/JSON is pretty much the standard - but then building microservices that communicate to each other in Rest/JSON quickly leads to a costly, hard to maintain, inconsistent mess.

We're obsessed with "one fits all" absolutes in tech. We should have more "it depends" imho

eric_the_read · on Sept 28, 2016

Amusingly, I've just finished trying to convince a team using C in their firmware to use Protobufs; they were trying to convince me to use JSON.

Oh, and my servers are in Ruby.

seangrogg · on Sept 28, 2016

Protobuf v3 allows use of JSON and gRPC for Node.js handles JavaScript objects quite handily. Might not be mutually exclusive.

StreamBright · on Sept 28, 2016

The third camp is the guys who know that you have to use schemas when it makes sense. :)

Osiris · on Sept 28, 2016

I've worked on both types of projects. My previous project used Scala on the backend and Node for the public API. Thrift was used to maintain persistent, reusable, connections between the public API and the backend, and to enforce strict typing on message values.

What I really liked about Thrift is that all I needed to know how to use the service was the thrift definition file. It was self-documenting.

My current project uses HTTP and JSON to communicate from the public API to the backend services. There is significantly more overhead (latency and bandwidth) and no enforced document structure (moving toward Swagger to help with that).

HTTP+JSON is great for the front-end where you need a more universally parsable response, but when you control the communication between two systems, something like Thrift/Protobuf solves a lot of problems that a common with REST-ish services.

opendomain · on Sept 28, 2016

No No No. This is like asking which is a better tool - a nail or screw? Since everyone is so familiar with SQL, they are avoiding very good reasons to use NoSQL.

'NoSQL' is just a broad term for datastores that do not normally use standard Structured Query Language to retrieve data. Most NoSQL do allow for for very structured data as well as some query languages that are similar to SQL.

BigTable (Hadoop, Cassandra, Dynamo) and block stores (Amazon S3, Redis, MemcacheD) are absolutely critical to cloud services. Json tuple document DBs are needed for mobile and messaging apps. Graph is for relationships, and Marklogic has an awesome NoSQL Db focused on XML.

Full disclosure: I am the founder of NoSQL.Org - but I also use multiple relational SQL databases every day.

maxlamb · on Sept 28, 2016

Although companies like Snowflake are bridging the gap between the two camps, with products that can natively ingest JSON and Avro, and makes it queryable via SQL allowing for joins etc.

fail2fail2ban · on Sept 28, 2016

It's possible to use SQL to retrieve data from a text file; that's not the same as using SQL backed by an actual relational database engine (and a well designed schema) that has been perfected over many decades.

maxlamb · on Sept 29, 2016

Snowflake does have a relational database engine, it was actually founded by two Oracle engineers (who had worked at Oracle 15+ years)

taf2 · on Sept 28, 2016

the short summary of what you describe is...

1. new business evolving quickly to meet and discover the product that fits the market they are chasing.

2. established and optimizing for a possibly still growing market but very well established set of features and use cases. They can take longer to deliver new features and can save lots of money by optimizing.

borlum · on Sept 29, 2016

Spot on