Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Not sure how old this post is, 2014 by the looks of it, but modern declarative automation makes this redundant. Ansible, as an example, would abstract away the CLI commands and be very readable in very few lines.

The setup described in this post looks fragile, regardless of whether it's literate. I mean, we used to do these things, but we've moved on.



That's what ansible PR material would have you believe. In reality only very simple things can be translated into ansible/salt and even then you may hit one of their many bugs. If you try to do complex things even bash would be a better choice (although obviously the best).

There's one thing ansible/salt are good at: make it easy for cheap replaceable devops to write tons of repetitive boilerplate yaml. Harder for them to shoot themselves in the foot than with a real programming language. But once you descend into jinja hell that's not very true either.

Also: newer isn't always better and all that.


I would second the opinion that the techniques demonstrated here are several years behind current best practices.

Automation is the right course, but this takes a very roundabout way toward producing an artifact that can be distributed and applied repeatably, compared to using something like an Ansible playbook that is applied using Virtualbox's built-in Ansible provisioner.

I would recommend anyone reading this thread against recreating this, except as a proof of concept, and instead concentrate on a workflow more similar to this one:

https://medium.com/faun/building-repeatable-infrastructure-w...


Horses for courses, it depends on what you're doing, but I've built Multi-£Bn production systems for the last decade on Puppet then Chef then Ansible (and now none of those) and I can't say that Ansible is very buggy these days nor does it fail at complexity. I mean, we found that provisioning with Packer using Ansible as the engine worked perfectly fine, nothing to do with PR material. Newer definitely isn't better, I would always go with reliable, battle-tested tech, but the old ways are largely pointless for most clients and developers who have long left behind artisan infrastructure. If tons of complexity is needed that baffles Ansible/Salt etc, I would be looking at the architects askew TBH...


> If tons of complexity is needed that baffles Ansible/Salt etc, I would be looking at the architects askew TBH...

Simple clean architectures can be deployed with anything. I judge a tool by it's ability to "make the easy things easy, and the hard things possible".

But you do have a very good point here:

> the old ways are largely pointless for most clients and developers who have long left behind artisan infrastructure

I don't like that it is this way, maybe I'm just old fashioned. But yes. This is the way it is.


At my employer, we use a sensible bi-modal approach, where every machine gets a base configuration applied through Puppet, and we consider this The Platform, and any per-machine (or per-use-case) changes should be applied through an Ansible repo.

Puppet lays down those things that the platform "promises"* to provide - syslog, time, auth, DNS, etc, and Ansible does application-specific things.

* - Not a strict promise in the Mark Burgess "Promise Theory" sense, but similar in thought.


I have experience with salt. Imperative vs declarative - it depends. Some stuff is simple (and the tutorials and introductions always are), other stuff is complex. For example I ended up writing imperative state in salt in a declarative manner. Was not easy. The solution I've found was to make one state predepend on another. The chain in the end was 8 items long if my memory serves me correctly.

On the other hand, in imperative world you have other difficulties, because you have to check the current state first and decide what to do in different edge cases. Not sure which one in the would be more complex.


I always read that ansible can only be used for very simple things. However I never read what things are too complicated with ansible. Can you share an example?


Ansible is good at putting configuration files in the right place and setting their appropriate values to what you want — provided it’s relatively simple, otherwise prepare yourself for complex templates — or installing software from a package manager, etc., etc. As soon as you go outside what its modules provide, it gets more difficult. Modules themselves are actually easy to write, provided you maintain Ansible’s (completely reasonable) maxim of idempotency. Sometimes, however, idempotency is hard to enforce.

The best example I’ve come across is building software from source. When there’s no package, you have to do this. Some may argue that this is not Ansible’s job, but I don’t see how it’s different from `apt install`, nor where else it would fit into the pipeline. Anyway, the way I solved this for example was by pinning the idempotency on the existence of build artefacts. Those are often software-specific, so it’s quite fiddly/non-general to find the right artefact.


Ansible should be distributing the software artifact not building it.

Package it as a deb as part of the CI in say jenkins and then apt install it with ansible.


I'd be interested in how you solved this, if you don't mind sharing. I tried writing a role to build source RPMs and it kind of works after doing some really dirty regex tricks, but only barely. I kept feeling that there must be a better way that I wasn't seeing.


Don’t say I didn’t warn you ;) This is not used any more — and the repo’s history looks messed up — but it was my solution to the “configure, make, make install” dance in Ansible:

https://github.com/wtsi-hgi/hgi-systems/blob/master/ansible/...

The replies to my original post say this is not something Ansible should be doing. Fair enough :P I was young!


You should be building packages for the software, of course. If it’s some random internal app you’re chucking into a container it’s one thing, but if you’re deploying it across VMs or physical hosts you probably should put the effort into writing a rpmspec or whatever the hell Debian does (this is a large reason why I don’t do Debian-based systems, I have never been able to properly wrap my head around dpkg builds - there’s too much magic in debhelper and not enough documentation).


Here's an example: you can register the output of a task into a variable, but if you use the same registered variable for multiple tasks even skipped tasks overwrite the registered variable. Say you want either task A or task B to run depending on the OS version, and do something based on that result in task C. Easy enough, just make two different registered variables, and have a separate conditional check for each registered variable on task C. But then, say it's only one of task A, B, C, or D, and task Z needs to know which one ran along with some other conditional logic. The conditional logic gets really hairy at that point. A way around this is to use the same registered variable repeatedly, and save its output via set_fact after each task, if that task ran, to a second variable. The set_fact module will only define the second variable if the conditional is true, unlike the registered variables which get redefined regardless. Then you can check for just the second variable later in the play and see which of the four tasks defined it. But now your 4 tasks have become 8 tasks of workaround repetition. This is just one of the quirks that starts to come up when a playbook gets more complicated than install x, write template y, done.

Basically, it's better to stick with tasks that stand on their own. As soon as you stick complex task logic in it becomes kind of a Rube Goldberg machine. That being said, you can pull off quite a bit if you just test rigorously and write defensively.


Worst scenario for me was trying to implement a quasi hierarchical set of roles for a little over 1000 machines. Meaning I had grouping by datacenters, packages, configuration, domain, apps, etc, but also some differences. Inheritence sucks for both salt and ansible.

http://reclass.pantsfullofunix.net/index.html helped a lit but still not perfect


Was it over a single inventory file ?

You can break it by app, like, having many git repos one per app and then use ansible galaxy to reuse code

Or you can try having a dynamic inventory that pulls metadata from say consul.

We’re in the ballpark of 2k machines and still happy !


I wanted a mono repo and I did manage to do it. But it was not elegant and I kind of pity whoever came after me and needed to maintain and extend that without me being there to explain the model.


This is for the experimentation phase before you abstract backwards into configuration for whatever system you're using.

Sometimes it's a lot easier to stay closer to the metal for figuring out what you actually need, then break out the declarative automation to make it truly reproducible.


I don't think "interspersing commands and their output with prose" is meant to provide a robust solution that should replace "automation tool like Puppet or Chef".

This looks to be a step-up from just manually tinkering with commands in some shell to figure out or explore what it is you need to declare.


Some kinds of declarative automation are just bad. Case in point: configuration management.

Ansible was written before we realized immutable infrastructure is the best way to go. Configuration Management tools craft system state dynamically like a drunk sculpting a Roman bust with a Louisville Slugger. You can get them to do what you want, but it takes a lot of work, and even then the outcome is uncertain. CM tools are often complex because a system whose state is constantly shifting requires complexity to handle it.

If instead you make immutable artifacts, you don't need complexity. A simple series of straight-forward commands in a single version-controlled file does everything you need. Dockerfiles seem immature at first because of how simple they are, but in practice it's much more reliable than Ansible, not to mention easier to support. Thus, non-declarative automation, when used with immutable infrastructure as code, trumps declarative.

Ansible is also only optionally declarative, which people miss just because it has that bastardized form of YAML for a config file. The simplest tasks, roles and playbooks work when executed sequentially, but not necessarily when out of sequential order. When you do make it super-duper-declarative, it can involve tons of confusing logic that makes it nearly impossible to understand, and is much more verbose (and complex) than a simple script. As soon as automation is more work and cost than the alternative, it should be ditched.


> Ansible, as an example, would abstract away the CLI commands and be very readable in very few lines.

If it's just a few lines can you give an example?


Sure:

    - name: install the latest version of Apache and MariaDB
      package:
        name:
          - httpd
          - mariadb-server
        state: latest


That's an ansible task to do something, but it's not the task solved in the article, or in any of the other linked examples.

Things that you haven't done before are harder than things you do all the time, and I would've reckoned trying to do them in ansible is much harder than doing it interactively, but I'd be interested in learning.

If I find an article like this[1], how do you translate it into the artefact I would want to keep in source control? In org-babel it's this[2], so what would that look like in ansible?

[1]: https://www.digitalocean.com/community/tutorials/how-to-set-...

[2]: http://www.howardism.org/Technical/Emacs/linux-iptables.org....

Once you've done it and you've documented it, I can see how you might translate that into ansible (or chef or whatever), but that's a different thing.


Modifying iptables rules is such a common operation that all configuration management languages would support them natively.

Using an iptables module for Ansible it is straightforward to write the rules as yaml. The documentation has a clear example.

However, having used several of these languages for over a decade and realizing this is a contentious issue within the community, I would actually avoid using these tools for individual rules. What I would do is dump the rules to a file, distribute the file, and let Ansible make sure the file matches what is running (by way of a regular rule load when necessary).

The idea here is that I am already familiar with iptables rules and how to write them, and would expect any other ops-ish person to be the same. The source file matches the output, and any historical diffs will be much more straightforward to read, as there are no intermediary source formats that can change.

Also, there is one less Ansible dependency involved, and less syntax to learn (given that one can already read iptables rules).


That's pretty much what I expected; I think that's pretty common, and that's exactly what the author claims to do (use this process for discovery and learnings, then write some guff for chef once they know what they're doing).


It was common to hear that you can only automate what you do manually, but these days a lot of the underlying tasks have changed. Iptables is an absolute doddle to automate, TBH, and the entire security coverage that Iptables offered has often been superseded by security groups in AWS, sidecar proxies with service meshes, etc, so the whole paradigm has changed. What I meant about the article being quite old, is not just that the solutions are old, but the problems being solved have gone away, or should have. Ok, I did come across a client at the beginning of the year doing it the old way, but they were an Internet Exchange around since 1994, so I understood that.


I'm not sure I'm following exactly.

Are you saying that everything (or almost everything) worth automating is already automated by someone else so learning how to do new things isn't important?


> - name: install the latest version of Apache and MariaDB

That's not a name, that an intention describing comment, why isn't it called that way (or for brevity's sake, just "comment")? That might be nit-picking, but IMHO such misnomer cause unnecessary confusion and at the very least make the tool harder to learn.


That’s the name of the task you’re running.


And the same thing in Chef:

    package %w(httpd mariadb-server)


I think this is really cool and I'm tempted to try it out for my own VPS so I can easily set up a fresh one if I need to.

Modern declarative automation is great for a lot of things, especially at work and for CI/CD pipelines, but it's no substitute for learning what you need to do to reproduce the same setup by hand.


How do you document how you call ansible within ansible if I may ask?


Not op but On my teams in the past a lot of it (not to the level of the article) is captured in shell scripts with the invocations.

Inspired by https://github.blog/2015-06-30-scripts-to-rule-them-all/


We have a convention that the call should be just "ansible-playbook play.yml" and everything else should live inside ansible. This way there's no need to document how to run the playbook.


Would you say that this can be compared to what leetrout commented with the difference that instead of scripts there is only a single command?

Is the playbook always called "play.yml"?

And how do you handle multiple repositories (or is there only a sinlge one?)?


Yes, it's basically the same what leertout does.

And no, it is not always called play.yml but according to its function, e.g. "install_foo.yml".

We typically have all our code for one customer in one repository.


Yes, that clarifies a lot.

And yes, one repository per configuration (per customer code).

Is it that you have one inventory per customer configuration? Or how do you manage multiple inventories in this setup?


> Is it that you have one inventory per customer configuration? Or how do you manage multiple inventories in this setup?

We have different customers with different servers and applications. Every customer has its own repository with one or more inventories.


Thanks for your quick feedback.

And how do you share the context / documentation which yaml file is intended to be called by which ansible utility with which utility parameters and switches?

I ask b/c this is normally the problem I run into when things grow and I found the suggestion in the OP interesting to this regard as it adds the context while implementing, not documenting afterwards (and such documentation often becomes stale). Also documentation is often declarative (like do this, then that etc.) and it does not show the original thoughts/ideas behind a certain utility invocation.

I'd be interested to learn a bit more here, so if you could share a bit of context from your end would be nice.


> And how do you share the context / documentation which yaml file is intended to be called by which ansible utility with which utility parameters and switches?

Like I said: Ideally there are no switches and parameters - the playbook should work as-is. If there are switches, we document them in our internal confluence. And yes, our docs get stale, too. That is a problem that we did not fix yet. :)

Many playbooks also get executed automatically, so there'S no need to remember parameters for them. You just start the CI-job that then runs the playbook.


despite being old, I think the value lies in the "principle", you don't have to generate your code out of your markup but at least if you document your thoughts and reflect upon them and what and why certain decisions were made, etc... That invaluable on it's own IMO


Also, how does one SSH into Lambda or AppSync?

Current services don't evwn allow for some of the bad practices of the past.


Well, quite.

The process doesn't allow it, either - using Ansible with Packer (e.g.) to build immutable images means that contact between build and compute is very limited.

Installing software on a production machine? Get outta here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: