Getting Started with Ansible as a Fullstack Developer

Introduction

I don’t claim to be a DevOps expert. I just know some basic ssh and scp, that’s all. I have done whatever server configuration till date by ssh’ing into instances and then configuring them by hand. But now-a-days I feel like spinning up and destroying a server has increased in my life style. And doing it manually is just waste of time when we already have solution like Ansible.

With that said, I’ll start this post by a gyan which one of my friend recently gave me. This gyan turned out to be a business management thing, but it can be applied in real life as well. If you really want something to succeed, ask yourself these questions.

What
Why
How
Where
Who
When

These questions will solidify your thought process. If you can answer these questions, you’ll be knowing what you’re doing, what you end goal is, you’ll come up with a roadmap, and you’ll be more commited to it.

What: Learn Ansible.

Why: To save time by automating my server and package installations. Some of things to automate? nginx, jenkins, openvpn.

How: Youtube, Ansible Docs, The Internet.

Where: AWS Cloud

Who: Santosh Kumar

When: Before 2022 starts, approx 60 days in hand at the time of writing this.

Connect with me on LinkedIn. And now is the time to get our feet wet…

Getting Started

If you are reading this to learn Ansible with me. I must tell you that I expect some knowledge beforehand from you. Below are the prerequisites.

Working knowledge of SSH. If you are reading this post, you must already have that. If not, try spinning up a EC2 instance and connect to it first.
Working knowledge of Linux. I’m doing Ansible with Amazon EC2. Ansible works with Windows too, but I’m leaving it to some of you guys to take the liberty of journaling your journey with Windows.
Working knowledge of AWS infrastructure. Are bare minimum, you should be able to spin up instances and able to connect to them (and each other) via SSH.

Better if you know what a VPC is, what a subnet is, what roles are, what policies are, what permissions are, what security group is etc. If these words are new to you, please consider skimming through AWS Certified Cloud Practitioner video.

For those who are ready, let’s start with some theory class first…

What is an inventory?

From dictionary:

inventory
^noun

The stock of an item on hand at a particular location or business.
a detailed list of all the items on hand

In context of Ansible, an inventory is a list of all the hosts. A host can be a server, or network device or whatever we intend to automate with Ansible.

We store IP addresses or hostnames in the inventory file, or group of hostnames/IPs. More on this in later sections. On top of that, we can also store variables, and sources which are static or dynamic in nature.

The default location for inventory in Linux environment is /etc/ansible/hosts.

What is a Module?

Modules are executable bits of code. Modules are like tools. From the knowledge of Ansible I have right now, each module represents a command which can be executed on the remote host.

Example modules:

ping
command
yum
service

Some of the builtin modules can be found at: https://docs.ansible.com/ansible/latest/collections/ansible/builtin/index.html

What is a Playbook?

Playbooks are YAML syntaxed files which store tasks which needs to be performed. In context of a playbook, task is modules + some extra info (more on this when we experiment).

There is a special syntax to playbooks which we will see when doing an example. In other words, playbooks are ansible’s configuration, deployment and orchestration language.

Playbook invoke Ansible modules. Playbooks are instruction manual on how to use modules. Playbooks are run against the inventory, which are like raw material.

What is a Role?

Roles are playbooks and related material in a named directory structures.

I have wrote about What are roles in next iteration of this series. But if you are new to Ansible I don’t see any reason for you to worry about Ansible.

Hello World with Ansible

For sake of this tutorial, I have made this setup on AWS.

I have 2 hosts in same subnet:

My control node is 10.10.10.10.
My one of the managed node is 10.10.10.11.
For sake of this tutorial, I’m using the same private key to connect to both the instances, which indeed is not the best practice. But for now, let it be.

I have installed ansible on my control node with sudo amazon-linux-extras install ansible2, but the method can be different among distros.
Although I can, I’ll not touch the system config/inventory files and instead take a copy of them and modify them in my workspace.

I do so like this:

$ mkdir ~/workspace/ansible
$ cd ~/workspace/ansible
$ 
$ cat /etc/ansible/hosts > inventory
$ cat /etc/ansible/ansible.cfg > ansible.cfg
$ ls
key.pem  inventory  ansible.cfg

What’s going on above? I have create a directory in my $HOME named workspace/ansible and cd into it. Next I have taken a copy of /etc/ansible/hosts file which happens to have no host entry and write it to my workspace/ansible dir. I have done same thing with ansible.cfg file. Then I ls to show what I have in that directory.

For sake of simplicity, as you can see, I have kept the private key (key.pem, which I got while spinning up the instance) in same directory as all the files.

Setting up the inventory

Let’s first go through the inventory file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# This is the default ansible 'hosts' file.
#
# It should live in /etc/ansible/hosts
#
#   - Comments begin with the '#' character
#   - Blank lines are ignored
#   - Groups of hosts are delimited by [header] elements
#   - You can enter hostnames or ip addresses
#   - A hostname/ip can be a member of multiple groups

# Ex 1: Ungrouped hosts, specify before any group headers.

## green.example.com
## blue.example.com
## 192.168.100.1
## 192.168.100.10

# Ex 2: A collection of hosts belonging to the 'webservers' group

## [webservers]
## alpha.example.org
## beta.example.org
## 192.168.1.100
## 192.168.1.110

[utility]
10.10.10.11

[nonexistent]
10.10.10.12

# If you have multiple hosts following a pattern you can specify
# them like this:

## www[001:006].example.com

# Ex 3: A collection of database servers in the 'dbservers' group

## [dbservers]
##
## db01.intranet.mydomain.net
## db02.intranet.mydomain.net
## 10.25.1.56
## 10.25.1.57

# Here's another example of host ranges, this time there are no
# leading 0s:

## db-[99:101]-node.example.com

As seen above, the default hosts file comes with a full exhaustive list of examples. You may want to truncate it and start fresh. But for now, I have decided to keep it that way for future reference. Also on line 26-27, I have created a group called utility and put my single to be controlled host IP.

I have added an extra group called nonexistent. This is for demo and we’ll use it in upcoming sections. Please note that I do not have any host with that address. It is just a dummy IP address I’m using.

Doing “Hello, World!” in Ansible

We do “hello world” in ansible by using a module called ping. Let’s see following command.

ansible all -m ping

In above command we are saying that on all hosts, run a module (-m) called ping. Let’s run that command.

$ ansible all -m ping
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

At this point of time, nothing has happened. ansible executeble is looking for hosts in the system inventory file (the one from /etc/ansible/hosts), which is blank at the moment.

Use local inventory file

To make it use my local inventory, I’ll pass it -i inventory. -i is short for --inventory-file. Let’s run that and see the output.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ ansible all -i inventory -m ping

10.10.10.11 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: Permission denied (publickey).",
    "unreachable": true
}
10.10.10.12 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: connect to host 10.10.10.12 port 22: Connection timed out",
    "unreachable": true
}

Both hosts are UNREACHABLE! But if you look closely, both have different reasons. If you have an eye for detail, they are not the ansible errors, they are ssh errors.

Using the private keys

Let’s look into 10.10.10.11 for now. The error message says: Failed to connect to the host via ssh: Permission denied (publickey). Which means that ansible is able to make a connection to the host. But is not able to authenticate with current configuration.

There might be other ways to do this, but I’m gonna use the private key which I downloaded when launching the instance…

Let’s use that private key in conjunction with our command.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ ansible all --private-key key.pem -i inventory -m ping

10.10.10.11 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "ping": "pong"
}
10.10.10.12 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: connect to host 10.10.10.12 port 22: Connection timed out",
    "unreachable": true
}

As you can see, one of our hosts has responded with our ping as pong. This is because we passed --private-key key.pem to ansible.

You can also override this setting in ansible.cfg file. You can use sed to change the setting file:

sed -i 's@/path/to/file@~/workspace/ansible/key.pem@g' ansible.cfg
sed -i 's/#private_key_file/private_key_file/g' ansible.cfg

And then verify it.

$ grep private_key ansible.cfg
private_key_file = ~/workspace/ansible/key.pem

Now I can run same command without --private-key key.pem.

1
2
3
$ ansible all -i inventory -m ping                                                                                                                                                      
10.10.10.11 | SUCCESS => {
[...output trimmed...]

Only execute modules on certain groups

Remember our inventory file had two groups? Named utility and nonexistent? And I told I’ll demo it in upcoming sections? This is the time.

You can limit the executions to a certain groups.

ansible utility -i inventory -m ping

10.10.10.11 | SUCCESS => {
[...output trimmed...]

The all can be replaced any other string which is a group in the inventory file. It was utility in this case.

Also, at this point, I want to point this out that one host can be in different groups and that’s completely normal.

Hello World with ansible-playbook

Note: I have removed the nonexistent group (i.e. 10.10.10.12) from my inventory file from now on. If you are following this guide on your own and you see unreachability error for 10.10.10.12, then please remove this host from the inventory file.

We saw how can we execute some commands on remote hosts with ansible command. Now we are gonna do the same work, but with the help of a Ansible Playbook. The command used to invoke an ansible playbook is ansible-playbook If you do Docker, you can relate ansible playbooks with docker compose and the file docker-compose.yml which stores all the configuration with the playbook we write here. Another common thing between them? YAML.

I’m not doing the comparision on syntax basis, but the behavior basis. Because you can run the same thing with command line and the yaml file. The benefit with yaml file is that, things are documented and staged.

A trivial Playbook

Let’s create a file in current directory and name it mytask.yml, and write following content to that file (which I have stolen from the Ansible docs):

1
2
3
4
5
6
---
- name: My task
  hosts: all
  tasks:
     - name: Leaving a mark
       command: "touch /tmp/ansible_was_here"

We’ll learn more about the syntax after we have done running this playbook. I have total 3 files in my ~/workspace/ansible now.

$ ls
mytask.yml  inventory  ansible.cfg

Let’s execute our mytask playbook now. This is the command I will be executing:

$ ansible-playbook -l utility -i inventory mytask.yml

Did you note the little -l utility thingy here? The long version of -l is --limit, which limits the number of host we are executing this playbook on. --limit takes the host group as you can see. And then we have the same -i inventory we had before. Without -i inventory, ansible would look for hosts in /etc/ansible/hosts file, which at the moment is simply blank as we have already seen.

ansible-playbook takes in a YAML file. Get introduced to ansible-playbook. Don’t get overwhelmed by that example on that page.

Let’s run the playbook now.

Output of the playbook

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ ansible-playbook -l utility -i inventory mytask.yml

PLAY [My task] ******************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************
ok: [10.10.10.11]

TASK [Leaving a mark] ***********************************************************************************************
[WARNING]: Consider using the file module with state=touch rather than running 'touch'.  If you need to use command 
because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in
ansible.cfg to get rid of this message.
changed: [10.10.10.11]

PLAY RECAP **********************************************************************************************************
10.10.10.11                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

I can go ahead and check if playbook actually ran by going into my host and checking if /tmp/ansible_was_here file was actually created. I can do that by ssh itself by issuing following command.

$ ssh 10.10.10.11 ls -l /tmp/ansible*
-rw-rw-r-- 1 ec2-user ec2-user 0 Oct 24 15:53 /tmp/ansible_was_here

Yes, it did. We successfully ran our first playbook.

Anatomy of mytask.yml

Let’s look at our mytask.yml again.

1
2
3
4
5
6
7
---
- name: My task
  hosts: all
  
  tasks:
     - name: Leaving a mark
       command: "touch /tmp/ansible_was_here"

Please not that indentation matter when working with YAML files, just like indentation matters in Python’s syntax. With that said, I’ll go line by line:

---. This is not a Ansible thing, but a YAML thing. A YAML file can optionally start with --- and end with .... People at redhat chose this for readability. Learn more about this.
- name: My task. Before I even start typing this line. I would like to point out that, not only Python, but Ansible also heavily belives in readability and self documentation. There are some things without which ansible can work, but they are there for sake of self documentation. So that a new person in the team can get benefited. This line simply documents the name of the playbook.
hosts: all. Do you remember utility and nonexistent from Setting up the inventory section? You can put thost groups here in the place of all.
tasks:. This is a list of tasks. Each task at least should have one name and one module.
- name:. This is the name of the task instead of the playbook we saw above. Please note that names are not important but highly encouraged. Just image above playbook output without any names. Won’t you be lost in the output?
command:. command is one of the core modules of ansible. It is used to execute arbitrary command on remote hosts.

Installing packages on hosts

I’m eager to install nginx on my system. That was the reason I started learning Ansible.

I went ahead and came up with this playbook which uses 2 real life modules called yum and service. As I have CentOS based distro, thus yum. And I suppose service is agnostic to most distros.

Installing nginx

I put the following code in a file called nginx.yml in te root of our ansible workspace.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
---
- name: install and start nginx
  hosts: web

  tasks:
    - name: install nginx
      yum:
        name:
          - nginx
        state: latest
    - name: start nginx
      service:
        name: nginx
        state: started

And I tried to run it. And this happened:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ ansible-playbook -i inventory nginx.yml

PLAY [install and start nginx] **************************************************************************************

TASK [Gathering Facts] **********************************************************************************************
ok: [10.10.10.11]

TASK [install nginx] ************************************************************************************************
fatal: [10.10.10.11]: FAILED! => {"changed": true, "changes": {"installed": ["nginx"], "updated": []},
"msg": "You need to be root to perform this command.\n", "rc": 1, "results": ["Loaded plugins: extras_suggestions, langpacks, priorities, update-motd\n"]}

PLAY RECAP **********************************************************************************************************
10.10.10.11                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

“You need to be root to perform this command”. Fair enough. What I’m tring to do with nginx playbook is to install nginx, which obviously requires administrative privilege.

After a bit of research, I found that I have to become root for privilege escalation. And the way be do it in playbook is by adding become: yes in the tasks which require them. In my case, both of them required sudo so I put them in the root indendation level, i.e. just below the hosts: web. Like so:

1
2
3
4
5
6
---
- name: install and start nginx
  hosts: web
  become: yes

  tasks:

And the playbook plays well:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ ansible-playbook -i inventory nginx.yml

PLAY [install and start nginx] ***********************************************************************************

TASK [Gathering Facts] *******************************************************************************************
ok: [10.10.10.11]

TASK [install nginx] *********************************************************************************************
changed: [10.10.10.11]

TASK [start nginx] ***********************************************************************************************
changed: [10.10.10.11]

PLAY RECAP *******************************************************************************************************
10.10.10.11                : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

This was the minimalist playbook I could have ever run. Next I’m going to do a little bit more and reverse proxy my Jenkins server using Nginx.

Insider info: One of the reason I’m doing Ansible is to configure Jenkins server which is currently running on port 8080 without HTTPS, which is risky. First I want to configure nginx, then proceed with HTTPS Anywhere certificate, and then at the end I might end up doing Jenkins with ansible itself.

Dealing with config files: Copying the nginx.conf

You might be aware of recent post which I did about Enabling HTTPS on EC2 Hosted Website. I’m taking that work as a reference and proceed.

Here is a modified version of nginx.conf I’m going to use. This code does not resembles the entire nginx.conf, but I want you to look at the highlighted text. You need to add that in the defualt configuration file. If in doubt, copy the entire gist code from here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
http {
    server {
        server_name  _;
        root         /usr/share/nginx/html;

        location / {
            proxy_pass http://localhost:8080/;
        }

        error_page 404 /404.html;
            location = /40x.html {
        }

        error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }
}

Please note that Jenkins is already running on port 8080 right now on my nginx machine and that is where I’m routing my incoming traffic to. If not Jenkins, you can run a python simple HTTP server with python3 -m http.server 8080 on 10.10.10.11 host for simplicity and testing purpose.

I have to add a task for copying configuration to my playbook now:

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
  tasks:
    - name: install nginx
      yum:
        name:
          - nginx
        state: latest
    - name: copy config for jenkins routing
      copy:
        src: nginx.conf
        dest: /etc/nginx/nginx.conf
        mode: preserve
      notify: restart nginx
    - name: start nginx
      service:
        name: nginx
        state: started
  
  handlers:
    - name: restart nginx
      service:
        name: nginx

There are 2 new things going on here:

The copy module which is used to copy files around. You can read more about what it does and what alternative are available to it on copy documentation page.
We also see the new handlers: block which works closely with notify: section.

handlers are similar to tasks in some sense (thus in same indentation level). It is different in the sense that a handler is only run when it is notified by some tasks (correct me if I am wrong).

You must also know that handlers are only notified if the notifying task is actually done running. Means that, if the task has no changes to make, it won’t notify the handler. We will see this with help of an example now.

Let’s run our playbook with our latest addition:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ ansible-playbook -l web -i inventory nginx.yml

PLAY [install and start nginx] ****************************************************************

TASK [Gathering Facts] ************************************************************************
ok: [10.10.10.11]

TASK [install nginx] **************************************************************************
changed: [10.10.10.11]

TASK [copy config for jenkins routing] ********************************************************
changed: [10.10.10.11]

TASK [start nginx] ****************************************************************************
changed: [10.10.10.11]

RUNNING HANDLER [restart nginx] ***************************************************************
ok: [10.10.10.11]

PLAY RECAP ************************************************************************************
10.10.10.11                  : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Handler only runs when a tasks chages the state

Without actually doing anything, try running the playbook again:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
$ ansible-playbook -l web -i inventory nginx.yml

PLAY [install and start nginx] ****************************************************************

TASK [Gathering Facts] ************************************************************************
ok: [10.10.10.11]

TASK [install nginx] **************************************************************************
ok: [10.10.10.11]

TASK [copy config for jenkins routing] ********************************************************
ok: [10.10.10.11]

TASK [start nginx] ****************************************************************************
ok: [10.10.10.11]

PLAY RECAP ************************************************************************************
10.10.10.11                  : ok=4    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

As you can see, the restart is not run this time. Go ahead and modify the config file and run the play as a homework.

Next Steps

Sadly I’ll have to end this post here as it is getting too long. I have already done Jenkins installation playbook as of now if you want to take a look you can take it here and discover new modules:

https://gist.github.com/santosh/78183e403e6a98d263130ba52aa0f593

I’ve posted about configuring HTTPS on a wildcard basis in a separate post. This post is different from previous one in the sense that it configures HTTPS not only for the domain, but also it’s all of subdomains.

Conclusion

I also know I have not covered everything in Ansible and some of you might be sad on that. I did’t cover Roles, I didn’t cover templating. I didn’t cover those giant folder structure.

My journey with Ansible is not this short. Ansible is not only radically simple, but also you can extend it’s capability by a simple language called Python.

Please let me know how you feel about this post in the comment section. Feel free to share it with your network. And don’t forget to subscribe using the form below.