My week 1 with Ansible: A Journey Through Errors and Fixes

Muskan AgrawalMuskan Agrawal
5 min read

I still remember opening my terminal with excitement on Day 1 of learning Ansible. I had read and heard enough about how it could automate everything, from installing packages to deploying apps. With that thought in mind, I wrote my very first playbook and ran ansible-playbook. I imagined the playbook would run smoothly and I’d be sipping coffee while Ansible worked like magic.

Instead, what I got was a wall of angry red error messages.

At first, it was frustrating. But then I realized each error was actually teaching me something important about how Ansible really works. So rather than giving up, I started documenting them, one by one. This post is a record of all my first-day failures, the reasons behind them, and how I fixed each one.


1. The SSH problem that stopped me immediately

My first attempt at running a playbook ended with a fatal message about SSH passwords and host key verification. Ansible was not happy at all with me trying to use passwords.

fatal: [node01]: FAILED! => {"msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this.  Please add this host's fingerprint to your known_hosts file to manage this host."

Since sshpass doesn’t get along with host key checking, I had two options: disable host key checking in ansible.cfg, or properly add the host’s fingerprint to my known_hosts file.

Because this was just a test environment, I went with the quick fix first by adding this to ansible.cfg:

[defaults]
host_key_checking = False

But the better long-term solution is definitely to use SSH keys and let Ansible connect securely without complaints. That was my first “ah-ha” moment.


2. The conditional check that made no sense

Next, I tried to use a condition: when: ansible_host == node02. I thought it was logical. But Ansible didn’t agree and yelled:

fatal: [node02]: FAILED! => {"msg": "The conditional check 'ansible_host == node02' failed. The error was: error while evaluating conditional (ansible_host == node02): 'node02' is undefined. 'node02' is undefined\n\nThe error appears to be in '/home/bob/playbooks/nginx.yaml': line 6, column 9, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n   tasks:\n     -  service: 'name=nginx state=started'\n        ^ here\n"}

It turns out I was treating an inventory hostname like a variable. Ansible doesn’t know what node02 is unless it’s defined somewhere in my host_vars or passed in as an actual variable.

The fix was simpler than I expected. If I wanted a task to run on node02, I just needed to set hosts: node02 in the play, rather than writing a conditional. Sometimes trying to be clever only complicates things!


3. Fighting with localhost

Another time, I tried running a playbook directly on my machine. Easy, right? I pointed Ansible to localhost, but instead of running, it failed miserably.

fatal: [localhost]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added 'localhost' (ED25519) to the list of known hosts.\r\nbob@localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}

This was eye-opening. By default, Ansible assumes SSH connections, even to localhost. That’s why I got unreachable host errors.

The fix was to tell Ansible explicitly that I wanted to run the task locally:

- hosts: localhost
  connection: local
  tasks:
    - name: Display resolv.conf
      command: cat /etc/resolv.conf

Once I added connection: local, everything ran smoothly. My localhost finally cooperated!


4. Permission Denied in /opt

Then came the dreaded "Permission denied" error when I tried to create a file under /opt/news/. The error message made it clear that my user didn’t have the right to touch that directory.

fatal: [node01]: FAILED! => {"changed": false, "msg": "Error, could not touch target: [Errno 13] Permission denied: b'/opt/news/blog.txt'", "path": "/opt/news/blog.txt"}

That’s when I discovered become: true. Adding it to the playbook elevated my privileges to run as sudo, which was required to create files in protected directories like /opt.

- hosts: node01
  become: true
  tasks:
    - name: 'create a file'
      file:
        path: /opt/news/blog.txt
        state: touch
        group: sam

With one extra line, the task ran perfectly. This was the moment I realized that become is one of the most crucial keywords in Ansible when you’re dealing with system-level tasks.


5. Copying files: Controller vs Remote

Finally, I stumbled upon a mistake that confused me for a while. I was trying to copy a file using the copy module, but Ansible kept saying it couldn’t find the file on the controller.

fatal: [node02]: FAILED! => {"changed": false, "msg": "Could not find or access '/usr/src/blog/index.html' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}

That’s when I learned an important distinction. By default, copy pulls files from the controller machine and pushes them to the remote machines. If the file already exists on the remote machine itself, you have to use remote_src: yes so Ansible knows to copy locally on the node.

Here’s the difference:

# Copy from controller to remote
- copy:
    src: /usr/src/blog/index.html
    dest: /opt/blog

# Copy within the remote node itself
- copy:
    src: /usr/src/blog/index.html
    dest: /opt/blog
    remote_src: yes

It was a subtle lesson, but one that made a big difference in how I structure my playbooks now.


My Key Day-1 Lessons

By the end of the day, I had a whole notebook full of fixes, but here are the main takeaways that will stick with me:

  1. Configure SSH keys early on, don’t waste time with passwords.

  2. Don’t use inventory names as variables without defining them.

  3. Always use connection: local when running tasks on localhost.

  4. Add become: true whenever tasks require elevated privileges.

  5. Understand the difference between controller files and remote files when using the copy module.


Final Thoughts

Day 1 with Ansible wasn’t smooth sailing. Honestly, it was more like bumping into walls repeatedly until I found the right doors to walk through. But every error taught me something I wouldn’t have understood just by reading the documentation.

Now, instead of fearing those red error messages, I welcome them. They’re signposts telling me where I need to dig deeper.

If you’ve just started with Ansible too, don’t be discouraged. Errors are part of the apprenticeship to automation. And sometimes, those lessons are the best teachers. 🚀


0
Subscribe to my newsletter

Read articles from Muskan Agrawal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Muskan Agrawal
Muskan Agrawal

Cloud and DevOps professional with a passion for automation, containers, and cloud-native practices, committed to sharing lessons from the trenches while always seeking new challenges. Combining hands-on expertise with an open mind, I write to demystify the complexities of DevOps and grow alongside the tech community.