Automated Server Patching with Ansible: Zero-Downtime Update Strategy
Automate OS and package updates across your fleet with Ansible playbooks. Covers rolling updates, pre-patch snapshots, automatic rollback, and compliance...
The Patching Problem
Unpatched servers are the number one attack vector. Yet manual patching is tedious, error-prone, and often postponed. The solution is automation with Ansible — idempotent, agentless, and simple enough that your playbooks serve as documentation.
<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 180" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="180" rx="12" fill="#1a1a2e"/><rect x="30" y="55" width="90" height="50" rx="8" fill="#6366f1" opacity="0.9"/><text x="75" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Code</text><rect x="150" y="55" width="90" height="50" rx="8" fill="#3b82f6" opacity="0.9"/><text x="195" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Build</text><rect x="270" y="55" width="90" height="50" rx="8" fill="#a855f7" opacity="0.9"/><text x="315" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Test</text><rect x="390" y="55" width="90" height="50" rx="8" fill="#2dd4bf" opacity="0.9"/><text x="435" y="85" text-anchor="middle" fill="#1a1a2e" font-size="12" font-family="system-ui">Deploy</text><rect x="510" y="55" width="60" height="50" rx="8" fill="#f59e0b" opacity="0.9"/><text x="540" y="85" text-anchor="middle" fill="#1a1a2e" font-size="12" font-family="system-ui">Live</text><path d="M122,80 L148,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M242,80 L268,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M362,80 L388,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M482,80 L508,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><defs><marker id="arrow1" markerWidth="8" markerHeight="6" refX="8" refY="3" orient="auto"><path d="M0,0 L8,3 L0,6" fill="#e2e8f0"/></marker></defs><text x="300" y="145" text-anchor="middle" fill="#94a3b8" font-size="11" font-family="system-ui">Continuous Integration / Continuous Deployment Pipeline</text></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">A typical CI/CD pipeline: code flows through build, test, and deploy stages automatically.</p></div>
Ansible Inventory Setup
Start by organizing your servers into groups:
# inventory/hosts.ini
[webservers]
web1.example.com
web2.example.com
[databases]
db1.example.com
db2.example.com
[monitoring]
monitor.example.com
[all:vars]
ansible_user=deploy
ansible_python_interpreter=/usr/bin/python3
[webservers:vars]
patch_group=group_a
[databases:vars]
patch_group=group_bThe Patching Playbook
Here is a comprehensive playbook that handles the entire patching lifecycle:
---
# playbooks/patch-servers.yml
- name: Automated Server Patching
hosts: all
become: true
serial: "50%" # Rolling update — patch 50% at a time
max_fail_percentage: 0
vars:
reboot_timeout: 600
snapshot_before_patch: true
notify_channel: "#ops-alerts"
pre_tasks:
- name: Check if server is in maintenance window
assert:
that:
- maintenance_window | default(true)
fail_msg: "Server is not in maintenance window. Skipping."
- name: Create pre-patch snapshot (if LXC/VM)
delegate_to: proxmox_host
command: >
pct snapshot {{ inventory_hostname_short }}
pre-patch-{{ ansible_date_time.date }}
when: snapshot_before_patch
ignore_errors: true
- name: Record pre-patch package versions
shell: dpkg -l > /tmp/pre-patch-packages.txt
changed_when: false
tasks:
- name: Update apt cache
apt:
update_cache: true
cache_valid_time: 3600
- name: Upgrade all packages
apt:
upgrade: safe
autoremove: true
autoclean: true
register: apt_result
- name: Display upgraded packages
debug:
msg: "{{ apt_result.stdout_lines | default([]) }}"
when: apt_result.changed
- name: Check if reboot is required
stat:
path: /var/run/reboot-required
register: reboot_required
- name: Reboot server if required
reboot:
reboot_timeout: "{{ reboot_timeout }}"
msg: "Ansible patching reboot"
when: reboot_required.stat.exists
- name: Wait for server to be ready
wait_for_connection:
delay: 10
timeout: 300
post_tasks:
- name: Verify critical services are running
systemd:
name: "{{ item }}"
state: started
loop: "{{ critical_services | default(['docker', 'ssh']) }}"
register: service_check
- name: Record post-patch package versions
shell: dpkg -l > /tmp/post-patch-packages.txt
changed_when: false
- name: Generate patch diff report
shell: >
diff /tmp/pre-patch-packages.txt
/tmp/post-patch-packages.txt || true
register: patch_diff
changed_when: false
- name: Save patch report
copy:
content: |
Patch Report for {{ inventory_hostname }}
Date: {{ ansible_date_time.iso8601 }}
Reboot Required: {{ reboot_required.stat.exists }}
Changes:
{{ patch_diff.stdout }}
dest: "/var/log/patch-reports/{{ ansible_date_time.date }}.txt"
handlers:
- name: Send notification
uri:
url: "https://notify.example.com/ops"
method: POST
body_format: json
body:
topic: ops-patches
title: "Patch complete: {{ inventory_hostname }}"
message: "{{ apt_result.changed | ternary('Packages updated', 'No updates') }}"Rolling Updates Strategy
The key to zero-downtime patching is the serial directive:
# Patch one server at a time
serial: 1
# Patch 50% at a time (good for load-balanced groups)
serial: "50%"
# Progressive — start slow, then speed up
serial:
- 1
- "30%"
- "100%"<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 200" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="200" rx="12" fill="#1a1a2e"/><rect x="60" y="30" width="140" height="140" rx="6" fill="none" stroke="#e2e8f0" stroke-width="1.5"/><text x="130" y="24" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Production</text><rect x="70" y="40" width="120" height="22" rx="3" fill="#6366f1" opacity="0.8"/><circle cx="82" cy="51" r="3" fill="#2dd4bf"/><text x="130" y="55" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Web Server</text><rect x="70" y="68" width="120" height="22" rx="3" fill="#6366f1" opacity="0.8"/><circle cx="82" cy="79" r="3" fill="#2dd4bf"/><text x="130" y="83" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">App Server</text><rect x="70" y="96" width="120" height="22" rx="3" fill="#a855f7" opacity="0.8"/><circle cx="82" cy="107" r="3" fill="#2dd4bf"/><text x="130" y="111" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Database</text><rect x="70" y="124" width="120" height="22" rx="3" fill="#f59e0b" opacity="0.6"/><circle cx="82" cy="135" r="3" fill="#2dd4bf"/><text x="130" y="139" text-anchor="middle" fill="#1a1a2e" font-size="9" font-family="system-ui">Monitoring</text><rect x="290" y="30" width="140" height="140" rx="6" fill="none" stroke="#e2e8f0" stroke-width="1.5"/><text x="360" y="24" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Staging</text><rect x="300" y="40" width="120" height="22" rx="3" fill="#3b82f6" opacity="0.6"/><circle cx="312" cy="51" r="3" fill="#2dd4bf"/><text x="360" y="55" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Web Server</text><rect x="300" y="68" width="120" height="22" rx="3" fill="#3b82f6" opacity="0.6"/><circle cx="312" cy="79" r="3" fill="#2dd4bf"/><text x="360" y="83" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">App Server</text><rect x="300" y="96" width="120" height="22" rx="3" fill="#a855f7" opacity="0.5"/><circle cx="312" cy="107" r="3" fill="#f59e0b"/><text x="360" y="111" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Database</text><line x1="200" y1="100" x2="290" y2="100" stroke="#2dd4bf" stroke-width="1.5" stroke-dasharray="5,3"/><text x="245" y="95" text-anchor="middle" fill="#2dd4bf" font-size="8" font-family="system-ui">VLAN</text><rect x="480" y="60" width="90" height="70" rx="6" fill="none" stroke="#f59e0b" stroke-width="1" stroke-dasharray="4,3"/><text x="525" y="85" text-anchor="middle" fill="#f59e0b" font-size="9" font-family="system-ui">Backup</text><text x="525" y="100" text-anchor="middle" fill="#f59e0b" font-size="9" font-family="system-ui">Storage</text><text x="525" y="115" text-anchor="middle" fill="#94a3b8" font-size="8" font-family="system-ui">3-2-1 Rule</text><line x1="430" y1="100" x2="478" y2="95" stroke="#f59e0b" stroke-width="1" stroke-dasharray="4,3"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Server infrastructure: production and staging environments connected via VLAN with offsite backups.</p></div>
Security-Only Updates
Sometimes you want only security patches, not feature updates:
- name: Install security updates only
apt:
upgrade: dist
default_release: "{{ ansible_distribution_release }}-security"
update_cache: trueOr use unattended-upgrades for automatic security patches:
- name: Configure unattended upgrades
apt:
name:
- unattended-upgrades
- apt-listchanges
state: present
- name: Enable automatic security updates
copy:
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";
dest: /etc/apt/apt.conf.d/20auto-upgrades
mode: "0644"Rollback Playbook
If patching breaks something, roll back from the snapshot:
---
- name: Rollback Server Patch
hosts: "{{ target_host }}"
become: true
tasks:
- name: Restore from pre-patch snapshot
delegate_to: proxmox_host
command: >
pct rollback {{ inventory_hostname_short }}
pre-patch-{{ rollback_date }}
when: use_snapshot | default(false)
- name: Or downgrade specific packages
apt:
name: "{{ packages_to_downgrade }}"
state: present
force: true
when: packages_to_downgrade is definedCompliance Reporting
Generate a compliance report showing patch status across your fleet:
- name: Patch Compliance Report
hosts: all
become: true
gather_facts: true
tasks:
- name: Check available updates
shell: apt list --upgradable 2>/dev/null | tail -n +2 | wc -l
register: available_updates
changed_when: false
- name: Check last patch date
stat:
path: /var/log/apt/history.log
register: apt_history
- name: Compile report
set_fact:
patch_status:
hostname: "{{ inventory_hostname }}"
os: "{{ ansible_distribution }} {{ ansible_distribution_version }}"
kernel: "{{ ansible_kernel }}"
pending_updates: "{{ available_updates.stdout }}"
last_patched: "{{ apt_history.stat.mtime | default('never') }}"
uptime_days: "{{ ansible_uptime_seconds | int // 86400 }}"
- name: Write consolidated report
delegate_to: localhost
lineinfile:
path: ./reports/patch-compliance.csv
line: "{{ patch_status.hostname }},{{ patch_status.os }},{{ patch_status.pending_updates }},{{ patch_status.uptime_days }}"
create: true
run_once: falseScheduling with Cron or Systemd Timers
# Run patching every Sunday at 3 AM
0 3 * * 0 cd /opt/ansible && ansible-playbook playbooks/patch-servers.yml -i inventory/hosts.ini >> /var/log/ansible-patching.log 2>&1<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 170" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="170" rx="12" fill="#1a1a2e"/><circle cx="60" cy="85" r="25" fill="#f59e0b" opacity="0.85"/><text x="60" y="82" text-anchor="middle" fill="#1a1a2e" font-size="9" font-family="system-ui" font-weight="bold">Trigger</text><text x="60" y="94" text-anchor="middle" fill="#1a1a2e" font-size="8" font-family="system-ui">webhook</text><polygon points="175,55 210,85 175,115 140,85" fill="#6366f1" opacity="0.85"/><text x="175" y="88" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">If</text><rect x="250" y="35" width="100" height="40" rx="6" fill="#2dd4bf" opacity="0.85"/><text x="300" y="55" text-anchor="middle" fill="#1a1a2e" font-size="10" font-family="system-ui">Send Email</text><text x="300" y="67" text-anchor="middle" fill="#1a1a2e" font-size="8" font-family="system-ui">SMTP</text><rect x="250" y="95" width="100" height="40" rx="6" fill="#a855f7" opacity="0.85"/><text x="300" y="115" text-anchor="middle" fill="#ffffff" font-size="10" font-family="system-ui">Log Event</text><text x="300" y="127" text-anchor="middle" fill="#ffffff" font-size="8" font-family="system-ui">database</text><rect x="400" y="55" width="100" height="40" rx="6" fill="#3b82f6" opacity="0.85"/><text x="450" y="75" text-anchor="middle" fill="#ffffff" font-size="10" font-family="system-ui">Update CRM</text><text x="450" y="87" text-anchor="middle" fill="#ffffff" font-size="8" font-family="system-ui">API call</text><circle cx="545" cy="75" r="18" fill="none" stroke="#2dd4bf" stroke-width="2"/><text x="545" y="79" text-anchor="middle" fill="#2dd4bf" font-size="9" font-family="system-ui">Done</text><defs><marker id="arrow10" markerWidth="8" markerHeight="6" refX="8" refY="3" orient="auto"><path d="M0,0 L8,3 L0,6" fill="#e2e8f0"/></marker></defs><line x1="87" y1="85" x2="138" y2="85" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><line x1="210" y1="72" x2="248" y2="55" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><line x1="210" y1="98" x2="248" y2="115" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><line x1="352" y1="55" x2="398" y2="68" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><line x1="352" y1="115" x2="398" y2="82" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><line x1="502" y1="75" x2="525" y2="75" stroke="#e2e8f0" stroke-width="1.5" marker-end="url(#arrow10)"/><text x="225" y="45" text-anchor="middle" fill="#2dd4bf" font-size="8" font-family="system-ui">true</text><text x="225" y="120" text-anchor="middle" fill="#a855f7" font-size="8" font-family="system-ui">false</text></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Workflow automation: triggers, conditions, and actions chain together to eliminate manual processes.</p></div>
Best Practices
1. Always snapshot before patching — LXC and ZFS make this instant 2. Use serial for rolling updates — never patch all servers at once 3. Verify services after patching — automated health checks in post_tasks 4. Keep a patch log — compliance requires proof of patching 5. Test in staging first — use separate inventory groups 6. Set a reboot timeout — servers that fail to come back should alert
At TechSaaS, we manage patching across all our infrastructure with Ansible. Combined with ZFS snapshots for instant rollback, we patch confidently knowing we can revert in seconds.
Need automated patching for your infrastructure? Contact [email protected].
Need help with devops?
TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.