← All articlesAI & Machine Learning

Prompt Engineering for DevOps: Automating Infrastructure with LLMs

Master prompt engineering for DevOps automation. Generate Terraform, Dockerfiles, CI/CD pipelines, and incident responses using LLMs with reliable output.

Y
Yash Pritwani
13 min read

Beyond Chatting: LLMs as Infrastructure Tools

Most prompt engineering guides focus on writing better emails or summarizing documents. For DevOps engineers, the real power is in generating infrastructure code, debugging configurations, and automating incident response. This requires a completely different prompting approach — one that prioritizes precision, safety, and reproducibility over creativity.

InputHiddenHiddenOutput

Neural network architecture: data flows through input, hidden, and output layers.

At TechSaaS, we use Claude Code CLI as the backbone of our infrastructure automation. Here is what we have learned about prompting for ops.

The DevOps Prompting Framework

Every infrastructure prompt should include four elements:

  1. Context: Current state of the system
  2. Constraint: What must be preserved or avoided
  3. Task: What to generate or change
  4. Format: Exact output format expected
CONTEXT: We run Docker containers on a Proxmox LXC host with Traefik
as reverse proxy. PostgreSQL 16 is shared across services.

CONSTRAINT: Do not modify existing services. Port 80 is used by Traefik.
All containers must join the 'padc-net' network. Use environment variables
for secrets, never hardcode them.

TASK: Generate a Docker Compose service definition for a new Redis
instance for session caching with 256MB memory limit, persistent
storage, and Traefik labels for internal access only.

FORMAT: Output only the YAML service block. No explanation needed.

Generating Terraform with LLMs

Get more insights on AI & Machine Learning

Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.

Terraform generation is where LLMs shine — and where they are most dangerous. A hallucinated resource can cost you money or expose infrastructure.

Bad prompt:

Create a Terraform config for AWS

Good prompt:

Generate Terraform 1.6+ HCL for an AWS VPC with:
- CIDR: 10.0.0.0/16
- 3 public subnets (10.0.1.0/24, 10.0.2.0/24, 10.0.3.0/24) across
  us-east-1a, 1b, 1c
- 3 private subnets (10.0.10.0/24, 10.0.20.0/24, 10.0.30.0/24)
- NAT Gateway in the first public subnet
- Internet Gateway
- Route tables for public (IGW) and private (NAT) subnets
- Tags: Environment=staging, Project=myapp, ManagedBy=terraform

Use aws provider >= 5.0. No modules, just resources.
Output only HCL code. No markdown fencing.

The specificity eliminates ambiguity. Every CIDR, AZ, and tag is defined. The LLM has no room to hallucinate.

Dockerfile Generation

DOCKERFILE_PROMPT = """Generate a production Dockerfile for a Node.js 22
application with these requirements:

- Multi-stage build (builder + production)
- Builder stage: install deps, build TypeScript
- Production stage: node:22-alpine, non-root user (node:node)
- Copy only dist/ and node_modules from builder
- HEALTHCHECK using curl on port 3000/health
- Labels: maintainer, version, description
- .dockerignore entries listed as comments at the top
- No npm, use pnpm throughout
- Pin Alpine version, don't use 'latest'
- EXPOSE 3000
- CMD ["node", "dist/server.js"]

Output only the Dockerfile content. No explanation."""

Result:

# .dockerignore: node_modules, dist, .git, .env, *.md, .github
FROM node:22.11-alpine3.20 AS builder
WORKDIR /app
RUN corepack enable pnpm
COPY package.json pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm build

FROM node:22.11-alpine3.20
LABEL maintainer="[email protected]"
LABEL version="1.0.0"
LABEL description="Production Node.js application"
RUN apk add --no-cache curl
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
USER node
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1
CMD ["node", "dist/server.js"]
PromptEmbed[0.2, 0.8...]VectorSearchtop-k=5LLM+ contextReplyRetrieval-Augmented Generation (RAG) Flow

RAG architecture: user prompts are embedded, matched against a vector store, then fed to an LLM with retrieved context.

CI/CD Pipeline Generation

Prompting for CI/CD pipelines requires specifying the exact platform and available secrets:

Generate a Gitea Actions workflow (.gitea/workflows/deploy.yml) that:
1. Triggers on push to main branch
2. Runs TypeScript type checking (pnpm tsc --noEmit)
3. Runs ESLint (pnpm lint)
4. Runs tests (pnpm test)
5. If all pass: builds Docker image, tags with git SHA
6. Pushes to Gitea container registry (git.techsaas.cloud)
7. SSHs into production and runs docker compose pull + up -d

Available secrets: DEPLOY_SSH_KEY, REGISTRY_TOKEN
Runner has: docker, pnpm, node 22
Use ubuntu-latest runner image.

Incident Response Prompts

When something breaks at 3 AM, having pre-built prompt templates saves critical minutes:

INCIDENT_TRIAGE_PROMPT = """You are an SRE triaging an incident.

ALERT: {alert_name}
SERVICE: {service_name}
METRIC: {metric_name} = {metric_value} (threshold: {threshold})
TIME: {timestamp}
RECENT CHANGES: {recent_deployments}

AVAILABLE ACTIONS:
- restart_service(name): Restart a Docker container
- scale_service(name, replicas): Scale a service
- rollback(name, version): Roll back to previous version
- page_human(severity, message): Page the on-call engineer

Analyze this alert. Output a JSON object with:
{{
  "severity": "P1|P2|P3|P4",
  "likely_cause": "string",
  "recommended_actions": ["action1", "action2"],
  "needs_human": true|false,
  "reasoning": "string"
}}"""

Free Resource

Free Cloud Architecture Checklist

A 47-point checklist covering security, scalability, cost optimization, and disaster recovery for production cloud environments.

Download the Checklist

Structured Output for Automation

When LLM output feeds into automation, force structured formats:

import json

def generate_and_parse(prompt: str) -> dict:
    """Generate structured output with validation."""
    full_prompt = prompt + """

CRITICAL: Output ONLY valid JSON. No markdown, no explanation,
no code fences. Just the raw JSON object."""

    response = llm.generate(full_prompt)

    # Strip any markdown fencing the model might add anyway
    fence = chr(96) * 3  # triple backtick
    cleaned = response.strip().removeprefix(fence + "json").removesuffix(fence).strip()

    try:
        return json.loads(cleaned)
    except json.JSONDecodeError:
        # Retry with even stricter prompt
        retry_prompt = f"Fix this invalid JSON:\n{cleaned}\nOutput only valid JSON."
        response = llm.generate(retry_prompt)
        return json.loads(response.strip())

Safety Rules for Infrastructure Prompts

Never let an LLM execute infrastructure commands without these safeguards:

  1. Dry run first: Generate the plan, review it, then apply
  2. Diff before apply: Show what will change before changing it
  3. Blast radius limits: Never modify more than N resources in one operation
  4. Rollback plan: Every change must include a rollback command
  5. Audit log: Record every LLM-generated command and its output
# Pattern: Generate -> Review -> Apply
claude -p "Generate terraform plan for..." > plan.tf
terraform plan -out=tfplan  # Review the plan
terraform apply tfplan       # Apply only after review
TriggerwebhookIfSend EmailSMTPLog EventdatabaseUpdate CRMAPI callDonetruefalse

Workflow automation: triggers, conditions, and actions chain together to eliminate manual processes.

The TechSaaS Approach

We maintain a library of battle-tested prompt templates for every infrastructure task. Our Claude Code integration uses these templates with dynamic context injection — current system state, recent changes, and service dependencies are automatically included. The result is infrastructure automation that is fast, reliable, and auditable.

#prompt-engineering#devops#llm#terraform#automation#infrastructure

Related Service

Cloud Solutions

Let our experts help you build the right technology strategy for your business.

Need help with ai & machine learning?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

We Will Build You a Demo Site — For Free

Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.

47+ companies trusted us
99.99% uptime
< 48hr response

No spam. No contracts. Just a free demo.