Internal Developer Portals with Backstage: How We Cut Context-Switching from 2h/Day to 20min

Yash Pritwani

4 May 20266 min read read

# Internal Developer Portals with Backstage: How We Cut Context-Switching from 2h/Day to 20min

Your developers are drowning. Not in code -- in tabs. Fifteen browser tabs open: Grafana for metrics, PagerDuty for incidents, Confluence for docs, Jenkins for builds, Jira for tickets, three different Slack channels for three different teams. A 2023 Spotify engineering survey found developers spend an average of 1.5-2.5 hours per day just *finding information* about their own services. That's not engineering. That's archaeology.

We deployed Backstage at a client with 140+ microservices and 8 platform teams. Within six weeks, average context-switching overhead dropped from 2 hours/day to under 20 minutes. Here's exactly how we did it, what broke along the way, and why most Backstage deployments fail.

Why Most Internal Portals Fail Before They Start

The number one reason developer portals die? They become another tool nobody maintains. We've seen it repeatedly: a team spends three months building a beautiful portal, launches it, and six months later the service catalog is 40% stale and developers are back to Slack-searching for answers.

Backstage avoids this trap through one mechanism: catalog-as-code. Every service registers itself via a catalog-info.yaml in its own repository. The team that owns the code owns the catalog entry. No central bottleneck. No "portal team" that becomes a single point of failure.

Architecture: What You Actually Need

Forget the enterprise reference architectures with twelve components. A production Backstage deployment needs four things:

1. Backstage app (Node.js, runs in a container) 2. PostgreSQL database (catalog state, search index) 3. GitHub/GitLab integration (catalog discovery, TechDocs) 4. Auth provider (SSO via OIDC -- we use Keycloak, but Okta/Azure AD work fine)

Here's the minimal app-config.production.yaml:

app:
  title: Acme Developer Portal
  baseUrl: https://portal.internal.acme.com

backend:
  baseUrl: https://portal.internal.acme.com
  database:
    client: pg
    connection:
      host: ${POSTGRES_HOST}
      port: 5432
      user: ${POSTGRES_USER}
      password: ${POSTGRES_PASSWORD}

catalog:
  providers:
    github:
      acmeOrg:
        organization: 'acme-corp'
        catalogPath: '/catalog-info.yaml'
        filters:
          repository: '.*'
        schedule:
          frequency: { minutes: 30 }
          timeout: { minutes: 3 }

auth:
  environment: production
  providers:
    oidc:
      production:
        metadataUrl: ${OIDC_METADATA_URL}
        clientId: ${OIDC_CLIENT_ID}
        clientSecret: ${OIDC_CLIENT_SECRET}

Deploy it in Docker. We use a multi-stage build that compiles the frontend, bundles the backend, and produces a ~250MB image:

FROM node:20-bookworm-slim AS builder
WORKDIR /app
COPY . .
RUN yarn install --frozen-lockfile
RUN yarn tsc
RUN yarn build:backend

FROM node:20-bookworm-slim
WORKDIR /app
COPY --from=builder /app/packages/backend/dist ./
COPY --from=builder /app/app-config*.yaml ./
RUN yarn install --production --frozen-lockfile
CMD ["node", "packages/backend", "--config", "app-config.yaml", "--config", "app-config.production.yaml"]

The Catalog: Getting 140 Services Registered in a Week

Here's the counterintuitive part: don't ask teams to register their services. Automate it.

We wrote a simple script that scanned every repository in the GitHub org, generated a catalog-info.yaml based on existing metadata (README, package.json, Dockerfile presence, Kubernetes manifests), and opened a PR to each repo:

# catalog-info.yaml - auto-generated, customize and merge
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  description: Handles payment processing and refunds
  annotations:
    github.com/project-slug: acme-corp/payment-service
    backstage.io/techdocs-ref: dir:.
    pagerduty.com/service-id: P2X4B9K
    grafana/dashboard-selector: "service=payment-service"
  tags:
    - java
    - spring-boot
    - tier-1
spec:
  type: service
  lifecycle: production
  owner: team-payments
  system: checkout
  dependsOn:
    - component:user-service
    - resource:payments-db
  providesApis:
    - payment-api

The key annotations link Backstage to existing tools. That grafana/dashboard-selector means developers click one button and see their Grafana dashboard. The pagerduty.com/service-id shows on-call status inline. No more tab-switching.

Out of 143 repositories, 128 PRs were merged within the first week. The remaining 15 were archived repos nobody owned -- which itself was a useful discovery.

The Plugins That Actually Matter

Backstage has 200+ plugins. You need about six:

1. TechDocs -- Docs-as-code rendered from Markdown in each repo. This alone eliminated 60% of the "where's the docs?" Slack messages. 2. Kubernetes -- Shows pod status, deployments, and recent events for each service. Developers stopped SSH-ing into clusters to run kubectl get pods. 3. GitHub Actions / Jenkins -- Build status, recent runs, failure rates. Visible on the service page. 4. PagerDuty -- Who's on call? Is there an active incident? Answered without leaving the portal. 5. Grafana -- Embedded dashboards showing key metrics (latency, error rate, throughput) per service. 6. Software Templates -- Golden paths for creating new services. This is where the real platform engineering value lives.

Software Templates: The Golden Path

This is the feature that turned skeptical senior engineers into advocates. Instead of copying a "template repo" and spending two days configuring CI/CD, Kubernetes manifests, and observability, developers fill out a form and get a production-ready repository in 90 seconds.

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: spring-boot-service
  title: Spring Boot Microservice
  description: Production-ready Spring Boot service with CI/CD, K8s, and observability
spec:
  owner: team-platform
  type: service
  parameters:
    - title: Service Details
      required: [name, owner, system]
      properties:
        name:
          title: Service Name
          type: string
          pattern: '^[a-z][a-z0-9-]*$'
        owner:
          title: Owner Team
          type: string
          ui:field: OwnerPicker
        system:
          title: System
          type: string
          ui:field: EntityPicker
          ui:options:
            catalogFilter:
              kind: System
  steps:
    - id: fetch
      name: Fetch Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          owner: ${{ parameters.owner }}
    - id: publish
      name: Create Repository
      action: publish:github
      input:
        repoUrl: github.com?owner=acme-corp&repo=${{ parameters.name }}
    - id: register
      name: Register in Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

The skeleton directory contains a complete Spring Boot project with Dockerfile, GitHub Actions workflow, Kubernetes Helm chart, Prometheus metrics endpoint, and structured logging -- all pre-configured.

Measuring the Impact

We tracked three metrics before and after deployment:

Metric

Before

After (6 weeks)

Change

|--------|--------|-----------------|--------|

Time to find service owner

12 min avg

15 sec

-98%

Time to create new service

2-3 days

90 seconds

-99%

Daily context-switching (self-reported)

2.1 hours

18 minutes

-86%

The context-switching number came from developer surveys, so take it with appropriate salt. But the directional improvement was unmistakable: developers stopped complaining about "not knowing where things are."

What Broke and How We Fixed It

Catalog drift: Despite automation, 15% of catalog entries became stale within the first month. Fix: we added a CI check that validates catalog-info.yaml on every PR. If it references a non-existent team or system, the build fails.

TechDocs build failures: About 20% of repos had Markdown that rendered fine in GitHub but broke TechDocs' MkDocs builder. Fix: added mkdocs build --strict to CI pipelines.

Plugin version conflicts: Backstage's plugin ecosystem moves fast and breaking changes are common. Fix: pin all plugin versions, upgrade monthly in a dedicated maintenance window, run the backstage-cli versions:check command before each upgrade.

When Backstage is Overkill

If you have fewer than 20 services and one team, Backstage is overkill. A well-maintained Confluence page or even a GitHub Wiki will serve you fine. Backstage's value scales with organizational complexity -- multiple teams, many services, heterogeneous tooling. Below that threshold, the maintenance cost exceeds the benefit.

Getting Started

Start small. Deploy Backstage with just the service catalog and TechDocs. Get 80% of your services registered. Once teams see value, add plugins incrementally. The worst thing you can do is spend three months building a fully-featured portal that nobody asked for.

If your platform team is drowning in "where is X?" questions and your developers are spending more time navigating tools than writing code, an internal developer portal isn't a nice-to-have. It's the highest-leverage investment your platform team can make.

---

*We've deployed Backstage and other internal developer platforms for teams ranging from 20 to 500 engineers. If your developers are losing hours to context-switching, let's talk about fixing thatlet's talk about fixing thathttps://techsaas.cloud/contact.*

Need help with platform-engineering?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

Get a Free Consultation Call +91 84569 84870