← All articlesDevOps

Version Control at Scale: Git Strategies That Survive 10,000 Commits

Monorepos, sparse checkouts, git-filter-repo, and commit signing. Practical Git strategies for teams that have outgrown basic branching workflows.

Yash Pritwani

23 March 202611 min read read

Version Control at Scale: Git Strategies That Survive 10,000 Commits

Git was designed for the Linux kernel — one of the largest open-source projects in existence. It can handle scale. But most teams do not struggle with Git's technical limits. They struggle with the workflows, conventions, and tooling around Git that break down as codebases and teams grow.

Here is what changes when your repository passes 10,000 commits, 50 contributors, or 1 GB in size — and how to adapt.

When Git Starts Feeling Slow

Git is fast. But "fast" is relative:

git status on a repo with 100,000 files takes seconds, not milliseconds
git log with full diff on a repo with 50,000 commits takes noticeable time
git clone of a 5 GB repository takes minutes, not seconds
git blame on a file with 2,000 revisions is painfully slow

These are not bugs. They are the natural consequences of storing every version of every file. The solutions are not "use a different VCS" — they are "use Git differently."

Monorepo vs Multi-Repo: The Real Trade-offs

The monorepo debate is not about Git performance. It is about organizational coordination.

Monorepo (single repo, all code)

company/
├── services/
│   ├── api/
│   ├── web-app/
│   └── worker/
├── libs/
│   ├── shared-utils/
│   └── proto-definitions/
├── infrastructure/
│   ├── terraform/
│   └── docker/
└── docs/

Advantages:

Atomic commits across services (change API and client in one commit)
Single source of truth for shared libraries
Easier code discovery (grep the entire company codebase)
Simplified dependency management (no version conflicts between repos)

Disadvantages:

CI complexity — you need to detect which services changed and only build those
Access control is harder (everyone can see everything by default)
Clone time grows with the entire company's history

Who does it: Google (billions of files, custom VCS), Meta (millions of files, custom Mercurial), Microsoft (Windows repo, custom Git tooling), Uber, Airbnb, Stripe.

Multi-Repo (one repo per service)

Get more insights on DevOps

Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.

Advantages:

Clear ownership boundaries
Independent CI/CD pipelines
Smaller, faster clones
Natural access control

Disadvantages:

Cross-service changes require coordinated PRs
Shared library versioning becomes a full-time job
Code duplication across repos
"Which version of the shared lib does service X use?" becomes a frequent question

The pragmatic answer: Most teams under 50 engineers should start with a monorepo. Switch to multi-repo when the CI pipeline becomes the bottleneck, not before.

Sparse Checkout: Clone Less, Work Faster

Git 2.25 introduced sparse checkout, which lets you clone a repo but only populate your working directory with the files you need:

# Clone without checking out files
git clone --no-checkout https://github.com/company/monorepo.git
cd monorepo

# Initialize sparse checkout
git sparse-checkout init --cone

# Only check out the services/api directory
git sparse-checkout set services/api libs/shared-utils

# Now your working directory only contains those paths
ls
# services/  libs/

This is transformative for monorepos. A developer working on the API service does not need 100,000 frontend files in their working directory. git status becomes fast again because Git only tracks the files you have checked out.

Combine with partial clone for even faster initial setup:

# Partial clone: download object metadata but not blob content
git clone --filter=blob:none --sparse https://github.com/company/monorepo.git
cd monorepo
git sparse-checkout set services/api
# Blobs are downloaded on demand when you access files

Git LFS: Large Files Without the Pain

Binary files (images, models, datasets, compiled assets) bloat Git repositories because Git stores every version as a full copy.

Git LFS (Large File Storage) replaces large files with pointer files in Git, storing the actual content on a separate server:

# Install and initialize
git lfs install

# Track file patterns
git lfs track "*.psd"
git lfs track "*.model"
git lfs track "datasets/**"

# Commit and push as normal
git add .gitattributes
git add model.psd
git commit -m "Add design file"
git push

When to use LFS:

→

Chaos Engineering for Small Teams: You Do Not Need Netflix to Break Things11 min read read

→

AIOps in Practice: How AI Is Transforming Incident Management in 202610 min read read

→

POSSE Strategy: Publish on Your Own Site, Syndicate Everywhere10 min read read

Any file over 10 MB that changes regularly
Binary files that Git cannot diff (images, compiled binaries, ML models)
Assets that would otherwise bloat clone times

Commit Signing: Trust but Verify

In a world of supply chain attacks, knowing that a commit actually came from who it says it came from matters.

# Configure Git to sign commits with SSH keys (Git 2.34+)
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true

Every commit now carries a cryptographic signature. On Gitea, GitHub, and GitLab, you can require signed commits on protected branches.

Branching Strategies That Scale

Trunk-Based Development

All development happens on short-lived feature branches
Branches live for hours to days, never weeks
Main is always deployable
Feature flags control visibility, not branches

Best for: Teams with strong CI, frequent deployments, experienced developers.

GitHub Flow

Feature branches + pull requests
Main is always deployable
No release branches (deploy from main)

Best for: Most teams. Simple, well-understood, GitHub-native.

Release Branches

Feature development on main
Cut a release branch when ready to ship
Cherry-pick fixes back to release branches

Best for: Products with multiple supported versions, mobile apps, on-premise software.

Rewriting History Safely

Free Resource

CI/CD Pipeline Blueprint

Our battle-tested pipeline template covering build, test, security scan, staging, and zero-downtime deployment stages.

Get the Blueprint

git-filter-repo (replacing filter-branch)

# Remove a file from all history (leaked secret, large binary)
git filter-repo --path secrets.env --invert-paths

# Remove a directory from all history
git filter-repo --path old-service/ --invert-paths

Warning: History rewriting changes commit hashes. Every contributor must re-clone after a rewrite. Communicate clearly before running this.

Removing Accidentally Committed Secrets

Remove from history with git filter-repo
Force push (coordinate with team first)
Rotate the secret immediately — removing from history does not revoke the credential
Add to .gitignore

The critical step is #3. Rotate first, then clean up.

Git Hooks for Quality Gates

Use pre-commit framework for team-wide hook management:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: trailing-whitespace
      - id: check-yaml
      - id: check-added-large-files
        args: ['--maxkb=1000']
      - id: detect-private-key

  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks

Performance Tuning for Large Repos

# Enable filesystem monitor (huge speedup for git status)
git config core.fsmonitor true
git config core.untrackedcache true

# Enable commit graph for faster log operations
git commit-graph write --reachable
git config fetch.writeCommitGraph true

# Enable multi-pack index for faster object lookup
git multi-pack-index write

These settings can reduce git status from 2+ seconds to under 200ms on large repos. The filesystem monitor is the single biggest win.

The Bottom Line

Git at scale is not about learning obscure commands. It is about:

Choosing the right structure (monorepo with sparse checkout vs multi-repo)
Keeping the working directory small (sparse checkout, shallow clone)
Keeping binary files out of Git (LFS for everything over 10 MB)
Signing commits (trust the identity behind every change)
Automating quality gates (pre-commit hooks, required reviews)
Tuning performance settings (fsmonitor, commit graph, multi-pack index)

The tools exist. The challenge is not technical — it is establishing the conventions and enforcing them consistently across the team.

#git#version-control#monorepo#devops#ci-cd#git-lfs#sparse-checkout

Related Service

Platform Engineering

From CI/CD pipelines to service meshes, we create golden paths for your developers.

Get a Consultation Chat on WhatsApp

Need help with devops?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

Get a Free Consultation WhatsApp Us

We Will Build You a Demo Site — For Free

Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.

47+ companies trusted us

99.99% uptime

< 48hr response

No spam. No contracts. Just a free demo.

Version Control at Scale: Git Strategies That Survive 10,000 Commits

Version Control at Scale: Git Strategies That Survive 10,000 Commits

When Git Starts Feeling Slow

Monorepo vs Multi-Repo: The Real Trade-offs

Monorepo (single repo, all code)

Multi-Repo (one repo per service)

Sparse Checkout: Clone Less, Work Faster

Git LFS: Large Files Without the Pain

You might also like

Commit Signing: Trust but Verify

Branching Strategies That Scale

Trunk-Based Development

GitHub Flow

Release Branches

Rewriting History Safely

git-filter-repo (replacing filter-branch)

Removing Accidentally Committed Secrets

Git Hooks for Quality Gates

Performance Tuning for Large Repos

The Bottom Line

Platform Engineering

Need help with devops?

We Will Build You a Demo Site — For Free

Related Articles

eBPF Beyond Security: Networking, Observability, and Performance in One Technology

Hardening Your Self-Hosted CI/CD Pipeline Against Supply Chain Attacks

ArgoCD Beyond the Basics: Multi-Cluster GitOps Patterns That Scale