Version Control at Scale: Git Strategies That Survive 10,000 Commits
Monorepos, sparse checkouts, git-filter-repo, and commit signing. Practical Git strategies for teams that have outgrown basic branching workflows.
Version Control at Scale: Git Strategies That Survive 10,000 Commits
Git was designed for the Linux kernel — one of the largest open-source projects in existence. It can handle scale. But most teams do not struggle with Git's technical limits. They struggle with the workflows, conventions, and tooling around Git that break down as codebases and teams grow.
Here is what changes when your repository passes 10,000 commits, 50 contributors, or 1 GB in size — and how to adapt.
When Git Starts Feeling Slow
Git is fast. But "fast" is relative:
git statuson a repo with 100,000 files takes seconds, not millisecondsgit logwith full diff on a repo with 50,000 commits takes noticeable timegit cloneof a 5 GB repository takes minutes, not secondsgit blameon a file with 2,000 revisions is painfully slow
These are not bugs. They are the natural consequences of storing every version of every file. The solutions are not "use a different VCS" — they are "use Git differently."
Monorepo vs Multi-Repo: The Real Trade-offs
The monorepo debate is not about Git performance. It is about organizational coordination.
Monorepo (single repo, all code)
company/
├── services/
│ ├── api/
│ ├── web-app/
│ └── worker/
├── libs/
│ ├── shared-utils/
│ └── proto-definitions/
├── infrastructure/
│ ├── terraform/
│ └── docker/
└── docs/
Advantages:
- Atomic commits across services (change API and client in one commit)
- Single source of truth for shared libraries
- Easier code discovery (grep the entire company codebase)
- Simplified dependency management (no version conflicts between repos)
Disadvantages:
- CI complexity — you need to detect which services changed and only build those
- Access control is harder (everyone can see everything by default)
- Clone time grows with the entire company's history
Who does it: Google (billions of files, custom VCS), Meta (millions of files, custom Mercurial), Microsoft (Windows repo, custom Git tooling), Uber, Airbnb, Stripe.
Multi-Repo (one repo per service)
Get more insights on DevOps
Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.
Advantages:
- Clear ownership boundaries
- Independent CI/CD pipelines
- Smaller, faster clones
- Natural access control
Disadvantages:
- Cross-service changes require coordinated PRs
- Shared library versioning becomes a full-time job
- Code duplication across repos
- "Which version of the shared lib does service X use?" becomes a frequent question
The pragmatic answer: Most teams under 50 engineers should start with a monorepo. Switch to multi-repo when the CI pipeline becomes the bottleneck, not before.
Sparse Checkout: Clone Less, Work Faster
Git 2.25 introduced sparse checkout, which lets you clone a repo but only populate your working directory with the files you need:
# Clone without checking out files
git clone --no-checkout https://github.com/company/monorepo.git
cd monorepo
# Initialize sparse checkout
git sparse-checkout init --cone
# Only check out the services/api directory
git sparse-checkout set services/api libs/shared-utils
# Now your working directory only contains those paths
ls
# services/ libs/
This is transformative for monorepos. A developer working on the API service does not need 100,000 frontend files in their working directory. git status becomes fast again because Git only tracks the files you have checked out.
Combine with partial clone for even faster initial setup:
# Partial clone: download object metadata but not blob content
git clone --filter=blob:none --sparse https://github.com/company/monorepo.git
cd monorepo
git sparse-checkout set services/api
# Blobs are downloaded on demand when you access files
Git LFS: Large Files Without the Pain
Binary files (images, models, datasets, compiled assets) bloat Git repositories because Git stores every version as a full copy.
Git LFS (Large File Storage) replaces large files with pointer files in Git, storing the actual content on a separate server:
# Install and initialize
git lfs install
# Track file patterns
git lfs track "*.psd"
git lfs track "*.model"
git lfs track "datasets/**"
# Commit and push as normal
git add .gitattributes
git add model.psd
git commit -m "Add design file"
git push
When to use LFS:
- Any file over 10 MB that changes regularly
- Binary files that Git cannot diff (images, compiled binaries, ML models)
- Assets that would otherwise bloat clone times
Commit Signing: Trust but Verify
In a world of supply chain attacks, knowing that a commit actually came from who it says it came from matters.
# Configure Git to sign commits with SSH keys (Git 2.34+)
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true
Every commit now carries a cryptographic signature. On Gitea, GitHub, and GitLab, you can require signed commits on protected branches.
Branching Strategies That Scale
Trunk-Based Development
- All development happens on short-lived feature branches
- Branches live for hours to days, never weeks
- Main is always deployable
- Feature flags control visibility, not branches
Best for: Teams with strong CI, frequent deployments, experienced developers.
GitHub Flow
- Feature branches + pull requests
- Main is always deployable
- No release branches (deploy from main)
Best for: Most teams. Simple, well-understood, GitHub-native.
Release Branches
- Feature development on main
- Cut a release branch when ready to ship
- Cherry-pick fixes back to release branches
Best for: Products with multiple supported versions, mobile apps, on-premise software.
Rewriting History Safely
Free Resource
CI/CD Pipeline Blueprint
Our battle-tested pipeline template covering build, test, security scan, staging, and zero-downtime deployment stages.
git-filter-repo (replacing filter-branch)
# Remove a file from all history (leaked secret, large binary)
git filter-repo --path secrets.env --invert-paths
# Remove a directory from all history
git filter-repo --path old-service/ --invert-paths
Warning: History rewriting changes commit hashes. Every contributor must re-clone after a rewrite. Communicate clearly before running this.
Removing Accidentally Committed Secrets
- Remove from history with
git filter-repo - Force push (coordinate with team first)
- Rotate the secret immediately — removing from history does not revoke the credential
- Add to
.gitignore
The critical step is #3. Rotate first, then clean up.
Git Hooks for Quality Gates
Use pre-commit framework for team-wide hook management:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: check-added-large-files
args: ['--maxkb=1000']
- id: detect-private-key
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.0
hooks:
- id: gitleaks
Performance Tuning for Large Repos
# Enable filesystem monitor (huge speedup for git status)
git config core.fsmonitor true
git config core.untrackedcache true
# Enable commit graph for faster log operations
git commit-graph write --reachable
git config fetch.writeCommitGraph true
# Enable multi-pack index for faster object lookup
git multi-pack-index write
These settings can reduce git status from 2+ seconds to under 200ms on large repos. The filesystem monitor is the single biggest win.
The Bottom Line
Git at scale is not about learning obscure commands. It is about:
- Choosing the right structure (monorepo with sparse checkout vs multi-repo)
- Keeping the working directory small (sparse checkout, shallow clone)
- Keeping binary files out of Git (LFS for everything over 10 MB)
- Signing commits (trust the identity behind every change)
- Automating quality gates (pre-commit hooks, required reviews)
- Tuning performance settings (fsmonitor, commit graph, multi-pack index)
The tools exist. The challenge is not technical — it is establishing the conventions and enforcing them consistently across the team.
Related Service
Platform Engineering
From CI/CD pipelines to service meshes, we create golden paths for your developers.
Need help with devops?
TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.
We Will Build You a Demo Site — For Free
Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.
No spam. No contracts. Just a free demo.