The Dependency Divide

I was staring at a broken CI pipeline for the third hour, all because a someone forgot to run git submodule update --init before pushing. It’s that sinking feeling when your "shared" code becomes a wall between you and a successful deploy, leaving you to untangle a detached HEAD state while the rest of the team waits for their builds to pass.

Shared code is a double-edged sword. You want to DRY (Don't Repeat Yourself) up your logic by pulling that Auth module or UI kit into multiple projects, but Git doesn't naturally want to live inside itself. This creates the Dependency Divide: the gap between keeping code modular and keeping it usable.

To bridge this, you usually have to pick a side: Submodules or Subtrees. Both solve the problem of nested repositories, but they do it with entirely different philosophies.

Git Submodules: The Pointer Approach

A submodule is essentially a glorified link. It doesn't actually store the code of the sub-repository inside your main (parent) repository. Instead, it stores a commit hash and a URL.

When you run:

git submodule add https://github.com/your-org/shared-utils.git libs/utils

Git creates a .gitmodules file. It tells the parent repo: "Hey, when you look at the libs/utils folder, go fetch the code from this URL at this specific commit."

Why you’ll love (and hate) them

The big win here is isolation. The parent repo stays small because it isn't carrying the weight of the sub-repo's history. If shared-utils has 10,000 commits, your parent repo doesn't care. It only cares about one specific SHA.

But the "gotcha" is the workflow friction. If you change code inside the submodule, you have to:
1. Commit and push inside the submodule folder.
2. Go out to the parent folder.
3. Commit the "new version" of the submodule in the parent repo.

If you forget step 3, your teammates pull the parent repo and get an old version of the code. If you forget step 1 and do step 3, your teammates' builds will fail because the parent repo is pointing to a commit that only exists on your local machine. It's a manual process that invites human error.

Git Subtrees: The "All-In" Approach

If submodules are a link, subtrees are a transplant. When you add a subtree, you are literally injecting the entire history and all the files of the sub-repo into your parent repo.

git subtree add --prefix=libs/utils https://github.com/your-org/shared-utils.git main --squash

The --squash flag is your best friend here. It prevents your parent repo’s history from being flooded with every single tiny commit from the utility library, condensing them into one "merge" commit.

The Human-Friendly Factor

Subtrees are much easier for a team to manage because nobody needs to know they exist.

Once a subtree is added, it just looks like a normal folder. Your junior devs don't need to learn special Git commands. They just pull, branch, and commit like usual. If you need to push changes back to the source repo, you use a specific git subtree push command, but the day-to-day operations are invisible.

The downside? Your parent repository gets heavier. You’re duplicating data. If you have 50 sub-repositories managed via subtrees, your .git folder is going to get massive.

Which one should you actually use?

I’ve spent years bouncing between these, and the decision usually comes down to who is touching the code.

Use Submodules if:
- The shared library is huge (like a game engine or a massive C++ framework) and you don't want to bloat your main repo.
- The shared library is managed by a different team, and you only want to "consume" it at specific version milestones.
- Your team is Git-savvy and won't freak out when they see v1.0.0 (detached at 3a4f2b).

Use Subtrees if:
- You are frequently editing both the parent project and the shared library at the same time.
- You want a "set it and forget it" experience for the rest of your developers.
- You're dealing with internal logic (like a shared API client) that changes alongside your main application features.

The Middle Path: Package Managers

Before you commit to either, ask yourself: *Does this need to be a Git dependency at all?*

In many cases, a private NPM registry, a GitHub Package, or a private PyPI server is a better choice. Versioning through a package manager ("shared-utils": "^1.2.0") is often more robust than manually wrestling with Git hashes.

But if you need the raw source code living in your tree for debugging or rapid iteration, pick your poison carefully. Submodules keep your history clean but your workflow messy; Subtrees keep your workflow clean but your history heavy.

Choose the one that matches your team's tolerance for "Git weirdness." For most fast-moving teams, that's usually the subtree.