Hacker News

by Ryan Harman

The Future of Version Control (bramcohen.com)

208 points by c17r 4 hours ago

ulrikrasmussen 2 hours ago

The thing about how merges are presented seems orthogonal to how to represent history. I also hate the default in git, but that is why I just use p4merge as a merge tool and get a proper 4-pane merge tool (left, right, common base, merged result) which shows everything needed to figure out why there is a conflict and how to resolve it. I don't understand why you need to switch out the VCS to fix that issue.

roryokane an hour ago

Even if you don’t use p4merge, you can set Git’s merge.conflictStyle config to "diff3" or "zdiff3" (https://git-scm.com/docs/git-config#Documentation/git-config...). If you do that, Git’s conflict markers show the base version as well:

  <<<<<<< left
  ||||||| base
  def calculate(x):
      a = x * 2
      b = a + 1
      return b
  =======
  def calculate(x):
      a = x * 2
      logger.debug(f"a={a}")
      b = a + 1
      return b
  >>>>>>> right

With this configuration, a developer reading the raw conflict markers could infer the same information provided by Manyana’s conflict markers: that the right side added the logging line.

psychoslave an hour ago

That still have an issue with the vocabulary. Things like "theirs/our" is still out of touch but it's already better than a loose spatial analogy on some representation of the DAG.

Something like base, that is "common base", looks far more apt to my mind. In the same vein, endogenous/exogenous would be far more precise, or at least aligned with the concern at stake. Maybe "local/alien" might be a less pompous vocabulary to convey the same idea.

kungito an hour ago

ktm5j an hour ago

I'm on my phone right now so I'm not going to dig too hard for this, but you can also configure a "merge tool" (or something like that) so you can use Meld or Kompare to make the process easier. This has helped me in a pinch to work out some confusing merge conflicts.

newsoftheday an hour ago

IshKebab 6 minutes ago

This is better but it still doesn't really help when the conflict is 1000 lines and one side changed one character and the other deleted the whole thing. That isn't theoretical - it happens quite regularly.

What you really need is the ability to diff the base and "ours" or "theirs". I've found most different UIs can't do this. VSCode can, but it's difficult to get to.

I haven't tried p4merge though - if it can do that I'm sold!

cxr 22 minutes ago

> I don't understand why you need to switch out the VCS to fix that issue.

For some reason, when it comes to this subject, most people don't think about the problem as much as they think they've thought about it.

I recently listened to an episode on a well-liked and respected podcast featuring a guest there to talk about version control systems—including their own new one they were there to promote—and what factors make their industry different from other subfields of software development, and why a new approach to version control was needed. They came across as thoughtful but exasperated with the status quo and brought up issues worthy of consideration while mostly sticking to high-level claims. But after something like a half hour or 45 minutes into the episode, as they were preparing to descend from the high level and get into the nitty gritty of their new VCS, they made an offhand comment contrasting its abilities with Git's, referencing Git's approach/design wrt how it "stores diffs" between revisions of a file. I was bowled over.

For someone to be in that position and not have done even a cursory amount of research before embarking on a months (years) long project to design, implement, and then go on the talk circuit to present their VCS really highlighted that the familiar strain of NIH is still alive, even in the current era where it's become a norm for people to be downright resistant to writing a couple dozen lines of code themselves if there is no existing package to import from NPM/Cargo/PyPI/whatever that purports to solve the problem.

crote 2 hours ago

Seconding the use of p4merge for easy-to-use three-pane merging. Just like most other issues with Git, if your merges are painful it's probably due to terrible native UX design - not due to anything conceptually wrong with Git.

roryokane 2 hours ago

Did you know that VS Code added support for the same four-pane view as p4merge years ago? I used p4merge as my merge tool for a long time, but I switched to VS Code when I discovered that, as VS Code’s syntax highlighting and text editing features are much better than p4merge’s.

I also use the merge tool of JetBrains IDEs such as IntelliJ IDEA (https://www.jetbrains.com/help/idea/resolve-conflicts.html#r...) when working in those IDEs. It uses a three-pane view, not a four-pane view, but there is a menu that allows you to easily open a comparison between any two of the four versions of the file in a new window, so I find it similarly efficient.

TacticalCoder 2 hours ago

Thirding it except I do it from Emacs. Three side-by-side pane with left / common ancestor / right and then below the merge result. By default it's not like that but then it's Emacs so anything is doable. I hacked some elisp code a great many years ago and I've been using it ever since.

No matter the tool, merges should always be presented like that. It's the only presentation that makes sense.

MarsIronPI 2 hours ago

jwr 2 hours ago

echrisinger 2 minutes ago

Has anyone considered a VCS that integrates more vertically with the source code through ASTs?

IE if I change something in my data model, that change & context could be surfaced with agentic tooling.

radarsat1 3 hours ago

Is it a good thing to have merges that never fail? Often a merge failure indicates a semantic conflict, not just "two changes in the same place". You want to be aware of and forced to manually deal with such cases.

I assume the proposed system addresses it somehow but I don't see it in my quick read of this.

hungryhobbit 2 hours ago

They address this; it's not that they don't fail, in practice...

the key insight is that changes should be flagged as conflicting when they touch each other, giving you informative conflict presentation on top of a system which never actually fails.

recursivecaveat 2 hours ago

It says that merges that involve overlap get flagged to the user. I don't think that's much more than a defaults difference to git really. You could have a version of git that just warns on conflict and blindly concats the sides.

gojomo 2 hours ago

Should you be counting on confusion of an underpowered text-merge to catch such problems?

It'll fire on merge issues that aren't code problems under a smarter merge, while also missing all the things that merge OK but introduce deeper issues.

Post-merge syntax checks are better for that purpose.

And imminently: agent-based sanity-checks of preserved intent – operating on a logically-whole result file, without merge-tool cruft. Perhaps at higher intensity when line-overlaps – or even more-meaningful hints of cross-purposes – are present.

skydhash 2 hours ago

> It'll fire on merge issues that aren't code problems under a smarter merge, while also missing all the things that merge OK but introduce deeper issues.

That has not been my experience at all. The changes you introduced is your responsibility. If you synchronizes your working tree to the source of truth, you need to evaluate your patch again whether it introduces conflict or not. In this case a conflict is a nice signal to know where someone has interacted with files you've touched and possibly change their semantics. The pros are substantial, and it's quite easy to resolve conflicts that's only due to syntastic changes (whitespace, formatting, equivalent statement,...)

gojomo 29 minutes ago

jwilliams 2 hours ago

Indeed. And plenty of successful merges end up with code that won't compile.

FWIW I've struggled to get AI tools to handle merge conflicts well (especially rebase) for the same underlying reason.

layer8 2 hours ago

Code not compiling is still the good case, because you’ll notice before deployment. The dangerous cases are when it does compile.

jwilliams 2 hours ago

skydhash 2 hours ago

I'm surprised to see that some people sync their working tree and does not evaluate their patch again (testing and reviewing the assumptions they have made for their changes).

conradludgate 2 hours ago

My understanding of the way this is presented is that merges don't _block_ the workflow. In git, a merge conflict is a failure to merge, but in this idea a merge conflict is still present but the merge still succeeds. You can commit with conflicts unresolved. This allows you to defer conflict resolution to later. I believe jj does this as well?

Technically you could include conflict markers in your commits but I don't think people like that very much

rightbyte 2 hours ago

> You can commit with conflicts unresolved.

True but it is not valid syntax. Like, you mean with the conflict lines?

Someone 14 minutes ago

furyofantares an hour ago

ericpauley an hour ago

rectang 2 hours ago

I agree. Nevertheless I wonder if this approach can help with certain other places where Git sometimes struggles, such as whether or not two commits which have identical diffs but different parents should be considered equivalent.

In the general case, such commits cannot be considered the same — consider a commit which flips a boolean that one branch had flipped in another file. But there are common cases where the commits should be considered equivalent, such as many rebased branches. Can the CRDT approach help with e.g. deciding that `git branch -d BRANCH` should succeed when a rebased version of BRANCH has been merged?

dfhvneoieno 2 hours ago

[dead]

mikey-k 3 hours ago

This

barrkel 39 minutes ago

I don't really get the upside of focus on CRDTs.

The semantic problem with conflicts exists either way. You get a consistent outcome and a slightly better description of the conflict, but in a way that possibly interleaves changes, which I don't think is an improvement at all.

I am completely rebase-pilled. I believe merge commits should be avoided at all costs, every commit should be a fast forward commit, and a unit of work that can be rolled back in isolation. And also all commits should be small. Gitflow is an anti-pattern and should be avoided. Long-running branches are for patch releases, not for feature development.

I don't think this is the future of VCS.

Jujutsu (and Gerrit) solves a real git problem - multiple revisions of a change. That's one that creates pain in git when you have a chain of commits you need to rebase based on feedback.

hackrmn 15 minutes ago

When you say "unit of work", unit of _which_ work are you referring to? The problem with rebasing is that it takes one set of snapshots and replays them on top of another set, so you end up with two "equivalent" units of work. In fact they're _the same_ indeed -- the tree objects are shared, except that if by "work" you mean changes, Git is going to tell you two different histories, obviously.

This is in contrast with [Pijul](https://pijul.org) where changes are patches and are commutative -- you can apply an entire set and the result is supposed to be equivalent regardless of the order the patches are applied in. Now _that_ is unit of work" I understand can be applied and undone in "isolation".

Everything else is messy, in my eyes, but perhaps it's orderly to other people. I mean it would be nice if a software system defined with code could be expressed with a set of independent patches where each patch is "atomic" and a feature or a fix etc, to the degree it is possible. With Git, that's a near-impossibility _in the graph_ -- sure you can cherry-pick or rebase a set of commits that belong to a feature (normally on a feature branch), but _why_?

barrkel 3 minutes ago

By "unit of work", I mean the atomic delta which can, on its own, become part of the deployable state of the software. The thing which has a Change-Id in Gerrit.

The delta is the important thing. Git is deficient in this respect; it doesn't model a delta. Git hashes identify the tip of a tree.

When you rebase, you ought to be rebasing the change, the unit of work, a thing with an identity separate and independent of where it is based from.

And this is something that the jujutsu / Gerrit model fixes.

IgorPartola 20 minutes ago

I used to use rebase much more than merge but have grown to be more nuanced over the years:

Merge commits from main into a feature branch are totally fine and easier to do than rebasing. After your feature branch is complete you can do one final main-to-feature-branch merge and then merge the feature branch into main with a squash commit.

When updating any branch from remote, I always do a pull rebase to avoid merge commits from a simple pull. This works well 99.99% of the time since what I have changed vs what the remote has changed is obvious to me.

When I work on a project with a dev branch I treat feature branches as coming off dev instead of main. In this case I merge dev into feature branches, then merge feature branches into dev via a squash commit, and then merge main into dev and dev into main as the final step. This way I have a few merge commits on dev and main but only when there is something like an emergency fix that happens on main.

The problem with always using a rebase is that you have to reconcile conflicts at every commit along the way instead of just the final result. That can be a lot more work for commits that will never actually be used to run the code and can in fact mess up your history. Think of it like this:

1. You create branch foo off main.

2. You make an emergency commit to main called X.

3. You create commits A, B, and C on foo to do your feature work. The feature is now complete.

4. You rebase foo off main and have to resolve the conflict introduced by X happening before A. Let’s say it conflicts with all three of your commits (A, B, and C).

5. You can now merge foo into main with it being a fast forward commit.

Notice that at no point will you want to run the codebase such that it has commits XA or XAB. You only want to run it as XABC. In fact you won’t even test if your code works in the state XA or XAB so there is little point in having those checkpoints. You care about three states: main before any of this happened since it was deployed like that, main + X since it was deployed like that, and main with XABC since you added a feature. git blame is really the only time you will ever possibly look at commits A and B individually and even then the utility of it is so limited it isn’t worth it.

The reality is that if you only want fast forward commits, chances are you are doing very little to go back and extract code out of old versions a of the codebase. You can tell this by asking yourself: “if I deleted all my git history from main and have just the current state + feature branches off it, will anything bad happen to my production system?” If not, you are not really doing most of what git can do (which is a good thing).

gzread 22 minutes ago

People see that CRDTs have no conflicts and proclaim them as the solution to all problems, not seeing that some problems inherently have conflicts and either can't be represented by CRDTs at all, or that the use of CRDTs resolves conflicts in a way that's worse than if you actually thought about conflict resolution. E.g. that multiplayer text editor that interleaved characters from simultaneous edits.

bos 3 hours ago

This is sort of a revival and elaboration of some of Bram’s ideas from Codeville, an earlier effort that dates back to the early 2000s Cambrian explosion of DVCS.

Codeville also used a weave for storage and merge, a concept that originated with SCCS (and thence into Teamware and BitKeeper).

Codeville predates the introduction of CRDTs by almost a decade, and at least on the face of it the two concepts seem like a natural fit.

It was always kind of difficult to argue that weaves produced unambiguously better merge results (and more limited conflicts) than the more heuristically driven approaches of git, Mercurial, et al, because the edit histories required to produce test cases were difficult (at least for me) to reason about.

I like that Bram hasn’t let go of the problem, and is still trying out new ideas in the space.

dboreham 2 hours ago

Note that CRDT isn't "a thing". The CRDT paper provides a way to think about and analyze eventually consistent replication mechanisms. So CRDTs weren't "introduced", only the "CRDT way of discussing replication". Every concrete mechanism described in the CRDT paper is very old, widely used for decades beforehand.

This means that everything that implements eventual consistency (including Git) is using "a CRDT".

hrmtst93837 an hour ago

If you stretch "CRDT" to mean any old eventually consistent thing, almost every Unix tool morphs into one under a loose enough definition. That makes the term much less useful, because practical CRDTs in 2024 usually mean opaque merge semantics, awkward failure modes, and operational complexity that has very little in common with the ancient algorithms people point at when they say "Git is a CRDT too". "Just Git" is doing a lot of work there.

simonw 3 hours ago

This thing is really short. https://github.com/bramcohen/manyana/blob/main/manyana.py is 473 lines of dependency-free Python (that file only imports difflib, itertools and inspect) and of that ~240 lines are implementation and the rest are tests.

zahlman 2 hours ago

It's really impressive what can be done in a few hundred lines of well-thought-out Python without resorting to brutal hacks. People complain about left-pad incidents etc. in the JS world but I honestly feel like the Python ecosystem could do with more, smaller packages on balance. They just have to be put forward by responsible people who aren't trying to make a point or inflate artificial metrics.

gavinhoward an hour ago

Bram Cohen is awesome, but this feels a little bare. I've put much more thought into version control ([1]), including the use of CRDTs (search for "# History Model" and read through the "Implementing CRDTs" section).

[1]: https://gavinhoward.com/uploads/designs/yore.md

AceJohnny2 37 minutes ago

That's worth making a separate post! (and I recommend rendering it to HTML)

But "bare" is part of the value of Cohen's post, I think. When you want to publicize a paradigm shift, it helps to make it in small, digestible chunks.

63stack 37 minutes ago

Is this the Bram Cohen who made bittorrent? There is surprisingly little information on this page.

vessenes 29 minutes ago

Yes

ZoomZoomZoom 3 hours ago

The key insight in the third sentence?

> ... CRDTs for version control, which is long overdue but hasn’t happened yet

Pijul happened and it has hundreds - perhaps thousands - of hours of real expert developer's toil put in it.

Not that Bram is not one of those, but the post reads like you all know what.

vova_hn2 an hour ago

I have a weird hobby: about once a year I go to the theory page [0] in pijul manual and see if they have fixed the TeX formatting yet.

You would think that if a better, more sound model of storing patches is your whole selling point, you would want to make as easy as possible for people who are interested in the project to actually understand it. It is really weird not to care about the first impression that your manual makes on a curious reader.

Currently, I'm about 6 years into the experiment.

Approximately 2 years in (about 4 years ago), I've actually went to the Pijul Nest and reported [1] the issue. I got an explanation on fixing this issue locally, but weirly enough, the fix still wasn't actually implemented on the public version.

I'll report back in about a year with an update on the experiment.

[0] https://pijul.org/manual/theory.html

[1] https://nest.pijul.com/pijul/manual/discussions/46

AceJohnny2 35 minutes ago

> It is really weird not to care about the first impression that your manual makes on a curious reader.

On the contrary, I think this is an all-too-familiar pitfall for the, er... technically minded.

"I've implemented it in the code. My work here is done. The rest is window dressing."

rbsmith an hour ago

Do you use Pijul?

From time to time, I do a 'pijul pull -a' into the pijul source tree, and I get a conflict (no local work on my part). Is there a way to do a tracking update pull? I didn't see one, so I toss the repo and reclone. What works for you in tracking what's going on there?

simonw 3 hours ago

I hadn't heard of Pijul. My first search took me to https://github.com/8l/pijul which hasn't been updated in 11 years, but it turns out that's misleading and the official repo at https://nest.pijul.com/pijul/pijul had a commit last month.

... and of course it is, because Pijul uses Pijul for development, not Git and GitHub!

codethief 2 hours ago

> I hadn't heard of Pijul

I'm surprised! Pijul has been discussed here on HN many, many times. My impression is that many people here were hoping that Pijul might eventually become a serious Git contender but these days people seem to be more excited about Jujutsu, likely because migration is much easier.

simonw 2 hours ago

jedberg 2 hours ago

idoubtit 3 hours ago

The canonical website is https://pijul.org. The homepage has a link to the pijul source repository.

ozten 2 hours ago

merlindru 18 minutes ago

I recently found a project called sem[1] that does git diffs but is aware of the language itself, giving feedback like "function validateToken added", "variable xyzzy removed", ...

i think that's where version control is going. especially useful with agents and CI

[1] https://ataraxy-labs.github.io/sem/

gnarlouse 3 hours ago

I think something like this needs to be born out of analysis of gradations of scales of teams using version control systems.

- What kind of problems do 1 person, 10 person, 100 person, 1k (etc) teams really run into with managing merge conflicts?

- What do teams of 1, 10, 100, 1k, etc care the most about?

- How does the modern "agent explosion" potentially affect this?

For example, my experience working in the 1-100 regime tells me that, for the most part, the kind of merge conflict being presented here is resolved by assigning subtrees of code to specific teams. For the large part, merge conflicts don't happen, because teams coordinate (in sprints) to make orthogonal changes, and long-running stale branches are discouraged.

However, if we start to mix in agents, a 100 person team could quickly jump into a 1000 person team, esp if each person is using subagents making micro commits.

It's an interesting idea definitely, but without real-world data, it kind of feels like this is just delivering a solution without a clear problem to assign it to. Like, yes merge-conflicts are a bummer, but they happen infrequently enough that it doesn't break your heart.

CuriouslyC 2 hours ago

Team scale doesn't tend to impact this that much, since as teams grow they naturally specialize in parts of the codebase. Shared libs can be hotspots, I've heard horror stories at large orgs about this sort of thing, though usually those shared libs have strong gatekeeping that makes the problem more one of functionality living where it shouldn't to avoid gatekeeping than a shared lib blowing up due to bad change set merges.

tasuki an hour ago

> What kind of problems do 1 person, 10 person, 100 person, 1k (etc) teams really run into with managing merge conflicts?

> What do teams of 1, 10, 100, 1k, etc care the most about?

Oh god no! That would be about the worst way to do it.

Just make it conceptually sound.

gnarlouse 30 minutes ago

Probably, but just introducing CRDTs also feels like the wrong way to approach the problem! :)

mikey-k 3 hours ago

Interesting idea. While conflicts can be improved, I personally don't see it as a critical challenge with VCS.

What I do think is the critical challenge (particularly with Git) is scalability.

Size of repository & rate of change of repositories are starting to push limits of git, and I think this needs revisited across the server, client & wire protocols.

What exactly, I don't know. :). But I do know that in my current role (mid-size well-known tech company) is hitting these limits today.

layer8 2 hours ago

One solution is to decompose your code into modules with stable interfaces and reference them as versioned dependencies.

rectang 2 hours ago

[dead]

bob1029 an hour ago

I think there are still strong advantages to the centralized locking style of collaboration. The challenge is that it seems to work best in a setting where everyone is in the same physical location while they are working. You can break a lock in 30 seconds with your voice. Locking across time zones and date lines is a nonstarter by comparison.

fn-mote 25 minutes ago

It seems like in a reasonable sized org you should not be merging so often that “centralized locking … across time zones” should be an issue.

Are people really merging that often? What is being merged? Doc fixes?

nkmnz an hour ago

I don't quite understand how CRDTs should help with merges. The difficult thing about merges is not that two changes touch the same part of the code; the difficult thing is that two changes can touch different parts of the code and still break each other - right?

AceJohnny2 42 minutes ago

Eh. It's a matter of visible pain vs invisible pain.

Developers are quite familiar with Merge Conflicts and the confusing UI that git (and SVN before it, in my experience) gives you about them. The "ours vs theirs" nomenclature which doesn't help, etc. This is something that seems improvable in a VCS, QED this post.

Vs the scenario you're describing (what I call Logical Conflicts), where two changes touching different parts of the code (so it doesn't emerge as a Merge Conflict) but still breaking each other. Like one change adding a function call in one file but another change changing the API in a different file.

These are painful in a different way, and not something that a simple text-based version control (which is all of the big ones) can even see.

Indeed, CRDTs do not help with Logical Conflicts.

WCSTombs 2 hours ago

For the conflicts, note that in Git you can do

    git config --global merge.conflictstyle diff3

to get something like what is shown in the article.

logicprog 3 hours ago

This seems like an excellent idea. I'm sure a lot of us have been idly wondering why CRDTs aren't used for VCS for some time, so it's really cool to see someone take a stab at it! We really do need an improvement over git; the question is how to overcome network effects.

vishvananda 3 hours ago

This is actually a very interesting moment to potentially overcome network effects, because more and more code is going to be written by agents. If a crdt approach is measurably better for merging by agent swarms then there is incentive to make the switch. It also much easier to get an agent to change its workflow than a human. The only tricky part is how much git usage is in the training set so some careful thought would need to be given to create a compatibility layer in the tooling to help agents along.

NetOpWibby 3 hours ago

Overcoming network effects cannot be the goal; otherwise, work will never get done.

The goal should be to build a full spec and then build a code forge and ecosystem around this. If it’s truly great, adoption will come. Microsoft doing a terrible job with GitHub is great for new solutions.

righthand 3 hours ago

Well over half of all people can’t tell you the difference between git and Github. The latter being owned by a corporation that needs the network effect to keep existing.

lemonwaterlime 2 hours ago

See vim-mergetool[1]. I use it to manage merge conflicts and it's quite intuitive. I've resolved conflicts that other people didn't even want to touch.

[1]: https://github.com/samoshkin/vim-mergetool

mentalgear 2 hours ago

Looks like vscode diff view .

lasgawe 2 hours ago

This is a really interesting and well thought out idea, especially the way it turns conflicts into something informative instead of blocking. The improved conflict display alone makes it much easier to understand what actually happened. I think using CRDTs to guarantee merges always succeed while still keeping useful history feels like a strong direction for version control. Looks like a solid concept!

a-dub 2 hours ago

doesn't the side by side view in github diff solve this?

conflict free merging sounds cool, but doesn't that just mean that that a human review step is replaced by "changes become intervals rather than collections of lines" and "last set of intervals always wins"? seems like it makes sense when the conflicts are resolved instantaneously during live editing but does it still make sense with one shot code merges over long intervals of time? today's systems are "get the patch right" and then "get the merge right"... can automatic intervalization be trusted?

edit: actually really interesting if you think about it. crdts have been proven with character at a time edits and use of the mouse select tool.... these are inherently intervalized (select) or easy (character at a time). how does it work for larger patches can have loads of small edits?

jFriedensreich 2 hours ago

starts with “based on the fundamentally sound approach of using CRDTs for version control”. How on earth is crdt a sound base for a version control system? This makes no sense fundamentally, you need to reach a consistent state that is what you intended not what some crdt decided and jj shows you can do that also without blocking on merges but with first level conflicts that need to be resolved. ai and language aware merge drivers are helping so much here i really wonder if the world these “replace version control” projects were made for still exists at all.

nozzlegear 2 hours ago

> ai and language aware merge drivers are helping so much here i really wonder if the world these “replace version control” projects were made for still exists at all.

I really wonder what kinds of magical AI you're using, because in my experience, Claude Code chokes and chokes hard on complex rebases/merge conflicts to the point that I couldn't trust it anymore.

miloignis 2 hours ago

The rest of the article shows exactly how a CRDT is a sound base for a version control system, with "conflicts" and all.

skydhash 2 hours ago

But the presentation does not show how it resolves conflicts. For the first example, Git has the 3 way-merge that shows the same kind of info. And a conflict is not only to show that two people have worked on a file. More often than not, it highlight a semantic changes that happened differently in two instances and it's a nice signal to pay attention to this area. But a lot of people takes merge conflicts as some kind of nuisance that prevents them from doing their job (more often due to the opinion that their version is the only good one).

BlueHotDog2 an hour ago

This is cool and i keep thinking about CRDTs as a baseline for version control, but CRDTs has some major issues, mainly the fact that most of them are strict and "magic" in the way they actually converge(like the joke: CRDTs always converge, but to what). i didn't read if he's using some special CRDT that might solve for that, but i think that for agentic work especially this is very interesting

mentalgear 2 hours ago

> [CRDT] This means merges don’t need to find a common ancestor or traverse the DAG. Two states go in, one state comes out, and it’s always correct.

Well, isn't that what the CRDT does in its own data structure ?

Also keep in mind that syntactic correctness doesn't mean functional correctness.

Retr0id an hour ago

Yes.

There are many ways to instantiate a CRDT, and a trivial one would be "last write wins". LWW is obviously not what you'd want for source version control. It is "correct" per its own definition, but it is not useful.

Anyone saying "CRDTs solve this" without elaborating on the specifics of their CRDT is not saying very much at all.

sibeliuss an hour ago

Why must everyone preprocess their blog posts with ChatGPT? It is such a disservice to ones ideas.

lowbloodsugar 7 minutes ago

Araxis merge. Four views. Theirs, ours, base and “what you did so far in this damned merge hell”.

phtrivier 2 hours ago

A suggestion : is there any info to provide in diffs that is faster to parse than "left" and "right" ? Can the system have enough data to print "[email protected] changed this" ?

lifeformed 3 hours ago

My issue with git is handling non-text files, which is a common issue with game development. git-lfs is okay but it has some tricky quirks, and you end up with lots of bloat, and you can't merge. I don't really have an answer to how to improve it, but it would be nice if there was some innovation in that area too.

samuelstros an hour ago

Improving on "git not handling non-text files" is a semantic understanding aka parse step in between the file write.

Take a docx, write the file, parse it into entities e.g. paragraph, table, etc. and track changes on those entities instead of the binary blob. You can apply the same logic to files used in game development.

The hard part is making this fast enough. But I am working on this with lix [0].

[0] https://github.com/opral/lix

gregschoeninger an hour ago

We're working on this project to help with the non-text file and large file problem: https://github.com/Oxen-AI/Oxen

Started with the machine learning use case for datasets and model weights but seeing a lot of traction in gaming as well.

Always open for feedback and ideas to improve if you want to take it for a spin!

jayd16 an hour ago

Totally agree. After trying to flesh out Unreal's git plugin, it really shows how far from ideal git really is.

Partial checkouts are awkward at best, LFS locks are somehow still buggy and the CLI doesn't support batched updates. Checking the status of a remote branch vs your local (to prevent conflicts) is at best a naive polling.

Better rebase would be a nice to have but there's still so much left to improve for trunk based dev.

zahlman an hour ago

What strategies would you like to use to diff the binaries? Or else how are you going to avoid bloat?

Is it actually okay to try to merge changes to binaries? If two people modify, say, different regions of an image file (even in PNG or another lossless compression format), the sum of the visual changes isn't necessarily equal to the sum of the byte-level changes.

rectang 2 hours ago

Has there ever been a consideration for the git file format to allow storage of binary blobs uncompressed?

When I was screwing around with the Git file format, tricks I would use to save space like hard-linking or memory-mapping couldn't work, because data is always stored compressed after a header.

A general copy-on-write approach to save checkout space is presumably impossible, but I wonder what other people have traveled down similar paths have concluded.

miloignis 2 hours ago

I really think something like Xet is a better idea to augment Git than LFS, though it seems to pretty much only be used by HuggingFace for ML model storage, and I think their git plugin was deprecated? Too bad if it ends up only serving the HuggingFace niche.

socalgal2 an hour ago

> [CRDT] This means merges don’t need to find a common ancestor or traverse the DAG. Two states go in, one state comes out, and it’s always correct.

Funny, there was just a post a couple of days ago how this is false.

https://news.ycombinator.com/item?id=47359712

Aperocky 21 minutes ago

Outside of the merit of the idea itself, I thought I was going to look at a repository at least as complete as Linus when he released git after 3 weeks, especially with the tooling we had today.

Slightly disappointed to see that it is a 470 line python file being touted as "future of version control". Plenty of things are good enough in 470 lines of python, even a merge conflict resolver on top of git - but it looks like it didn't want anything to do with git.

Prototyping is almost free these days, so not sure why we only have the barest of POC here.

ithkuil 14 minutes ago

It clearly says in the article that this is just a demo

steveharing1 42 minutes ago

Git is my first priority until or unless i see anything more robust than this one.

catlifeonmars 27 minutes ago

Can we stop using line-oriented diffs in favor of AST-oriented diffs?

Is it just lack of tooling, or is there something fundamentally better about line-oriented diffs that I’m missing? For the purpose of this question I’m considering line-oriented as a special case of AST-oriented where the AST is a list of lines (anticipating the response of how not all changes are syntactically meaningful or correct).

jauntywundrkind 2 hours ago

In case the name doesn't jump out at you, this is Bram Cohen, inventory of Bittorrent. And Chia proof-of-storage (probably better descriptions available) cryptocurrency. https://en.wikipedia.org/wiki/Bram_Cohen

It's not the same as capturing it, but I would also note that there are a wide wide variety of ways to get 3-way merges / 3 way diffs from git too. One semi-recent submission (2022 discussing a 2017) discussed diff3 and has some excellent comments (https://news.ycombinator.com/item?id=31075608), including a fantastic incredibly wide ranging round up of merge tools (https://www.eseth.org/2020/mergetools.html).

However/alas git 2.35's (2022) fabulous zdiff3 doesn't seems to have any big discussions. Other links welcome but perhaps https://neg4n.dev/blog/understanding-zealous-diff3-style-git...? It works excellently for me; enthusiastically recommended!

skybrian 2 hours ago

It sounds interesting but the main selling point doesn’t really reasonate:

If you haven’t resolved conflicts then it probably doesn’t compile and of course tests won’t pass, so I don’t see any point in publishing that change? Maybe the commit is useful as a temporary state locally, but that seems of limited use?

Nowadays I’d ask a coding agent to figure out how to rebase a local branch to the latest published version before sending a pull request.

undefined 2 hours ago

[deleted]

alunchbox 2 hours ago

Jujutsu honestly is the future IMO, it already does what you have outlined but solved in a different way with merges, it'll let you merge but outline you have conflicts that need to be resolved for instance.

It's been amazing watching it grow over the last few years.

aduwah 30 minutes ago

The only reason I have not defaulted to jj already is the inability to be messy with it. Easy to make mistakes without "git add"

MattCruikshank an hour ago

For anyone who thinks diff / merge should be better - try Beyond Compare from Scooter Software.

codemog an hour ago

Nobody should have these types of problems in the age of AI agents. This kind of clean up and grunt work is perfect for AI agents. We don’t need new abstractions.

twsted an hour ago

Version control systems are more important than ever with AI.

monster_truck an hour ago

Not this again

newsoftheday an hour ago

OK, I'll stick with git.

hahhhha500012 5 minutes ago

[dead]

hahhhha500012 an hour ago

[dead]

undefined an hour ago

[deleted]

hahaddmmm12x an hour ago

[dead]

hahaddmmm12x 2 hours ago

[dead]

hahaddmmm12x 2 hours ago

[dead]

Hacker News

by Ryan Harman

The Future of Version Control (bramcohen.com)

ulrikrasmussen 2 hours ago [-]

roryokane an hour ago [-]

psychoslave an hour ago [-]

kungito an hour ago [-]

ktm5j an hour ago [-]

newsoftheday an hour ago [-]

IshKebab 6 minutes ago [-]

cxr 22 minutes ago [-]

crote 2 hours ago [-]

roryokane 2 hours ago [-]

TacticalCoder 2 hours ago [-]

MarsIronPI 2 hours ago [-]

jwr 2 hours ago [-]

echrisinger 2 minutes ago [-]

radarsat1 3 hours ago [-]

hungryhobbit 2 hours ago [-]

recursivecaveat 2 hours ago [-]

gojomo 2 hours ago [-]

skydhash 2 hours ago [-]

gojomo 29 minutes ago [-]

jwilliams 2 hours ago [-]

layer8 2 hours ago [-]

jwilliams 2 hours ago [-]

skydhash 2 hours ago [-]

conradludgate 2 hours ago [-]

rightbyte 2 hours ago [-]

Someone 14 minutes ago [-]

furyofantares an hour ago [-]

ericpauley an hour ago [-]

rectang 2 hours ago [-]

dfhvneoieno 2 hours ago [-]

mikey-k 3 hours ago [-]

barrkel 39 minutes ago [-]

hackrmn 15 minutes ago [-]

barrkel 3 minutes ago [-]

IgorPartola 20 minutes ago [-]

gzread 22 minutes ago [-]

bos 3 hours ago [-]

dboreham 2 hours ago [-]

hrmtst93837 an hour ago [-]

simonw 3 hours ago [-]

zahlman 2 hours ago [-]

gavinhoward an hour ago [-]

AceJohnny2 37 minutes ago [-]

63stack 37 minutes ago [-]

vessenes 29 minutes ago [-]

ZoomZoomZoom 3 hours ago [-]

vova_hn2 an hour ago [-]

AceJohnny2 35 minutes ago [-]

rbsmith an hour ago [-]

simonw 3 hours ago [-]

codethief 2 hours ago [-]

simonw 2 hours ago [-]

jedberg 2 hours ago [-]

idoubtit 3 hours ago [-]

ozten 2 hours ago [-]

merlindru 18 minutes ago [-]

gnarlouse 3 hours ago [-]

CuriouslyC 2 hours ago [-]

tasuki an hour ago [-]

gnarlouse 30 minutes ago [-]

mikey-k 3 hours ago [-]

layer8 2 hours ago [-]

rectang 2 hours ago [-]

bob1029 an hour ago [-]

fn-mote 25 minutes ago [-]

nkmnz an hour ago [-]

AceJohnny2 42 minutes ago [-]

WCSTombs 2 hours ago [-]

logicprog 3 hours ago [-]

vishvananda 3 hours ago [-]

NetOpWibby 3 hours ago [-]

righthand 3 hours ago [-]

lemonwaterlime 2 hours ago [-]

mentalgear 2 hours ago [-]

lasgawe 2 hours ago [-]

a-dub 2 hours ago [-]

ulrikrasmussen 2 hours ago

roryokane an hour ago

psychoslave an hour ago

kungito an hour ago

ktm5j an hour ago

newsoftheday an hour ago

IshKebab 6 minutes ago

cxr 22 minutes ago

crote 2 hours ago

roryokane 2 hours ago

TacticalCoder 2 hours ago

MarsIronPI 2 hours ago

jwr 2 hours ago

echrisinger 2 minutes ago

radarsat1 3 hours ago

hungryhobbit 2 hours ago

recursivecaveat 2 hours ago

gojomo 2 hours ago

skydhash 2 hours ago

gojomo 29 minutes ago

jwilliams 2 hours ago

layer8 2 hours ago

jwilliams 2 hours ago

skydhash 2 hours ago

conradludgate 2 hours ago

rightbyte 2 hours ago

Someone 14 minutes ago

furyofantares an hour ago

ericpauley an hour ago

rectang 2 hours ago

dfhvneoieno 2 hours ago

mikey-k 3 hours ago

barrkel 39 minutes ago

hackrmn 15 minutes ago

barrkel 3 minutes ago

IgorPartola 20 minutes ago

gzread 22 minutes ago

bos 3 hours ago

dboreham 2 hours ago

hrmtst93837 an hour ago

simonw 3 hours ago

zahlman 2 hours ago

gavinhoward an hour ago

AceJohnny2 37 minutes ago

63stack 37 minutes ago

vessenes 29 minutes ago

ZoomZoomZoom 3 hours ago

vova_hn2 an hour ago

AceJohnny2 35 minutes ago

rbsmith an hour ago

simonw 3 hours ago

codethief 2 hours ago

simonw 2 hours ago

jedberg 2 hours ago

idoubtit 3 hours ago

ozten 2 hours ago

merlindru 18 minutes ago

gnarlouse 3 hours ago

CuriouslyC 2 hours ago

tasuki an hour ago

gnarlouse 30 minutes ago

mikey-k 3 hours ago

layer8 2 hours ago

rectang 2 hours ago

bob1029 an hour ago

fn-mote 25 minutes ago

nkmnz an hour ago

AceJohnny2 42 minutes ago

WCSTombs 2 hours ago

logicprog 3 hours ago

vishvananda 3 hours ago

NetOpWibby 3 hours ago

righthand 3 hours ago

lemonwaterlime 2 hours ago

mentalgear 2 hours ago

lasgawe 2 hours ago

a-dub 2 hours ago

jFriedensreich 2 hours ago

nozzlegear 2 hours ago

miloignis 2 hours ago