Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

> I believe there are plenty bridges between CVS and git already implemented, which also allows you to checkout only part of the CVS tree.

How many of them have you used? I've used a couple, to interact with large code bases on the rough order of 300GB. In my experience they don't work very well, because you have to be hygienic about the commands you run or some part of your Git state gets out of sync with some part of your state for the other source control system. So I gave up on those, and I use something similar to Microsoft's solution at work on a daily basis. It's a real pleasure by comparison, and in spite of that I still call myself a Git fan (about 10 years of heavy Git use now). At work the code base is monolithic and everyone commits directly to trunk (at a ridiculous rate, too).

I've heard horror stories about back when people had to do partial checkouts of the source code, and I'm glad that the tooling I use is better.

The idea of breaking up a repository merely because it is too large reminds me the story of the drunkard looking for his keys under the streetlights. The right tools for the right job, sometimes you change the job to match the tools, and sometimes you change the tools to match the job.



A bit of a topic highjack, but I'm always curious how "many people committing very often on the same branch" works in practice. I'd expect there to be livelock like scenario's.

Do you need to do anything special? Or is this just a non-issue? Doy you push to master or do you use some sort of pull request gui (like github or phabricator)


Not sure what livelock you're talking about. If you have a bunch of people pushing to Git master, sure, you'll get conflicts and have to rebase (if we're talking about TBD). But the conflicts are always caused by someone successfully pushing to master, so some progress will always be made.

But I was always just using Git as an interface to something else, usually Perforce or something similar. When pushing with these tools, you'd only get conflicts if other people changed the same files. Git was just used to create a bunch of intermediate commits and failed experiments on your workstation, which is something that it really excels at.

The only real problem is when the file you're changing is modified by many people on different teams, which often means that it's used for operations, and when that becomes a bottleneck it'll get refactored into multiple files or the data will be moved out of source control.


I used one for TFS, when I worked in Microsoft, and git p4 when I worked at Splunk. Certainly enjoying that we are 100% git now.

My point was that with GVFS they are not really solving the problem they had - git status still takes 4-5 seconds, to be that is a lot.


So you're saying that GVFS isn't a good idea because it's not good enough for working on the Windows repository?

Well, yeah. It's pre-production, and let Microsoft worry about their own problems anyway. But it sounds like GVFS will be killer for people who have large repos that aren't as large as the Windows repo. Even if 4-5s for the 270GB/3.5M file repo is too long, 400-500ms for the 27GB repo is fantastic.

At some point you ask yourself, "Would I split this repo if my tools could handle the combined repo just fine?" If the answer is no, then you're going to be happy that the tools are getting better at handling big repos. Microsoft's choice to exploit the filesystem-as-API and funnel all filesystem interaction through the VCS is a smart choice and there are a ton of opportunities for optimization that don't exist when you're just writing to a plain filesystem.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: