Hate UML?

Draw sequence diagrams in seconds.

Why Perforce is more scalable than Git
Posted on: 2009-02-23 21:31:22

Okay, say you work at a company that uses Perforce (on Windows). So you're happily tapping away using perforce for years and years. Perforce is pretty fast -- I mean, it has this "nocompress" option that you can tweak and turn on and off depending on where you are, and it generally lets you get your work done. If you change your client spec, it synchronizes only the files it needs to. Wow, that's blows the mind! Perforce is great, why would you ever need anything else? And its way better than CVS.

Suddenly you have to clone something with git, and BAM! The world is changed. You feel it in the water. You feel it in the earth. You smell it in the air. Once you've experienced git, there is no going back, man. Git is the stuff man. You might have checked out firefox -- but have you checked out firefox ooon GIT?

So many really obvious things are missing in p4. Want to restore your source tree to a pristine state? "git clean -fd". Want to store your changes temporarily to work on something else? "git stash". Share some code with a cube-mate without checking in? "git push". Want to automatically detect out of bounds array accesses and add missing semicolons to all your code? "git umm-nice-try"

Branching on git is like opening a new tab in a browser. It's a piece of cake. You can branch for EVERY SINGLE BUGFIX. And you wrote the code, so you get to merge it back in, because you are the expert.

Branching on Perforce is kind of like performing open heart surgery. It should only be done by professionals: experts in the art who really know what they are doing. You have to create a "branch spec" file using a special syntax. If you screw up, the entire company will know and forever deride you as the idiot who deleted "//depot/main". The merging is done by gatekeepers. Hope they know what they're doing!

Now, if you have been using git for a few days you might discover this tool called "git-p4". "AHA!" you might say, "I can import from my company's p4 server into git and work from that, and then submit the changes back when I am done," you might say. But you would be wrong, for a number of reasons.

git-p4 can't handle large repositories

Really. It's just a big python script, and it works by downloading the entire p4 repository into a python object, then writing it into git. If your repo is more than a couple of gigs, you'll be out of memory faster than you can skim reddit.

But that problem's fixable. I was able to hack up git-p4 to do things a file at a time in about an hour. The real problem is:

Git can't handle large repositories

Okay this is subjective because it depends on your definition of large. When I say large, I mean about 6 gigs or so. Because your company's source tree is probably that large. If you have the power, you will use it. Maybe you check in binaries of all your build tools, or maybe for some reason you need to check in the object files of the nightly builds, or something silly like that. P4 can handle this because it runs on a cluster of servers somewhere in the bowels of your company's IT department, administered by an army of drones tending to its every need. It has been developed since 1995 to handle the strain. Google also uses Perforce, and when it started to show its strain, Larry Page personally went to Perforce's headquarters and threatened to direct large amounts of web traffic up their executives' whazzoos until they did something about it.

Git has none of that. The typical git user considers the linux kernel to be a "large project". If you've looked at Linus's git rant on Google code, take a listen to see how he sidesteps the question of scalability.

Don't believe me? Fine. Go ahead and wait a minute after every git command while it scans your entire repo. It's maddening because its long enough to be annoying, but not enough time to skim Geekologie.

The solution

You know what? I don't think many people really use distributed source control. The centralized model is here to stay. Most git users (especially those using Github) use the centralized model anyway.

Ask yourself this: Is it really that important to duplicate the entire history on every single PC? Do you really need to peruse changelist 1 of KDE from an airplane? In most cases, NO. What you really want is the other stuff: easy branching, clean, and stash, and the ability to transfer changes to another client. The distributed stuff isn't really asked for, or needed. It just makes it hard to learn.

Just give me a version control system that lets me do these things and I'll be happy:

  • Let me merge changes into my coworker's repos, without having to check them in first.
  • Let me "stash" stuff cause it's really handy. Clean is nice to have too.
  • Make branching easy.
  • Don't waste 40% of my disk space with a .git folder, when this could be stored on a central server.

Is that really so hard?

Want more programming tech talk?
Add to Circles on Google Plus
Subscribe to posts

Post comment

Real Name:
Your Email (Not displayed):

Text only. No HTML. If you write "http:" your message will be ignored.
Choose an edit password if you want to be able to edit or delete your comment later.
Editing Password (Optional):

Holger Schurig

2009-03-14 19:14:09
Try "git clone --depth 0". This way you clone a remote git repository, but you do *NOT* download the history since commit 1.


2009-03-14 20:04:08
Git was designed to be a version control tool - no a quasi- file-server 'repository' which is how most other tool like Subversion and Perforce are actually used. It was also not designed to track a whole set of unrelated projects - say a teams entire code-base, something that both Linus and then Randall made pretty clear.

The solution? Track each project as a single Git repository, and if you need to tie them together, create a master repository that included each one as a sub-module. The flexibility you gain from 'setting free' your individual projects is enormous, as it the smart use of a master repository that uses branches to create different mash-ups of your overall code-base.

Stephen Waits

2009-03-14 20:04:40
I've done some testing on big p4 repositories. Specifically, 36GB. Git was awful. p4 continues to haul ass. A p4 sync takes less than one second, if no files have changed on the server. Just doing a git status was on the order of minutes.

Git, plain and simple, does not scale to large repositories. That's OK, I guess, it's not really designed to handle that use case.

Stephen Waits

2009-03-14 20:07:27
@zzz, FWIW, we store all of our code and data in p4 because it's the Right thing to do. We make video games PS3+BluRay == massive content. At any given time, our data works with our code. If I need to sync back a month to look at some issue, I need the specific data to be sync'ed back too.

Really, for us, p4 works great. It stays out of our way, it's faster than anything out there. It's not distributed, but we don't care about that.


2009-03-14 20:29:44
If your repo is 6GB+, I think you're doing it wrong...


2009-03-14 20:35:28
"we store all of our code and data in p4 because it's the Right thing to do"

Well you may have identified another use case where Git is not ideal - really large binary blobs. I think the problem is Git has to checksum (sorry SHA1) all files it scans - and that would take some time on a 36GB file.

To be fair, Git has always been advertised as a SCM - i.e. a source-code management system - and for that use-case it absolutely rocks IMO. Personally I would still investigate a hybrid approach where you have the option of pulling just the source down to your lappy with Git, so if you are on the plane and you DO want to look at change-set 1 at least you can!

Jason P

2009-03-14 20:36:47
I'm curious as to how other people who work with huge repos in P4 send around diff packs for review. diff/patch? Something internal?


2009-03-14 20:36:51
Let me get this straight: you're saying perforce is faster than git for large projects? This surprises me because most git operations are completely off-line since all the data is local. I thought that operations which require network I/O are the slower ones. Care to back up your claim with a specific use case and some data? (It's an honest question btw, I don't use git or perforce so I'm not defending git here.)


2009-03-14 20:45:45
@ Jason P, an internal wrapper script for p4 that sends changelists to reviewers for reviews and approvals.

To merge changes into a coworker's repo, why can't they just patch a CL? You don't have to submit a CL for a coworker to grab the changes.


2009-03-14 20:50:40
"Let me merge changes into my coworker's repos, without having to check them in first."

Would that be your coworkers distributed repository, by any chance?


2009-03-14 21:07:05
Wrt chewing up disk space in .git: check out git alternates. 'git clone -s' does it for you autmatically. I ran into this exact problem and the alternates stuff works like magic!

Agree with your points about scalability. Git is not good for anything other than source code (medium # of small text files).


2009-03-14 21:08:17
Perforce may scale well with regards to data size. In my experience it doesn't scaled well over a distributed network. Between having to check files out to work on them and tight integration with Visual Studio, if your link to the Perforce server does down you practically have to stop work.

My one experience of Perforce was doing work with another company remotely. Our VPN was unfortunately a bit dodgy. Combine that with Perforce lead to an incredibly frustrating experience.

I wouldn't recommend using it if you're not on a LAN.


2009-03-14 22:21:27
The one feature of a DVCS that I really really really like is the ability to use it as a sneakernet. Not all of the machines I develop on are connected to a network, or connected to the same network that the central/blessed repository is on.

Bypassing the central respository to share patches... meh. This I do not see as a feature -- if there's a central repository, it should be used as the mechanism of communication between developers.

On the other hand, "stashing" stuff is really nice. And branching (and merging) *should* be easy. I'm all over those two requests.

As for wasting my disk space... meh. Sometimes I care, sometimes I don't (disk is cheap, but disk fills up faster still). Having an option for git to use either a local or a remote (central/blessed) repository would be nice.

Disclaimer: I still use CVS, I've used Perforce (and liked it), and I use git (and like it), and I don't currently have an repositories that approach the sizes discussed in the article.


2009-03-14 23:10:09
"we store all of our code and data in p4 because it's the Right thing to do"

Well you may have identified another use case where Git is not ideal - really large binary blobs. I think the problem is Git has to checksum (sorry SHA1) all files it scans - and that would take some time on a 36GB file.

To be fair, Git has always been advertised as a SCM - i.e. a source-code management system - and for that use-case it absolutely rocks IMO. Personally I would still investigate a hybrid approach where you have the option of pulling just the source down to your lappy with Git, so if you are on the plane and you DO want to look at change-set 1 at least you can!

Daniel Stockman

2009-03-14 23:15:07
As a CLI-proficient user of both git and p4 (also having hacked git-p4 to restore some sanity), I can state with full confidence that git beats p4's CLI like a redheaded stepchild eight days a week and thrice on Sundays. Any perceived benefits or "power" that p4 gains from being adept with binary blobs of redonkulous girth is irrelevant when the command line tool is worse than friggin' CVS and all of the GUIs suck.

(All SCM GUIs suck, imo, but that's my CLI-bias bleeding through)

Jason Dusek

2009-03-14 23:40:49
"The distributed stuff isn't really asked for, or needed. It just makes it hard to learn."

But not much later:

"Let me merge changes into my coworker's repos, without having to check them in first."

That would be distributed stuff.

Ted M

2009-03-15 00:44:26
This is pretty much our experience, too. P4 rules the kingdom in games development, probably because we need to version HUGE amounts of artwork (which is almost always binary) as well as code.

A lot of teams try out Alien Brain, and quickly realize that

1. It's structured like Visual Source Safe or CVS (in other words you aren't REALLY versioning changes, just files, and that's really bad), and

2. Versioning artwork against code is just as important as versioning one code change against another or one artwork change against another, and having your artwork and your code in different version control systems, even when they're both structured around atomic changes (which Alien Brain isn't) causes problems.

So most teams just dump artwork, intermediate data files, and all sorts of things in the same p4 depot that their code is in. And it works like a champ. Except that p4 is missing so many of the cool features that git gives you.

Ted M

2009-03-15 00:46:41
"If your repo is 6GB+, I think you're doing it wrong..."

Then you've never been responsible for builds in the games biz...

Hatem Nassrat

2009-03-15 00:52:26
Having to work over any network will be really slow. Having the ability to work on an airplane is super cool.

As far as flexibility Git is Awesome, so stop posting things that don't make sense.

Woody Gilk

2009-03-15 04:01:54
So basically, you wrote a post bashing for git not doing something it was never designed to do? Good job, buddy.


2009-03-15 10:57:52
You can't simply measure performance against the size of the repository and call that "scalable".

The number of simultaneous clients that can be doing operations is just as important, if not more so. P4 was notorious for holding locks far longer than necessary, and clients would queue up for minutes at a time (I rememeber syncs that would take more than half an hour on a fairly small repository because there were a hundred other clients trying to sync).

P4 does *not* in fact, scale well (although, I admit that more recent versions of P4 are better than what I was using is 2004).

I feel pretty confident your assessment of git would be different if you had 1,000 coworkers using your P4 repository at the same time.

Good post

2009-03-15 13:29:46
I just wanted to let everyone know that this post is dead-on. I work at a software company that is entirely based on P4. The repositories are huge because they contain a lot of non-source files, like Photoshop, videos and such. Trying to push to git has been painful because it is massively slow on any large repository. The insert alone can take several hours.

Any web company with non-source code in their repo will run into the same thing. I'm surprised more people haven't pointed out this glaring problem with the git model.


2009-03-15 13:47:55
Hi all, interesting discussion. I am curious though:

it seems as if there is a specific problem with Git, namely it doesn't handle large binary files well (large images, artwork, etc).

Has anyone actually taken this specific use-case to the Git developers on the mailing list?

Second, it seems like your problem could be solved by having a separate machine to run Git just for your Binary assets. When you need to make a build, you just dump all those files to the machine, have it version the directory, and then include that 'version' into your Git source repo.

Interesting post.

Good Post

2009-03-15 14:43:12
Hey Anon, yes, this has been mentioned with respect to DSCMS. Those who are working on commercial versions have been working on potential solutions.

In the meantime, it's still just easier to dump stuff into P4. Beefy 64-bit P4 servers are cheap to build now.


2009-03-16 07:35:17
@Jason P: Review Board works great with Perforce.


2009-03-16 07:59:23
actually.... p4 doesn't run on a cluster of servers. That's one of it's biggest shortcomings. It runs on ONE server, one really, big beefy freaking server if you have lots of stuff and users. Google, for example, was having serious problems with the speed of, well, everything p4, until they went out and bought one of the most powerful computers they could. Then all was well again.

So yeah, it's scaleable, but it's directly proportional to the size of the server it's on.

Good Post

2009-03-16 13:06:57
@masukomi... you can set up P4 proxies to help alleviate the pain if you have a lot of data to transfer. But you are right, it has to keep a database on a single server, the size of which is dependent on the number of clientspecs/branches.

John Fries

2009-03-16 19:32:18
interesting writeup. Would be very interested in hearing a three-way comparison of git, subversion and perforce, since those seem to be the three that most people have to decide between.




2009-03-17 23:01:18
There was quite a bit of research done a while ago investigating the size of the average dev team. The number was <10. Kind of surprising, but true. There are relatively few places in the world where enormous, cross-referenced project repositories are needed: Microsoft, Google, Siemens, Philips, government agencies, etc.

However, for 99% of the software developers out there, git (or one of it's DVCS brethren) just works. In those cases, the benefits of being entirely mobile, having near zero time cost for most actions, and the ability to easily experiment with the contents of the repository are game-changing wins. For the top 1%, there are tools like Clearcase and Perforce.


2009-03-20 13:20:43
Go for Plastic SCM, the best of GIT and Perforce combined, and a decent GUI. Replication, merging, really fast true branches (not like GIT or P4)...


2009-03-31 09:40:50

If you have that large a repo, it's probably because you're stuffing large binary blobs into git. If you're stuffing large binary blobs into git, you need to look into the .gitattributes file so that git won't try to diff/compress said large binary files. It's got some heuristics to try and recognize them, but making its work a bit easier is sure to show you some gain.


2009-04-04 02:03:38
{quote}Let me merge changes into my coworker's repos, without having to check them in first. {/quote}

Why? What's the big deal with checking in? Use a personal branch, and have your bunker-mate use one two. Check-in your WIP on a regular basis, just in case your drive goes kablooie.

{quote}Let me "stash" stuff cause it's really handy. Clean is nice to have too.{/quote}

I must be missing something. Wouldn't a personal branch work just fine for this?

{quote}Make branching easy. {/quote}

Branching in Perforce is difficult for users who don't understand the nuances of client workspace mapping. When you understand how the repository is structured, and how your local hard drive is layed out, it becomes so much easier. If you don't know the structure of the repository, which contains the family jewels, please turn in your coder's badge. If you don't know how your own disc is structured, please turn in your computer.

{quote}Don't waste 40% of my disk space with a .git folder, when this could be stored on a central server. {/quote}

Good idea. I'm curious -- let's say we had a multi-Tb repository, with 80k files on just one tip, tens of thousands of branches, 1600 coders, 11 locations, 8 time-zones. If we were using GIT, and I wanted to work disconnected from the network for a couple days, what would be "gotten" onto my laptop?


2009-04-04 02:21:14
Unenlightened thoughts personal thoughts on storing binary info in Perforce, or any other version control tool....

IMHO, a version/revision control tool, with all it's diff, 3-way-merging, and compressed delta storage goodies is at it's best when it's storing editable source. Storing binary data, especially binary data that can be recreated from the version controlled source, is not the ideal use for this kind of system. That said, I've done it too, because I also believe that every version of the source should include the tools used to process the source into product shipped to the customer. But I would like to consider the use of a different paradigm for the archiving of binary data, especially mongo BLOBs. I would like to consider a system more ideally suited to storing Big Honkin binary files, and have a reference to those BLOBs in the version control system. Now I wonder what would work.....

Sam Vilain

2009-04-08 08:02:05
There's a bit of confusion in this piece. Firstly, what systems like Perforce do is collect many projects in one place and give you a timeline for them. So when looking at "repository size", consider that you don't normally keep every project in the same repository with git.

Of course with the Perl Perforce repository, the size was something like 450MB in Perforce and 70MB in Git, once the crazy metadata format used by perforce's insane integration system were appropriately grokked.

I mean, don't get me wrong, I think Perforce is a great product - beats SVN hands-down in design and was around many years before - it's just too complex. Integration is badly modelled, hardly anyone understands it properly. So in that respect, Perforce doesn't scale to very large teams because the branching model is too hard to work with.

Yes of course Git doesn't do a lot of that product release cycle development / Software Configuration Management. It's unix: it does one thing and does it well.

Scott Bilas

2009-04-24 16:14:25
Thanks for this post. I've been hearing such great stuff about git, and like you the commands it offer seem absolutely killer. But I was concerned it would suffer from the same problems as all the other open source SCM's: it dies horribly with large files.

I'm in the games biz myself and we ran into these problems with svn. Once we got past a certain size team and asset base, it started to really choke. I wrote up a little postmortem at scottbilas.com about our experience with it (search for 'svn').

We tried really hard to make svn work because of the astronomical price of P4. A price that we all grudgingly pay again and again in this industry because everything else is so much worse.

My current plan is to clone the commands from git into our command line p4 extension tool we have (it does things like auto-creating Crucible code reviews and such). For example, 'stash' should be pretty easy to implement. Actually, it already exists. Search the p4 public depot for 'p4tar'. I haven't tried it out yet.

Anyway the other commands should be implementable with a tool on top of p4 using p4api.net. If I only had some spare time.. :)


2009-05-01 21:36:32
binary files are a known problem and are receiving some attention in git. There are some ideas in the cooking pot that may make a big difference. On the mailing list it was asked if anyone has a repository that could be experimented upon.

Daniel Barkalow

2009-05-02 00:02:15
It's been a long time since I used git-p4, because, well, it couldn't handle the depot. But I've written a more efficient and more targeted importer (as well as a plugin mechanism for the core git); if you're building git yourself, you can pull "git://iabervon.org/git.git p4-clean" and try it. I use it at work with our large perforce depot and it does a good job for all of the parts of the depot I happen to work on. Exporting is left as an exercise for the reader (and if you do it, let me know), but it's great for figuring out what actually happened in the recent history and for previewing tricky merges so that you can check whether you're doing them right in p4 afterwards.

You'll want to get some p4api and set P4API_BASE to the directory where you untar it; this lets the plugin use the C++ bindings for perforce instead of running the command-line client.

Look at Documentation/vcs-git-p4.txt for how to configure it; you generally end up actually getting data simply with "git fetch origin" (or "git fetch" if you apply the bugfix I forgot to send back from work).


2009-05-27 16:39:13
I agree that Perforce is probably better for most people. Having used it at a previous job, I wish I could go back to it. However, we're using git here because the Perforce prices have gotten sky-high! $900/user just to get in the door is ridiculous. If I wanted to pay that kind of money, I'd get a real tool like ClearCase...


2009-06-08 19:53:02
The whole point of git is that you work on only what you need to and leave the rest to the others. When you're working on what you need to, yes it is amazing to have all the history since day 1, especially since that day 1 code could have been written by somebody else thinking something else.

Why would you ever expect git to work well in a centralized usage scenario? Would you expect p4 to work well in a distributed use case? Honestly, dude...Apples and Oranges.

And what happens when that central server is inaccessible? or when you're travelling to a trade show with a demo and you have a really cool idea on the plane you'd like to try out? P4 can be a real pain in the proverbial wazoo in those circumstances.

D Herring

2009-06-15 22:30:53
I wish our data sets were small enough to check in to P4. Or does it handle a few TB of uncorrelated sensor data with ease? There's always an upper bound. P4 seems to hit the sweet spot for game design; but for raw code, I'll stick with git.

BTW, rather than store the data in the repo, we've started storing the git hashes with the data. Works nicely.


2009-06-15 22:50:26
Cheap branching is still broken and doesn't work properly under CVS, Subversion, or Perforce.

It works well under the DVCS tools such as Git and Mercurial (though a lack of branch naming is sometimes an issue depending upon the tool) - it works absolutely blindingly under Clearcase - unfortunately for Clearcase it is expensive IBM software, and the hardware constraints on that tool (particularly for dynamic views) make it compromises also.

Perforce, CVS, Subversion are cut from the same cloth however - they are lightyears behind the branching capabilities of DVCS's and also Clearcase which has had fantastic branching semantics available since the mid-90s.

:emaN laeR

2009-06-16 01:17:33
By the way, you can your .git folder wherever you want by using git --bare init and pushing to that.

Or you could go my way and have your .git folder actually be a symlink to a folder on another machine over ssh.

Please. When you argue about this stuff, please research thoroughly. There is a lot of things you can do with git that just takes a while to learn.

Git is like really good drugs.


2009-06-16 03:04:40
>"I don't think many people really use distributed source control."


>"I don't need distributed source control, so I know nobody out there will need it, as I don't see why they should. But they WILL need to move 6gb repos, because I do, so that's what normal people needs."

In short, different people different needs. I'm the happiest SCM user since I switched to git for my <6gb projects, which doesn't mean it does have to fit everyone and every possible project, for the same reason I don't use vim to edit jpg files.


2009-06-16 05:00:25
To be honest, I think you should write a new article that refers to these comments here as well. Some even claim that having 6 Gigs source code is in the wrong here... what kind of projects produces 6 gigs of source code, not even java is that verbose... ;-)

Jared Oberhaus

2009-06-17 17:36:22
I recently wrote about git not being able to handle large repositories as well; as you elude to, it's not so much the size in space, but the number of files that is painful:


git is clearly designed for what I would call "small" projects like the Linux kernel. If you want to do another project, you do not add it to an existing git repository, you make another one. This best fits with pushing and pulling a single project. But if you have a large system that is composed of many such smaller projects, you have to use something other than the source control system to synchronize their dependencies.


2009-07-10 03:15:26
>it's not so much the size in space,

>but the number of files that is painful

That's exacly my case. We tried to migrate WebMethods repository containing lots of services (corporate scale, all currently used/deployed, and cannot be split into submodules/subtrees). It contanis like 100k files, and doing simple git status took about 10 minutes of disk IO while it was scanning for changes.

Julian Adams

2009-12-27 16:48:17
Often Git and Mercurial are sold as DVCS being the killer feature. To me branching being a first-class and easy operation is the killer feature, and that doesn't require DVCS. We use Accurev, which has first class branching and is server based. A full checkout of our codebase is about 10GB, although most people only checkout 4-6GB of that. The depot history goes back to 2004. With those sizes it works just fine.


2010-04-08 10:43:00
We think Perforce is really cheap.

Zero downtime. No administration needed. What else can one ask for?


2010-08-21 22:57:25
I was heavily modding Fallout 3 with files from fallout3nexus.com. After a while I wanted to be able to switch back and forth to different mods in a way that FOMM (Fallout Mod Manager) was unable to do well.

So I thought, "hey! I'll just take a fresh install and make it a git repo." This worked to some degree but some of the mods had large files. Eventually when I went to switch to a different branch it just died with an out of memory error.

This is because git has to be able to store the whole file in memory to process it. My machine has 6GB of RAM (the one I was using) but on Windows most versions of git are 32-bit.

Bam. Dead in the water. I had to actually boot up Ubuntu on a live disc, apt-get install the 64-bit version of git just to swap branches. Fail; plain and simple.

It sucks to have a designer create a great tool like git only to have him also be too lazy to solve some edge cases for others.

* File sizes > RAM? This should be doable in a slower way only when needed.

* File sizes > 32-bit version capabilities? Again fix it but have it use the slower algorithm only when needed.

* 32-bit only version.... Seriously most new computers other than netbooks have 64-bit capability these days. Just make it the default

Being too stuck up to solve this problem that would obviously increase adoption of your tool just seems dumb. And for those who say >6GB repos and you're doing something wrong or don't have large repos or don't revision large files obviously haven't run across a business need to do so but when your paycheck requires it you'll be singing a different tune.

I used Perforce when I worked at Google and will likely use it again in my next company for which I just got hired. I like it but I know I am going to miss features from a DVCS. I used Bazaar at my last company and it was quite nice but also suffers from the same problem as git and I believe hg.


2010-08-25 19:07:56
It's 2010 and Perforce still lacks an equivalent to 'git status'. Come on Perforce, show me the files I need to add, before I break the build, please!


2010-10-14 16:38:03
I've had the same experience trying to use git as a front end to our company's giant p4 repo. Unlike most of the complains I've read, we don't store big binary blobs in the repo. One or two here and there, but most of it is source files. 600MB and 27k files worth of source code. Due to really bad design choices stemming from an uncouth history in SourceSafe :) things are pretty strongly interconnected, so it doesn't make much sense to just split them up into several repos. Git on that repo was just really frustratingly slow, even compared to p4 over a VPN. I've also never managed to get git-p4 to work.

I really want the local branches and lack of needing to check out files, and we gave up on a server update since 2002 (it was that or health insurance -- that bad), but it's just become a big time sink for me to even investigate it anymore.

Sam Liddicott

2011-05-25 12:51:53
I'm gnashing my teeth at perforce - wants to download over a gigbyte of things that are already exactly in the place it wants to download them to because (unlike git) it doesn't scan the file system and doesn't md5sum large media files - it prefers to download them all over again.

Steve Hanov

2011-05-25 14:35:41
Sam: Perforce can be frustrating because it blindly re-downloads files on a forced sync.

For each file, if the cost of a checksum is less than the cost of downloading the whole file, they should try to do an incremental transfer.


2011-06-10 18:01:08
If your code repository for /one/ project is 6gb, you're doing something wrong with how you've structured the code. If you have 6gb of code, you have many projects. Android is built around git.


2011-12-12 09:36:45
We used git for a AAA videogame that had a good success. The repository grew up to 110-120GiB, and of course it got larger and larger as it got dirty on your computer. We had it mixed with SVN (for artists) and there were lots of binary files. With the right mix of SSD, common sense and configuring git worked just perfectly.

On the other hand, I'm using perforce right now. Turns out that even a simple merge, check-in or branching is slow. The client continously polls the server, sometimes crashes if you make it age and must rely on network and servers for every little thing you want to do. Yes, shelving relies on the server, the server even keeps track of what I have and what I don't, with the obvious desynchronization issues.


2012-03-01 23:31:23
Wow. So Git handles Gnome, KDE and Android among others and you're saying Git can't scale. Your argument is based on large media files and a repository setup the way Perforce likes it. It's not a problem with Git. The problem is with the way you've configured your repository. I don't blame you since you're coming to DVCS from a centralized mind set. Change the way you structure your repository and you'll find things are actually much better. You might want to look into sub modules.

Finally, you're talking about disk space. Mind telling me why I need to have double the disk space available with Perforce just to be able to switch quickly between any 2 branches? I have actually run out of disk space just because of this and have lost valuable productive time trying to free up enough space to check out another branch. Never again.


2012-04-09 20:28:16
Another place where Git comes up short is the inability to lock files. Many programmers seem to see that as a feature of Git. But, for anyone (artists, designers) who works extensively in binary files where changes can't be merged, the ability to lock a file for editing, or to know that someone else has already locked it for editing is the single most important feature of version control.


2012-05-23 03:30:11
I _love_ having the full history available, it's why whenever I expect to work on an svn project, I check out the specific folders with git-svn. I very often do whatchanged -p to check other people's checkins, perhaps grepping it, etc. And the log too. I haven't tried perforce, but doing svn log on an sf.net repo is slower than just loading their viewvc web page. More than enough time to loose track of the task at hand; git lets me check logs without getting me out of my flow.

(And for those of us outside the USA, being able to work offline is a must, but I can see how not everyone will care about that.)


2013-03-14 21:20:52
Wouldn't git + repo (Google script for android version control) = perforce?


basically, repo allows you to combine different git repositories together.

In the case of android, each hardware company (eg Qualcomm for their radio, Broadcom for their bluetooth/wifi) will have separate git repositories for each component.

Repo manages all the git repositories automatically (you can still control git yourself)

(yes, it still wouldn't solve problems of having many large binary blobs and calculating md5sums for them)


2013-05-30 19:31:04
Since people are still commenting:

Try PlasticSCM, it might be close.


2013-11-29 02:38:10
Try git-annex. Problem fixed.

Br. Bill

2014-04-22 23:43:47
Good article.

Here is a correction to this article and a list of updates to Perforce that change some of the things described here (can't blame it for being written a while back; the world changes).

Re: "So many really obvious things are missing in p4." …

::Want to restore your source tree to a pristine state? "git clean -fd".

--> As of Perforce 2014.1, the "p4 clean" command does this.

::Want to store your changes temporarily to work on something else? "git stash".

--> This has been possible with the "p4 shelve" command since P4 2009.2.

::Share some code with a cube-mate without checking in? "git push".

--> There are ways to do this, but creating a branch for every person or code fix isn't a typical way of doing business in P4.

Re: Branching, git vs. P4

::Branching on Perforce is kind of like performing open heart surgery. It should only be done by professionals: experts in the art who really know what they are doing. You have to create a "branch spec" file using a special syntax.

--> This really has never been true. Branch specs are helpful but not required. If you understand branching strategy for your team/group/company, this isn't difficult at all. Merging, on the other hand, can be ugly if you do it wrong and submit the changes. That's true with any SC system.

::If you screw up, the entire company will know and forever deride you as the idiot who deleted "//depot/main".

--> You can't really delete a branch by branching. By merging, sure. This is what rollback is for.


Other posts by Steve

Yes, You Absolutely Might Possibly Need an EIN to Sell Software to the US How Asana Breaks the Rules About Per-Seat Pricing 5 Ways PowToon Made Me Want to Buy Their Software How I run my business selling software to Americans 0, 1, Many, a Zillion Give your Commodore 64 new life with an SD card reader 20 lines of code that will beat A/B testing every time [comic] Appreciation of xkcd comics vs. technical ability VP trees: A data structure for finding stuff fast Why you should go to the Business of Software Conference Next Year Four ways of handling asynchronous operations in node.js Type-checked CoffeeScript with jzbuild Zero load time file formats Finding the top K items in a list efficiently An instant rhyming dictionary for any web site Succinct Data Structures: Cramming 80,000 words into a Javascript file. Throw away the keys: Easy, Minimal Perfect Hashing Why don't web browsers do this? Fun with Colour Difference Compressing dictionaries with a DAWG Fast and Easy Levenshtein distance using a Trie The Curious Complexity of Being Turned On Cross-domain communication the HTML5 way Five essential steps to prepare for your next programming interview Minimal usable Ubuntu with one command Finding awesome developers in programming interviews Compress your JSON with automatic type extraction JZBUILD - An Easy Javascript Build System Pssst! Want to stream your videos to your iPod? "This is stupid. Your program doesn't work," my wife told me The simple and obvious way to walk through a graph Asking users for steps to reproduce bugs, and other dumb ideas Creating portable binaries on Linux Bending over: How to sell your software to large companies Regular Expression Matching can be Ugly and Slow C++: A language for next generation web apps qb.js: An implementation of QBASIC in Javascript Zwibbler: A simple drawing program using Javascript and Canvas You don't need a project/solution to use the VC++ debugger Boring Date (comic) barcamp (comic) How IE <canvas> tag emulation works I didn't know you could mix and match (comic) Sign here (comic) It's a dirty job... (comic) The PenIsland Problem: Text-to-speech for domain names Pitching to VCs #2 (comic) Building a better rhyming dictionary Does Android team with eccentric geeks? (comic) Comment spam defeated at last Pitching to VCs (comic) How QBASIC almost got me killed Blame the extensions (comic) How to run a linux based home web server Microsoft's generosity knows no end for a year (comic) Using the Acer Aspire One as a web server When programmers design web sites (comic) Finding great ideas for your startup Game Theory, Salary Negotiation, and Programmers Coding tips they don't teach you in school When a reporter mangles your elevator pitch Test Driven Development without Tears Drawing Graphs with Physics Free up disk space in Ubuntu Keeping Abreast of Pornographic Research in Computer Science Exploiting perceptual colour difference for edge detection Experiment: Deleting a post from the Internet Is 2009 the year of Linux malware? Email Etiquette How a programmer reads your resume (comic) How wide should you make your web page? Usability Nightmare: Xfce Settings Manager cairo blur image surface Automatically remove wordiness from your writing Why Perforce is more scalable than Git Optimizing Ubuntu to run from a USB key or SD card UMA Questions Answered Make Windows XP look like Ubuntu, with Spinning Cube Effect See sound without drugs Standby Preventer Stock Picking using Python Spoke.com scam Stackoverflow.com Copy a cairo surface to the windows clipboard Simulating freehand drawing with Cairo Free, Raw Stock Data Installing Ubuntu on the Via Artigo Why are all my lines fuzzy in cairo? A simple command line calculator Tool for Creating UML Sequence Diagrams Exploring sound with Wavelets UMA and free long distance UMA's dirty secrets Installing the Latest Debian on an Ancient Laptop Dissecting Adsense HTML/ Javascript/ CSS Pretty Printer Web Comic Aggregator Experiments in making money online How much cash do celebrities make? Draw waveforms and hear them Cell Phones on Airplanes Detecting C++ memory leaks What does your phone number spell? A Rhyming Engine Rules for Effective C++ Cell Phone Secrets