Calculating the "truck factor" for GitHub projects
(Log in to post comments)
Calculating the "truck factor" for GitHub projects
Posted Jul 16, 2015 23:50 UTC (Thu) by bronson (subscriber, #4806) [Link]
The linked github repo only has a README, no code. I'd like to run it against some other projects like atom/electron, rubygems/rubygems, and rspec/rspec... Has anyone seen a way to do that?
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 0:20 UTC (Fri) by JoeBuck (subscriber, #2330) [Link]
I think that there may be some problems with the methodology: Linux gets a truck factor of 90, it appears, because the researchers don't distinguish core parts of the kernel from drivers (all nontrivial non-documentation files are equal), so it is as though Linus and all his deputies could die in a plane crash but everything would be fine because 20 authors of obscure drivers remain. Likewise, in many cases there are other people who are thoroughly familiar with some part of an important project, but don't count as an author because a control freak project owner insists on reworking all the checkins himself.Still, it's something, and we should pay attention to important projects with a "truck factor" of one to make sure that there's a backup plan.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 7:54 UTC (Fri) by Felix (subscriber, #36445) [Link]
However a lot of value about the number of supported packagees (as well as packaging quality for "key" components). So in theory the homebrew "distro" might be fine if 40, 50 persons were leaving but that would weaken the network effect quite a lot. Also I assume that there are some packages which are used very often and you might find that these are only maintained by a "core group" of much fewer people (similar to Fedora for example).
On the other hand it's a nice automated approach without much influence of subjective judgement so I see some value in the results.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 1:32 UTC (Fri) by sashal (subscriber, #81842) [Link]
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 3:26 UTC (Fri) by HybridAU (guest, #85157) [Link]
I'm not sure I completely agree with their methodology though. Anecdotally, I work on some software where I'm the only developer so by their methodology a bus factor of 1. But for precisely that reason my employer requires that the project is well documented, not just comments in the code but developer manuals, style guidelines, database schemas, issue tracker, documents explaining past design decisions and a road map for the future. If I was to go out for lunch today and get hit by a little red bus, my employer could simply find someone new and carry on.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 9:03 UTC (Fri) by Lennie (subscriber, #49641) [Link]
Let's say that if you have a very large code base, even with a large amount of great documentation and great code comments if it take a months or maybe even a year before an other developer is productive with fixing bugs and adding new features then it would still be a big problem for your employer.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 13:09 UTC (Fri) by nix (subscriber, #2304) [Link]
Calculating the "truck factor" for GitHub projects
Posted Jul 19, 2015 3:26 UTC (Sun) by pr1268 (subscriber, #24648) [Link]
If I was to go out for lunch today and get hit by a little red bus, my employer could simply find someone new and carry on.
Yes, but the researchers were studying github projects (which assumes an open-source and development strategy). From the sounds of it, your employment project is an in-house only, presumably closed-source project.
Not that I disagree with your employer's methodology; he/she (and you) have a good handle on how to manage such a contingency. Which I pray doesn't happen to you.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 7:00 UTC (Fri) by andreashappe (subscriber, #4810) [Link]
Abandonment factor.
Posted Jul 17, 2015 8:58 UTC (Fri) by fb (guest, #53265) [Link]
Buses are not a common cause for projects fading away. What we see often is:
- "I got married" / "I have kids now"
- "I've switched jobs" (or "I've got a job")
- "I lost interest"
Switching the metaphor to (something like) "abandonment/leave factor" would help making people think more about which are the real risk factors to projects.
A. Is this project funded? If so, to which extent? If **all** Linux kernel devs decided to leave to work on something else, different people would just get hired to work on it.
B. What is the growth/decay rate of contributors?
For instance, for those 46% of projects with a single maintainer, what is the ratio in that group for:
- paid to work on it/hobbyist
- single/married
- kids/no kids
- working/student.
Abandonment vs. Bus/Truck factor
Posted Jul 19, 2015 3:37 UTC (Sun) by pr1268 (subscriber, #24648) [Link]
I would appreciate if we dropped the morbid metaphor [...]
Well, in the defense of our editor and the authors of the paper, that term has been used for over twenty years (according to the Wikipedia page linked in our editor's blurb).
While I agree it's not the most pleasant nor politically-correct term, it is commonly known among software development organizations (or at least it should be).
Your bullet points are a more accurate way of enumerating reasons for developer loss among open-source/open-development projects (thankfully), but, assuming a project does fade away, these developers are at least still alive to perhaps provide documentation and support to someone who may be willing to take over. Projects with a single maintainer don't have that luxury when that single maintainer is hit by a ...
Abandonment vs. Bus/Truck factor
Posted Jul 19, 2015 15:14 UTC (Sun) by jwakely (guest, #60262) [Link]
What is not politically correct about it?! Is it discriminatory against bus drivers? People who suffer from chronic falling under buses syndrome? Is the word bus an outdated stereotype for modern public transport systems? Is it offensive to car drivers who don't like public transport? I think objecting to the term because it's unpleasant would fit Wikipedia's definition of PC: "a pejorative term used to criticize language, actions, or policies seen as being excessively calculated to not offend or disadvantage any particular group of people in society". Objecting to Bus Factor seems excessively concerned with a complete non-issue.
Anyway, as you say, the point is that you can plan for other forms of abandonment and perform a risk analysis (assuming you could find out the marital status, age, family status of the contributors, which is incidentally all information that could be used to discriminate when someone is applying for a job!) but you can't plan for someone getting hit by a bus. It can happen to even the most dedicated contributors that you know would never abandon the project.
Abandonment vs. Bus/Truck factor - politically correctness
Posted Jul 19, 2015 16:29 UTC (Sun) by pr1268 (subscriber, #24648) [Link]
Maybe "politically correct" was the wrong term... But, I've learned to describe a phrase as being not-PC whenever it involves grisly or macabre details; i.e. it might be more appropriate to use what the GP suggested.
For example, I've been called out for calling this thing a "finger chopper". (Shame on me!)
I certainly did not mean to imply that "bus factor" (or "truck factor") was discriminatory; but instead (like our editor and the GP said) somewhat morbid.
Abandonment vs. Bus/Truck factor - politically correctness
Posted Jul 21, 2015 15:05 UTC (Tue) by jwakely (guest, #60262) [Link]
Haha! When I was at school we called that a guillotine, which brings even gorier imagery to mind :)
Abandonment vs. Bus/Truck factor
Posted Jul 20, 2015 14:25 UTC (Mon) by fb (guest, #53265) [Link]
That said, bringing such a morbid metaphor can be problematic. Specially when arguments get hot. IIRC Guido van Rossum really did not like when someone -amidst an adversarial discussion- started making comments about Guido suffering a (fatal) bus accident (IIRC it was over making integer division return a FractionalNumber in Python many/many years ago.... with which Guido decided not to go ahead).
One can argue that "unlike the got too busy with something else" case, a deceased developer can't help with any comments etc. But I still /think/ we have more projects where the developer simply abandons a project "never to be heard from again" due to "getting busy with other things" than due to loss of life.
Abandonment vs. Bus/Truck factor
Posted Jul 21, 2015 1:31 UTC (Tue) by dakas (guest, #88146) [Link]
Incidentally, the statistics don't seem to include the projects with a bus factor of 0. I'd guess those to constitute the majority.
Abandonment vs. Bus/Truck factor
Posted Jul 29, 2015 1:40 UTC (Wed) by apollock (subscriber, #14629) [Link]
Well, in the defense of our editor and the authors of the paper, that term has been used for over twenty years (according to the Wikipedia page linked in our editor's blurb).
While I agree it's not the most pleasant nor politically-correct term, it is commonly known among software development organizations (or at least it should be).
In the now defunct SysOps, we had a sysadmin who objected to the term "hit by a bus" because he knew someone who had literally been hit by a bus, so we came up with "Eaten by a GRUE" to remove the microaggression.
We extended it to be a great disaster recovery training exercise with GRUE becoming a backronym for "Google's Real Untimely Education". A sysadmin could declare they were "eaten by a GRUE" for the day and go off and work on something that didn't involve interacting with the service they were normally responsible for (or any of the co-workers who would have to step in and pick up the pieces).
Abandonment factor.
Posted Jul 21, 2015 5:05 UTC (Tue) by NightMonkey (subscriber, #23051) [Link]
Keep this awesome phrase going! :) The art of Rhetoric is a fun art, indeed.
FFmpeg/libav
Posted Jul 17, 2015 9:06 UTC (Fri) by Lennie (subscriber, #49641) [Link]
__
With regard to security issues, Reinhard attributed the difference in fix rates to a difference in how the two projects approach development ("Michael" is Michael Niedermayer, the lead developer of FFmpeg):
Michael seems to have much more capacity and time, and thus is usually faster with pushing patches for such crashers. Libav takes the time to investigate, reproduce and understand those patches. Unfortunately, in the majority of cases, this is not trivial at all, often because of terse (or even wrong) commit messages, or the fact that there are better places to fix a particular issue in the code. "Better" usually means that more than a single instance of the issue is fixed.
__
Ohh... does anyone still think FFmpeg is a better project than libav when we include the 'truck factor' ? ;-)
FFmpeg/libav
Posted Jul 17, 2015 17:45 UTC (Fri) by dlang (guest, #313) [Link]
as was discussed in the article, yes
because even if you remove Michael, the next several contributers down are all doing about as much work as the top contributers of libav
in other words, if Michael was to disappear and nobody else did any more work, the two projects would be in about the same shape as far as patch rate.
FFmpeg/libav
Posted Jul 18, 2015 18:52 UTC (Sat) by flussence (subscriber, #85566) [Link]
Libav is what FFmpeg looks like after it's been hit by a truck.
Calculating the "truck factor" for GitHub projects
Posted Jul 17, 2015 11:16 UTC (Fri) by Samathy (guest, #102370) [Link]
What happens when you have a large project where one person has contributed a to only a few core files but there are hundreds of other authors who have contributed less important files. Loosing the one core developer would be devastating, but loosing a few lesser devs wouldn't be so bad.
Additionally, there may be developers who have actually contributed very little measurable code but in fact are the driving force behind the design of the project.
I think it boils down to the lack of data analysed. You can't figure out how many people you could loose because every developer on a project has a different importance - You can't treat all the contributions as having the same weight.
I'm not sure how you could improve upon the data, but you'd have to factor in some more information to get a usable number.
Calculating the "truck factor" for GitHub projects
Posted Jul 18, 2015 23:04 UTC (Sat) by mrjk (subscriber, #48482) [Link]
I think the thing is since this is automated, you could use it to perhaps raise flags, not for the exact value.
Calculating the "truck factor" for GitHub projects
Posted Jul 20, 2015 3:27 UTC (Mon) by pabs (subscriber, #43278) [Link]
Calculating the "truck factor" for GitHub projects
Posted Jul 20, 2015 20:55 UTC (Mon) by bronson (subscriber, #4806) [Link]
Gotta say, it's a neat idea, but if all it produces is one blog entry (that reads like a blog entry) then I guess it's fairly useless.
Calculating the "truck factor" for GitHub projects
Posted Jul 20, 2015 21:24 UTC (Mon) by bronson (subscriber, #4806) [Link]
Looking forward to it.
Calculating the "truck factor" for GitHub projects
Posted Sep 12, 2015 23:07 UTC (Sat) by pabs (subscriber, #43278) [Link]