An Empirical Analysis of Inferences From Commit, Fork, and Branch Rates of Top GitHub Projects

An Empirical Analysis of Inferences From Commit, Fork, and Branch Rates of Top GitHub Projects

Ekbal Rashid, Mohan Prakash
Copyright: © 2022 |Pages: 16
DOI: 10.4018/IJOSSP.300751
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

GitHub repositories have been used to understand some important parameters related to quantitative growth of software projects. The rate at which fork, commits and branches have been done in different repositories which are supposed to be the topmost ones when it comes to the number of contributors is the main point of study in this paper. The desirable goal is to find out some sort of similarity in the trends of all these software projects. Similar trends tend to lead to interesting inferences regarding the quantitative growth and even the quality of these software projects. Is the rate of forking, branching, committing anything to do with the success of software projects? The authors have taken up this question in the current study.
Article Preview
Top

Introduction

In an earlier work the authors had tried to analyze data of top ten GitHub projects of 2019 published in ‘The State of the Octoverse’ (https://docs.github.com/en/github) and then some parameters were defined. The terms that were defined were related to the concepts of ‘effort’ and ‘activity’ of software projects. An attempt was made to understand how different kinds of efforts namely – commit-effort, fork-effort, and branch-effort could play significant roles in understanding the health of the software, or one can even say, the quality of the software. Similarly, some activity parameters were defined which could also help understand the quantitative growth of software projects in general. The activities that were defined in that particular research work were issue activity and pull request activity. It was felt during that research that these ratios were not adequate to understand the quantitative evolution of the software project. Take for example we have two software projects A and B. Let us assume that these projects have the same values for effort and activity but are working for different spans of time. A suppose is working for a period of 30 months and B for a period of 60 months. Let us assume that both have 1000 forks and 1000 contributors. That makes the fork-effort of both the projects equal to one. But the fact that B has been in business for twice the amount of time as compared to A may make one wonder whether that can play a significant role in understanding the situation of the project. An intuitive understanding is that this may indeed make us say that the time for which the project is in operation has indeed a lot to do with the understanding of the quantitative growth of the project. Whether the time factor has anything to do with the quality of the software is also a matter of deep inspection. Can we say that software projects that are developed faster can be better qualitatively just because they have been developed faster? Doesn’t seem to be right. Because at first instance one may think what has the speed got to do with the quality? So, the present paper will be attempting to examine commit, fork and branch rates/ First these quantities have been formally defined and then the data collected has been presented. This is followed by the analysis and inferencing part and then the post research analysis. There is no doubt about the fact that researchers are looking towards GitHub repositories with a view to gathering data, especially the data of open-source software projects and then drawing interesting conclusions from it. They are also trying to find out why software projects succeed and why such projects fail. Invariably, such attempts will in a way lead to development of newer engineering techniques in the study of software projects.

A lot of work is also being done to understand the quality aspect of such software projects. GitHub is the storehouse for many such projects, mostly collaborative and open source in nature. The authors believe that quantitative growth has a deep and dialectical link with quality of software. We may understand quality in terms of quantity also and that changes in quantity however imperceptible they might be, bring about change in quality. Similarly, the change in quality is also a kind of quantitative change if looked at from the proper viewpoint. The direction of quantitative change may adversely or positively affect the quality of the software.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 1 Issue (2015)
Volume 5: 3 Issues (2014)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing