Classification of Product Backlog Items in Agile Software Development Using Machine Learning

Classification of Product Backlog Items in Agile Software Development Using Machine Learning

Nirubikaa Ravikumar, Banujan Kuhaneswaran, Adeeba Saleem, Ashansa Kithmini Wijeratne, B. T. G. S. Kumara, G. A. C. A. Herath
DOI: 10.4018/978-1-6684-4755-0.ch016
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In agile software development, product backlog items (PBI) are used to capture the user requirements prior to the product implementation. Many types of requirements can be observed within a software project. Proper classification of PBI can positively impact the software development process. PBI can be classified into three categories: user stories, foundational stories, and spikes. After the extreme literature survey, no research was held on classifying the PBI into the categories mentioned above. This paper proposed a machine learning (ML) based approach to classify the PBI into three categories. 4,721 PBI were collected from different software projects and manually labelled into the three classes mentioned above. Then the PBI were cleaned using different pre-processing techniques. Classification models were constructed using ML techniques. The performance of each ML model was evaluated using accuracy, precision, recall, and F1 score. Support vector machine (SVM) outperformed other ML models by providing 88% accuracy.
Chapter Preview
Top

Introduction

Software Engineering (Sommerville, 2004, 2020) is a systematic engineering approach and an engineering discipline concerned with all aspects of software development and maintenance of software products. It helps to adopt a systematic approach to developing products (Glass, Vessey, & Ramesh, 2002). Rather than individual programming, software engineering is intended to support professional software development.

The Software Development Life cycle (SDLC) (Ragunath, Velmourougan, Davachelvan, Kayalvizhi, & Ravimohan, 2010) is a systematic process that supports the production of high-quality software at a lower cost. The primary goal of this SDLC is to produce the best software that meets the customer’s expectations in the shortest possible time. The traditional waterfall (Adenowo & Adenowo, 2013; Balaji & Murugaiyan, 2012) model is more rigid and based entirely on following a set of steps that keeps the team moving forward. It doesn’t have any space for unexpected changes or revisions in this form. A sudden change in requirements in the project will result in the need for rework. The waterfall model has less end-user or client involvement within a project. Adding opinions and clarifying what clients want as the project moves is impossible in this waterfall model (Alshamrani & Bahattab, 2015).

The Agile methodology (Sharma, Sarkar, & Gupta, 2012) was proposed to respond to the limitations of the traditional waterfall model. It is an iterative project management and software development methodology that enables teams to deliver value to customers [6] quickly. It provides customer involvement in decision-making, leading to excellent customer retention (Balaji & Murugaiyan, 2012). The Agile methodology was very flexible, and even late changes in requirements were welcomed.

In Agile development, Product Backlog Items (PBI) (Sedano, Ralph, & Péraire, 2019) are a prioritised set of requirements to be implemented in product development. It’s a decision-making tool that aids in estimating, refining, and prioritising everything that might be done in the future. Many individuals consider the backlog a to-do list, and they define it as a list of tasks that must be completed to get the product into the market. In detail, we can say the PBI is a prioritised list of all potential product features, all fundamental elements that must be in place for the attributes to be implemented, and all critical research activities and studies that must take place to achieve accurate decisions about how to approach feature development and requirements. To increase the quality of the product, we need to consider the features in PBI.

We can classify the items in PBI as (i) user stories (Amna & Poels, 2022; Raharjana, Siahaan, & Fatichah, 2021): simple statements which explain the need or requirements of the product and are written from a user perspective (ii) foundational stories: a foundational or infrastructure needs that we have to fulfil the user stories, and (iii) spikes (Al Hashimi, Altaleb, & Gravell, 2020; Al Hashimi & Gravell, 2020): a most important feature that includes researches and studies to achieve a better understanding on user needs and development procedures.

Classification in ML is a method of grouping a given data set into classes. It can be done with both structured and unstructured data. Classifying the given PBI is the primary research objective. Since three types of PBI will be considered in this research, authors have to use multiclass classification models such as Artificial Neural Network (ANN), Naive Bayes, Support Vector Machine (SVM), Logistic Regression, and Decision Tree.

The structure of this paper is divided into six sections. In Section II, provides the background of the study. Section III explains the related works so far held in the proposed approach and how our works differ from the existing studies. Section IV presents the motivation of the research. Section V describes the methodology we used and some details about datasets. Section VI analyses the results and discusses the research’s findings and limitations. Finally, in Section VII, we conclude the paper with future works.

Key Terms in this Chapter

Software Engineering: Software engineering is a systematic engineering approach to developing software. A software engineer is someone who uses software engineering concepts to design, build, maintain, test, and review computer software.

User Stories: A user story is a brief, straightforward description of a feature from the viewpoint of the customer who wants the new feature of a product. It explains how well a piece of work will deliver a particular value to the client. User stories are also helpful for identifying reoccurring problems, tracking the development of necessary system capabilities, and ensuring user satisfaction.

Spikes: A spike is essential for research or a study when a team member needs to move forward with other items within the PBI. The spike is established when the team has to conduct additional research or analysis before estimating a user story or task. It gives long-term trust, visibility, and predictability to the product roadmap.

Machine Learning: Machine learning (ML) is an artificial intelligence (AI) science that enables machines to automatically learn from data and previous experiences while finding patterns to generate predictions with minimum human interaction. Machine learning approaches allow computers to function independently without requiring explicit programming. ML apps are fed new data and may learn, grow, evolve, and adapt independently.

Product Backlog Items: A product backlog is a list of the new requirements, modifications to current features, repairs for bugs, changes to the infrastructure, and other tasks that a team may carry out to accomplish a specific goal. It is a prioritised list of functions or features that will help to achieve the product’s objectives and establish team expectations.

Agile: Agile means having the capacity to produce and adapt to change. It lets teams offer value to their clients more quickly and with fewer difficulties through an iterative approach to project management and software development. It helps teams provide value to their customers faster and through an iterative project management and software development approach.

Software Development Life Cycle: The Software Development Life Cycle is the process of producing software applications using conventional business procedures. It is usually separated into six to eight steps: planning, requirements, design, build, documentation, testing, deployment, and maintenance. Depending on the scale of the project, some project managers will combine, divide, or eliminate steps. These are the essential components for any software development initiatives.

Word Embedding: Word embedding is a phrase used in natural language processing (NLP) to describe the representation of words for text analysis, often in the form of a real-valued vector that encodes the meaning of the word, such that words that are near in the vector space are considered to be similar in meaning. Word embeddings may be generated by employing language modelling and feature learning approaches in which words or phrases from the lexicon are mapped to real-number vectors.

Artificial Neural Network (ANN): An artificial neural network computation technique creates several processing units based on interconnected connections. It consists of various processing components that process inputs and produce outputs in accordance with predetermined activation functions.

Foundational Stories: A foundational story is required to fulfil the user stories in the Product Backlog from the team members’ viewpoint. The demands and requirements of the user are not explicitly depicted in the foundational stories. However, it clarifies what infrastructure is required to meet the user's needs.

Complete Chapter List

Search this Book:
Reset