Article Preview
TopIn the realm of software development, understanding stakeholder needs is crucial for designing complex software systems (Althunibat et al., 2022). Stakeholders, often users, contribute NLP-written requirements for large-scale projects. Ko et al. (2007) proposed an approach wherein initial data needs are automatically categorized into topics, reflecting political analyst perspectives. Experiments, utilizing datasets in both Korean and English, validate the efficacy of this strategy. This highlights the potential for an internet-based requirements analysis-supporting system to efficiently gather and evaluate dispersed end-user requirements via the network.
Moving forward, support vector machine (SVM) algorithms have garnered attention for their ideal academic characteristics and high performance (Al Qaisi et al., 2021). Yang et al. (2010) delved into the analysis of support vector characteristics, presenting a novel learning process that incorporates SVM classification algorithms. The algorithm, rooted in the equivalence of classification between support vector sets, employs incremental learning to accumulate data. Experimental results indicate its potential to expedite training processes, reduce storage costs, and maintain organizational accuracy (Quba et al., 2021).
Artificial intelligence (AI) and deep learning (DL) come to the forefront in the work of Navarro-Almanza et al. (2017). They recommend using a convolutional neural network (CNN) model to categorize software requirements, showcasing promising results on the PROMISE corpus dataset. This dataset, with pre-grouped and labeled criteria for both functional requirements (FR) and non-functional requirements (NFR), serves as a valuable resource for evaluating the suggested model. (Gill et al., 2014)
Lu and Liang (2017) further contributed to understanding user requirements by breaking them down into FRs and NFRs, including usability, portability, performance, and reliability. Their research involved diverse methods such as bag of words (BoW), CHI2, TF-IDF, and AUR-BoW, as well as ML algorithms like J48, naive Bayes, and bagging. Comparative analysis reveals that the bagging ML algorithm provides the best categorization outcome for NFRs, as validated by feedback from actual customers.
In the domain of ML techniques for classifying FR phrases, AlZu'bi and Jararweh (2020) introduced a novel approach that integrates information from various ML models. This method, implemented and trained using a single dataset, aims to enhance the accuracy and quality of FR classification.
To address imbalanced classes and improve classifier performance, Kurtanović and Maalej (2017) propose a strategy applying cross-validation to classifiers. Their focus is on the automatic identification of NFRs, particularly in the categories of security, usability, operations, and performance. This involves preprocessing steps such as stopword and punctuation removal, coupled with feature selection using BoW, bigrams, and trigrams. Notably, the inclusion of part-of-speech tags emerges as a highly informative feature in their experiments using the SVM classifier algorithm.
The landscape of software requirement classification is further enriched by exploring various methodologies (Alsawareah et al., 2023; Al-Kasabera et al., 2020). These studies aimed to establish correlations between software architecture and NFRs, emphasizing the significance of considering software architecture in addressing NFRs within the software development life cycle.