Errors in Errors: An Exploration of Grammarly's Corrective Feedback

Errors in Errors: An Exploration of Grammarly's Corrective Feedback

Joshua Kloppers
DOI: 10.4018/IJCALLT.325792
Article PDF Download
Open access articles are freely available for download

Abstract

Automated writing evaluation (AWE) software is an increasingly popular tool for English second language learners. However, research on the accuracy of such software has been both scarce and largely limited in its scope. As such, this article broadens the field of research on AWE accuracy by using a mixed design to holistically evaluate the accuracy of the corrective feedback of the leading AWE program Grammarly. 1136 Grammarly-identified errors related to style, lexis, and form were graded and discussed by two native English speakers. An overall accuracy rate of 78.86% and an accuracy rate of 91.60% when excluding style-related errors were found. However, several issues relating to the promotion of a set writing style, variance in feedback quality, and accuracy of style-related corrective feedback were also identified.
Article Preview
Top

Introduction

In 2007, Nadasdi and Sinclair claimed that “a pedagogically-oriented grammar checker… is at least as good as human correction for helping students improve their L2 [second language] writing skills” (p. 12). Though the enthusiasm of this claim definitely does not represent the attitudes of all educators or researchers, there is undoubtedly interest in the use of automatic writing evaluation (AWE) software and how it can be used in the L2 learning process by educators, researchers, and students. This interest can be seen in the variety of published works that have recommended using AWE applications in the L2 classroom, either by themselves or in conjunction with an educator (Bailey & Lee, 2020; Hockly, 2018; Link et al., 2022; Nova, 2018; Wang et al., 2013). Outside of the classroom, this interest can be seen in the success of commercial AWE applications such as Grammarly, which has reported over 30 million daily users (Lytvyn, 2022).

This interest in AWE software has resulted in an increasing amount of research aimed at exploring L2 teachers’ and learners’ perceptions of AWE software as well as its effect on L2 acquisition (Link et al., 2022). However, despite this increase, there is still a dearth of research on the accuracy of AWE applications (Link et al., 2022; Xi, 2010). Considering that accuracy and consistency are deemed important factors affecting the value of written corrective feedback—here understood as any “response to erroneous output with the aim of improving accuracy” (Karim & Nassaji, 2019, p. 29)—this is a significant gap in the current research (El Tatawy, 2002).

In terms of Grammarly, the recent isolated studies that have sought to evaluate its accuracy have largely focused only on specific pieces of language such as lexical (Dodigovic et al., 2016) or form errors (Dodigovic & Tovmasyan, 2021) and have not looked at the accuracy of the software holistically or in regards to other types of errors, such as those related to style. Furthermore, these studies (Dembsey, 2017; Dodigovic et al., 2016; Dodigovic & Tovmasyan, 2021) have evaluated Grammarly’s accuracy by comparing it to teacher-provided corrections. However, Grammarly is widely used outside of the classroom—“outside of the classroom” is here used to simply refer to situations where instructor feedback is unavailable, even if the work being done is academic in nature—and so needs to also be evaluated as a tool that does not replace instructor feedback but rather functions separately or alongside it (Grammarly Inc., 2022a). To do this, the information that Grammarly provides to the learners (i.e., the errors that it identifies and corrects) needs to be evaluated in detail by itself.

Another issue that AWE research faces is the speed at which the software and applications that it aims to research change (Dodigovic & Tovmasyan, 2021). Grammarly, for example, averages over three updates a month and recently released a large-scale update that it named Grammarly 2.0 (Grammarly Inc., 2022b). It is unclear how generalizable the results of research based on earlier forms of AWE applications are to current AWE applications.

This paper aims to partly address these issues, at least for a time, by thoroughly evaluating the accuracy of Grammarly’s identified form, lexical, and style errors as well as its recommended corrections for said errors. It will do so by having two separate native English speakers rate and then comment on 1136 Grammarly-identified and corrected errors in forty separate texts produced by university-level L2 English speakers.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 5 Issues (2022)
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing