Deep Learning Speech Synthesis Model for Word/Character-Level Recognition in the Tamil Language

Deep Learning Speech Synthesis Model for Word/Character-Level Recognition in the Tamil Language

Sukumar Rajendran, Kiruba Thangam Raja, Nagarajan G., Stephen Dass A., Sandeep Kumar M., Prabhu Jayagopal
Copyright: © 2023 |Pages: 14
DOI: 10.4018/IJeC.316824
Article PDF Download
Open access articles are freely available for download

Abstract

As electronics and the increasing popularity of social media are widely used, a large amount of text data is created at unprecedented rates. All data created cannot be read by humans, and what they discuss in their sphere of interest may be found. Modeling of themes is a way to identify subjects in a vast number of texts. There has been a lot of study on subject-modeling in English. At the same time, millions of people worldwide speak Tamil; there is no great development in resource-scarce languages such as Tamil being spoken by millions of people worldwide. The consequences of specific deep learning models are usually difficult to interpret for the typical user. They are utilizing various visualization techniques to represent the outcomes of deep learning in a meaningful way. Then, they use metrics like similarity, correlation, perplexity, and coherence to evaluate the deep learning models.
Article Preview
Top

Introduction

Mobile devices and social media have made a significant contribution in ways never previously possible. In 15 years, the number of Internet users jumped from 745 million to 4388 million. The growth of user-generated material on the Internet is significantly advanced as users spend more time on the Internet, and this is not just due to an increase in the number of people using the Internet. The Internet, as the most open form of communication, gives people the freedom to post whatever they want, whenever they want it. Because reading all the documents of interest to a single person is nearly difficult. This demands the development of methodologies and tools for analysing large sets of documents and opinions in order to produce concise summaries of the data. In addition, users will appreciate it greatly if the summary of the results may be given in some graphic form. Using deep learning speech synthesis, users may study and grasp the themes that are concealed in unlabelled text materials. The majority of people in the globe speak Tamil, which is the world's fourth most widely spoken language. According to the classic Paninian grammar, Tamil is known for its rich syntactic structure. The development of new language models in the field of deep learning speech synthesis modeling Tamil is almost non-existent. Speech-to-text research continues and led the community, particularly in the suburbs, to create this for Tamil speakers. The primary difficulty is the collection of the dataset and its transcript, which comprise colloquial Tamil. The project objective is to build an app that recognizes the language barrier of Tamil in Tamil text and converts the text of output into English language with google API.

In this research work, the following components are discussed:

  • Find a collection of data of the voice corpus that is to recognize the voice and its matching transcript.

  • Establishing effective research training data.

  • Use speech identification machine learning algorithms.

  • Make the end user translate the most correct speech.

The visualisation approaches and assessment metric were introduced to aid in the development of a simple platform for Tamil modelling research.

Top

Latent Semantic Analysis (Lsa)

Text material can be represented graphically using Latent Semantic Analysis, commonly known as Latent Semantic Indexing (LSI), a knowledge representation technique. Using LSA, all of the contexts of words in which a specific word appears or does not exist are gathered together to form a set of mutually constraining constraints. This is a major factor in determining the degree to which different words and groups of words have similar meanings. A dictionary, grammar, parser, or any other tool created by humans is not necessary for LSA. It only accepts raw text files as input. For each document in the corpus, a word count vector with length W represents the document. The corpus itself is frequently used to generate the lexicon. In this way, the corpus can be depicted as a matrix of dimensions D x W, In this case, D stands for the total number of documents in the corpus. In each cell of the matrix, the word's TF-IDF score may be found. The latent semantic space is a vector space with a reduced dimensionality (equal to the number of desired s) that is used by LSA to map documents and concepts (Deer ster et al., 1990). Using approaches like the cosine similarity method, the latent semantic space can be further condensed to locate similar words and documents. Neuropsychological, phrase comprehension reviewer choice and research article recommendation (LSA model) semantic categorization clustering of words (Bakhshi et al., 2020; Bernard et al., 2020; Christy et al., 2020; Gaonkar, 2019). On the bsite2, you may view examples of LSA in action.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 7 Issues (2023)
Volume 18: 6 Issues (2022): 3 Released, 3 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing