Automatic Abstractive Summarization of Text: Harnessing the Power of Large Language Models and Deep Learning
DOI:
https://doi.org/10.63075/82cejv76Abstract
In today’s world, much information is available online and offline. There are hundreds of articles on a single topic with a lot of data. It isn’t easy to extract helpful information manually. Automatic text summarization systems have been developed to solve this problem. Text summarization takes valuable information from large documents and puts it in short summaries. There are two ways to generate summaries: extractive summarization and abstractive summarization. Only relevant sentences are extracted from the original document using the extractive technique. Abstractive summarization involves interpreting the original text before generating the summary. A lot of research on extractive summarization. However, analysis in abstractive summary in the Urdu language has not been studied well so far. Urdu is a dynamic language in terms of literary sources and requires serious research efforts to generate abstractive summaries. So, we propose an abstractive summarization method for Urdu text using the Urdu Fake News dataset and pre-trained models. For analysis, Urdu Fake News dataset is used containing text data for summarization and to serve this purpose we provide four summarization systems: one based on BART, another on T5, the third on GPT-2, and a fourth on EGPT-2. We employ a pre-trained model designed explicitly for Urdu to perform the summarization task. This study intends to enhance Urdu text summarization by identifying inefficient heads and then removing them entirely from the model. We evaluated our suggested abstractive summarization model using the Rouge Score and found that it improved accuracy and produced more natural, cohesive summaries.
Keywords— Automatic summarization; NLP; T5; BART; GPT-2; EGPT-2