Optimizing Transformers and Large Language Models: Effectiveness Through Training and Fine-Tuning

  • Unique Paper ID: 180865
  • Volume: 12
  • Issue: 1
  • PageNo: 2727-2730
  • Abstract:
  • Transformer-based large language models have significantly advanced the field of natural language processing, delivering cutting-edge results across numerous tasks. However, as these models scale to billions of parameters, fully fine-tuning them becomes increasingly resource-intensive, both in terms of computation and storage. This paper investigates strategies to enhance the training and fine-tuning of transformers, with an emphasis on parameter-efficient approaches. Commonly known as "delta-tuning," these techniques allow models to be adapted to new tasks by updating only a minimal part of the parameters, yet still achieve performance close to that of full fine-tuning. This research shows recent developments in this area, examining their effectiveness, scalability, and relevance across a variety of NLP applications. By lowering the hardware and memory requirements, parameter-efficient tuning emerges as a practical and scalable method for refining LLMs, enabling broader accessibility and easier deployment across different sectors.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 12
  • Issue: 1
  • PageNo: 2727-2730

Optimizing Transformers and Large Language Models: Effectiveness Through Training and Fine-Tuning

Related Articles