Reinforcement Learning for tuning language models ( how to train ChatGPT )

January 28, 2023

The Large Language Model revolution started with the advent of transformers in 2017. Since then there has been an exponential growth in the models trained. Models with 100B+ parameters have been trained. These pre-trained models have changed the way NLP is done. It is much easier to pick a pre-trained model and fine-tune it for a downstream task ( sentiment, question answering, entity recognition etc.. ) than training a model from scratch. Fine-tuning can be done with a much smaller set of examples than training a model from scratch making the whole process of NLP much easier.

Link