Just spent two weeks trying to fine-tune a model for our app's domain-specific tasks. Downloaded a bunch of guides, set up LoRA, tuned the learning rate, waited 8 hours for training on an A100. Result
TPTom and 3 more commentedNote: Python programmer Disclaimer: beginner to both ML and NLP, only fine-tuned GPT-2 a bunch of times with the help of some packages. I'm curious if using GPT-2 might yield a higher accuracy for document vectors (with greatly varying length) or no...
Join discussion