A large language model (LLM) is a computerized language model, embodied by an artificial neural network using an enormous amount of 'parameters' (i.e 'neurons' in its layers with up to tens of millions to billions 'weights' between them), that are (pre-)trained on many GPUs in relatively short time due to massive parallel processing of vast amounts of unlabeled texts containing up to trillions of tokens (i.e. parts of words) provided by corpora such as Wikipedia Corpus and Common Crawl, using self-supervised learning or semi-supervised learning, resulting in a tokenized vocabulary with a probability distribution.