ChatGPT, Google Gemini and Meta AI are all LLMs that work by predicting the next word, using word vectors.

New Delhi:

Have you ever wondered how ChatGPT works? The short answer to this complex question lies in large language models or LLMs, which are fundamental models that are trained using large amounts of textual data. These models do not process words like humans do. Instead, they use a long series of numbers that represent a single word. This data is sent to computers in the form of Word vectors.

These sequences of numbers are known as word vectors and can be imagined as a single point in an imaginary space, with words that have similar meanings placed closer together. The scale of each model is enormous and almost impossible to imagine, but for reference, GPT4 has a staggering 1.76 trillion parameters, with millions of unique word vectors, according to a June 28, 2023 report from SemiAnalysis, a US-based independent AI research and analytics company. Processing such a huge number of vectors with trillions of parameters has been made possible by the spectacular advancement in computing power in recent years. Most recently, on June 19, Nvidia became the world’s largest public company by market capitalization, surpassing Microsoft and Apple, as a result of growing demand for its AI-enabled chipsets.

ChatGPT, Google Gemini and Meta AI are all LLMs that work by predicting the next word, using word vectors. This prediction is done by transforming word vectors provided by the user as “cues” into predictions, using Transformers.

How is text prediction done in LLMs?

LLMs are multi-layered. Each layer consists of a neural network architecture (imagine artificial neurons) known as transformers. These transformers process the input text (each word vector individually) and within each transformer, the words in the form of vectors look around and interact for relevant information. This process is repeated over and over again, not just for a single message, but even for the next time a user enters a message with similar words in the LLMs. This allows efficiency in future searches for better prediction of “the next word” in the sequence.

How are LLMs formed?

LLMs are trained using unsupervised learning, eliminating the need for human labeling of data. Data from web pages, books and other textual sources are used to feed the LLMs before being made public. These have also generated controversy since in some cases they reflect human prejudices. In particular, Microsoft’s Twitter Chatbot Tay, Google’s Gemini, and OpenAI’s Sora (text-to-video converter) have generated controversy over the years for giving bigoted, racial, and gender discriminatory responses. To its credit, the industry has responded to the challenge and is constantly evolving to override human biases in LLMs.