A Simple Explanation as to How AI (LLMs) Works
/Building the AI
Large Language Models (LLMs). Computer programs that do one thing: predict the next “token.”
Training Data. A massive amount of text is initially fed into the system to train it.
Parameters. The internals rules and limitations learned from the training data.
Tokenization part 1: pre-training. The process of converting the raw training data (text, images, or audio) into small units called tokens.
What happens When Someone uses the AI
Prompt. A user asks a question.
Tokenization part 2: inference. The process of converting the prompt (whether text, images, or audio) into small units called tokens.
Embedding. The conversion of tokens into numbers (vectors) so the computer can look at their relationships.
Vector databases. The storage and search engine for vector embeddings.
RAG. The system searches the vector database relevant to the prompt to prevent hallucinations and provide updated information.
Transformers. The core AI architecture that uses vectors to make a prediction about which token to generate next for the prompt.
