AI Definitions: Natural Language Processing
/Natural language processing - This type of machine learning transfers language into numbers to make it intelligible to machines. The first step is tokenization, where text is divided into word units called tokens. These tokens are then transformed into vectors. These vectors are lists of numbers. A single word token might be represented by more than 1,000 numbers in a vector. The vector is considered to have a higher dimension when many numbers are used. The meaning is therefore nuanced. A low dimension for a vector means the list of numbers is low. While a low dimension is not as nuanced, it is easier to work with. A deep learning model (typically a transformer model) can use these vectors to understand the meaning of words and determine how the words relate to one other. An example would be “king “relates to “man” while “queen” relates to “woman.”
More AI definitions here