The past few years have seen a boom in the development of this new and revolutionary AI technology. It has reached a point where practically everyone has at least heard of it or perhaps even got to use one for their daily errands. However, what is AI, really?
AI is an acronym for “Artificial Intelligence.” Artificial – human-made; not natural. Intelligence – the ability to learn and apply new skills. In essence, AI is a man-made computer program that learns by itself, from itself, and without human interference. Modern AI algorithms aren’t that advanced yet however, so humans have to train them. There are different types of AI that incorporate different ways of learning. In this article I am just going to cover popular and mainstream AI models, also known as LLMs or Large Language Models.
Picture this: You walk into English class and your teacher gives out a pop quiz with fill-in-the-blank questions. How would you answer it? The best method is probably going to be analyzing every word in the sentence before the blank carefully, and the words closest to it being more important to you (if the word right before the blank is a noun, then a good guess would be that we need an adjective, but we also need the sentence’s context to be sure). Well, what if I told you that ChatGPT (the most popular LLM) works the same way? When you ask ChatGPT something, it feeds your prompt into an algorithm that predicts what the next word should be, taking all context into account. Then, it feeds its answer again and again and again until it answers the prompt in its entirety (i.e. if you want it to write you a 500-word essay, it has to call the algorithm 500 times). However, ChatGPT doesn’t start out knowing every word in the English language, nor does it know how to string them together, which is why it first has to learn.
Different AIs learn differently, or should I say gain their “intelligence” in separate ways. For the case of LLMs, the most popular structure used is the transformer block, which uses neural networks. Neural – relating to the nervous system. Network – a system of interconnected items. In a nutshell, a neural network is collection of points, the system mimicking the structure of a human brain. Each point, storing data, can communicate with other points and change their data, and vice versa. Coming back to the English pop quiz, the way we would use every word in the sentence to try and predict what was in the blank, AI tries to do the same. If we focused more on words closer to the blank, AI would weigh those words more for its probability predictions.
Training data is crucial in these scenarios. In the beginning, the AI has no idea what words even are, nor how to weight them more or less based on the context of the prompt we feed it. The way we train it is by first defining a couple thousand words to be its vocabulary and giving a number to each word to classify it. Then, we get a sample of text, Harry Potter let’s say. We feed it “You are a wizard,” where we normally expect “Harry” in return, but the AI gives “fish.” We will look at the algorithm and focus on what words were weighed more and modify their values so that next time the AI gives “Harry” (basically we do trial-and-error with backtracking until the AI gives back what we want). We have to do this process millions if not billions of times, which is why you hear so much about training data online. After all, AI is nothing without it!
Remember, artificial “intelligence” is not smart. It’s just a computer that guesses words, only reaching a trustworthy accuracy after way too much money and energy is spent on it. It’s not a fortuneteller, genius, nor of equal reasoning when compared to a human. It will only know what you want it to know. For example, ChatGPT won’t invent calculus if you give it algebra; it will only know Calculus if you train it on Calculus.