What is the new DeepMind language model Gopher?
Language models that learn through artificial intelligence (AI) are the talk of the town. Usually, the performance and quality of these language models goes hand in hand with their size. The larger the model, the better the performance. However, larger models are more opaque. This is viewed critically by ethicists, since models become increasingly opaque with increasing model size and biases become increasingly difficult to detect. This leads to considerable ethical concerns. Gopher is a comparatively small language model that can look up information in a database and obtain its information from there. Gopher has been trained to be friendly and to conduct dialogue in a similar way to a human. Users can ask Gopher concrete questions and receive concrete answers, which are composed of information from the database. This allows Gopher, despite its smaller size, to keep up with the large models on the market while remaining flexible. Gopher’s knowledge can also be refreshed by updating the database without the need to re-train Gopher.
The developer company of Gopher, Deepmind, is not unknown in this context. The company was founded in 2010 and bought by Google’s parent company, Alphabet, in 2014. The company, which has its headquarters in London, has further centres in Canada, France and the United States. With Gopher, Deepmind has set a new milestone in the field of language models.
With 280 billion parameters, Gopher is not the largest language model, but it brings with it enormous potential through its linkage to the database. In the paper published by Deepmind, which is over 118 pages long, the company explains everything worth knowing about the language model and gives example conversations that describe the interactions between Gopher and the user. Users can ask the language model questions on any topic imaginable. It doesn’t matter whether users want to know about dinosaurs, the theory of relativity or the capital of the Czech Republic. Gopher has an answer for every question.
Gopher, like all larger language models, is a transformer. This means that Gopher learns itself (machine learning) and translates a sequence of characters into another sequence of characters. The model is trained to do this using sample data and thus learns how to work. Gopher was trained on 300 billion characters, but can draw on much larger amounts of knowledge because of the database. In total, the amount of data comprises 2.3 trillion characters and is thus many times larger than the amount of data used to train Gopher.
Gopher can be used for different areas and was tested and compared in 152 tasks by Deepmind after its development. The tasks ranged from fact checking to language modelling to answering various questions from users. In about 80 per cent of the tasks, Gopher was able to prevail over the competing language models compared, which included the well-known GPT-3 model.
The Deepmind model came out on top, especially in conversation, where it showed a high degree of consistency. Natural conversation is often a problem with language models that rely on artificial intelligence. Although the models are able to form individual, grammatically correct sentences, they have difficulty establishing a context over an entire section or text. This is important for a fluent conversation, however, and is one of the major challenges in the development of artificial language models.
One reason for Gopher’s good performance is its connection to the database. Here, Gopher’s database is used like a kind of cheat sheet or reference book. This database is used by Gopher to search for passages with similar language, which thus increases the prediction and accuracy of the model. Deepmind calls the technology of the model “Retro” (Retrieval-Enhanced Transformer). Translated into German, this means something like a transformer enhanced by lookup capabilities. Through this technology, Gopher is able to compete with language models that are 25 times larger.
Although Gopher is convincing in many areas and leaves its competitors behind, this AI, just like other language models, has to struggle with the similar ethical issues. However, due to the link with the database, Gopher is to be evaluated differently from an ethical point of view than comparable language models without a database. Gopher makes transparent which sections of the database were used for the predictions. This can help to explain the results and at the same time leads to the fact that Gopher is not a pure black box. Furthermore, distorting influences (bias) can be changed directly in the database and thus eliminated.
The fact that the language model, although a rather small model, usually outperformed its competitors in the tests raises the question of how good large language models with a connection to a database could be. However, these are not currently on the market and would have to be tested from an ethical perspective, in addition to development.
At the moment, however, Gopher is the most efficient language model, judging by Deepmind’s data, which can learn through changes in the database without having to be completely retrained.