The Future of Foundation Models

Ofir Zuk (Chakon)

29/09/2022

4 Min read

This is part 3 of a series of blogs on Foundation Models and their impact on AI. You can read part 1 and part 2 here and here.

The risks of foundation models (see part 2) are Gordian Knots that must be addressed before they pose a danger to society. Given our limited understanding of foundation models, it is already challenging enough to enumerate all the possible risks, much less propose methods to address them.

“Our study hints at a preliminary but alarming conclusion: systems like GPT-3 seem better suited for disinformation—at least in its least subtle forms—than information, more adept as fabulists than as staid truth-tellers.”

Truth, Lies, and Automation: How Language Models Could Change Misinformation (Source)

Recognizing these risks, researchers have opted to limit the access of foundation models to a selected group of people. Instead, they are taking the time to debate the ethical implications and address the loopholes of foundation models.

We see this trend for DALL·E 2 whose API is not made available to the public to avoid abuse. More recently, Facebook released its large language model–OPT-175B–to researchers, in hopes of “increasing the diversity of voices defining the ethical considerations”.

That said, the potential of foundation models cannot be understated. Big tech is still racing each other in their quest to achieve a state-of-the-art foundation model. As we have seen earlier, companies compete with one another by producing ever-larger models.

Generate synthetic data for free with our free trial. Start now!

Yet, some researchers questioned the assumption that bigger is better. Interestingly, DeepMind’s RETRO (Retrieval-Enhanced Transformer) is reportedly a testament that smaller networks can still enjoy the versatility of their larger counterparts at a fraction of the training cost. Being able to use external memory to look up text, RETRO is expected to perform at the same level as neural networks 25 times its size. In the future, we are likely to see more researchers utilizing clever tricks to further optimize the resource usage of foundation models.

We also foresee a future where foundation models are open-sourced. A little while ago, the benefits of foundation models were enjoyed almost exclusively by Big Tech, who could afford exorbitant training resources. Encouragingly, the rise of open-source foundation models democratized the access of foundation models to the public. Now, anyone with internet access could download the BERT model off the HuggingFace model library. In the future, even larger foundation models – including HuggingFace’s BigScience model and EleutherAI’s GPT-NeoX-20B – will be available to the public as well. Such open-source efforts inevitably lead to an explosion of innovative NLP start-ups.

Foundation Models and the Future of AI

We have yet to discover the full extent of what foundation models could do. With the research on foundation models continuing at full steam, foundation models are bound to get more powerful and versatile. Such advancements in foundation models will be translated into a multitude of use cases across industries–including synthetic data.

The AI community recognizes that it must collectively address the dangers of foundation models before society can reap the benefits of foundation models. Practitioners today are also halting the practice of blindly releasing their powerful models to the wild. Instead, they are opting to collaboratively address the gaps in their models.

Having observed such a collective conscious effort to build foundation models for good, we cannot wait to see how foundation models will transform society for the better. It is only a matter of time before foundation models usher in a brand new paradigm of AI.