Maikel Sibbald

Foundation Models and Generative AI: A New Era in AI

Over the past few months, large language models (LLMs), such as ChatGPT, have revolutionized the way we interact with artificial intelligence. From writing poetry to planning vacations, these models showcase the immense potential of AI to drive enterprise value. My name is Kate Soule, a senior manager of business strategy at IBM Research, and today I aim to provide a comprehensive overview of this emerging field of AI and its applications in business settings.

What Are Foundation Models?

Large language models, such as ChatGPT, belong to a broader category known as foundation models. This term was coined by a team from Stanford University, who observed that the field of AI was transitioning to a new paradigm. Previously, AI applications were built by training multiple models on task-specific data to perform specialized functions. However, foundation models introduce a unified approach—these models serve as a foundational capability that can drive a wide range of applications and use cases.

Foundation models are trained on vast amounts of unstructured data in an unsupervised manner. For example, in the language domain, these models are fed terabytes of sentences to predict the next word in a sequence. This generative capability is what defines foundation models as part of the generative AI field.

The Power of Generative AI

Generative AI focuses on predicting and generating new content, such as the next word in a sentence. While foundation models are primarily trained for generative tasks, they can also be tuned to perform traditional natural language processing (NLP) tasks like classification or named-entity recognition. This tuning process involves introducing a small amount of labeled data to update the model’s parameters for specific tasks. Even in low-labeled data domains, foundation models can be applied through a technique called prompt engineering, where the model is prompted to complete specific tasks without extensive tuning.

Advantages of Foundation Models

The primary advantage of foundation models is their performance. Having been trained on terabytes of data, these models significantly outperform traditional models trained on limited datasets. Additionally, foundation models offer productivity gains by reducing the need for labeled data, making it easier and faster to develop task-specific models.

Challenges and Disadvantages

Despite their advantages, foundation models come with challenges. The compute cost of training these models is exceptionally high, making it difficult for smaller enterprises to develop their own foundation models. Running inference on these models also requires significant resources, often necessitating multiple GPUs simultaneously. Another critical issue is trustworthiness. Since these models are trained on vast amounts of unstructured data scraped from the internet, there may be biases, hate speech, or other toxic content embedded in their training data. This lack of transparency regarding the datasets used further complicates trustworthiness.

Applications Beyond Language

Foundation models are not limited to language. They have been successfully applied in other domains, such as vision and code. For instance, DALL-E 2 generates custom images based on text inputs, while tools like Copilot assist in code completion. IBM is actively innovating across multiple domains, including language, vision, code, chemistry, and Earth science. Products like Watson Assistant, Maximo Visual Inspection, and Project Wisdom exemplify IBM’s commitment to leveraging foundation models for enterprise value.

Closing Thoughts

The potential of foundation models and generative AI is immense, but so are the challenges. IBM is dedicated to improving the efficiency, trustworthiness, and reliability of these models to make them more accessible and relevant in business settings. To learn more about IBM’s advancements in this field, explore the resources linked below.