Let’s start with what people are familiar with, ChatGPT. ChatGPT is a highly-trained and clever chatbot. The GPT part of its name stands for Generative Pre-trained Transformer. Generative means that it can generate text or other forms of output. Pre-trained means that it has been trained on a large dataset. And Transformer refers to a type of neural network architecture enabling it to understand the relationships and dependencies between words in a piece of text. IBM’s Granite 3.0 is very similar to ChatGPT, except that it is optimized for specific enterprise applications rather than general queries.
Just a side
note, I was wondering about the choice of name for the product. In the UK, the
traditional gift for a 90th anniversary is granite. I just wondered
whether there was some kind of link. In 1933 IBM bought Electromatic
Typewriters, but I can’t see the link. Or maybe I’ve been doing too many
brain-training quizzes!
Granite was originally
developed by IBM and intended for use on Watsonx along with other models. In May
this year, IBM released the source code of four variations of Granite Code Models
under Apache 2, allowing completely free use, modification, and sharing of the software.
In the original
press release in September 2023, IBM said: “Recognizing that a single model will
not fit the unique needs of every business use case, the Granite models are being developed in different sizes.
These IBM models – built on a decoder-only architecture – aim to help businesses
scale AI. For instance, businesses can use them to apply retrieval augmented generation
for searching enterprise knowledge bases to generate tailored responses to customer
inquiries; use summarization to condense long-form content – like contracts or call
transcripts – into short descriptions; and deploy insight extraction and classification
to determine factors like customer sentiment.”
The two sizes
mentioned in that press release are the 8B and 2B models.
In October this year, Version 3.0 was released, which is made up of a number of models. In fact the press release tells us that “IBM Granite 3.0 release comprises:
- Dense, general purpose LLMs: Granite-3.0-8B-Instruct, Granite-3.0-8B-Base, Granite-3.0-2B-Instruct and Granite-3.0-2B-Base.
- LLM-based input-output guardrail models: Granite-Guardian-3.0-8B, Granite-Guardian-3.0-2B
- Mixture of experts (MoE) models for minimum latency: Granite-3.0-3B-A800M-Instruct, Granite-3.0-1B-A400M-Instruct
- Speculative decoder for increased inference speed and efficiency: Granite-3.0-8B-Instruct-Accelerator.
Let’s put a little more flesh on the bones of those models:
- The base and instruction-tuned language models are designed for agentic workflows, Retrieval Augmented Generation (RAG), text summarization, text analytics and extraction, classification, and content generation.
- The decoder-only models are designed for code generative tasks, including code generation, code explanation, and code editing, and are trained with code written in 116 programming languages.
- The time series models are lightweight and pre-trained for time-series forecasting, and are optimized to run efficiently across a range of hardware configurations.
- Granite Guardian can safeguard AI by ensuring enterprise data security and mitigating risks across a variety of user prompts and LLM responses.
- Granite for geospatial data is an AI Foundation Model for Earth Observations created by NASA and IBM. It uses large-scale satellite and remote sensing data.
In case you
didn’t know, agentic workflows refer to autonomous AI agents dynamically
interacting with large language models (LLMs) to complete complex tasks and
produce outputs that are orchestrated as part of a larger end-to-end business
process automation.
Users can deploy
open-source Granite models in production with Red Hat Enterprise Linux AI and
watsonx, at scale. Users can build faster with capabilities such as
tool-calling, 12 languages, multi-modal adaptors (coming soon), and more, IBM
tells us.
IBM is claiming
that Granite 3.0 is cheaper to use compared to previous versions and other LLM
(large language models) such as GPT-4 and Llama
IBM also tested
the Granite Guardian against other guardrail models in terms of their ability
to detect and avoid harmful information, violence, explicit content, substance
abuse, and personal identifying information, showing it made AI applications
safer and more trusted.
We’re told that
the Granite code models range from 3 billion to 34 billion parameters and have
been trained on 116 programming languages and 3 to 4 terabytes of tokens,
combining extensive code data and natural language datasets. If you want to get
your hands on them, the models are available from Hugging Face, GitHub,
Watsonx.ai, and Red Hat Enterprise Linux (RHEL) AI. A curated set of the
Granite 3.0 models can be found on Ollama and Replicate.
At the same
time, IBM released a new version of watsonx Code Assistant for application
development. The product leverages Granite models to augment developer skill
sets, simplifying and automating their development and modernization efforts.
It simplifies and accelerates coding workflows across Python, Java, C, C++, Go,
JavaScript, Typescript and more.
Users can
download the IBM Granite.Code (which is part of the watsonx Code Assistant
product portfolio) extension for Visual Studio Code to unlock the full
potential of the Granite code model from here.
It seems to me
that the Granite product line is a great way for organizations to make use of
AI both on and off the mainframe. I’m looking forward to seeing what they
announce with Granite 4.0 and other future versions.
No comments:
Post a Comment