Sunday, 8 December 2024

Cyber targets for 2025

Let us imagine that there is a room somewhere in Russia (but it could be anywhere else hostile to the West) and it’s full of hackers plotting their attacks for 2025. You can imagine that they are sharing stories of their successes in 2024. How they have targeted people with phishing emails and got them to open malware or download (unwittingly) malware that has not only given the hackers access to the servers of that company, but every other company in the supply chain.

The next hacker speaks up explaining how he has got round the security of cloud providers and managed to get into a variety of organizations that way. He proudly explains that he hasn’t even exploited some of those hacks yet. They are now easy targets for the New Year.

A third hacker explains how he managed to access a security update to a frequently used piece of software, and how he had added a back door that no-one had spotted. So, when everyone downloaded the software and patched the vulnerability, they introduced a back door that only he knew about. He suggested that this time next year he would be rich from all the ransoms he was going to collect.

Another hacker jumps up and explains that he was using AI to automate ransomware attacks, and he is making lots of dosh from the people who were paying him for the Ransomware as a Service software – sometimes people with very little IT knowledge – and were then using it to attack companies that had upset them in some way.

Lots of other people want to speak up with stories of how they had attacked companies and made money, but everyone stops speaking as an old general gets to his feet. He looks very stern but smiles as he starts to speak. “Comrades”, he says, “you have all done very well attacking companies in the West.” He pauses and his face takes on a sternness that had scared many a junior officer. He continues, “The problem is this: we have not defeated the West. What I need you to do is find some way to bring down the whole infrastructure of western society. Can you do that?”

The hackers look round at each other, until one speaks up. “Capitalist society depends on capital.” The audience is not overimpressed by the obviousness of the comment. There is much murmuring from the audience, but the hacker continues, “Why don’t we attack the banks and all the other financial institutions in North America and Europe. If they don’t have access to money, everything else will come to a stop.” The crowd nods in agreement. Some make additional useful comments to each other.

“How do we do that?” asks the general. “We attack the mainframes that are used by most of these organizations”, replies the hacker. And that’s what they do. Attacks by people who understood Windows and Linux continue in all their forms, but a large tranche of the technical people are given the job of understanding how mainframes work and their vulnerabilities. After all, the majority of financial institutions use mainframes. A subgroup is given the task of looking at employees on mainframes and seeing which ones could be manipulated into giving access to these fintech mainframes. They are looking for staff with drug habits and staff with financial problems or other issues that could be used against them. Another group has the task of getting keyloggers onto the laptops of systems programmers at mainframe sites.

A list of potential hacking techniques that have been used before are circulated amongst the hackers for them to see which still work and are useful for others to try.

They could attack sites using CICS. There are automated tools like CICSpwn available that could be used to identify potential misconfigurations, which could then be used by the hackers to bypass authentication. They could use the CICS customer front end and try a simple brute force attack to find a userid and password that would get them into the system.

They could use FTP. Two things need to happen first – keylogger software needs to capture the login credentials from a systems programmer, and a ‘connection getter’ needs to identify where to FTP to. Commands can be written to upload malicious binaries, and JES/FTP commands can be used to execute those binaries.

They could use TN3270 emulation software for their attack. Provided they have some potential userids, they could try password spraying, ie a few commonly-used passwords can be tried against every userid on the system.

NJE allows one trusted mainframe to send a job to another mainframe that it’s connected to. Hackers could use NJE to spoof a mainframe or submit a job and gain access to that other mainframe.

Then there’s potential vulnerabilities in Linux and other non-IBM software (like Ansible, Java, etc) that runs on mainframes.

Other techniques are available, but it’s not the function of this blog to make the job of nation state hackers easier. It is the job of this blog to ensure that every mainframe site is doing everything it can to ensure that it is secure against all forms of attack, and that it has software installed that can alert staff at the earliest opportunity that an attack has started, and the defence software needs to be able to suspend any suspect jobs as soon as possible.

Meanwhile, meetings like the one I’ve envisaged are probably going on, and mainframe-using companies in the West are going to be the targets in 2025. Don’t let yours be one of them.

Sunday, 1 December 2024

Rock solid AI – Granite on a mainframe

Let’s start with what people are familiar with, ChatGPT. ChatGPT is a highly-trained and clever chatbot. The GPT part of its name stands for Generative Pre-trained Transformer. Generative means that it can generate text or other forms of output. Pre-trained means that it has been trained on a large dataset. And Transformer refers to a type of neural network architecture enabling it to understand the relationships and dependencies between words in a piece of text. IBM’s Granite 3.0 is very similar to ChatGPT, except that it is optimized for specific enterprise applications rather than general queries.

Just a side note, I was wondering about the choice of name for the product. In the UK, the traditional gift for a 90th anniversary is granite. I just wondered whether there was some kind of link. In 1933 IBM bought Electromatic Typewriters, but I can’t see the link. Or maybe I’ve been doing too many brain-training quizzes!

Granite was originally developed by IBM and intended for use on Watsonx along with other models. In May this year, IBM released the source code of four variations of Granite Code Models under Apache 2, allowing completely free use, modification, and sharing of the software.

In the original press release in September 2023, IBM said: “Recognizing that a single model will not fit the unique needs of every business use case, the Granite models are being developed in different sizes. These IBM models – built on a decoder-only architecture – aim to help businesses scale AI. For instance, businesses can use them to apply retrieval augmented generation for searching enterprise knowledge bases to generate tailored responses to customer inquiries; use summarization to condense long-form content – like contracts or call transcripts – into short descriptions; and deploy insight extraction and classification to determine factors like customer sentiment.”

The two sizes mentioned in that press release are the 8B and 2B models.

In October this year, Version 3.0 was released, which is made up of a number of models. In fact the press release tells us that “IBM Granite 3.0 release comprises: 

  • Dense, general purpose LLMs: Granite-3.0-8B-Instruct, Granite-3.0-8B-Base, Granite-3.0-2B-Instruct and Granite-3.0-2B-Base.
  • LLM-based input-output guardrail models: Granite-Guardian-3.0-8B, Granite-Guardian-3.0-2B
  • Mixture of experts (MoE) models for minimum latency: Granite-3.0-3B-A800M-Instruct, Granite-3.0-1B-A400M-Instruct
  • Speculative decoder for increased inference speed and efficiency: Granite-3.0-8B-Instruct-Accelerator.

Let’s put a little more flesh on the bones of those models:

  • The base and instruction-tuned language models are designed for agentic workflows, Retrieval Augmented Generation (RAG), text summarization, text analytics and extraction, classification, and content generation.
  • The decoder-only models are designed for code generative tasks, including code generation, code explanation, and code editing, and are trained with code written in 116 programming languages.
  • The time series models are lightweight and pre-trained for time-series forecasting, and are optimized to run efficiently across a range of hardware configurations.
  • Granite Guardian can safeguard AI by ensuring enterprise data security and mitigating risks across a variety of user prompts and LLM responses.
  • Granite for geospatial data is an AI Foundation Model for Earth Observations created by NASA and IBM. It uses large-scale satellite and remote sensing data.

In case you didn’t know, agentic workflows refer to autonomous AI agents dynamically interacting with large language models (LLMs) to complete complex tasks and produce outputs that are orchestrated as part of a larger end-to-end business process automation.

Users can deploy open-source Granite models in production with Red Hat Enterprise Linux AI and watsonx, at scale. Users can build faster with capabilities such as tool-calling, 12 languages, multi-modal adaptors (coming soon), and more, IBM tells us.

IBM is claiming that Granite 3.0 is cheaper to use compared to previous versions and other LLM (large language models) such as GPT-4 and Llama

IBM also tested the Granite Guardian against other guardrail models in terms of their ability to detect and avoid harmful information, violence, explicit content, substance abuse, and personal identifying information, showing it made AI applications safer and more trusted.

We’re told that the Granite code models range from 3 billion to 34 billion parameters and have been trained on 116 programming languages and 3 to 4 terabytes of tokens, combining extensive code data and natural language datasets. If you want to get your hands on them, the models are available from Hugging Face, GitHub, Watsonx.ai, and Red Hat Enterprise Linux (RHEL) AI. A curated set of the Granite 3.0 models can be found on Ollama and Replicate.

At the same time, IBM released a new version of watsonx Code Assistant for application development. The product leverages Granite models to augment developer skill sets, simplifying and automating their development and modernization efforts. It simplifies and accelerates coding workflows across Python, Java, C, C++, Go, JavaScript, Typescript and more.

Users can download the IBM Granite.Code (which is part of the watsonx Code Assistant product portfolio) extension for Visual Studio Code to unlock the full potential of the Granite code model from here.

It seems to me that the Granite product line is a great way for organizations to make use of AI both on and off the mainframe. I’m looking forward to seeing what they announce with Granite 4.0 and other future versions.