Mainframe Update: December 2023

Sunday 10 December 2023

GSE Conference – what I learned on Wednesday

Last time, I was talking about the sessions I attended on the Tuesday at the excellent GSE UK Conference in the first week of November. This time, I want to tell you what I learned from the sessions I attended on the Wednesday.

At the 10:15 session, I was speaking in the AI stream, looking at the brain and what we mean by ordinary intelligence, before people start talking about artificial intelligence. The session was well-received, and I was asked to give it again at lunch time to some people who were unable to attend.

After the coffee break, I saw IBM’s Lih M Wang’s presentation entitled, “AI for IT Resiliency Use Cases”. She started by suggesting that we are in a new era of computing. The challenges facing IT Operations include:

Digital transformation with exponential business growth / cost. There are billions of transactions per day with unpredictable resource demands. And there are millions of Log and SMF records per day, but which indicators shouldn’t be missed?
Complexity of business applications across hybrid cloud. There are multiple components across platforms making it difficult to isolate problems. And there’s the impact of any changes, eg hardware / software / application changes.
Knowledge and skills gap for IBM zSystems. There’s limited cross-domain SME compared with number of systems managed. Plus, people need to know about the topology, inter-relationships, and dependencies.

Lih said that customers are asking: “Can AI-ML help?” They are looking for early warnings or sick symptoms. They also need to identify anomalous behaviour. Anomaly Analytics on IBM zSystems can: transform unstructured (SYSLOG) data into insights; turn performance metric (SMF) data into operational dashboards; and accelerate problem prevention by leveraging Machine Learning. IBM’s maxim is: Proactive, Prevent, Optimize.

Lih explained the difference between threshold monitoring and a Machine Learning (ML) baseline model. Basically, threshold monitoring is static, whereas ML can recognize what’s significant at much lower levels of activity. Using IBM Z Anomaly Analytics with Watson (ZAA), Machine Learning, and Enterprise Data Warehouse (EDW), data can be collected. Then, running typical workloads, the model can be trained. Thirdly, it can be scored by comparing models of expected behaviour with metrics. Visualization of the metrics shows how well the model runs on its own. Lih then showed how this worked with various scorecards for CICS, Db2, IMS, and MQ. Using colours, it becomes very easy to see where anomalies are occurring.

Real-time insights mean that the system will generate events when specified metrics exceed an anomaly threshold. Events will be shown in the main Problem Insights panel along with other events such as key single message events. Customer can select which of the KPIs to monitor for events and the threshold to use. Events can be forwarded to event monitors such as Watson AIOps. Selecting the Evidence column will take you to the scorecard with that KPI and time period open. Lih went on to give some customer examples.

I wanted to see BMC Software's Dave McCain's presentation called, "How can we use AI and user behaviour for better security monitoring". Unfortunately, I had a meeting. I hope to catch it another time.

The last AI stream session of the day was IBM Champion’s Henri Kuiper’s Jeopardy game. Jeopardy is a game that gives you the answer and you have to come up with the right question. For example, “This British mathematician and computer scientist is often considered the father of theoretical computer science and artificial intelligence. The answer is Alan Turing. Or, how about, “The 1956 workshop held at Dartmouth College which is often considered the birth of AI as a field”. The answer is, “What is the Dartmouth Workshop?”. Try this one, “This type of Machine Learning algorithm is inspired by the structure and function of the human brain and is used for tasks like image and speech recognition”. The answer is, “What is a neural network?”

There were other questions about unsupervised learning, GPT3, transfer learning, OpenAI, and so much more about AI and its history.

All-in-all, the GSE conference provided great education, brilliant company, and isn’t to be missed. I heartily recommend it to everyone who has an interest in mainframes. See you at the next one!

Sunday 3 December 2023

GSE Conference – what I learned on Tuesday

With 278 sessions across 18 streams, there was a lot of education and training going on across the three and a half days of the GSE UK conference this year. I thought that I’d share some of what I learned while I was there.

The first session I attended on the Tuesday was IBM Champion Henri Kuiper’s session to the AI stream entitled, “AI AI AI What Has Turing Started?”. Henri started by looking at the computers he had owned and how they had developed over the years, and then moved on to mainframe developments. He explained how the Turing test worked, and quoted John McCarthy from the 1950s saying, “Artificial intelligence is the science of making machines do things that would require intelligence if done by humans”. IBM’s Arthur Lee Samuel in 1959 said, “Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming eﬀort”. Henri talked about Eliza, your personal therapy computer, and much more as Artificial Intelligence (AI) developed. Alain Calmerauer in 1972 developed the Prolog programming language. Henri went on to discuss how AIs work and how they can be trained. He discussed deep learning and reinforcement learning, and Generative Large Language Models (GLLMs). He also explained how transfer learning could be used to avoid training a model from scratch and how it helps to improve the model’s performance on the target task (or domain) by leveraging already existing knowledge. And ended up with Gollems (GLLMs).

After that, I watched Elpida Tzortzatos, an IBM Fellow and CTO AI for IBM zSystems, discuss “AI for Business with Trust and Transparency”.

After lunch I went to a security session with Al Saurette from MainTegrity, who was discussing “Early warning of cyber attacks. Ways to stay ahead of the bad guys”. He highlighted how problems can occur even at sites with the best firewalls and access control because the bad guys can get their hands on stolen credential, trusted staff can go rogue, and people just sometimes make mistakes. He explained that hacking is a business. If the bad actors find a way in, they'll leave multiple backdoors so that they and others can get back whenever they want. They may install timebombs in case they get caught at this early stage. They'll compromise backups to prevent recovery of files. They'll take a copy of your data (exfiltration) to sell. And then encrypt your data and send a ransom demand. That, he explained, was why it was so important to always monitor what was going on and quickly identify anything that might damage your data. He called it integrity monitoring and alerting. You could then tell what files were affected, the time interval during which the attack took place, which userid or job was responsible, and check whether that change had been authorized (to avoid any false positives). The people alerted could also check the files line by line using before and after copies of the data. Because the bad actors can encrypt your data. The software needs to identify in the first seconds if unauthorized encryption activity is taking place and suspend the task. If later, it’s found that everything is OK, then the task can carry on from where it left off. If it’s not OK, you’ve just been saved from a mass encryption activity, and you won’t be sent a ransom demand. Al Saurette also described other early warning capabilities available and how important their use was. The other benefit of using the tools available was compliance with PCI DSS, NIST, GDPR and other regulations.

Lastly on the Tuesday, I watched “Jekyll and Hyde of Generative AI”, presented by Venkat Balabhadrapatruni, a Distinguished Engineer with Broadcom MSD. He started by saying that understanding the use case or business need should drive the right approach – whether that's using Artificial Intelligence, machine learning, deep learning, or generative AI. He explained that generative AI is a branch of artificial intelligence (AI) that focuses on creating content, data, or outputs, based on patterns learned from large volumes of training data.

There's a booming and evolving generative AI ecosystem. He suggested that generative AI will transform how organizations work over the next five years. IP, security, privacy, and ethical concerns will drive a vast majority of large Enterprise customers to adopt well-governed on-prem large language models (LLMs). Organizations will need assistance (LLM selection, in-house training, integration, etc) to fully capitalize on Gen AI value proposition. He then ran through a number of uses of AI.

Venkat Balabhadrapatruni suggested that the positive aspects of Gen AI were: creativity and Innovation; efficiency and automation; content summarization; language translation, and medical and scientific advancements. He then moved on to some challenges and ethical concerns, such as: privacy concerns; hallucinations; resource intensive, regulatory challenges; and lack of transparency. Lastly, he listed the potential dangers and misuses, eg: deepfakes and manipulation; intentional misuse; and bias, misinformation, and fairness. The key takeaways from his presentation were to start with the business need, not with the technology; recognize that generative AI is not the only AI; understand the data and algorithms; the onus is on the user to validate the responses from any generative AI technology; and that the responses from generative AI are only as good as the training data and the specificity of the prompt.

I’ll look at some of the sessions from Wednesday next time.