What is Machine Learning for z/OS?
Machine Learning is a type of
artificial intelligence (AI) that provides computers with the ability to learn
without being explicitly programmed. Using
algorithms that iteratively learn from data, machine learning allows computers
to find hidden insights in the data without being explicitly programmed where
to look.
Machine learning systems can find correlations in data and recognize
patterns to provide early detection and to predict events before they
happen. This can mean early detection of
healthcare conditions, prediction of factors that lead to better patient
adherence or better clinical outcomes, or algorithms to reach new heights of
personalized care and tailored treatment protocols.
Machine learning projects
generally include tasks such as data cleansing and ingestion, data feature
engineering and selection, data transformation, model training, model
evaluation, model deployment, scoring, re-evaluation, and re-training (feedback
loop). Many of these tasks need to be performed iteratively to get to desired
results. Each task requires heavy engagement from experienced analytics
personas across the organization from data scientist and/or software/data
engineers to application developers. As such, a machine learning project
usually takes weeks to months before a usable model could be generated and
deployed in production.
IBM Machine Learning for z/OS
(Machine Learning for z/OS) is an end to end enterprise machine learning
platform that will help to simplify and significantly reduce the time for
creation and deployment of machine learning models by:
- Integrating all the tools and functions needed for machine learning and automating the machine learning workflow.
- Providing a platform with freedom of choice and productivity for better collaboration across different personas including data scientist, data engineer, business analyst and application developers, for a successful machine learning project.
- Infusing cognitive capabilities into the machine learning workflow to help determine when model results deteriorate and need to be tuned and provide suggestions for updates or changes.
Machine
learning is needed where business rules are rapidly changing, or where
application development can’t keep pace with changes that need to be made, or
where applications need to be continually tuned. Instead of writing lots of
complex business rules you would use machine learning, select the appropriate
algorithm and parameters to build the model. Once the model is created, it can
be trained on historical data and deployed to recognize patterns to make future
predictions. Predictions are retained
and compared to actual result as part of model monitoring. As environment
evolves, model results may deteriorate at which time, the data scientist can
choose to retrain the model with stored feedback data. By simplifying model
management, Machine Learning for z/OS reduces the amount or maintenance in an
application because the model is "aware" and always learning,
becoming smarter over time.
Why is
Machine Learning Important to zEnterprise Customers?
Many of our enterprise
customers have expressed an interest in leveraging the latest analytics technology
with the flexibility to deploy on premise, in the cloud or in a hybrid
environment. That said, many of our z
Systems customers are not yet ready to move their most sensitive data to the
cloud. They want to take advantage of
their existing significant investment in infrastructure, minimize costly data
movement and ensure data governance/security. For some customers this will be
their first entry into the machine learning domain. For them we have made this process much
simpler by lowering the bar for development and maintenance of predictive
behavior models. For some customers,
with already extensive data science expertise we have simplified the
development and maintenance process by providing cognitive expertise to build
behavioral models and automation to maintain those models over time -- freeing
up their data developers and data scientists to work on enhancing their
existing models and to bring data science to new areas of the business.
Machine
Learning for z/OS also offers RESTful APIs and programming APIs to perform
tasks such as transactional scoring.
Scoring allows zEnterprise customers to evaluate a transaction against a
machine learning model to determine in real time e.g. risk of pre-diabetes,
likelihood of medication adherence/compliance, risk of over-payment prior to
claims payment, and to make real time decisions based on these information
(e.g. elastic drug pricing). This type
of real time scoring requires access to the actual transactional data which
means the model scoring engine should be collocated with the transactions to
meet transactional SLAs. Machine
Learning for z/OS includes the various tools and functions needed to train and
deploy machine learning models and automating machine learning workflows. It includes collaboration features for
personas such as data scientists and application developers. It also includes capabilities to determine
when models need to be tuned and advise changes. Through its web UI, RESTful APIs and
programming APIs, it provides a suite of functions to ingest all types of
zEnterprise data, transform and cleanse the data, train models with a selected
algorithm using the data, evaluate a trained model, select optimal models/algorithms
through the Cognitive Assistant for Data Scientist (CADS) interface, manage
models, deploy models into production, automate feedback to ingest new data and
re-train models, monitor model status and resource utilization, RESTful APIs to
call for online scoring with models, a data scientist notebook interface to use
machine learning APIs in interactive mode.
IBM makes it possible for
customers to satisfy these requirements while benefiting from the latest
analytics advancement like Machine Learning for z/OS. They can access z Systems
data in place and combine that data with other sources of information, such as
structured and unstructured data from other systems. They can then build models
to predict customer behavior to make the most optimal business decisions. And
by accessing live data they can be more agile. This is exactly what our large
customers want to do.
What is the
IBM DB2 Analytics Accelerator?
The IBM DB2 Analytics
Accelerator for z/OS (the Accelerator) is a high-performance appliance for DB2
z/OS that deeply integrates Netezza balanced and highly parallelized asymmetric
massively parallel processing technology with IBM z Systems technology at the database
kernel level. The accelerator allows DB2
to offload data-intensive and complex static and dynamic DB2 queries (e.g. data
warehousing, business intelligence, and analytic workloads) to the accelerator
without any application changes. With the accelerator, these queries can be
executed significantly faster than was previously possible, while avoiding expensive
general purpose CPU (GP) utilization in DB2 for z/OS. The performance and cost savings
of the Accelerator opens up unprecedented opportunities for organizations to
make use of their data on the zEnterprise platform.
The analytics accelerator is
conceptually the same as a hybrid automobile.
The hybrid automobile has a standard vehicle user interface (e.g.
steering wheel, brake, accelerator pedal).
A hybrid automobile may at any given time run using its gasoline or
electrical power source to optimize fuel economy. The switching between power sources to
optimize fuel efficiency is done by the automobile itself without requiring
constant manual intervention by the user or a change in the standard vehicle
API’s.
With
the DB2 Analytics Accelerator, DB2 for z/OS can offload data-intensive and
complex static and dynamic DB2 for z/OS queries, such as data warehousing,
business intelligence and analytic workloads, transparently to the application.
The DB2 Analytics Accelerator then executes these queries significantly faster
than previously possible—all while avoiding CPU utilization by DB2 for
z/OS. It allows users to run workloads
that historically were offloaded from z Systems, or run queries that were
governed or shunted in DB2 for z/OS such as ad hoc queries whose performance
characteristics are typically unknown at runtime. And IT administrators can
allow DB2 for z/OS to choose where to run these queries, or they can force
these queries to the DB2 Analytics Accelerator to prevent additional DB2 for
z/OS consumption.
- The accelerator delivers dramatic improvement in response time on unpredictable, complex, and long-running dynamic and static query workloads. It helps in meeting SLAs and shortening batch windows by offloading complex query workloads. The idea is to keep what’s working well in DB2 and improve response times for CPU intensive queries.
- The accelerator allows users to run new workloads that had previously not been considered for the MF or run queries that had previously been governed or shunted in DB2 (e.g. Ad-hoc queries whose performance characteristics are typically unknown at runtime). Clients can allow DB2 to choose where to run these queries, or they can force these types of queries to the accelerator to prevent additional DB2 consumption.
- By offloading resource intensive queries and the associated processing onto the accelerator, clients can lower MSU consumption. Additionally, they can reduce the cost of storing, managing, and processing historical data with a near line storage solution.
- There is also the reduction in costs associated with the time it takes to perform general tuning and administration tasks associated with supporting and improving performance for resource intensive workloads in DB2 for System z.
- Clients can also lower or eliminate the cost of acquiring HW and SW for data warehousing and analytics as well as lowering or eliminating the cost incurred from data movement, transformation, landing, storage, and maintenance of systems. With the accelerator, clients can consolidate disparate data to their existing zEnterprise platform while benefiting from integrated operational BI.
- With Accelerator-only tables and in-DB transformation capabilities, data can be Extracted from a number of source systems, Loaded into the Accelerator, and Transformed within the Accelerator (ELT). Applications directly access the transformed data through DB2 for z/OS. Accelerator-only tables can be used to store transformed data ‘only’ in the Accelerator and not maintain a second copy in z/OS.
- Increased organization agility by being able to more rapidly respond with immediate, accurate information and deliver new insights to business users.
- Reporting is consolidated on zEnterprise where the majority of the data being analyzed lives, while retaining zEnterprise security and reliability.
How Does the Analytics
Accelerator Complement and Improve the Enterprise Data Lake Strategy?
The
Analytics Accelerator was designed to be used in concert with DB2 z/OS with a
vision to become the first true Hybrid Transactional and Analytics Processing
Engine (HTAP). The Analytics Accelerator
was intended to be complementary to a zEnterprise data lake strategy and not
competitive. Several new features within
the Analytics Accelerator actually reduce the costs of data movement to the
data lake AND improve the data latency of the data that is landed in the data
lake.
In
2017, two new features will further the Analytics
Accelerator’s ability to complement a zEnterprise data lake strategy.
- Transactional consistency in the Analytics Accelerator: With this feature, DB2 applications will no longer need to be concerned with data currency within the Analytics Accelerator: the most current result set will be guaranteed. This removes the largest obstacle for much broader use of the Analytics Accelerator. Today many customers hesitate to use the Analytics Accelerator because they cannot guarantee that the queries can tolerate potentially stale data. With this feature, there will be no difference in latency between data returned by DB2 and by the Analytics Accelerator. This will make DB2 + the Analytics Accelerator the only true Hybrid Transactional and Analytics Processing Engine (HTAP) solution in the market.
- Remove the cost of replication to the Analytics Accelerator from the 4HRA: When customers say 'we can replicate to other environments', there will be 2 major advantages with the Analytics Accelerator. First is that they cannot guarantee transactional consistency when replicating to a separate environment (see above). Second; when sending data to an external environment, replication and ETL has a cost on z/OS on top of the standard People, Process, Infrastructure, Liability of Data Breach costs from maintaining 2 copies with 2 separate access points. See our 'Cost of ETL' Calculator. With this feature, the cost of replication to the Analytics Accelerator will be removed from the 4HRA. Any other replication or ETL to disparate environments will impact the 4HRA and thus lead to additional costs.
As
was mentioned above, the Analytics Accelerator also supports Accelerator Only
Tables (AoT’s). With Accelerator-only tables and in-DB transformation
capabilities, data can be Extracted
from a number of source systems, Loaded
into the Accelerator, and Transformed
within the Accelerator (ELT).
Applications directly access the transformed data through DB2 for
z/OS. Accelerator-only tables can be
used to store transformed data ‘only’ in the Accelerator.
What this all means is that the Analytics Accelerator data will be transactionally consistent with DB2
data. The replication of data from DB2 to
the Analytics
Accelerator will be $0 cost. The Analytics Accelerator will support in
accelerator transformations of data.
Therefore, data can be replicated to the accelerator, transformed to
match the structure of data in the data lake, and extracted with 0 latency from
the DB2 data without incurring any costs in DB2 AND without having to extract
to a ETL server in between to do the transformations. Such a solution avoids the high cost of
extraction of data from DB2 for System z, the cost of maintaining a set of ETL
servers and complex ETL flows (Test, Prod), the additional liability of data
breach from maintaining additional data copies and interfaces to these copies,
the latency in moving this data to disparate systems before landing to the data
lake, etc. Many customers are already
using federation technologies between DB2 + the Analytics Accelerator and the
data lake (Big SQL, Impala) to reduce data movement processes. With HTAP, $0 cost of replication to the
Analytics Accelerator, and AoT’s, the Analytics Accelerator is completely
complementary to the enterprise data lake strategy and reduces costs, liability
of data breach, and latency associated with getting data from System z to the
data lake.
The proposed solution
architecture, with Machine Learning for z/OS and the IBM DB2 Analytics
Accelerator at its core, is intended to drive substantial new analytics driven
revenue for clients while reducing existing people, process, and infrastructure
costs. This solution provides the
tooling to derive a tremendous amount of actionable insight from its
transactional data (monetize its transactional data), reduces existing costs by
reducing data/infrastructure sprawl across the enterprise, improves existing
Service Level Agreements (SLAs), reduces data latency for analytics
initiatives, improves data governance, etc.
Ultimately, the goal of Machine Learning for clients is to take new,
transactional AI solutions to the market in an efficient and scalable
manner. In the case of a 'Transparent Pharmaceutical Benefits Manager (PBM)', machine learning
and the analytics accelerator can serve as the transactional analytics engine
that deliver new revenue opportunities to a consumer. Showcasing state of the art analytics and AI
solutions may also attract new PBM opportunities (e.g. marketing machine
learning based formulary and rebate management processes to earn new claims
adjudication business). Some examples of
opportunities for Machine Learning and the Analytics Accelerator in the PBM example are:
Example 1: Health Outcomes Optimization; ex Diabetes
Most health conditions being treated have metrics associated with
success. Conditions can be segmented
into common chronic (i.e. diabetes, asthma, high cholesterol, high blood
pressure, heart disease, arthritis, etc.) and uncommon high cost/needing
specialty medications (i.e. RA, Crohns, multiple sclerosis, cancers). Diabetes has very clear metrics tied to
success (ABC: A1c = average blood sugar;
B=Blood pressure; and C = cholesterol).
Unfortunately, payers and providers have limited views on the successful
metrics for a given population. A PBM can build out a predictive risk model to provide a health score for
patients with Diabetes and thereby segment the diabetes population into well
controlled, moderate control and poor control.
By having this information available for real time analysis inside it’s Db2 adjudication system, a PBM can enable its health plan clients to “treat/manage” these segments
differently – i.e. someone who is poorly controlled may receive additional
counseling at the pharmacy, have a different copay for the member or have a
different message to the physician. At
both the point of care in the doctor’s office and the point of sale, the PBM
would measure the adherence to medications.
If someone is not at goal, and was not taking their medications
regularly, an adherence program could be implemented. If the patient was taking their medications,
then a more potent medication or a new medication may be needed.
The value of doing this type of analysis to consumers is that the PBM can help patients meet clinical goals and drive lower copay's to the
consumer. For the physicians, this type
of analysis can be used to drive pay for performance programs. This analysis can also be used to drive value
between the health plans and pharmaceutical companies. By leveraging the concept of differential
rebates, this technology can help members achieve clinical goals. By increasing achievement in clinical goals,
the pharmaceutical companies get paid more, and the health care systems can
reduce costs. A PBM can monetize this by further aligning itself
with the health systems (increased value to the health system from better
clinical outcomes, more effective transactional scoring and auditing within
fast pass and e-Prior Authorization control processes, etc.) and potentially
driving increased revenue through its ‘prescription outcomes’ contracts.
Example 2: Major changes in “risk” - Resource Utilization Bands (RUBS)
Example 3:
Showcasing the Value of Machine Learning Driven Insight to Existing Clients
With
the ability to access medical data files from existing customers, a PBM can:
- Use ML capabilities to show correlations (e.g. patient attributes and co-morbidity) using medical/health data
- Apply Johns Hopkins ACG functions to this data
- Show clients the value of ML to clinical outcomes
- Integrate ML features into the existing application (e.g. via a Bot)
- Sell this new application as a service to clients
Other
clients may have other interesting data sources. For example, some customers may engage human coaching companies
who have a wealth of data, interactivity with member, and a wealth of
asynchronous communications that can be leveraged in Machine Learning modeling.
Example 4: Fast Pass, e-Prior
Authorization, Alternative Drug Recommendation
Example 5: Drive new
revenue at Hospital Systems
There are several immediate potential opportunities
that exist within small to medium hospital health systems using Machine
Learning.
The first opportunity is with employee health at
these hospital systems. Small to medium
systems may have 50K employees. In the
case of employee health, every 10K employees represents $100M in employee
spend. Machine learning driven insight
can be used to show these hospital systems how a PBM can help save 5-7% on
employee health costs and improve the qualities of service for its employees.
The second potentially large opportunity is to use
machine learning to help hospital systems optimize revenue for specialty
products. A transparent PBM typically wants to align with the hospital systems. For example, there are cases where hospital
systems are treating patients that require expensive drugs (MS, HIV). Historically, some of these health systems
started prescribing the drugs and sending them out to a 3rd party who would
handle the filling of the medication.
These drugs often represented $50K of medication. This presents an opportunity for the PBM to showcase what they can do as a
partner and sell new core services to the hospital health system.
Smaller hospital health systems may also be more
interested in population health management.
For instance, understanding the factors that lead to some people taking
medications and others skipping or not filling their medication. Machine Learning is key to uncovering factors
that humans may not have previously considered.
Example 6: Drive Value
to Retail Clinics/Stores
Promote patient medication adherence using other
financial motivators such as free co-pay cards to use with retail pharmacy’s or
retail grocery store coupons for health food options.