GenAI in Insurance Update: Q4 2024

Jeff Heaton

In Brief

AI expert Jeff Heaton presents his quarterly update of recent technological advances and outlines their relevance to the insurance industry. In this installment: a look at different AI model types and how to evaluate them. Have a question or comment for Jeff? Join the conversation.

Key takeaways

Multimodal model will make these previously separate technologies essentially obsolete and enable complex operations using simple human-language commands
The leading foundation model creators have launched several significant updates in recent months, including OpenAI o1, which should be of particular interest to insurers in dealing with the ramifications and comorbidities of multiple complex medical conditions.
As GenAI becomes more prevalent throughout the insurance industry, it is more important than ever to properly evaluate its output for accuracy, completeness, bias, and originality.

Welcome to the first installment of a new quarterly generative artificial intelligence (GenAI) update series from RGA.

In these updates, RGA’s Jeff Heaton will present recent technological advances and outline their relevance to the insurance industry. This article + video series is intended to create an open dialog, so readers are encouraged reach out and share how they are navigating or implementing the technologies mentioned or to get Jeff’s take on related areas of interest.

Multimodal models

Do you remember what going on a road trip was like not so long ago? You would pack up the car and bring a digital camera, a GPS system, a guidebook, and your cell phone. Now these devices have all been combined into a single device called a mobile phone. Of course, if you are anything like me, you use this device as an actual phone maybe only 5% of the time.

This convergence of technology from single-use devices to more comprehensive all-encompassing gadgets is common in technology, and such consolidation is already well under way in GenAI. Multimodal models have become more common in 2024. These models allow multiple media types to be both the input and output of GenAI.

In the past insurers might:

First use optical character recognition (OCR) to translate a policy applicant’s health record to digital text.
Then use natural language processing (NLP) to extract specific data points.
And finally send those data points to a predictive model to gain a meaningful understanding.

Now, you can use human language to tell the multimodal model what you want to know about the medical record and that one model will perform the task previously performed by three different pieces of technology. Like the mobile phone, the multimodal model will make these previously separate technologies essentially obsolete and enable complex operations using simple human-language commands.

Step boldly, but responsibly, into the world of generative AI with RGA and DigitalOwl.

Find out how

Foundation model landscape update

Foundation models are pre-trained – by companies such as OpenAI, Google, Amazon, Meta, and others – to perform many diverse tasks. If you have used ChatGPT, you have used a foundation model.

For many business tasks, a foundation model might be all you need. For more specialized tasks, you can fine-tune the foundation model or augment it with access to your own data through retrieval augmented generation (RAG). The leading foundation model creators have launched several significant updates in recent months. For example:

May 13, 2024: OpenAI releases ChatGPT 4o.
July 23, 2024: Meta releases Llama 3.1, the first open-source model that truly reaches parity on text-to-text tasks with the latest frontier foundation models like GPT-4 and Claude 3.5.
September 12, 2024: OpenAI releases OpenAI o1 (Strawberry) through the familiar ChatGPT website, adding new reasoning capabilities.

The new OpenAI o1 should be of particular interest to insurers in dealing with the ramifications and comorbidities of multiple complex medical conditions. Essentially, o1 breaks an initial question into a growing list of sub-questions that it researches, thus delivering a more “thought-out” answer. This reasoning capability also provides built-in self-checking, which could reduce or eliminate hallucinations – those answers that GenAI simply makes up to fill in knowledge gaps.

Foundation models can be either open or closed. An open model, such as Llama, can run on any computer system. There are even “small, large language models” that can run on conventional laptops and are surprisingly powerful.

One such small LLM model is Mistral 7b from Mistral AI. This technology can easily fit within 28GB of RAM, or even half of that with proper optimizations. Given that 32GB laptops are becoming more common and that not all of the LLM must be loaded to RAM at once, it is entirely possible to run Mistral 7b from a laptop.

Evaluation of GenAI models

As GenAI becomes more prevalent throughout the insurance industry, it is more important than ever to properly evaluate its output:

Accuracy: How correct is the information the LLM is providing? The most extreme forms of inaccuracy are the much-maligned hallucinations, which present LLM-invented information as factual.
Completeness: Summarization is a common task for GenAI; however, summarization must inherently pass over some information to elevate the importance of other information. Without clear instructions, this can be the very definition of bias. Users must specify the information they want and measure the completeness of the result from the LLM.
Bias: LLMs can operate on very minimal tasks, but any lack of concreteness in a request is necessarily filled with the LLM’s biases. If you ask for code and do not specify the language, you will get Python. If you ask about insurance regulations and do not specify a region, you will get an answer from a US context. Even with precise instructions, you must measure and understand the baseline bias in any model used.
Originality: Copyright concerns are one of the most important issues for GenAI practitioners to work through. How original is the result from your GenAI model? If you generate a random face to use in a presentation, is this a real person? Consider this: Reports have identified LLM-generated synthetic data originating from a security breach from a few years ago. Plagiarism checkers are a well-developed industry and can potentially help here.

As we wrap up the final quarter of a very eventful 2024, I can only imagine what 2025 may have in store for the industry as both GenAI technology and insurers’ skill at using it continue to advance. Look for my next update in Q1 2025.

Have question or comment for Jeff? Join the conversation.

Meet the Authors & Experts

Author

Jeff Heaton

Vice President, AI Innovation