Content Hub

Sixfold Content

Sixfold News

Sixfold Raises $30M Series B

Sixfold has raised a $30M Series B to accelerate the next phase of growth in building the AI Underwriter. The round was led by Brewer Lane, with strategic investment from Guidewire and continued support from Bessemer Venture Partners and Salesforce Ventures.

Explore all Resources

Stay informed, gain insights, and elevate your understanding of AI's role in the insurance industry with our comprehensive collection of articles, guides, and more.

Sixfold News

Scaling AI in Underwriting: Lessons from Skyward & Zurich

Most insurers have run an AI pilot. Far fewer have successfully scaled it. At a Sixfold webinar, Amy Nelsen at Zurich North America and Jim Mormile at Skyward Specialty shared what that journey actually looks like.

5 min read

Maja Hamberg

Most insurers have run an AI pilot. Far fewer have scaled one. At a recent Sixfold webinar, two carriers shared what it actually took to get there, and what production at scale really looks like.

Amy Nelsen, Head of UW Operations for US Middle Market at Zurich North America, and Jim Mormile, President of Professional Lines at Skyward Specialty, spoke about their experience rolling out Sixfold across their organizations.

Skyward is now live across more than 10 product lines with nearly 100 underwriters. Zurich has deployed across 30 US offices, with more than 200 underwriters using Sixfold in their daily workflow.

Five key steps for successfully scaling AI in insurance underwriting.

We walked through our five-step scaling framework with both of them to hear, in their own words, how each stage played out in practice.

Step 1: Pick Your Focus

The biggest mistake teams make is starting with AI and working backwards to find a problem. Both Amy and Jim were deliberate about doing the opposite, finding a genuine pain point first and then asking whether AI could solve it.

For Skyward, the problem was the triage wall. Underwriters were spending hours manually working through submissions just to determine basic appetite fit.

For Zurich it was starting with a use case which underwriters didn’t like doing: documentation.

"We had a healthcare risk where the underwriter got about a 75-page submission. It wasn't until page 56 to 59 that they realized the submission should not be covered since it was out of appetite."
‍
─Jim Mormile @Skyward Specialty

Step 2: Prove It Works

Before scaling, you need a few signals that it's actually working, but that doesn't always come from a dashboard. Both Jim and Amy found that meaningful early indicators were user feedback.

For Jim, it was a veteran underwriter pulling him aside unprompted.

"The anecdotal evidence that really made us realize it was working: it was a veteran underwriter that came up to me and said, '[Sixfold] actually changed my day. It sped up my work progress and workflow'"
‍
─Jim Mormile @Skyward Specialty

The follow-up signal was equally telling “underwriters stopped asking whether they should use Sixfold and started asking when it was coming to their line of business.” That's when Skyward knew it was time to expand.

Step 3: Map Your Expansion

Scaling AI isn't a single rollout but it's a series of decisions about sequencing, readiness, and change management. Both organizations took a structured approach, but in different ways.

Skyward scored each of their 16 lines of business across criteria like guideline robustness, process standardization, underwriting complexity, and, critically, how tech-forward the underwriting team was. They then hand-picked early adopters rather than opening the floodgates.

Zurich took a geographic approach, piloting across four offices before expanding countrywide, and intentionally included underwriters at different levels of tech comfort so they could anticipate resistance before it became a problem at scale.

Both teams also addressed job security fears head-on.

"It's less about the underwriting role becoming irrelevant, and more about if you handle a $10M book of business versus a $20M book of business."
‍
─Amy Nelsen @Zurich North America

Jim's approach was to be explicit about what AI wouldn't do: no automated decision-making, no replacing the underwriter's final judgment. The framing was always about giving underwriters better information, not replacing their expertise.

Step 4: Roll Out in Phases

Both teams learned that trying to do too much too fast creates fatigue, and fatigue kills adoption. Skyward actually hit pause on one line of business after underwriters started showing signs of frustration. Rather than pushing through, leadership made the call to step back.

A few months later, that team came back ready to re-engage, pulled in by the FOMO of watching other lines of business benefit.

On the flip side, Skyward's phased approach led to real efficiency gains in deployment speed. Their first two lines of business took 12 to 14 weeks from introduction to production. By the time they were rolling out subsequent lines, they'd cut that down to eight weeks.

Amy's experience at Zurich echoed the same takeaway on speed:

"The most amazing part is how fast you can get from just a concept to deployment with Sixfold."
‍
─Amy Nelsen @Zurich North America

Step 5: Assess Impact

Once AI is live in production, measuring success means looking at two things in parallel: business outcomes and output quality. Neither alone tells the full story.

For Jim at Skyward, that meant tracking time to quote and number of quotes out the door, but also running a monthly accuracy review and a sentiment score across underwriting teams.

"We constantly look at both ends of the spectrum: the ROI metrics, and whether the accuracy is there to give our underwriters the information they need to make better decisions."
‍
─Jim Mormile @Skyward Specialty

Amy's approach at Zurich was similar, and revealing in its simplicity. Sixfold's success isn't measured separately from the business; it's measured through the business. When AI becomes embedded enough that you stop evaluating it as a standalone tool and start measuring it through your core business metrics, that's when you know it's really working.

"We're measuring our business outcomes like how many quotes underwriters are getting out the door and how much business have they bound. We also meet with Sixfold every month to look at quality metrics."
‍
─Amy Nelsen @Zurich North America

The Final Wisdom

Scaling AI in underwriting isn't primarily a technology challenge, it's a people and process challenge. Both Amy and Jim closed with the same underlying message: be preapred for that thing change quickly in the world of AI, don't be afraid to fail, and don't stay stuck in proof-of-concept mode forever.

The insurers that will succed at scaling AI are the ones that move deliberately, learn fast, and bring their underwriters along for the ride from day one.

Watch the video recording of the entire webinar here.

Sixfold News

Sixfold's 100 Billion Token Milestone

Sixfold was featured at OpenAI’s Dev Day Keynote, recognized for crossing 100 billion tokens processed. We sat down with our Senior AI Engineer, Drew Das, to talk about what that milestone really means for underwriting.

5 min read

Sixfold was featured on the big screen at Open AI’s DevDay 2025, recognized for surpassing 100 billion tokens processed. To make it even better, Senior AI Engineer, Drew Das, was honored among global developers for his contributions.

This is a huge milestone for Sixfold, a reflection of the real impact our AI is making in underwriting. But what does 100 billion tokens actually mean in practice? We sat down with Drew to find out.

Can you start by explaining what exactly OpenAI recognized Sixfold for at Dev Day, and what it felt like to see your name up on the screen?

Seeing my name on stage felt like public validation of the engineering team’s hard work. Sixfold is operating at the frontier of large-scale AI in insurance underwriting. We’re not a pilot anymore; we’re in production.

It also hit home the responsibility that comes with it. Yes, we’ve built scale, but now we have to make sure what we produce is high quality and meaningfully used. Sixfold isn’t experimenting with LLMs; we’re running them reliably where accuracy, latency, and auditability really matter.

We’re not a pilot anymore; we’re in production.

What does “100 billion tokens” actually represent in Sixfold’s day-to-day work?

Each token is a fragment of understanding, a piece of text, data, or context that our models interpret and turn into something underwriters can act on.

It signals the volume of underwriting data, documents, broker submissions, and risk information that our platform analyzes and processes through generative AI workflows.

Hitting 100 billion means we’ve moved many workflows off manual underwriting review and into AI. When repetitive work is reduced, underwriters spend more time on relationships, strategy, and higher-value decisions.

Hitting 100 billion means we’ve moved many workflows off manual underwriting review and into AI. When repetitive work is reduced, underwriters spend more time on relationships, strategy, and higher-value decisions.

So… what’s next, another 100 billion tokens?

Token count is becoming table stakes. What matters now is how those tokens are applied and the outcomes they create.

We’re focusing on deeper workflow integration, bringing AI agents that go beyond risk assessment to support decisions, automate workflows, and deliver real-time underwriting intelligence. Our Research Agent is a great example of that.

Considering the high-stakes nature of insurance, every risk insight and decision generated by Sixfold must be auditable and traceable. Our solution isn’t just built for scale, but for governance.

On the engineering side, what does it take to run something at that scale?

From an engineering perspective, scale brings complexity. We’re constantly balancing latency, cost, and accuracy.

Many AI projects stop at the “cool demo” stage, but we’re pushing through the messy engineering: hybrid search, re-ranking, prompt tuning, and evaluation, all happening under the hood.

Scaling means better signals, more edge cases surfaced, and faster model learning. It’s a flywheel that keeps getting smarter.

Scaling means better signals, more edge cases surfaced, and faster model learning. It’s a flywheel that keeps getting smarter.

Referral Agent: From “To-Do” to “Done” in Seconds

P&C & Speciality

Product Update

Referral Agent: From “To-Do” to “Done” in Seconds

Sixfold is shifting from insight to action. Our AI has helped underwriters surface risks, highlight insights, and guide decisions, but now we’re taking the next step with a suite of agents designed to clear bottlenecks in the underwriting process. First up: Referral Agent.

5 min read

Maja Hamberg

Since AI’s introduction in underwriting, it has been about surfacing information: flagging risks, highlighting important insights, and providing recommendations. But there's always been a gap between AI that knows what needs to be done and AI that actually does it.

Sixfold is building AI agents that close that gap: agents that work alongside underwriters to move cases forward autonomously. Referral Agent marks a shift in what Sixfold can do for your underwriting team, starting with one of the biggest bottlenecks we hear about from underwriters: referrals.

“We’re no longer just telling underwriters what to do; we’re helping them do it. This launch represents a major shift in how we drive efficiency and consistency in underwriting — turning “here's what you need to know’ into “here’s what we’ve already done about it”. The Referral Agent is just the beginning of that vision.”

- Alex Schmelkin, Founder and CEO of Sixfold

Where 40% of Cases Get Stuck

When we asked underwriters about their biggest workflow frustrations, referrals came up again and again.

Every underwriter knows this routine: you spot a case that needs escalation, and suddenly you're building a package. Pull the rules. Cross-reference submission details. Write a detailed email explaining your reasoning. Make sure the approver, such as senior underwriter, has everything they need to make a decision.

It's necessary work, but it takes about an hour per case. With up to 40% of applications requiring escalation, underwriters can spend half their day packaging referrals instead of evaluating risk.

The ripple effects hurt everyone. Brokers wait days for responses while referrals sit in approval queues. Approvers get packages missing key details and have to circle back with questions. What should be a quick handoff becomes a multi-day process that slows everyone down.

AI That Completes the Loop

Referral Agent transforms how this process works. Instead of stopping at identification that a case needs referral, it handles the next steps as well. It executes referrals on its own: flags when a case needs escalation, assembles the complete referral package and drafts a ready-to-send email in your preferred tone.

The underwriter's role changes from package preparation to package review, freeing up time from admin work and letting them focus on more strategic decisions.

The repetitive process that used to take 60 minutes? It now takes 1 minute.

How it works

Sixfold knows the insurer’s unique risk appetite, applies what it’s learned, and clearly shows if and why a referral is required after case analysis is completed.

When the agent flags a case for escalation, it constructs a complete referral package in email form: a business summary including the company's operational focus and a structured rationale section that walks through each referral guideline breach step-by-step, citing the exact percentages, dollar amounts, and policy thresholds that might have triggered the referral requirement.

With this launch, we’re making the underwriter’s inbox part of the workflow. Underwriters can just forward a broker’s email, with attachments, straight to Sixfold, and the risk assessment kicks off automatically. When it’s time to refer, a pre-drafted email is done, ready to be sent with a single click from the underwriter’s default email provider.

	Without Sixfold
Time spent deciding if referral is needed, building the package, and drafting the email	~ 60 minutes
Referral Criteria	Manually checked
Quote turnaround	Often delayed 2–3 days
Built-in Confidence	Complex appetite guidelines to review and manually check if case needs referred

With Sixfold

~ 1 minute

Automatically checked

Same-day approvals, approvers get everything they need the first time

Reliable AI that instantly applies your guidelines and gives confidence to proceed

Time spent deciding if referral is needed, building the package, and drafting the email

Without Sixfold

~ 60 minutes

With Sixfold

~ 1 minute

Referral Criteria

Without Sixfold

Manually checked

With Sixfold

Automatically checked

Quote turnaround

Without Sixfold

Often delayed 2–3 days

With Sixfold

Same-day approvals, approvers get everything they need the first time

Built-in Confidence

Without Sixfold

Complex appetite guidelines to review and manually check if case needs referred

With Sixfold

Reliable AI that instantly applies your guidelines and gives confidence to proceed

Benefits by Role

For Referrers ‍

- Always know when to refer
- No more repetitive referral tasks
- Hours back every day for actual underwriting

For Approvers

- Every referral is standardized, making it easier to review
- With complete referral packages, you can decide faster - often same-day
- Automatic audit trail for compliance and training

For Insurers

- Referral guidelines are applied consistently across every team
- Quotes out faster → happier brokers → improved hit ratios
- A process that scales with every underwriter hired

The Broader Vision

‍Referral Agent is part of a shift in what Sixfold can do for your team. In 2024, Sixfold gave underwriters a brain that reads and assesses; now Sixfold equips them with a brain that acts: AI agents that move cases forward on their own.

Each Sixfold agent plugs into your existing workflow to solve a specific bottleneck. Backlog of referrals? Deploy Referral Agent. Team spending hours researching online? The Research Agent does it in minutes. Endless broker negotiations slowing you down? The Negotiation Agent (coming soon) handles it. Use them individually or stack them together,

“Every underwriter knows that sinking feeling of a growing referral queue. With the Referral Agent, that queue clears itself. It proves that AI can take action, not just give advice. It's the first of many agents that free underwriters from processor work so they can become portfolio strategists."

- Lana Jovanovic, Head of Product @Sixfold

Experience It Live: Try Referral Agent at ITC Vegas in October. Forward an email, watch the risk assessment run, and generate a referral in real time. Book your ITC meeting with Sixfold, or schedule a digital demo.

Product Update

Life & Health

New in Life & Health: Clear Condition Stories with Clinical Insights

Introducing Conditions and Core Clinical Data for Life & Health: Our latest update helps underwriters see the full picture, faster. Conditions tie related facts to a diagnosis. Core Clinical Data brings key lab results into one view. The result? Quicker reviews, clearer decisions, and better outcomes.

5 min read

Ana Clara Ribeiro

Life & Health underwriters often have to go through hundreds of pages of medical documents to understand an applicant’s health profile, knowing that one missed detail could lead to the wrong coverage decision.

When we introduced our Life & Health Underwriting AI, we set out to give underwriters everything they need in one place: diagnoses, medications, and procedures pulled from applications and supporting documents, aligned to the insurer’s unique risk appetite. They no longer had to spend time manually going through medical documents.

But after getting feedback from underwriters across global insurers, we realized something important: presenting information alone isn't enough. What they really need is the full health story, with all the pieces connected. For example, if a medication appears, they need to know the full context around it: how often it was prescribed and any related diagnoses or procedures.

That’s why we’re upgrading our Underwriting AI with Conditional Insights and Clinical Data, designed to reflect the way underwriters think about the overall health profile.

What’s New?

Conditions: From Medical Facts to the Full Story

*The new conditions view brings together all relevant facts tied to a stated diagnosis, such as medications, procedures, and condition history.*

Conditions completely change how Sixfold’s insights are presented to underwriters.

Previously, all relevant data, like Personal Health, Medications, Procedures, and more, appeared as individual facts. While surfacing this information is essential, it didn’t show underwriters how it all connected.

The thing is, Life and Health underwriters don't analyze each fact in isolation; they think about how it fits into the bigger picture. What does this medication suggest? How severe is the condition? How does it all connect?

That’s why Sixfold now brings together all relevant information under a diagnosed condition, including:

Medications and ongoing treatment
Procedures and lab results
The condition’s full history, with context and progression over time

For example, take Gastritis, an inflammation of the stomach lining that can cause pain, indigestion, and discomfort. An applicant might have been diagnosed with it a while ago, is taking medication such as Pantoprazole, and could have undergone procedures like Endoscopies five years ago.

Instead of treating these details as in isolation, Sixfold now recognizes they’re all part of the same conditions story. This mirrors exactly how an experienced underwriter would naturally connect the dots, seeing not just health facts, but the bigger picture of an applicant’s health history.

“This release has been a huge focus for our team, and the feedback from early users has been very positive. They’ve shared firsthand how impactful it is to have a single experience that brings together every aspect of an applicant’s medical history.”

- Noah Grosshandler, Product Manager at Sixfold.

Want a closer look at how Conditions work? Join our live product demo on September 11th.

Core Clinical Data: Lab Insights and Historical Trends

Core Clinical Data brings key lab results, like Vitals, Cardiovascular Health, and Hematology, into one section, showing values, normal ranges, test dates, trends, and clickable sources for every data point.

Underwriters often have to go through dozens of lab results to answer: Does this applicant have an underlying condition? Are the results concerning? Could they worsen over time?

Core Clinical Data brings all of that information into one place, instantly. It pulls information found in lab results and medical records uploaded to Sixfold, presenting a pre-defined set of health indicators commonly tied to high-risk conditions such as: Vitals, Cardiovascular Health, and Hematology.

It gives underwriters instant and standardized insight: lab values, normal ranges, and historical progression, making it easier to assess the presence, progression, and severity of chronic conditions.

The impact? Core Clinical Data gives underwriters a snapshot of the applicant’s health at a glance, meaning time saved in reading lab reports that can be used to actually bind accounts. With immediate access to historical trends and key lab indicators, underwriters can make faster and more accurate decisions.

How Guardian Cut Review Time in Half

Guardian, one of the largest life insurers in the U.S., faced a common challenge: manual case reviews slowed down underwriters and created bottlenecks in the underwriting process.

By adopting Sixfold for its Disability line, Guardian was able to automatically extract medical data, triage information faster, and speed up case assessments end-to-end.

The impact: a 50% reduction in review time, freeing underwriters from being stuck reviewing medical records and letting them focus on risk decisions. Read more about the results Guardian has seen here.

Now, Guardian is expanding the program across more business lines, and you can see why! Join our Life & Health Product Demo on September 11th, where we’ll showcase how Sixfold works for life & health insurance underwriting.

Research Agent: From Hours of Digging to Seconds of Insights

P&C & Speciality

Product Update

Research Agent: From Hours of Digging to Seconds of Insights

Meet Research Agent, underwriters’ new research partner. It spots the gaps in a submission, finds the missing details, and delivers the insights that matter most for the decision.

5 min read

Alexis Clayman

In commercial underwriting, submissions rarely contain everything underwriters need to properly assess risks. Critical information such as SEC filings, cyber incidents, litigation history, or executive misconduct aren't in the application, and finding that data falls on the underwriter.

The stakes are high: miss one key detail, and underwriters are suddenly pricing a completely different risk. The research to uncover this information means hours going through public sources such as news archives and databases.

All while brokers demand quotes fast in an increasingly competitive market.

Why We Built it

With Research Agent, underwriters have a research partner who finds the gaps in the submission, locates the missing pieces, and brings back the insights that matter for the decision. The agent does the external research work on the underwriter’s behalf.

By filling critical information gaps and applying research from the public web and connected third-party data sources into Sixfold’s overall risk assessment, underwriters are given a complete risk picture for more precise appraisal. Underwriters can view the exact origin of any piece of information, with clickable links that take them directly to the original source for verification or further detail.
‍

The result? Quotes go out faster, underwriters make more consistent calls, and decisions get made without wondering what might have been missed.

“Research Agent has been one of our most anticipated features, especially for customers in specialty lines where extensive research is needed to get a complete picture of the risk. We expect the agent to save them at least two hours per case.”

- Alex Schmelkin, Founder & CEO at Sixfold

When Research Makes Or Breaks The Quote

Sixfold’s Research Agent is designed to support commercial lines of business where external data is essential in risk evaluation and quoting. Lines of business we’re seeing the highest demand for include:‍

General Liability - Public web research to capture OSHA violations, litigation history, and negative press coverage.
Cyber - 3rd party cyber threat intelligence plus broader public web research for cyber breach disclosures and corporate litigation involving cybersecurity lapses.
E&S Property - Public and/or 3rd party data on location and climate based risks (flood, wildfire, crime, etc.), permit history, and company mentions in the news.
Directors & Officers (D&O) - News on executive misconduct, class-action lawsuits, SEC filings, and shareholder news.
Healthcare & Life Sciences - Public records of malpractice lawsuits, regulatory actions, patient safety concerns, and news coverage related to clinical trials or product safety.

While Competitors Prep, You Price

Research Agent delivers wins across the board, immediate time savings for underwriters and competitive advantages insurers.

“We’re seeing growing demand for research-heavy underwriting. With this launch, we accelerate the time to quote while elevating decision-making with richer context.”
‍
- Lana Jovanovic, Head of Product at Sixfold

Focus on what matters: Those hours gained back per case? That's time underwriters can spend on what they do best: making smart risk decisions, strengthening broker relationships, and building customer connections instead of drowning in research tasks.

Speed wins deals: Reduced manual research burden means underwriters can respond faster to brokers, increase their quote volume, and capture more bound premiums. For carriers competing on both responsiveness and quality, this means winning more of the right business.

Same quality, every time: Every underwriter now has access to the same thorough research process. No more inconsistency based on who's handling the case or how much time they have to dig. Better decisions, whether it's one case or a thousand.

Improved portfolio performance: Sixfold’s comprehensive risk assessment, now with the added capability of public web research and connected third-party data, equips underwriters with a deeper understanding of each risk.

Contact Sixfold to see the Research Agent in action.

Sixfold's Approach to AI Fairness & Bias Testing

Life & Health

Transparency

Sixfold's Approach to AI Fairness & Bias Testing

As AI becomes more embedded in insurance underwriting, ensuring fairness is a shared responsibility across carriers, vendors, and regulators. Sixfold's commitment to responsible AI means continuously exploring new ways to evaluate bias.

5 min read

Alexis Clayman

As AI becomes more embedded in the insurance underwriting process, carriers, vendors, and regulators share a growing responsibility to ensure these systems remain fair and unbiased.

At Sixfold, our dedication to building responsible AI means regularly exploring new and thoughtful ways to evaluate fairness.¹

We sat down with Elly Millican, Responsible AI & Regulatory Research Expert, and Noah Grosshandler, Product Lead on Sixfold's Life & Health team, to discuss how Sixfold is approaching fairness testing in a new way.

Fairness As AI Systems Advance

Fairness in insurance underwriting isn’t a new concern, but testing for it in AI systems that don’t make binary decisions is.

At Sixfold, our Underwriting AI for life and health insurers don’t approve or deny applicants. Instead, it analyzes complex medical records and surface relevant information based on each insurer's unique risk appetite. This allows underwriters to work much more efficiently and focus their time on risk assessment, not document review.

“We needed to develop new methodologies for fairness testing that reflect how Sixfold works.”
‍
— Elly Millican, Responsible AI & Regulatory Research Expert

While that’s a win for underwriters, it complicates fairness testing. When your AI produces qualitative outputs such as facts and summaries, rather than scores and decisions, most traditional fairness metrics won’t work. Testing for fairness in this context requires an alternative approach.
‍

“The academic work around fairness testing is very focused on traditional predictive models, however Sixfold is doing document analysis,” explains Millican. “We needed to develop new methodologies for fairness testing that reflect how Sixfold works.”

“The academic work around fairness testing is very focused on traditional predictive models, however Sixfold is doing document analysis,” explains Millican. “We needed to develop new methodologies for fairness testing that reflect how Sixfold works.”

“Even selecting which facts to pull and highlight from medical records in the first place comes with the opportunity to introduce bias. We believe it’s our responsibility to test for and mitigate that,” Grosshandler adds.

While regulations prohibit discrimination in underwriting, they rarely spell out how to measure fairness in systems like Sixfold’s. That ambiguity has opened the door for innovation, and for Sixfold to take initiative on shaping best practices and contributing to the regulatory conversation.

A New Testing Methodology

To address the challenge of fairness testing in a system with no binary outcomes, Sixfold is developing a methodology rooted in counterfactual fairness testing. The idea is simple: hold everything constant except for a single demographic attribute and see if and how the AI’s output changes.²

“Ultimately we want to validate that medically similar cases are treated the same when their demographic attributes differ,”
‍
— Noah Grosshandler, Product Manager @Sixfold

“We start with an ‘anchor’ case and create a ‘counterfactual twin’ who is identical in every way except for one detail, like race or gender. Then we run both through our pipeline to see if the medical information that’s presented in Sixfold varies in a notable or concerning way” Millican explains.

“Ultimately we want to validate that medically similar cases are treated the same when their demographic attributes differ,” Grosshandler states.

Proof-of-Concept

For the initial proof-of-concept, the team is focused on two key dimensions of Sixfold’s Life & Health pipeline.

1. Fact Extraction Consistency
Does Sixfold extract the same facts from medically identical underwriting case records that differ only in one protected attribute?

2. Summary Framing and Content Consistency
Does Sixfold produce diagnosis summaries with equivalent clinical content and emphasis for medically identical underwriting cases?

“It’s not just about missing or added facts, sometimes it’s a shift in tone or emphasis that could change how a case is perceived,” Millican explains. “We want to be sure that if demographic details are influencing outputs, it’s only when clinically appropriate. Otherwise, we risk surfacing irrelevant information that could skew decisions.”

Expanding the Scope

**Future testing will likely explore proxy variables such as ZIP codes, names, and socioeconomic indicators, which might implicitly shape model behavior.**

While the team’s current focus is on foundational fairness markers (race and gender), the methodology is designed to evolve. Future testing will likely explore proxy variables such as ZIP codes, names, and socioeconomic indicators, which might implicitly shape model behavior.

“We want to get into cases where the demographic signal isn’t explicit, but the model might still infer something. Names, locations, insurance types, all of these could serve as proxies that unintentionally influence outcomes,” Millican elaborates.

The team is also thinking ahead to version control for prompts and model updates, ensuring fairness testing keeps pace with an evolving AI stack.

“We’re trying to define what fairness means for a new kind of AI system,” explains Millican. “One that doesn’t give a single output, but shapes what people see, read, and decide.”

Sixfold isn’t just testing for fairness in isolation, it’s aiming to contribute to a broader conversation on how LLMs should be evaluated in high-stakes contexts like insurance, healthcare, finance, and more.

That’s why Sixfold is proactively bringing this work to the attention of regulatory bodies. By doing so, we hope to support ongoing standards development in the industry and help others build responsible and transparent AI systems.

“This work isn’t just about evaluating Sixfold, it’s about setting new standards for a new category of AI." Grosshandler concludes.

“This work isn’t just about evaluating Sixfold, it’s about setting new standards for a new category of AI. Regulators are still figuring this out, so we’re taking the opportunity to contribute to the conversation and help shape how fairness is monitored in systems like ours,” Grosshandler concludes.

Positive Regulatory Feedback

When we recently walked through our testing methodology and results with a group of regulators focused on AI and data, the feedback was both thoughtful and encouraging. They didn’t shy away from the complexity, but they clearly saw the value in what we’re doing.

“The fact that it’s hard shouldn’t be a reason not to try. What you’re doing makes sense... You’re scrutinizing something that matters.” said one senior policy advisor.

“The fact that it’s hard shouldn’t be a reason not to try. What you’re doing makes sense... You’re scrutinizing something that matters.”

— Senior Policy Advisor

One of the key themes that came up during the meeting was the unique nature of generative AI, and why it demands a different kind of oversight. As one senior actuary and behavioral data scientist put it: “Large language models are more qualitative than quantitative... A lot of technical folks don’t really get qualitative. They’re used to numbers. The more you can explain how you test the language for accuracy, the more attention it will get.”

That comment really resonated. It reflects the heart of our approach, we’re not just tracking metrics. We’re evaluating how language evolves, how facts can shift, and how risk is framed and communicated depending on the inputs.

The Road Ahead

**Fairness in AI is an ongoing commitment. Sixfold’s work in developing and refining fairness and bias testing methodologies reflects that mindset.**

Fairness in AI isn’t a fixed destination, it’s an ongoing commitment. Sixfold’s work in developing and refining fairness and bias testing methodologies reflects that mindset.

As more organizations turn to LLMs to analyze and interpret sensitive information, the need for thoughtful, domain-specific fairness methods will only grow. At Sixfold, we’re proud to be at the forefront of that work.

Footnotes

¹While internal reviews have not surfaced evidence of systemic bias, Sixfold is committed to continuous testing and transparency to ensure that remains the case as we expand and refine our AI systems.

²To ensure accuracy, cases involving medically relevant demographic traits, like pregnancy in a gender-flipped case, are filtered out. The methodology is designed to isolate unfair influence, not obscure legitimate medical distinctions.