This page contains press release content distributed by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AIM Intelligence and BMW Group Examine Gaps in Evaluating Enterprise AI Policy Compliance

Research reveals LLMs follow allowlist policies but systematically fail to enforce organizational prohibitions, exposing a critical gap in enterprise AI safety

SF, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Seoul, South Korea / Munich, Germany – January 2026 – BMW Group and AIM Intelligence, a leading AI safety startup, today announced the publication of COMPASS (Company/Organization Policy Alignment Assessment), the first systematic framework for evaluating whether large language models (LLMs) comply with organization-specific policies. The research, now available on arXiv, reveals a critical gap that remains under-measured in current evaluation practices: models that pass standard safety benchmarks often fail dramatically when enforcing the nuanced, context-dependent rules that govern real-world business operations.

Why Enterprise AI Policies Break Down in Practice

As organizations across healthcare, finance, automotive, and government sectors rapidly adopt LLMs for customer-facing applications, the research team discovered a fundamental asymmetry that poses significant risks for policy-critical deployments.
Key Findings:
Strong Allowlist Compliance: Models reliably handle legitimate requests with over 95% accuracy
Critical Denylist Failures: Models fail to correctly refuse prohibited requests in up to 97% of cases
Catastrophic Adversarial Vulnerability: Under adversarial conditions, some models refuse fewer than 5% of policy-violating requests
“Most AI safety tests focus on whether a model behaves safely in general,” said Dasol Choi, AI Safety Researcher at AIM Intelligence. “COMPASS looks at a more practical question: can an AI system reliably follow the specific rules of an organization? Our findings show that, in many real-world deployments today, the answer is often no.”

Why Generic AI Safety Isn’t Enough

The research addresses a critical disconnect between how AI systems are evaluated and how they are deployed. While existing safety benchmarks focus on universal harms such as toxicity and violence, real enterprises operate under complex internal policies—compliance manuals, operational playbooks, legal edge cases, and brand-specific constraints.
COMPASS evaluates models across four dimensions that typical benchmarks ignore:
1. Policy Selection: Can the model identify which policy applies to a given situation?
2. Policy Interpretation: Can it reason through conditionals, exceptions, and vague clauses?
3. Conflict Resolution: When rules collide, does the model resolve conflicts as the organization intends?
4. Justification: Can the model ground its decisions in actual policy text?

“Our evaluation revealed a striking asymmetry,” noted DongGeon Lee, AI Safety Researcher at AIM Intelligence. “While models achieve near-perfect accuracy on what they can do, they remain structurally vulnerable in enforcing what they must not do. This gap persists across model scales and architectures, indicating that scaling alone cannot solve the problem.”

Industry-Scale Validation

The research team applied COMPASS across eight diverse industry scenarios—Automotive, Government, Financial, Healthcare, Travel, Telecom, Education, and Recruiting—generating and validating 5,920 queries that test both routine compliance and adversarial robustness. Fifteen state-of-the-art models were evaluated, including leading proprietary and open-source systems.

Making Misalignment Measurable

Perhaps the most significant contribution of COMPASS is transforming alignment from a philosophical concern into an engineering problem. The framework and benchmark datasets are publicly available on GitHub and Hugging Face, enabling organizations to evaluate their AI systems against their own policies.

About the Research Collaboration

This research represents a collaboration between AIM Intelligence, BMW Group, Yonsei University, Pohang University of Science and Technology, and Seoul National University. The full paper, “COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs,” is available at https://arxiv.org/abs/2601.01836.

About AIM Intelligence

AIM Intelligence is a Seoul-based AI safety company specializing in automated red-teaming, real-time guardrails, and AI monitoring solutions. Founded in 2024, AIM Intelligence serves major enterprises and conducts research across large language models, multimodal systems, autonomous agents, and emerging physical AI. The company has published over 15 research papers at top-tier conferences including ICML, ACL, NeurIPS, and IEEE.

Team Cookie Official
Team Cookie
email us here
Visit us on social media:
LinkedIn
Facebook

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Abilytics Appoints Ajish Cherian as Chief Technology Officer to Drive AI, Cloud, and Platform Engineering Growth

Abilytics Appoints Ajish Cherian as Chief Technology Officer to Drive AI, Cloud, and Platform Engineering Growth

The convergence of cloud, data, and AI is reshaping enterprise operating models. We want to build secure AI-first

February 23, 2026

Quantum Risk Moves to the Boardroom as eMudhra Advises Global Enterprises on Post-Quantum Cryptography Strategy

Quantum Risk Moves to the Boardroom as eMudhra Advises Global Enterprises on Post-Quantum Cryptography Strategy

eMudhra helps global enterprises assess quantum risk, modernize PKI, and build crypto-agile strategies for a secure

February 23, 2026

FleetCollect Launches DOT Compliance Platform for Small Trucking Fleets

FleetCollect Launches DOT Compliance Platform for Small Trucking Fleets

Modern driver qualification file platform offers automated compliance tracking, one-click audit reports, at an

February 23, 2026

Golpo AI Launches Golpo 2.0 and Announces $4.1M Seed Round to Advance AI-Native Explainer Video Creation

Golpo AI Launches Golpo 2.0 and Announces $4.1M Seed Round to Advance AI-Native Explainer Video Creation

Golpo introduces Golpo 2.0, an AI-native video platform enabling teams to create explainer videos and make whiteboard

February 23, 2026

Remondo Introduces Breakthrough Platform for Ultra-High Resolution Imaging at Scale

Remondo Introduces Breakthrough Platform for Ultra-High Resolution Imaging at Scale

New LEO microsatellite payload delivers sub-30cm resolution at constellation-scale economics We built Remondo to

February 23, 2026

Edchart Expands Global Skills Recognition Through NoSQL Certification With Digital Credentialing

Edchart Expands Global Skills Recognition Through NoSQL Certification With Digital Credentialing

A global certification pathway validates NoSQL expertise through secure testing and digital credentials aligned with

February 23, 2026

Kuumba Made Selects BatchMaster Web ERP to Replace Generic Solution and Secure Compliance

Kuumba Made Selects BatchMaster Web ERP to Replace Generic Solution and Secure Compliance

“The reliability, scale integration, and deep lot tracing we gained will help us achieve full organic certification and

February 23, 2026

Astria Learning Demonstrates Scalable AI-Driven eCampus Model at MSU’s Alliance for African Partnership REIMAGINE Summit

Astria Learning Demonstrates Scalable AI-Driven eCampus Model at MSU’s Alliance for African Partnership REIMAGINE Summit

NAIROBI, KENYA, February 23, 2026 /EINPresswire.com/ — Astria Learning presented its AI-enabled eCampus ecosystem at

February 23, 2026

Granite Fit Club Spotlights Premium, Capped-Membership Gym Experience in Prescott Valley, AZ

Granite Fit Club Spotlights Premium, Capped-Membership Gym Experience in Prescott Valley, AZ

Local member feedback highlights cleanliness, modern equipment, and a less crowded workout environment All of the

February 23, 2026

Kuvings to Exhibit at The Inspired Home Show 2026 in Chicago, Showcasing the AUTO10 Plus Juicer

Kuvings to Exhibit at The Inspired Home Show 2026 in Chicago, Showcasing the AUTO10 Plus Juicer

Kuvings to present Hands-Free Juicing innovation at The Inspired Home Show in Chicago. IL, UNITED STATES, February 23,

February 23, 2026

Superproxy Launches AI-Native Sales Workspace to Streamline Deal Management for Growing Businesses

Superproxy Launches AI-Native Sales Workspace to Streamline Deal Management for Growing Businesses

New platform unifies pipeline tracking, quotes, client engagement, and team performance tools in one workspace for $20

February 23, 2026

BrewLedger Launches Cloud-Based Management Platform to Address Rising Operational Costs in Craft Brewing

BrewLedger Launches Cloud-Based Management Platform to Address Rising Operational Costs in Craft Brewing

BrewLedger launches a mobile-first brewery management platform, offering offline sync and affordable tools to

February 23, 2026

2026 Creator Content Protection Report: Top DMCA Services Compared

2026 Creator Content Protection Report: Top DMCA Services Compared

New analysis ranks Fanlock, Rulta, Bruqi, and Ceartas across 10 enforcement capabilities as deepfake attacks surge 900%

February 23, 2026

Compassion Recovery Centers Expands Access to Insurance-Covered Intensive Outpatient Treatment in California

Compassion Recovery Centers Expands Access to Insurance-Covered Intensive Outpatient Treatment in California

Our goal is to make insurance-covered mental health treatment more accessible and transparent for families navigating

February 23, 2026

Vidac Pharma Reports First Patient Dosed in Phase 2b Study of VDA-1102 for High-Risk Actinic Keratosis

Vidac Pharma Reports First Patient Dosed in Phase 2b Study of VDA-1102 for High-Risk Actinic Keratosis

Vidac Pharma reports first patient dosed in Phase 2b study of VDA-1102 for high-risk Actinic Keratosis at Centroderm,

February 23, 2026

Stoneridge Expands Support and Cloud Migration Guidance for Dynamics GP Users

Stoneridge Expands Support and Cloud Migration Guidance for Dynamics GP Users

Strengthening education, services, and transition planning for the GP community We meet clients where they are, whether

February 23, 2026

Australian Manufacturing M&A Report Maps $1.2B in Mid-Market Deals and Emerging Valuation Winners

Australian Manufacturing M&A Report Maps $1.2B in Mid-Market Deals and Emerging Valuation Winners

A new report on $1.2B of Australian manufacturing M&A reveals 35 key mid-market deals and the features driving

February 23, 2026

MDGeniusAI Desarrolla Monitoreo Postoperatorio con Inteligencia Artificial para Detección Temprana de Complicaciones

MDGeniusAI Desarrolla Monitoreo Postoperatorio con Inteligencia Artificial para Detección Temprana de Complicaciones

MIAMI, FL, UNITED STATES, February 23, 2026 /EINPresswire.com/ — MDGeniusAI ha anunciado el desarrollo de PICS™

February 23, 2026

MDGeniusAI Lanza Primera Plataforma de Salud Construida Completamente con Inteligencia Artificial

MDGeniusAI Lanza Primera Plataforma de Salud Construida Completamente con Inteligencia Artificial

Sistema de Cuatro Módulos de IA Gestiona el Ciclo Completo del Paciente en Medicina Estética MIAMI, FL, UNITED STATES,

February 23, 2026

Croc Coatings Launches Croc Guard Sealer and Croc Refresh Program

Croc Coatings Launches Croc Guard Sealer and Croc Refresh Program

North Idaho Concrete Coatings Leader Adds Full Protection and Maintenance for Homes and Businesses Our core focus has

February 23, 2026

Author Robert Wolfe Releases Novel ‘Salvation Moon: Rise of the Black Wolf’ Blending Slavery with Supernatural Elements

Author Robert Wolfe Releases Novel ‘Salvation Moon: Rise of the Black Wolf’ Blending Slavery with Supernatural Elements

OAK LAWN, IL, UNITED STATES, February 23, 2026 /EINPresswire.com/ — Author Robert Wolfe has released Salvation Moon:

February 23, 2026

Charles Vermont Offers UK Access to Curated International Womenswear from Leading European and Canadian Labels

Charles Vermont Offers UK Access to Curated International Womenswear from Leading European and Canadian Labels

GREATER MANCHESTER, UNITED KINGDOM, February 23, 2026 /EINPresswire.com/ — Charles Vermont, an independent UK-based

February 23, 2026

Joey Gutos Releases ‘It Just Feels Right,’ an Intimate Indie Rock Single About Self-Acceptance and Taking the Leap

Joey Gutos Releases ‘It Just Feels Right,’ an Intimate Indie Rock Single About Self-Acceptance and Taking the Leap

TEMPE, AZ, UNITED STATES, February 22, 2026 /EINPresswire.com/ — Phoenix-based singer-songwriter Joey Gutos unveils

February 23, 2026

SUMMERVILLE DOG TRAINER CANINE REVOLUTION DOG TRAINING PUBLISHES LEASH TRAINING BOOK NOW AVAILABLE ON AMAZON

SUMMERVILLE DOG TRAINER CANINE REVOLUTION DOG TRAINING PUBLISHES LEASH TRAINING BOOK NOW AVAILABLE ON AMAZON

Canine Revolution Dog Training Releases The Ultimate Leash Training Manual, a Simple 5-Phase Guide for Dog Owners

February 23, 2026

Study Shows Mobile-First Website Designs Deliver Higher Customer Engagement According to Creative Canvas Findings

Study Shows Mobile-First Website Designs Deliver Higher Customer Engagement According to Creative Canvas Findings

CEDAR RAPIDS, IA, UNITED STATES, February 23, 2026 /EINPresswire.com/ — Creative Canvas Web is announcing insights

February 23, 2026

Commercial Solutions Named Best Commercial Roofing Company in Raleigh, NC by CommercialRoofers.org

Commercial Solutions Named Best Commercial Roofing Company in Raleigh, NC by CommercialRoofers.org

CommercialRoofers.org names Commercial Solutions Best Commercial Roofing Company in Raleigh, NC for 2026. Family-owned

February 23, 2026

Indyme and Eden Unveil European-Market Freedom Case™ at Euroshop 2026

Indyme and Eden Unveil European-Market Freedom Case™ at Euroshop 2026

Strategic partnership adapts the world’s only self-service locked case to European gondola fixtures, eliminating the

February 23, 2026

Doll Amir & Eley Welcomes Attorney Ryan H. Chan

Doll Amir & Eley Welcomes Attorney Ryan H. Chan

LOS ANGELES, CA, UNITED STATES, February 9, 2026 /EINPresswire.com/ — Doll Amir & Eley LLP announced today that Ryan H. Chan has joined the firm,…

February 23, 2026

saVRee Launches Maritime Training Courses to Build Digital Competency Across Marine and Offshore Operations

saVRee Launches Maritime Training Courses to Build Digital Competency Across Marine and Offshore Operations

saVRee introduces maritime training courses covering propulsion, safety, and auxiliary systems. ABERDEEN, UNITED

February 23, 2026

Crime Scene Cleaners Launches New Website Focused on Trust, Accessibility, and Compassionate Service

Crime Scene Cleaners Launches New Website Focused on Trust, Accessibility, and Compassionate Service

Crime Scene Cleaners Unveils New Website Designed to Better Serve Families, First Responders, and Property Professionals Across Missouri and Kansas For more than 25 years,…

February 23, 2026

Municipal Waste Systems Leave a Gap in Residential Sanitation, Local Companies Are Stepping In

Municipal Waste Systems Leave a Gap in Residential Sanitation, Local Companies Are Stepping In

Communities are rethinking residential sanitation as local companies address gaps left by traditional waste systems. Communities are paying more attention to what happens inside the…

February 22, 2026

YourMedPlan Introduces Expanded Health Insurance Options for Employers

YourMedPlan Introduces Expanded Health Insurance Options for Employers

Enhanced options include group and individual-based health insurance solutions together under one advisory model. CLEARWATER, FL, UNITED STATES, February 10, 2026 /EINPresswire.com/ — YourMedPlan has…

February 22, 2026

saVRee Updates Marine Engineering Online Courses With New 3D Simulations and System Training Content

saVRee Updates Marine Engineering Online Courses With New 3D Simulations and System Training Content

Marine engineering courses enhanced with simulations, 3D models, and updated training material. ABERDEEN, UNITED KINGDOM, February 10, 2026 /EINPresswire.com/ — saVRee, a digital training provider…

February 22, 2026

Ohio State Opens New University Hospital Today

Ohio State Opens New University Hospital Today

Medical staff are moving more than 425 patients into 26-story, 1.9-million-square-foot hospital COLUMBUS, OH, UNITED

February 22, 2026

The Reunion Resort Villas & Diamante Country Club Golf Stay’n’Play Diamond Package & Event Venue in Hot Springs Village

The Reunion Resort Villas & Diamante Country Club Golf Stay’n’Play Diamond Package & Event Venue in Hot Springs Village

The Reunion Resort Villas and Diamante Country Club Announce New Golf Stay’n’Play Diamond Package and Corporate Event

February 22, 2026

CBD MEDIA COMMENCE MONTHLY FEATURES ON AN ARRAY OF HOME & APARTMENT ROOFING MATTERS

CBD MEDIA COMMENCE MONTHLY FEATURES ON AN ARRAY OF HOME & APARTMENT ROOFING MATTERS

Sydney Based CBD Media, a Lifestyle Magazine confirmed to Eleven Media will commence in January feature posts on

February 22, 2026

‘House’ Star to Headline Virtual Event Ahead of Jewish National Fund-USA’s Shabbat for Israel

‘House’ Star to Headline Virtual Event Ahead of Jewish National Fund-USA’s Shabbat for Israel

Actress Lisa Edelstein will be the main speaker at an upcoming virtual event leading up to Jewish National Fund-USA’s

February 22, 2026

Validus AI Partners Unveils Q1 2026 Strategic Blueprint for Democratizing Enterprise AI

Validus AI Partners Unveils Q1 2026 Strategic Blueprint for Democratizing Enterprise AI

New Executive Summary provides the "how-to" for leaders transitioning from centralized bottlenecks to a high-velocity,

February 22, 2026

FESTIVAL STANDOUT — CHILDREN OF THE PINES — NOW STREAMING ON AMAZON PRIME

FESTIVAL STANDOUT — CHILDREN OF THE PINES — NOW STREAMING ON AMAZON PRIME

Joshua Morgan’s feature debut — a chilling blend of memory, mystery, and the supernatural. I wanted the film to explore

February 22, 2026

New Book Explores the Spiritual Calling, Authority, and Endurance of Women in Christian Ministry

New Book Explores the Spiritual Calling, Authority, and Endurance of Women in Christian Ministry

Dr. Janet Seay Stanley examines obedience, refinement, and resistance in God’s Finished Product: The Anointed and Chosen Women NEW YORK CITY, NY, UNITED STATES, February…

February 22, 2026