OpenAIInventorSoftQpercomQpercom Ltd Platform 94 Mervue Business Park GalwayOct 14, 2025

Comparing AI-Generated Preview and Portfolio Feedback: Gpt 4.o vs. Claude 4

Thomas Kropmans, Oleh Bilokrylyi, Dmytro Predchyshyn, David Cunningham, Edward Melvin, Gabia Neverauskaité

AI Summary

The study compares the quality and accuracy of portfolio feedback generated by GPT-4o and Claude-sonnet-4 (via Amazon Bedrock) in the context of Qpercom's digital assessment tools for high-stakes clinical assessments. It analyzes both preview feedback (for examiners) and direct student feedback, evaluating how well each model identifies different levels of student performance. The findings assess the safety, constructiveness, and educational value of the AI-generated feedback.

Key Contribution

AI-generated feedback on student portfolios from GPT-4o and Claude-Sonnet-4 shows promise for high-stakes clinical assessments, but careful evaluation is needed to ensure accuracy and educational value.

Abstract

This report provides an in-depth comparative analysis of AI-generated portfolio feedback delivered through two leading Large Language Model platforms (LLM): Gpt4.o OpenAI and Claude-sonnet-4 (Anthropic) via Amazon Bedrock. The feedback was analyzed in two distinct stages: preview feedback, which serves as a safety and verification layer for examiners/administrators, and portfolio feedback, which is delivered directly to the students. These systems are integral to Qpercom’s digital assessment tools and support high-stakes clinical assessments such as Objective Structured Clinical Examinations (OSCEs), high-stake recruitment using Multiple Mini Interviews (MMIs), and Video Interviewing and Digital Scoring (VIDS). This evaluation examines how accurately each model reflects students’ actual high-, mid-, and underperformance and whether its feedback provides safe, constructive, and educationally valuable input.

Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2025

VenueEuropean Journal of Artificial Intelligence and Machine Learning

Related Papers

Finding related papers...

Search

Comparing AI-Generated Preview and Portfolio Feedback: Gpt 4.o vs. Claude 4

Related Papers