Researchers have developed MELD (Multi-Task Equilibrated Learning Detector), a new system for detecting AI-generated text that outperforms existing methods on robustness and accuracy metrics.
The detector addresses critical weaknesses in current AI text detection systems, which often fail when faced with adversarial attacks or content from unseen AI models. MELD uses a multi-task learning approach that simultaneously identifies whether text is AI-generated while also predicting the generator family, attack type, and source domain.
How MELD Works
Unlike traditional detectors that focus solely on distinguishing human from AI text, MELD enriches its training with auxiliary supervision tasks. The system attaches multiple prediction heads to a shared encoder and balances four different loss functions using learned uncertainty weights.
To improve robustness against attacks, MELD employs a teacher-student distillation framework. An exponential moving average teacher makes predictions on clean inputs while an attack-augmented student learns to match the teacher's outputs.
The system also uses a hard-negative pairwise ranking loss to increase the score margin between AI-generated texts and the most confusable human texts. At inference time, all auxiliary heads are discarded, making MELD as efficient as standard detectors.
Performance Results
On the public RAID leaderboard, MELD ranks as the strongest open-source detector and competes with leading commercial models, particularly under attack conditions and at low false-positive rates.
The researchers tested MELD on a new evaluation dataset called MELD-eval, built from recent chat models released by four major LLM providers including OpenAI, Anthropic, and others. Without additional fine-tuning, MELD achieved 99.9% true positive rate at 1% false positive rate.
Many baseline detectors showed sharp performance degradation on the same test set, highlighting the challenge of detecting text from newer AI models.
The research addresses growing concerns about AI-generated content in academic integrity, content moderation, and provenance tracking as large language models become embedded in everyday writing workflows.
💬 Discussion
Sign in to join the discussion.
Sign in →No comments yet — be the first.