The IrokoBench results, published at NAACL 2025 and now widely used to measure African language performance, reveal a gap that the industry has not addressed quickly enough. GPT-4o scores 72.5% on English tasks but drops to 48.1% on African languages. LLaMA 3 (70B) does even worse, averaging just 25.5%, which is 45 points lower than […]

