A recent study by Cornell, Washington, Waterloo universities, and AI2 reveals that generative AI models like GPT-4o, Meta’s Llama 3, and Google’s Gemini still frequently produce hallucinations, or false information. The research benchmarked various models against authoritative sources on diverse topics such as law, health, and geography. Results showed that no model excelled in all areas, and those with fewer hallucinations often avoided answering more difficult questions.
Despite advancements, even the best models generated accurate, hallucination-free text only 35% of the time. The study underscores the need for ongoing research to reduce AI misinformation, suggesting human-in-the-loop fact-checking and the development of advanced fact-checking tools.
Source: TechCrunch
Comments