Mastodon Feed: Post

Mastodon FeedApr 23, 2026, 12:04 AM

Boosted by glyph ("Glyph"):
Grouchybeast ("The Grouchybeast") wrote:

@davidgerard The otherwise great paper on LLMs hallucinating images in image analysis benchmarking made me facepalm when they used ChatGPT to assess some of the output. WHAT ARE YOU DOING? LLMS MAKE THINGS UP WHEN ASKED TO ANALYSE DATA! THAT IS LITERALLY THE ENTIRE THESIS OF YOUR PAPER!