Mastodon Feed: Post

Mastodon FeedMay 9, 2026, 2:03 PM

dysfun@treehouse.systems ("gaytabase") wrote:

on the other hand, my token rate is much slower when i do it through llama.cpp's ui myself. presumably this is because they have filled the context buffer with some instructions for agents to ignore.