Mastodon Feed: Post

Mastodon FeedNov 19, 2024, 2:17 AM

jsonstein@masto.deoan.org ("Jeff Sonstein") wrote:

the quality improvement resulting from using gpt-4o or gpt-4o-mini instead of earlier models (like 3.5-turbo) is startlingly discernible, but sadly so is the latency degradation. 4o-mini is way cheaper than 4o, but the difference in latency is human-insignificant for these porpoises.

this is indeed a harsh set of tradeoffs, and I may need to look at other LLMs… 3.5-turbo responses are just far too low-quality.