Mastodon Feed: Post

Mastodon FeedNov 19, 2024, 2:10 AM

Reblogged by jsonstein@masto.deoan.org ("Jeff Sonstein"):

hkrn@mstdn.social ("Hacker News") wrote:

Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference
L: https://cerebras.ai/blog/llama-405b-inference
C: https://news.ycombinator.com/item?id=42178761
posted on 2024.11.18 at 19:15:04 (c=0, p=5)