Mastodon Feed: Post

i'm trying a qwen3.5 model. it seems to be a bit much for my gpu - 0.2 tokens per second and two minutes to respond to "say hello".