bcantrill ("Bryan Cantrill") wrote:
We hit a new (and disturbing!) failure mode recently when a production rack that had been up for several months saw every (!) compute sled's service processor become simultaneously unresponsive. On today's episode, @ahl and I willl be joined by the members of the Oxide team that debugged the vexing issue -- and @cliffle will reveal how he reached its surprising root cause. Join us at 5p Pacific for a wild tale of death by uptime!