NHacker Next
- new
- past
- show
- ask
- show
- jobs
- submit
login
▲Decoupled DiLoCo: Resilient, Distributed AI Training at Scale (self.__VINEXT_RSC_CHUNKS__=self.__VINEXT_RSC_CHUNKS__||[];self.__VINEXT_RSC_CHUNKS__.push("2:I[\"aadde9aaef29\",[],\"default\",1]\n3:I[\"6e873226e03b\",[],\"Children\",1]\n5:I[\"bc2946a341c8\",[],\"LayoutSegmentProvider\",1]\n6:I[\"6e873226e03b\",[],\"Slot\",1]\n7:I[\"3506b3d116f7\",[],\"ErrorBoundary\",1]\n8:I[\"a9bbde40cf2d\",[],\"default\",1]\n9:I[\"3506b3d116f7\",[],\"NotFoundBoundary\",1]\na:\"$Sreact.suspense\"\n:HL[\"/assets/index-BLEkI_5r.css\",\"style\"]\n")"_blank">deepmind.google)
Rendered at 22:31:13 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
This paper proposes a work partitioning scheme that removes a constraint that makes parallelizing AI training inefficient. The idea of a work partitioning scheme isn't novel, but the scheme itself is.