Bytedance has just rolled out Seed Diffusion Preview, a new AI model that generates code tokens in parallel rather than one by one. If you’ve ever battled with slow, sequential code generation, this approach—pushing up to 2,146 tokens per second on Nvidia H20 GPUs—will feel refreshingly fast.
This model uses a unique “discrete-state diffusion” technique, normally seen with continuous data like images, and adapts it for text and code. Instead of handling tokens one after the other, it reconstructs code from a placeholder-filled state, enabling multiple sections to be generated simultaneously.
Quality isn’t compromised here. In fact, benchmark tests indicate that Seed Diffusion Preview rivals, and sometimes even outperforms, current models, particularly when it comes to code editing. The two-stage training process helps a lot: the first stage employs mask-based training, while a follow-up edit-based phase, which includes token insertions and deletions, ensures nothing slips through the cracks.
The architecture also respects code structure by optimising the generation order—ensuring, for example, that variables are declared before they’re used. Trained on a curated dataset of high-quality sequences, this transformer-based model opens up a fresh avenue for rapid, reliable code production.
Implementing parallel decoding is no small feat, given the computational intensity of diffusion models. Bytedance tackled this challenge using on-policy learning to streamline the process, all while integrating a robust verification model to keep quality in check.
This development marks Bytedance’s strategic response to Google’s Gemini Diffusion, targeting similar code generation tasks. As research continues to focus on enhancing scalability and adapting the model for even more intricate reasoning tasks, this tool is set to become a valuable asset. A demo is available for anyone keen to see this advancement in action.