Autoregressive Text-to-Visual Generation via Hybrid Architecture

A unique hybrid architecture of Mamba and Transformer for visual generation.

Last updated