We present RobusTok, a new image tokenizer with a two-stage training scheme: Main training → constructs a robust latent space. Post-training → aligns the generator’s latent distribution with its image ...
This repo implements UniTok, a unified visual tokenizer well-suited for both generation and understanding tasks. It is compatiable with autoregressive generative models (e.g. LlamaGen), multimodal ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results