HyperAI超神经

4 个月前

基于Transformer的可扩展扩散模型

查看论文详情

William Peebles Saining Xie

基于Transformer的可扩展扩散模型

摘要

我们探索了一类基于Transformer架构的新型扩散模型。我们训练了图像的潜在扩散模型，将以往广泛使用的U-Net主干网络替换为作用于潜在块（latent patches）上的Transformer。通过前向传播复杂度（以Gflops衡量）的视角，我们分析了所提出的扩散Transformer（Diffusion Transformers, DiTs）的可扩展性。研究发现，具有更高Gflops的DiTs——无论是通过增加Transformer的深度或宽度，还是增加输入token的数量——均表现出更优的性能，FID指标持续降低。除了具备良好的可扩展性外，我们最大的DiT-XL/2模型在类别条件下的ImageNet 512×512和256×256基准测试中均超越了所有先前的扩散模型，尤其在256×256尺度上取得了当前最优的FID值2.27。

代码仓库

VachanVY/diffusion-transformer

pytorch

senmaoy/RAT-Diffusion

pytorch

GitHub 中提及

facebookresearch/DiT

官方

pytorch

GitHub 中提及

milmor/diffusion-transformer

pytorch

GitHub 中提及

milmor/diffusion-transformer-keras

tf

GitHub 中提及

FineDiffusion/FineDiffusion

pytorch

GitHub 中提及

MindSpore-scientific/code-5/tree/main/Scalable-Sharpness-Aware-Minimization

mindspore

nyu-systems/grendel-gs

pytorch

GitHub 中提及

pytorch

GitHub 中提及

chuanyangjin/fast-dit

pytorch

GitHub 中提及

mindspore-lab/mindone

mindspore

pytorch

GitHub 中提及

huggingface/diffusers

jax

https://arxiv.org/abs/2305.00504

基准测试

基准	方法	指标
image-generation-on-imagenet-256x256	DiT-XL/2	FID: 2.27
image-generation-on-imagenet-512x512	DiT-XL/2	FID: 3.04 Inception score: 240.82

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程

即用型 GPU

最优价格

Hyper Newsletters

订阅我们的最新资讯

我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新

邮件发送服务由 MailChimp 提供

HyperAI超神经

4 个月前

基于Transformer的可扩展扩散模型

查看论文详情

William Peebles Saining Xie

基于Transformer的可扩展扩散模型

摘要

我们探索了一类基于Transformer架构的新型扩散模型。我们训练了图像的潜在扩散模型，将以往广泛使用的U-Net主干网络替换为作用于潜在块（latent patches）上的Transformer。通过前向传播复杂度（以Gflops衡量）的视角，我们分析了所提出的扩散Transformer（Diffusion Transformers, DiTs）的可扩展性。研究发现，具有更高Gflops的DiTs——无论是通过增加Transformer的深度或宽度，还是增加输入token的数量——均表现出更优的性能，FID指标持续降低。除了具备良好的可扩展性外，我们最大的DiT-XL/2模型在类别条件下的ImageNet 512×512和256×256基准测试中均超越了所有先前的扩散模型，尤其在256×256尺度上取得了当前最优的FID值2.27。

代码仓库

VachanVY/diffusion-transformer

pytorch

senmaoy/RAT-Diffusion

pytorch

GitHub 中提及

facebookresearch/DiT

官方

pytorch

GitHub 中提及

milmor/diffusion-transformer

pytorch

GitHub 中提及

milmor/diffusion-transformer-keras

tf

GitHub 中提及

FineDiffusion/FineDiffusion

pytorch

GitHub 中提及

MindSpore-scientific/code-5/tree/main/Scalable-Sharpness-Aware-Minimization

mindspore

nyu-systems/grendel-gs

pytorch

GitHub 中提及

pytorch

GitHub 中提及

chuanyangjin/fast-dit

pytorch

GitHub 中提及

mindspore-lab/mindone

mindspore

pytorch

GitHub 中提及

huggingface/diffusers

jax

https://arxiv.org/abs/2305.00504

基准测试

基准	方法	指标
image-generation-on-imagenet-256x256	DiT-XL/2	FID: 2.27
image-generation-on-imagenet-512x512	DiT-XL/2	FID: 3.04 Inception score: 240.82

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程

即用型 GPU

最优价格

Hyper Newsletters

订阅我们的最新资讯

我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新

邮件发送服务由 MailChimp 提供