TorchGAN: A Flexible Framework for GAN Training and Evaluation

TorchGAN is a PyTorch based framework for writing succinct and comprehensible code for training and evaluation of Generative Adversarial Networks. The framework's modular design allows effortless customization of the model architecture, loss functions, training paradigms, and evaluation metrics. The key features of TorchGAN are its extensibility, built-in support for a large number of popular models, losses and evaluation metrics, and zero overhead compared to vanilla PyTorch. By using the framework to implement several popular GAN models, we demonstrate its extensibility and ease of use. We also benchmark the training time of our framework for said models against the corresponding baseline PyTorch implementations and observe that TorchGAN's features bear almost zero overhead.


Introduction
Generative Adversarial Networks (GANs) ( [1]) are a class of deep generative models that formulate the model estimation problem as an adversarial game between two neural networks, a Generator representing an implicit generative distribution, and a Discriminator that differentiates between samples from said implicit distribution and the true data distribution. The implicit distribution recovers the data distribution when the game reaches equilibrium. Apart from being one of the most popular approaches for generative modeling and unsupervised learning tasks in Computer Vision, with diverse applications such as photo-realistic image generation ( [2], [3]), image super-resolution ( [4]), image-to-image translation ( [5]) and video generation ( [6], [7]), it has also found applicability in domains such as Natural Language Processing ( [8]) and Time Series Analysis ( [9]).
GANs generally share a standard design paradigm, with the building blocks comprising one or more generator and discriminator models, and the associated loss functions for training them. TorchGAN makes use of this design similarity by exposing a simple API for customizing these blocks. The interaction between these components at training time is facilitated by a highly robust trainer which automatically adapts to user-defined GAN models and losses. TorchGAN provides an extensive and continually expanding collection of popular GAN models, losses, evaluation metrics, and stability-enhancing features, which can either be used off the shelf or easily extended or combined to design more sophisticated models effortlessly. With the above design principles in mind, we aim to improve upon existing GAN training frameworks such as TFGAN [10], HyperGAN [11], and IBM GAN-Toolkit [12] on the aspects of extensibility, the richness of the feature set and documentation.

Implementing Models in TorchGAN
The core of the TorchGAN framework is a highly versatile trainer module, responsible for its flexibility and ease of use. The trainer requires specification of the generator and the discriminator architecture along with the optimizers associated with each of them, represented as a dictionary, as well as the list of associated loss functions, and optionally, evaluation metrics. We provide an illustrative example for training DCGAN on CIFAR10 in Figure 1. One can either choose from the in-built implementations of popular GAN models, losses and metrics or define custom variants of their own with minimal effort by extending the appropriate base classes. This extensibility is widely useful in research applications where the user only needs to write code for the model architecture and/or the loss function. The trainer automatically handles the intricacies of training with custom models/losses. The trainer also supports the usage of multiple generators and discriminators, allowing training of more sophisticated models such as Generative Multi Adversarial Networks ( [14]). Performance visualization is handled by a customizable Logger object, which, apart from console logging, currently supports the Tensorboard and Vizdom backends.

Comparison with Existing Frameworks
TorchGAN provides high-quality implementations of various GAN models, metrics for evaluating GANs, and various approaches for improving the stability of GAN training. We provide an overview of the features that are provided off the shelf by TorchGAN and compare them with the ones provided by other frameworks. Note that the list is not exhaustive as the modular and extensible structure of TorchGAN allows one to extend or modify these features, or use them as building blocks for more sophisticated models.

Development
The

Conclusion and Future Work
We present the features of the TorchGAN framework and demonstrate its extensibility, ease of use and efficiency. Future work and extensions under active development include, integration of GAN models for video generation, generalization of the training loop to support Inference GAN models, such that they can be conveniently modified and extended, addition of features such as Adaptive Instance Normalization layers, and expanding the model zoo and documentation to cover more sophisticated examples such as Multi Agent-GAN training. We also envision the extension of the framework to domains beyond Computer Vision by adding support for NLP and Time Series GAN models.