Posts

Dec 9, 2020
Tutorial for Cluster Distributed Training using Slurm+Singularity

This tutorial covers how to setup a cluster of GPU instances on AWS and use Slurm to train neural networks with distributed data paralleli...

Jan 22, 2020
Importance-Aware Learning for Neural Headline Editing

Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We aim to automate the headline editing process to ...

Jun 23, 2019
Notes on CVPR 2019

This is a note of thoughts and summaries of what is seen and heard at CVPR 2019. It will mainly be about papers related to NLG and Language+Vision.

May 22, 2019
Explore Gradient-Checkpointing in PyTorch

This is a practical analysis of how Gradient-Checkpointing is implemented in Pytorch, and how to use it in Transformer models like BERT and GPT2.