Hey! I’m Puyuan Peng. I’m a first year PhD student in computer science at UT Austin. I’m very fortunate to have David Harwath as my advisor and I’m with the Speech, Audio, and Language Technologies (SALT) Lab. Before coming to Austin, I did my master’s in statistics at the University of Chicago, where I spent a wonderful summer working with Karen Livescu and Herman Kamper. I did my undergrad in Math and Applied Math at Beijing Normal University.

In my free time, I like to workout and sing.

contact: pyp [at] utexas [dot] edu

Talks

Visually Grounded Speech Processing and Understanding

May 2022 at Developmental Intelligence Laboratory, Department of Psychology, UT Austin, USA
Jan 2022 at Karen Livescu Group, Toyota Technological Institute at Chicago, USA.
Jan 2022 at Cognitive Machine Learning Group, Departement d’Etudes Cognitives, Ecole Normale Supérieure, France.

Papers

Word Discovery in Visually Grounded, Self-Supervised Speech Models
Puyuan Peng, David Harwath
Interspeech, 2022
pdf code

MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade, Puyuan Peng, David Harwath
Interspeech, 2022
pdf code

Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng, David Harwath
The 2nd Workshop on Self-supervised Learning for Audio and Speech Processing at AAAI, 2022
pdf code

Fast-Slow Transformer for Visually Grounding Speech
Puyuan Peng, David Harwath
ICASSP, 2022
pdf code

A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings
Puyuan Peng, Herman Kamper, and Karen Livescu
The 1st Workshop on Self-Supervised Learning for Speech and Audio Processing at NeurIPS, 2020
pdf