Hexiang(Frank) Hu
Ph.D. Student [at] USC [at]
In Deep.
I am passionte with Machine Learning, Computer Vision as well as Natural Language Processing. I aim to combine the great power of vision and language.


Hexiang Hu is a Computer Science Ph.D. student in Viterbi School of Engineering at University of Southern California(USC), working with Prof. Fei Sha. Prior to this, He was a Ph.D. student in Henry Samueli School of Engineering and Applied Science at University of California, Los Angeles(UCLA). He earned his Bachelor’s degrees in Computer Science from Zhejiang University and Simon Fraser University with honor. His research interests lie in the field of Machine Learning, Computer Vision and Natural Language Processing.


2017 -
PhD student @ USC
Large Scale Machine Learning, Vision and Language
Supervisor: Prof. Fei Sha
Summer 2017
Applied Scientist Intern @ AWS DL
Large Scale Machine Learning
2016 - 2017
PhD student @ UCLA
Object Detection, Semantic Segmentation
Supervisor: Prof. Fei Sha
Summer 2016
Research Intern @ Megvii Inc.(Face++)
Instance-level Segmentation
Jun. 2015 - Apr. 2016
Research Assistant @ VML
Multi-label Classification and Group Activity Recognition
Supervisor: Prof. Greg Mori


Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets

We show the design of the decoy answers has a significant impact on how and what the learning models learn from the datasets. In particular, the resulting learner can ignore the visual information, the question, or the both while still doing well on the task. Inspired by this, we propose automatic procedures of how to remedy such design deficiencies.

ArXiv 2017 (Tech Report)
LabelBank: Revisiting Global Perspectives for Semantic Segmentation

We show the ability of our framework to improve semantic segmentation performance in a variety of settings. We learn models for extracting a holistic LabelBank from visual cues, attributes, and/or textual descriptions. We demonstrate improvements in semantic segmentation accuracy on standard datasets across a range of state-of-the-art segmentation architectures and holistic inference approaches.

ArXiv 2017 (Tech Report)
FastMask: Segment Multi-scale Object Candidates in One Shot

We present a novel segment proposal framework, namely FastMask, which takes advantage of the hierarchical structure in deep convolutional neural network to segment multi-scale objects in one shot. Through leveraging feature pyramid and sliding-window region attention, we made instance proposal not only fast but more accurate.

CVPR 2017 (Spotlight) in Honolulu, Hawaii
Learning Structured Inference Neural Networks with Label Relations

We propose a generic structured model that leverages diverse label relations to improve image classification performance. It employs a novel stacked label prediction neural network, capturing both inter-level and intra-level label semantics. The design of this framework naurally extends to leverage partial observations in the label space to inference the rest label space.

CVPR 2016 in Las Vegas, Nevada
Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition

We propose a method to integrate graphical models and deep neural networks into a joint framework with a sequential prediction approximation, modeled by recurrent neural network. This framework simultaneously predicts the underline structure of interactions between people and inferences the corresponding labels for individual and group.

CVPR 2016 in Las Vegas, Nevada