Haw-Shiuan Chang

Nature Language Processing

Multi-facet Embeddings for Controlling the Topics of Language Generation

We design a framework that displays multiple candidate upcoming topics, of which a user can select a subset to guide the generation. Our framework consists of two components: (1) a method that produces a set of candidate topics by predicting the centers of word clusters in the possible continuations, and (2) a text generation model whose output adheres to the chosen topics. The training of both components is self-supervised, using only unlabeled text. Our experiments demonstrate that our topic options are better than those of standard clustering approaches, and our framework often generates fluent sentences related to the chosen topics, as judged by automated metrics and crowdsourced workers.

(Paper, Code, Talk, Slides, Poster).

Multi-facet Embeddings for Relation Extraction

Screen Shot 2021-03-29 at 2.30.25 AM.png

We propose multi-facet universal schema that uses a neural model to represent each sentence pattern as multiple facet embeddings and encourage one of these facet embeddings to be close to that of another sentence pattern if they cooccur with the same entity pair. In our experiments, we demonstrate that multi-facet embeddings significantly outperform their single facet embedding counterpart, compositional universal schema (Verga et al., 2016), in distantly supervised relation extraction tasks. Moreover, we can also use multiple embeddings to detect the entailment relation between two sentence patterns when no manual label is available.
(Paper, Code, Talk, Slides, Poster).

Multi-facet Embeddings for Sentence Representation

Screen Shot 2021-03-29 at 2.34.21 AM.png

We propose a novel embedding method for a text sequence (e.g., a sentence) where each sequence is represented by a distinct set of multi-mode codebook embeddings to capture different semantic facets of its meaning. The codebook embeddings can be viewed as the cluster centers which summarize the distribution of possibly co-occurring words in a pre-trained word embedding space. Our experiments show that the per-sentence codebook embeddings significantly improve the performances in unsupervised sentence similarity and extractive summarization benchmarks.

(Paper, Slides, Poster).

Overcoming Practical Issues of Deep Active Learning

Screen Shot 2019-11-19 at 4.41.34 PM.png

Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to noise in labeling, (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data.

We perform extensive experiments on four named entity recognition (NER) tasks and results show that our methods greatly alleviate these limitations without sacrificing too much sampling efficiency.
(Paper, Slides, Talk ).

Distributional Inclusion Vector Embedding

We propose a novel word embedding method which preserves the distributional inclusion property in the sparse-bag-of-word (SBOW) feature. The embedding can be used to predict generality of words, detect the hypernym relation, and discover the topics from the raw text simultaneously. The extensive experiments show that the embedding effectively compresses the SBOW, and achieves new state-of-the-art performances on the unsupervised hypernym detection tasks (Paper, Code, Demo, Poster). We also show that DIVE could help us to do word sense induction more efficiently (Paper, Slides).

UMASS TAC 2016 system for relation extraction

TAC-KBP is one of the most challenging text-based information retrieval tasks. We integrate research which is done in UMASS IESL in the past year, including embedding linker, multilingual Universal Schema, and LSTM sentence embedding. We perform extensive error analysis and develop some novel techniques (such as using a search engine to reduce noise in training data) to tackle the problems (Paper).

Neural Network

Use Active Learning to Improve SGD

Inspired by active learning, we propose two alternatives to re-weight training samples based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD). Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques (Paper, Poster).

Education

Student Modeling and Prerequisite Verification in Knowledge Tree

We extract answering logs of the exercises from Junyi Academy (http://www.junyiacademy.org/), an E-learning website similar to Khan Academy.

We use crowdsourcing and machine learning to discover relationships between exercises. Based on that, we will design a mechanism of adaptive test to improve learning experiences of Junyi academy (Paper, Presentation, Demo, Dataset).

Computer Vision and Multimedia

Active Sampling for estimating QoE model

We use Bayesian learning to model the non-linear relationships between quality of experience (QoE) and multiple factors.

Our experiment shows that active sampling can be used to reduce the number of samples collected from crowdsourcing for building such model (Paper).

Hierarchical Image Segmentation without Training

We proposed a general framework which applies classifiers with different complexity to discriminate segments in an image.
Our unsupervisedhierarchical segmentation results achieve similar or better performance in several standard benchmarks compared with the current state-of-the-art methods based on learning, and has been accepted to ACCV 2014 (Paper, Poster).

Decomposition of Multiple Foreground Co-segmentation

We proposed an efficient algorithm which decomposes the unsupervised Multiple Foreground Co-segmentation problem into three sub-problems: segmentation, matching and figure-ground classification.

Our method improves the accuracy of the state-of-the-art method by 13% in a standard benchmark, and has been accepted by CVIU (Paper).

Superpixel-Based Large Displacement Optical Flow

We formulated our objective function at the superpixel level rather than the pixel level as the traditional optical flow method did.

Our method achieves better large displacement matching capability than LDOF in videos with lower quality , and has been accepted to ICIP 2013 (Paper, Poster).