Time |
Title |
Speakers/Authors |
||||||
---|---|---|---|---|---|---|---|---|
9:00-9:15am | Session: Opening Remarks |
|
||||||
9:15-10:00am |
Keynote talk 1: What Dense Graph Do You Need for Self-Attention? [slides]
Abstract: Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities. Recent works propose sparse Transformers with attention on sparse graphs to reduce complexity and remain strong performance. While effective, the crucial parts of how dense a graph needs to be to perform well are not fully explored. In this talk, we introduce Normalized Information Payload (NIP), a graph scoring function measuring information transfer on graph, which provides an analysis tool for trade-offs between performance and complexity. Guided by this theoretical analysis, we present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer while yielding O(NlogN) complexity with sequence length N. |
|||||||
10:00-10:45am |
Keynote talk 2: A Closer Look at Structure and Sparsity in Graph Based Natural Language Understanding.
Abstract: Graph based approaches have been increasingly utilized for different NLP applications, however, what types of structures should be leveraged and to what extent these graph structures help still remain challenging. In this talk, we take a closer look at graph neural networks via two typical NLP applications: structure-aware conversation summarization and knowledge-graph enhanced question answering. Concretely, the first section looks at how to utilize graph structures to better encode discourse relations and actions in conversations for improved dialogue summarization, and the second part dissects state-of-the-art graph neural network modules and their reasoning capability for question answering. |
|||||||
10:45-11:00am | Coffee Break/Social Networking | |||||||
11:00-11:45am | Panel topic: GNNs Vs Pretraining (Transformers): Friends or Enemy? Panelists: Michael Perlmutter (UCLA), Linfeng Song (Tencent AI), Jian Tang (UMontreal and MILA), Jingbo Shang (UC San Diego), Meng Jiang (Notre Dame) |
|
||||||
11:45-1:00pm |
Five Contributed Talks (each 15 minutes) (11:45-12:00pm) Talk 1: Yinquan Lu et al., KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs [paper] (12:00-12:15pm) Talk 2: Bai Xuefeng, Semantic Representation for Dialogue Modeling [paper] (12:15-12:30pm) Talk 3: Shengyao Lu, R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning [paper] (12:30-12:45pm) Talk 4: Yuxian Meng et al., GNN-LM: Language Modeling Based on Global Contexts via GNN [paper] (12:45-1:00pm) Talk 5: Leonardo Ribeiro, Improving Graph-to-Text Generation with Neural Graph Encoders and Pretrained Language Models [paper] |
|||||||
2:00-2:45pm |
Keynote talk 3: Improving Interpretability and Generalization with Structured Neural Transduction.
Abstract: Sequence and graph prediction problems can generally be handled with 'unstructured' sequence-to-sequence models, often leading to strong performance, especially in i.i.d. settings. In this talk, instead, I will discuss alternatives that approach the transduction process in a 'structured' way, aiming for improved interpretability and out-of-distribution generalization. In the first part, I will discuss how a text-to-graph generation problem can be tackled by inducing graph decomposition and alignments as part of learning. This method yields a neural graph generator that, at inference time, simply tags the input sequence with graph fragments. We will see how this idea is used to produce an accurate (AMR) and transparent semantic parser. In the second part, I will discuss how seq-to-seq problems can be handled by a neural model which models the 'translation' process as structured permutation and monotonic translation of the subsequences. We will see that this structured method leads to improvements in out-of-distribution ("compositional") generalization on semantic parsing and machine translation tasks. Work with Bailin Wang, Chunchuan Lyu, Mirella Lapata, and Shay Cohen. |
|||||||
2:45-3:30pm |
Keynote talk 4: Geometric Scattering: Graph Neural Nets that Preserve High-Frequency Information. [slides]
Abstract: Many advances in deep learning exploit the intrinsic structure of the data. For instance, Convolutional Neural Networks leverage the fact that images are a regular grid of pixels, whereas recurrent neural networks exploit the temporal structure of text-based data. Inspired by this success, the new field of geometric deep learning aims to develop deep learning architectures for datasets such as graphs and manifolds with less regular structure. A principal challenge in this endeavor is defining a proper notion of convolutional filters. Many graph neural networks propose to define graph convolution as a localized averaging operation. While these networks achieve great success on benchmark datasets, they are known to suffer from the oversmoothing problem, i.e., they do not preserve high-frequency information. This motivates us to define an alternative, wavelet-based model of graph neural networks known as the graph scattering transform. In its initial form, the graph scattering transform is a handcrafted network with no learnable parameters (except in the final layer). This version of the graph scattering transform has the advantage of (i) being amenable to rigorous mathematical analysis and (ii) not requiring much training data. However, handcraftedness is also a form of rigidity that limits the ability of the network to learn. Therefore, I will also introduce several new variations of the graph scattering transform which are able to learn from data. |
|||||||
3:30-3:45pm | Coffee Break/Social Networking | |||||||
3:45-4:15pm |
Position Talk (3:45-4:00pm) P1: Graph4NLP: A Library for Deep Learning on Graphs for NLP, Yu Chen (Meta AI) (4:00-4:15pm) P2: Efficient and effective training of language and graph neural network models, Vassilis N. Ioannidis (AWS Graph ML) |
|||||||
4:15-5:45pm |
Six Contributed Talks (each 15 minutes) (4:15-4:30pm) Talk 6: Abhay M Shalghar et al., Document Structure aware Relational Graph Convolutional Networks for Ontology Population [paper] (4:30-4:45pm) Talk 7: Kunze Wang et al., ME-GCN: Multi-dimensional Edge-Embedded Graph Convolutional Networks for Semi-supervised Text Classification [paper] (4:45-5:00pm) Talk 8: Yuchen Zeng et al., Combinatorial Scientific Discovery: Finding New Concept Combinations Beyond Link Prediction [paper] (5:00-5:15pm) Talk 9: Zhibin Chen et al., Entailment Graph Learning with Textual Entailment and Soft Transitivity [paper] (5:15-5:30pm) Talk 10: Xin Xie et al., From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer [paper] (5:30-5:45pm) Talk 11: Ran Song et al., Ontology-guided and Text-enhanced Representation for Knowledge Graph Zero-shot Relational Learning [paper] |
|||||||
5:45-6:00pm | Closing remarks |
|