Time 
Title 
Speakers/Authors 


9:009:15am  Session: Opening Remarks 


9:1510:00am 
Keynote talk 1: What Dense Graph Do You Need for SelfAttention? [slides]
Abstract: Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities. Recent works propose sparse Transformers with attention on sparse graphs to reduce complexity and remain strong performance. While effective, the crucial parts of how dense a graph needs to be to perform well are not fully explored. In this talk, we introduce Normalized Information Payload (NIP), a graph scoring function measuring information transfer on graph, which provides an analysis tool for tradeoffs between performance and complexity. Guided by this theoretical analysis, we present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer while yielding O(NlogN) complexity with sequence length N. 

10:0010:45am 
Keynote talk 2: A Closer Look at Structure and Sparsity in Graph Based Natural Language Understanding.
Abstract: Graph based approaches have been increasingly utilized for different NLP applications, however, what types of structures should be leveraged and to what extent these graph structures help still remain challenging. In this talk, we take a closer look at graph neural networks via two typical NLP applications: structureaware conversation summarization and knowledgegraph enhanced question answering. Concretely, the first section looks at how to utilize graph structures to better encode discourse relations and actions in conversations for improved dialogue summarization, and the second part dissects stateoftheart graph neural network modules and their reasoning capability for question answering. 

10:4511:00am  Coffee Break/Social Networking  
11:0011:45am 
Panel topic: GNNs Vs Pretraining (Transformers): Friends or Enemy?
Panelists: Michael Perlmutter (UCLA), Linfeng Song (Tencent AI), Jian Tang (UMontreal and MILA), Jingbo Shang (UC San Diego), Meng Jiang (Notre Dame) 


11:451:00pm 
Five Contributed Talks (each 15 minutes) (11:4512:00pm) Talk 1: Yinquan Lu et al., KELM: Knowledge Enhanced PreTrained Language Representations with Message Passing on Hierarchical Relational Graphs [paper] (12:0012:15pm) Talk 2: Bai Xuefeng, Semantic Representation for Dialogue Modeling [paper] (12:1512:30pm) Talk 3: Shengyao Lu, R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning [paper] (12:3012:45pm) Talk 4: Yuxian Meng et al., GNNLM: Language Modeling Based on Global Contexts via GNN [paper] (12:451:00pm) Talk 5: Leonardo Ribeiro, Improving GraphtoText Generation with Neural Graph Encoders and Pretrained Language Models [paper] 

2:002:45pm 
Keynote talk 3: Improving Interpretability and Generalization with Structured Neural Transduction.
Abstract: Sequence and graph prediction problems can generally be handled with 'unstructured' sequencetosequence models, often leading to strong performance, especially in i.i.d. settings. In this talk, instead, I will discuss alternatives that approach the transduction process in a 'structured' way, aiming for improved interpretability and outofdistribution generalization. In the first part, I will discuss how a texttograph generation problem can be tackled by inducing graph decomposition and alignments as part of learning. This method yields a neural graph generator that, at inference time, simply tags the input sequence with graph fragments. We will see how this idea is used to produce an accurate (AMR) and transparent semantic parser. In the second part, I will discuss how seqtoseq problems can be handled by a neural model which models the 'translation' process as structured permutation and monotonic translation of the subsequences. We will see that this structured method leads to improvements in outofdistribution ("compositional") generalization on semantic parsing and machine translation tasks.


2:453:30pm 
Keynote talk 4: Geometric Scattering: Graph Neural Nets that Preserve HighFrequency Information. [slides]
Abstract: Many advances in deep learning exploit the intrinsic structure of the data. For instance, Convolutional Neural Networks leverage the fact that images are a regular grid of pixels, whereas recurrent neural networks exploit the temporal structure of textbased data. Inspired by this success, the new field of geometric deep learning aims to develop deep learning architectures for datasets such as graphs and manifolds with less regular structure.


3:303:45pm  Coffee Break/Social Networking  
3:454:15pm 
Position Talk (3:454:00pm) P1: Graph4NLP: A Library for Deep Learning on Graphs for NLP, Yu Chen (Meta AI) (4:004:15pm) P2: Efficient and effective training of language and graph neural network models, Vassilis N. Ioannidis (AWS Graph ML) 

4:155:45pm 
Six Contributed Talks (each 15 minutes) (4:154:30pm) Talk 6: Abhay M Shalghar et al., Document Structure aware Relational Graph Convolutional Networks for Ontology Population [paper] (4:304:45pm) Talk 7: Kunze Wang et al., MEGCN: Multidimensional EdgeEmbedded Graph Convolutional Networks for Semisupervised Text Classification [paper] (4:455:00pm) Talk 8: Yuchen Zeng et al., Combinatorial Scientific Discovery: Finding New Concept Combinations Beyond Link Prediction [paper] (5:005:15pm) Talk 9: Zhibin Chen et al., Entailment Graph Learning with Textual Entailment and Soft Transitivity [paper] (5:155:30pm) Talk 10: Xin Xie et al., From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer [paper] (5:305:45pm) Talk 11: Ran Song et al., Ontologyguided and Textenhanced Representation for Knowledge Graph Zeroshot Relational Learning [paper] 

5:456:00pm  Closing remarks 
