GEN: A Practical Alternative to Graph Transformers for Long-Range Graph Modeling
Shuo Wang, Ge Cheng, Yun Zhang
Published: 2024/1/2
Abstract
Message Passing Neural Networks (MPNNs) model local relations effectively but struggle to propagate information over long distances. Graph Transformers (GTs) mitigate this via global self-attention, yet their quadratic cost in the number of nodes limits scalability. We propose Graph Elimination Networks (GENs), an MPNN variant that approximates GT-like long-range modeling while maintaining high efficiency. GENs combine edge-wise and hop-wise self-attention in parallel; their multiplicative composition yields an attention kernel separable across edge and hop factors within a bounded K-hop receptive field. To enable hop-wise attention, we introduce the Graph Elimination Algorithm (GEA), which prevents double counting across hops, ensuring that each round injects the k-hop incremental contribution exactly once. Taking differences between successive rounds recovers the k-hop increment and yields disentangled multi-hop features as inputs for hop-wise attention. This preserves clearer structural distinctions across hop distances and enables more faithful modeling of pairwise dependencies between distant nodes within the K-hop neighborhood. On the Long-Range Graph Benchmark (LRGB), GENs outperform strong MPNN baselines by 7.7 and 6.0 percentage points (pp) on PascalVOC-SP and COCO-SP, and achieve performance on par with or better than state-of-the-art Graph Transformers. On OGBN-Products, GENs support full-batch training/inference, while sparse-attention baselines like Exphormer struggle with memory limits under comparable budgets, highlighting GENs as a practical alternative for large, sparse graphs.