Andrej Risteski's Avatar

Andrej Risteski

@andrejristeski.bsky.social

Machine learning researcher. Professor in ML department at CMU.

1,054 Followers  |  306 Following  |  13 Posts  |  Joined: 18.11.2024  |  1.8773

Latest posts by andrejristeski.bsky.social on Bluesky

Post image

If you're coming to COLT, @khodakmoments.bsky.social , @tm157.bsky.social, myself, Nick Boffi and Jianfeng Lu are organizing a workshop on the Theory of AI for Scientific Computing, to be held on the first day of the conference (Monday, June 30). Schedule here: tasc-workshop.github.io/#schedule

29.06.2025 01:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Finally, this lens doesn't consider computation per layer. A typical GNN layer (e.g. a GCN layer) for the edge-based architecture would be more expensive compared to the node-based architecture. It would be nice if "rewiring" ideas in the literature can get best-of-both-worlds.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This echoes a message from recent position papers (e.g. arxiv.org/pdf/2502.14546) that we need better benchmarks for graph-based tasks to "see" fine-grained differences between improved architectures/methods/etc. (This argument has been made more broadly for benchmarks in SciML)

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Empirically, on "standard" benchmarks edge embeddings provide only modest benefit---but it's easy to construct (synthetic, diagnostic, but natural) datasets which have a "hub" structure on which edge embeddings perform dramatically better.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The proof techniques are reminiscent of and inspired by techniques in time–space trade-offs in computational complexity, which measure "information flow" between variables in a function. A nice survey if interested here: cs.brown.edu/people/jsava...

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Oversmoothing, Oversquashing, Heterophily, Long-Range, and more: Demystifying Common Beliefs in Graph Machine Learning After a renaissance phase in which researchers revisited the message-passing paradigm through the lens of deep learning, the graph machine learning community shifted its attention towards a deeper and...

In some ways, this supports recent remarks in a recent position paper arxiv.org/abs/2505.15547 that phenomena like oversquashing conflate topological and computational bottlenecks --- and in fact, that their interaction matters a lot.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The separation manifests on graphs with topological bottlenecks (i.e. hub nodes), and when solving the task requires routing information through a narrow cut. The hub nodes need to either have a lot of memory or the length of the protocol has to grow.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We show that there are natural functions computable by edge architectures with constant depth and memory. However, in any node-based architecture, depth x memory has to be Ω(√n), where n = # nodes of the graph. Moreover, without the memory constraint there is *no* separation.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We examine a modest architectural change: edges maintain their own state (common GNNs maintain node states). This change was introduced in the literature mostly to naturally handle edge-centric tasks & edge features---but we show it has representational benefits too.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
What graph neural networks cannot learn: depth vs width This paper studies the expressive power of graph neural networks falling within the message-passing framework (GNNmp). Two results are presented. First, GNNmp are shown to be Turing universal under su...

We introduce an additional resource budgetβ€”memory per processor (viewing GNNs as a message-passing protocol), which β‰ˆ models the dimensionality of the hidden representation (at finite precision), similar to the viewpoint in arxiv.org/abs/1907.03199

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Theory of Graph Neural Networks: Representation and Learning Graph Neural Networks (GNNs), neural network architectures targeted to learning representations of graphs, have become a popular learning model for prediction tasks on nodes, graphs and configurations...

The Weifeiler-Lehman hierarchy lens classifies GNNs according to their distinguishing power wrt the graph isomorphism problem (see e.g. arxiv.org/abs/2204.07697). While powerful, this lens ignores other computational resources that can influence expressiveness.

24.06.2025 15:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Towards characterizing the value of edge embeddings in Graph Neural Networks Graph neural networks (GNNs) are the dominant approach to solving machine learning problems defined over graphs. Despite much theoretical and empirical work in recent years, our understanding of finer...

In recent work arxiv.org/abs/2410.09867 with D. Rohatgi, @tm157.bsky.social, @zacharylipton.bsky.social , J. Lu and A. Moitra, we revisit fine-grained expressiveness in GNNs---beyond the usual symmetry (Weisfeiler-Lehman) lens. Paper will appear in ICML '25, thread below.

24.06.2025 15:54 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Congratulations!

09.12.2024 22:43 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@andrejristeski is following 20 prominent accounts