Baran Hashemi's Avatar

Baran Hashemi

@rythian47.bsky.social

AI for Mathematics

46 Followers  |  103 Following  |  24 Posts  |  Joined: 01.12.2024  |  2.1014

Latest posts by rythian47.bsky.social on Bluesky

Tnx Kyle ๐Ÿคœ

19.09.2025 05:41 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

We got accepted at #NeurIPS2025. I am very happy that I could merge my knowledge of Mathematics with AI to create sth new and useful for the community. โ˜บ๏ธ

The paper: arxiv.org/abs/2505.17190
The code: github.com/Baran-phys/T...

19.09.2025 04:21 โ€” ๐Ÿ‘ 20    ๐Ÿ” 7    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Current AI research vibes:
- Letโ€™s use LLM to do a baby science/math, after it doesnโ€™t work, headline: LLM is bad at the baby math task โ€”> guaranteed virality ๐Ÿ˜’
- Meanwhile, you develope a novel (non-LLM) method to solve this issue, report success on a deep math problem
โ€”> naa, not enough drama๐Ÿคฆ๐Ÿป

08.09.2025 09:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Another new result from the #NeurIPS rebuttal/discussion phase, our Tropical Transformer achieves much better length OOD performance across all algorithmic tasks, while being 3x-9x faster at inference and using 20% fewer parameters than the Universal Transformer (UT) models.

04.08.2025 20:47 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms Dynamic programming (DP) algorithms for combinatorial optimization problems work with taking maximization, minimization, and classical addition in their recursion algorithms. The associated value functions correspond to convex polyhedra in the max plus semiring. Existing Neural Algorithmic Reasoning models, however, rely on softmax-normalized dot-product attention where the smooth exponential weighting blurs these sharp polyhedral structures and collapses when evaluated on out-of-distribution (OOD) settings. We introduce Tropical attention, a novel attention function that operates natively in the max-plus semiring of tropical geometry. We prove that Tropical attention can approximate tropical circuits of DP-type combinatorial algorithms. We then propose that using Tropical transformers enhances empirical OOD performance in both length generalization and value generalization, on algorithmic reasoning tasks, surpassing softmax baselines while remaining stable under adversarial attacks. We also present adversarial-attack generalization as a third axis for Neural Algorithmic Reasoning benchmarking. Our results demonstrate that Tropical attention restores the sharp, scale-invariant reasoning absent from softmax.

During #NeurIPS rebuttal, we have evaluated๐ŸŒดTropical Transformer on the Long Range Arena (LRA), achieving highly competitive results, placing 2nd๐Ÿฅˆ overall in average accuracy.
Check out our paper: arxiv.org/abs/2505.17190
Our code: github.com/Baran-phys/T...

01.08.2025 20:05 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Cool. Will definitely do ๐Ÿ‘

27.05.2025 05:18 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Interesting. I was not aware of aware if the challenges in the video subfield. But that makes sense given the context. We will definitely explore those benchmarks in the future. Thanks for the suggestions.

27.05.2025 05:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Tnx. We did not test yet on any other benshmarks. You mean algorithmic or language type benchmarks?

27.05.2025 04:57 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Interesting. I was not aware of this study. However, we did not just used tropical operations, we tried to simulate a concrete tropical circuit and do the message passing in the tropical space with the Generalized Hilbert metric as the kernel.

27.05.2025 04:54 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

7/ Our message โœ๏ธ
Better reasoning might come not from bigger models, but from choosing the right algebra/geometry ๐ŸŒด.
@petar-v.bsky.social @jalonso.bsky.social
#TropicalGeometry #NeuralAlgorithmicReasoning #AI4Math

26.05.2025 13:08 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

6/ We also show that each Tropical attention head can function as a tropical gate in a tropical circuit, simulating any max-plus circuit.

26.05.2025 13:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

5/ We benchmarked on 11 canonical combinatorial tasks. Tropical attention beat vanilla & adaptive softmax attention on all three OOD axes, Length, value and Adversarial attack generalization:

26.05.2025 13:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

4/ Tropical Attention runs each head natively in max-plus. Result:
Strong OOD length generalization with sharp attention maps even in several algorithmic tasks, including the notorious Quickselect algorithm (Another settlement for the challenge identified by @mgalkin.bsky.social )

26.05.2025 13:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Image by Cowdery and Challas, featured in June 2009 Mathematics Magazine

Image by Cowdery and Challas, featured in June 2009 Mathematics Magazine

3/ In the Tropical (max + ) geometry, โ€œadditionโ€ is max, โ€œmultiplicationโ€ is +. Many algorithms already live here, carving exact polyhedral decision boundaries --> so why force them through exponential probabilities?
Let's ditch softmax, embrace the tropical semiring ๐Ÿคฏ๐Ÿน.

26.05.2025 13:08 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms Dynamic programming (DP) algorithms for combinatorial optimization problems work with taking maximization, minimization, and classical addition in their recursion algorithms. The associated value func...

2/ We introduce Tropical Attention -- the first Neural Algorithmic reasoner that operates in the Tropical semiring, achieving SOTA OOD performance on executing several combinatorial algorithms
arxiv.org/abs/2505.17190

26.05.2025 13:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿงต Tropical Attention --> Softmax is out, Tropical max-plus is in ๐Ÿฆพ
1/ ๐Ÿ”ฅEver experinced softmax attention fade as sequences grow?
That blur is why many attention mechanisms stumble on algorithmic and reasoning tasks. Well, we have a Algebraic Geometric Tropical solution ๐ŸŒด

26.05.2025 13:08 โ€” ๐Ÿ‘ 10    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

I'm speaking about AI for enumerative geometry at the CMSA New Technologies in Mathematics seminar, on Wednesday.

07.04.2025 18:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

If you think of DyT as an Activation function, it will be exactly a sub-family of our learnable Dynamic Range Activator (DRA) activation function, when (a,c)=0:

openreview.net/forum?id=4X9...

03.04.2025 14:35 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Call Paper Submission Entrance The workshop uses OpenReview as the review platform. For detailed submission guidelines, please see below.

๐Ÿ”ฅBig News! The 2nd AI for Math Workshop is coming back to #ICML2025 and weโ€™re back with the theme of exploring the frontiers of AI for mathematical reasoning, problem solving, discovery!

๐Ÿซต Calling all pioneers in AI4Math:
๐Ÿ“œ Submit your exciting work:
sites.google.com/view/ai4math...

31.03.2025 19:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Beautiful indeed!

25.03.2025 22:51 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Dark Energy Survey: implications for cosmological expansion models from the final DES Baryon Acoustic Oscillation and Supernova data The Dark Energy Survey (DES) recently released the final results of its two principal probes of the expansion history: Type Ia Supernovae (SNe) and Baryonic Acoustic Oscillations (BAO). In this paper,...

The DESI survey @desisurvey.bsky.social suggests the universe is *not* maximally boring! Statistical significance is not quite there yet, but a new result is a bit stronger than their previous indication that dark energy might be varying with time. (cont.)

arxiv.org/abs/2503.06712

19.03.2025 21:37 โ€” ๐Ÿ‘ 101    ๐Ÿ” 15    ๐Ÿ’ฌ 10    ๐Ÿ“Œ 4

For the ICLR Camera-ready version:

openreview.net/forum?id=4X9...

13.03.2025 15:04 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Tnx. The probing methods were both linear and non-linear over the conjectural form of the large-genus asymptotic form of the intersections. If the model actually learned the underlying math, it must have internalized the parameters of the asymptotic formula. We found that this was the case.

08.02.2025 20:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Can Transformers Do Enumerative Geometry? We introduce a Transformer-based approach to computational enumerative geometry, specifically targeting the computation of $\psi$-class intersection numbers on the moduli space of curves....

๐Ÿš€ Curious how Transformers understand Enumerative Geometry or model recursive functions with factorial blow-up?
I'll be presenting our results, openreview.net/forum?id=4X9..., at the Math4AI/AI4Math Workshop @mpiMathSci! ๐Ÿ”ฅ
๐Ÿ“… Registration is open until Feb 28
๐Ÿ”— www.mis.mpg.de/events/serie...
#AI4Math

08.02.2025 20:27 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Can Transformers Do Enumerative Geometry? How can Transformers model and learn enumerative geometry? What is a robust procedure for using Transformers in abductive knowledge discovery within a mathematician-machine collaboration? In this work...

I am extremely happy to announce that our paper
Can Transformers Do Enumerative Geometry? (arxiv.org/abs/2408.14915) has been accepted to the
@iclr-conf.bsky.social!!
Congrats to my collaborators Alessandro Giacchetto at ETH Zรผruch and Roderic G. Corominas at Harvard.
#ICLR2025 #AI4Math #ORIGINS

23.01.2025 10:17 โ€” ๐Ÿ‘ 12    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

@rythian47 is following 20 prominent accounts