Jaedong Hwang's Avatar

Jaedong Hwang

@jaedonghwang.bsky.social

PhD Student @MITEECS https://jd730.github.io/

31 Followers  |  138 Following  |  15 Posts  |  Joined: 01.03.2025  |  1.7344

Latest posts by jaedonghwang.bsky.social on Bluesky

Post image

Interested in doing a Ph.D. to work on building models of the brain/behavior? Consider applying to graduate schools at CU Anschutz:
1. Neuroscience www.cuanschutz.edu/graduate-pro...
2. Bioengineering engineering.ucdenver.edu/bioengineeri...

You could work with several comp neuro PIs, including me.

27.09.2025 20:30 β€” πŸ‘ 52    πŸ” 30    πŸ’¬ 1    πŸ“Œ 4
Post image

We have one poster in this afternoon's session at #ICML2025 (West Exhibition Hall B2-B3, W-414).
Unfortunately, none of the authors could attend the conference, but feel free to contact me if you have any questions!
icml.cc/virtual/2025...

16.07.2025 13:16 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

10/10 This work was a wonderful collaboration with Kumar Tanmay, Seok-Jin Lee, Ayush Agrawal, Hamid Palangi, Kumar Ayush, Ila Fiete, and Paul Pu Liang.
πŸ“˜ Paper: arxiv.org/pdf/2507.05418
🌐 Project: jd730.github.io/projects/Geo...
#LLM #MultilingualAI #Reasoning #NLP #AI #LanguageModels

15.07.2025 15:46 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

9/10
This matters:
βœ”οΈ For global inclusivity
βœ”οΈ For users who expect interpretable reasoning in their native language
βœ”οΈ For fair multilingual evaluation
🧠 LLMs shouldn’t just give the right answerβ€”they should think in your language.

15.07.2025 15:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

8/10
πŸ“Š On MGSM, BRIDGE improves both math and language accuracy in medium- and low-resource languages.
Even better:
β€’ It maintains performance in English
β€’ It succeeds where naive post-training and SFT or GRPO alone fail (especially in math).

15.07.2025 15:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

7/10
We also propose BRIDGE, a method that balances:
β€’ Supervised fine-tuning for task-solving
β€’ GRPO with a language consistency reward in reasoning.
This decouples multilingual ability from reasoning ability.

15.07.2025 15:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

6/10
GeoFact-X lets us evaluate not just what models predict, but how they think.
We measure:
β€’ Answer correctness
β€’ Reasoning quality
β€’ Language consistency
Models do better on region-language aligned pairs vs. mismatched ones.

15.07.2025 15:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

5/10
We introduce GeoFact-X, the first benchmark to evaluate language-consistent reasoning.
🌍 It includes multilingual CoT QA across 5 regions Γ— 5 languages (EN, JA, SW, HI, TH)=25 region-language pairs.
Questions are grounded in regional facts, each with step-by-step reasoning.

15.07.2025 15:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

4/10
We evaluate leading LLMs (e.g., Qwen2.5, LLaMA-3, Gemma-3, DeepSeek-R1) on MGSM with native-language CoT.
πŸ” Result:
Many models get the correct answer but default to English for reasoning, even when prompted otherwise.
That’s a serious misalignment.

15.07.2025 15:40 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

3/10
Existing multilingual benchmarks (e.g., MGSM, MMLU-ProX) only evaluate if the final answer is correct in the target language.
They don’t measure if the reasoning process (CoT) is in the same language.
That gap matters for transparency, fairness, and inclusivity.

15.07.2025 15:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2/10
Today’s LLMs are multilingual-ish.
They often generate answers in the input language, but their reasoning steps (chain-of-thought) default to English, especially after post-training on English data.

15.07.2025 15:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

🧡1/10
LLMs can answer in many languages.
But do they think in them?
Even when prompted in Swahili or Thai, models often switch to English for reasoning.
This breaks interpretability and trust.
So we ask: Can LLMs reason in the input language?

15.07.2025 15:39 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

If I remember correctly, that was also the first CV conference with over 1000 papers, and people already felt overwhelmed. Now, CVPR 2025 has 2800+ papers, and #NeurIPS2024 had 4497. It’s becoming nearly impossible to discover hidden gems while wandering poster sessions. 2/2

12.06.2025 00:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

#CVPR2025 Six years have passed since the 'Computer Vision After 5 Years' workshop at CVPR 2019. In it, Bill Freeman predicted that vision-science-inspired algorithms would lead the way. Instead, the field is now dominated by generative AI and foundation models. 1/2

12.06.2025 00:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We learned the bitter lession that a poster should be checked before the poster session #ICLR2025.
Thank you all for coming and we are delight that you enjoyed our mistakes.
We are also highly appeciate authors of MMSearch allowing us to use their panel.

26.04.2025 10:32 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ“’ Excited to share that I will be presenting our paper on Neuro-Inspired SLAM at #ICLR2025 TOMORROW!
πŸ—“ Saturday, April 26th 10:00 - 12:30 pm
πŸ“ Hall 3 (Poster #55)
jd730.github.io/projects/FAR...

25.04.2025 13:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Global modules robustly emerge from local interactions and smooth gradients - Nature The principle of peak selection is described, by which local interactions and smooth gradients drive self-organization of discrete global modules.

1/ Our paper appeared in @Nature today! www.nature.com/articles/s41... w/ Fiete Lab and @khonamikail.bsky.social .
Explains emergence of multiple grid cell modules, w/ excellent match to data! Novel mechanism for applying across vast systems from development to ecosystems. πŸ§΅πŸ‘‡

19.02.2025 23:20 β€” πŸ‘ 97    πŸ” 32    πŸ’¬ 2    πŸ“Œ 2

@jaedonghwang is following 20 prominent accounts