SEACrowd's Avatar

SEACrowd

@seacrowd.bsky.social

Advancing Southeast Asian (SEA) NLP Research https://seacrowd.github.io/

15 Followers  |  10 Following  |  13 Posts  |  Joined: 12.03.2025  |  1.9371

Latest posts by seacrowd.bsky.social on Bluesky

Post image Post image

We’re thrilled that SEA-VL has been accepted to the ACL 2025 (Main)!

Thank you to everyone who contributed to this project πŸ₯³

Paper: arxiv.org/abs/2503.07920
Project: seacrowd.github.io/seavl-launch/

#ACL2025NLP #SEACrowd #ForSEABySEA

16.05.2025 22:18 β€” πŸ‘ 2    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

Let’s build a VLM that sees and celebrates Southeast Asiaβ€”together. πŸ’ͺ

@josephimperial.bsky.social @samuel-cahyawijaya.bsky.social @jcblaise.bsky.social @ruochenzhang.bsky.social @rianadamr.bsky.social @antonrufino.bsky.social

08.05.2025 09:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
🚨 SEA-VL Phase 2 - Building Vision-Language Models for Southeast Asia: Call for Contributors Welcome! SEA-VL is a global community project organized by the SEACrowd community to push the boundaries of vision and language research in Southeast Asia (SEA). We recently completed Phase 1 of this ...

Whether you’re a researcher, developer, artist, linguist, photographer, student, or simply someone who loves Southeast Asia, your voice and skills matter. Join us!

πŸ“₯ Apply now: seacrowd.github.io//seavl-phase...

πŸ’¬ Questions? Join the conversation on Discord: discord.gg/XXRHFuvkTA

08.05.2025 09:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Why contribute?
🀝 Work with an international team of passionate researchers
πŸ… Earn points for every contributionβ€”with opportunities for a certificate, exclusive merch (t-shirt & keychain), and even co-authorship on our final paper

08.05.2025 09:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We are looking for contributors who can:
πŸ”ΉSubmit culturally relevant images from SEA
πŸ”ΉAnnotate image submissions
πŸ”ΉTranslate existing benchmarks to SEA languages
πŸ”ΉCreate high-quality questions for multicultural images from SEA
πŸ”ΉCreate high-quality prompts for image generation with our VLM

08.05.2025 09:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We want build the first open-source vision-language model (VLM) that fully captures Southeast Asia’s rich cultures, languages, and everyday life!

08.05.2025 09:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ“’ Calling all SEA-passionate individuals!

SEACrowd is excited to launch our contributor call for SEA-VL Phase 2: Building Visual Language Models for Southeast Asia! 🌏

After the success of Phase 1, we're now taking on a bigger mission (see thread)πŸ‘‡

08.05.2025 09:41 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Preview
The ACL Special Interest Group on SEA NLP Southeast Asia

Interested in pushing research for Southeast Asian languages? We're happy to welcome you in SEACrowd and SIGSEA! See links below:

SIGSEA: www.sigsea.org/home
Discord: discord.gg/XXRHFuvkTA

13.03.2025 11:36 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Introducing SEA-VL with 1.3M culturally relevant imagesβ€”50x larger than existing datasets!
πŸ” Key insights:
βœ… Crowdsourcing: good accuracy but slow & costly
βœ… Image Crawling: ~85% cultural relevance
❌ Image Generation fails to capture SEA nuances & faces licensing issues

13.03.2025 11:36 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Why is this important?
βœ… AI models trained on culturally relevant data can better understand local contexts, traditions, and languages.
βœ… Community contributions ensure AI does not misrepresent local identities.
βœ… We empower local communities in AI development.

13.03.2025 11:36 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ’‘ That’s why we created SEA-VL, an open-source initiative designed to bridge the resource gap and provide AI models with more accurate, culturally relevant data from SEA. But we couldn’t have done it alone!

#NoLanguageLeftBehind #SoutheastAsia

13.03.2025 11:36 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

AI is shaping the future, but how often does it reflect the cultures, languages, and traditions of Southeast Asia? Not enough!

Most VL datasets used to train AI are dominated by Western-centric data, leaving Southeast Asian cultures largely underrepresented.

13.03.2025 11:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

SEA-VL: Building AI for Southeast Asian Research 🌏

We release SEA-VL, the largest vision-language dataset tailored for SEA’s diverse culture.

πŸ“œ arXiv: arxiv.org/abs/2503.07920
πŸ€— Data: huggingface.co/collections/...

Check the thread 🧡

13.03.2025 11:36 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1

@seacrowd is following 10 prominent accounts