#EpiverseTRACE is now on Bluesky & LinkedIn! 🎉
We’re expanding to be more inclusive & diverse, reaching a wider audience in public health & data science.
Want to know more about what we do?⁉️🤔
🧵a thread!
@andresmorenob.bsky.social
Systems Engineering and Computer Science Professor at Pontificia Universidad Javeriana in Bogotá, Colombia. Interested in data, AI,and education. Opinions expressed are my own and do not reflect the views of my employer.
#EpiverseTRACE is now on Bluesky & LinkedIn! 🎉
We’re expanding to be more inclusive & diverse, reaching a wider audience in public health & data science.
Want to know more about what we do?⁉️🤔
🧵a thread!
These thoughts after working on suspenseful story generation (pre o1) arxiv.org/abs/2402.17119. GPT can be beat at suspense generation. Interestingly GPT can be improved by guiding it with the theory of-mind part of story b planning.
~3 pages is the longest form we’ve tried.
Something I don't understand is: why can't LLMs write novel-length fiction yet?
They've got the context length for it. And new models seem capable of the multi-hop reasoning required for plot. So why hasn't anyone demoed a model that can write long interesting stories?
I do have a theory ... +
Great blog post (by a 15-author team!) on their release of ModernBERT, the continuing relevance of encoder-only models, and how they relate to, say, GPT-4/llama. Accessible enough that I might use this as an undergrad reading.
19.12.2024 19:11 — 👍 76 🔁 19 💬 1 📌 2A photo of my open textbook, "Theory of Computing: An Open Introduction", on my bookshelf leaning up against some other classic theory texts.
With students writing my theory exam today, I figured it's a good time to share a link to my open textbook with all you current (and future!) theoreticians.
This term was the first time I used it in class, and students loved it. Big plans for future editions, so stay tuned!
taylorjsmith.xyz/tocopen/
Great tutorial on language models!
11.12.2024 08:04 — 👍 2 🔁 0 💬 0 📌 0Check out this BEAUTIFUL interactive blog about cameras and lenses
ciechanow.ski/cameras-and-...
A timely paper exploring ways academics can pretrain larger models than they think, e.g. by trading time against GPU count.
Since the title is misleading, let me also say: US academics do not need $100k for this. They used 2,000 GPU hours in this paper; NSF will give you that. #MLSky
A poem for my last day working at the writing center for the semester (by Joseph Fasano)
23.11.2024 02:09 — 👍 15672 🔁 3322 💬 148 📌 133