Anthony Dsouza's Avatar

Anthony Dsouza

@ajdsouza.com.bsky.social

Mountainbikes | Guitars | MSc @lst.uni-saarland.de

17 Followers  |  79 Following  |  10 Posts  |  Joined: 13.12.2025  |  1.567

Latest posts by ajdsouza.com on Bluesky


Preview
GitHub - ajdsouz/bert Contribute to ajdsouz/bert development by creating an account on GitHub.

github.com/ajdsouz/bert

06.02.2026 16:20 — 👍 0    🔁 0    💬 0    📌 0

I’m going to use lightning-fabric for another graded project where my team builds an Audio language model. That is more of my style.

I’d like to think I’ve become a better dev, atleast compared to from when I started the project.

06.02.2026 16:02 — 👍 0    🔁 0    💬 1    📌 0

are sometimes very restrictive in terms of freedom. We used Lightning for fp16 support on Titan X, a card with limited fp16 capabilities. GradScaler and autocast do the job just fine on a single gpu, but lightning provides the ease of changing the precision by changing the argument to the trainer.

06.02.2026 16:02 — 👍 0    🔁 0    💬 1    📌 0

but thankfully not as many times as we’d have without writing tests for each component. One big bad decision we made was to use libraries on top of Pytorch for a trainer, despite having a functioning trainer class. Dont get me wrong, Lightning (and other libraries) are great at what they do, but

06.02.2026 16:02 — 👍 0    🔁 0    💬 1    📌 0

model. We had to make many decisions, like the dataset size, inner architecture, and most importantly, how everything would work together. Ofc, we used Karpathy’s nanogpt as an inspiration, but even then there were many differences. We also ended up shooting ourselves in the foot a couple of times,

06.02.2026 16:02 — 👍 0    🔁 0    💬 1    📌 0

For the last few weeks, I’ve been working in a team on a graded project on “Pretraining LLMs”. The idea is to learn how (L)LMs are pretrained, and each team has to pretrain one from scratch. We were recommended to follow the Tuning handbook for direction and structure. We decided on an encoder 1/n

06.02.2026 16:02 — 👍 0    🔁 0    💬 1    📌 0

F, C, C : “We got rectal bleeding”
House : “What, all of you?”

Caught me off guard i started watching House MD

30.01.2026 01:47 — 👍 1    🔁 0    💬 0    📌 0

finally got to doing what needed to be done - adding a custom handle

08.01.2026 15:06 — 👍 0    🔁 0    💬 0    📌 0
Preview
GitHub - ajdsouz/coverletter: A simple Latex + pandoc coverletter template. A simple Latex + pandoc coverletter template. Contribute to ajdsouz/coverletter development by creating an account on GitHub.

I’ve been applying for jobs for a while now and kinda hated using gdocs to write cover letters. Built myself a nice LaTeX + pandoc setup that builds a pdf from a markdown coverletter. Saved myself some time and energy and mainly sanity. github.com/ajdsouz/cove...

06.01.2026 23:23 — 👍 1    🔁 1    💬 0    📌 0
Preview
GitHub - ajdsouz/coverletter: A simple Latex + pandoc coverletter template. A simple Latex + pandoc coverletter template. Contribute to ajdsouz/coverletter development by creating an account on GitHub.

I’ve been applying for jobs for a while now and kinda hated using gdocs to write cover letters. Built myself a nice LaTeX + pandoc setup that builds a pdf from a markdown coverletter. Saved myself some time and energy and mainly sanity. github.com/ajdsouz/cove...

06.01.2026 23:23 — 👍 1    🔁 1    💬 0    📌 0
A photo of a train station after it’s snowed. There’s a group of friends walking on the snow covered platform.

A photo of a train station after it’s snowed. There’s a group of friends walking on the snow covered platform.

Snow usually doesn’t stay in Saarbrücken. This time, it did.

05.01.2026 22:24 — 👍 1    🔁 1    💬 0    📌 0
A photo of a train station after it’s snowed. There’s a group of friends walking on the snow covered platform.

A photo of a train station after it’s snowed. There’s a group of friends walking on the snow covered platform.

Snow usually doesn’t stay in Saarbrücken. This time, it did.

05.01.2026 22:24 — 👍 1    🔁 1    💬 0    📌 0

@ajdsouza.com is following 20 prominent accounts