Maria Khalusova 's Avatar

Maria Khalusova

@mariak.bsky.social

Always growing, she/her, RAG builder, LLM whisperer, tech generalist

3,059 Followers  |  137 Following  |  183 Posts  |  Joined: 25.04.2023  |  1.8485

Latest posts by mariak.bsky.social on Bluesky

A tiny bit of mirroring? :)

13.06.2025 18:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

PS: That said, I’ll probably still keep an eye on what’s happening and may even share some posts every now and then. I’ve got a lot of thoughts on RAG, data processing, LLMs/VLMs, etc., so I likely won’t disappear fully.

13.06.2025 16:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The work will still be here when I return. The AI won’t slow down, but also a couple of months won’t make a dent in the field. This moment, however, this chance to be fully present with my family? That’s something I don’t want to miss.

13.06.2025 16:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

And even more grateful to work with a team that’s so supportive. Stepping away from work, especially in a field moving at warp speed, can feel counterintuitive. But for me, it’s a way to reconnect with what matters most.

13.06.2025 16:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Kids won’t be kids forever, and mine are getting ever so close to becoming teenagers. Now is time I know I’ll never get back.
I’m incredibly grateful to be in a place, both professionally and personally, where this is possible.

13.06.2025 16:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Next week, I’m stepping away for a couple of months to take a sabbatical and spend time with my kids. I’m not burnt out. I’m following my own advice: do the thing you’ll regret not doing when you’re old.

13.06.2025 16:57 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Things move fast in AI. Every week brings new models, new capabilities, or new ideas to chase. It’s exciting, but also easy to get swept up in the pace and forget to pause, to touch grass, to zoom out and see the bigger picture.
🧡

13.06.2025 16:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

RAG exists to solve different problems across varied domains. Understand the problem you’re solving and look at your data.

12.06.2025 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Once you have some answers to these, you can get further into the technical weeds and experiment with chunking to find an optimal size.
Bottom line, however, is - there's no universal "best" chunk size.

12.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

* How much context do you typically need to retrieve to satisfy a typical query? Simple facts may only require a sentence or two. Creative tasks may require larger context. Analytical queries may need a whole bunch of supporting evidence.

12.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

They all vary in structure, style, and length.
* What is your use case? Are you trying to answer questions with specific facts? Are you gathering multiple documents to summarize for a report? Do you pull from transcripts and need to preserve speaker attribution?

12.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Same goes for chunking. The β€œbest” chunk size depends on a range of factors, and without those, the question is incomplete.
Here are some of the questions to ask instead:
* What does your data look like? Financial statements, technical manuals, customer support transcripts are not the same.

12.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Asking β€œWhat is the best chunk size for RAG?” without any additional context is like asking, β€œWhat’s the best thing to wear?” Wear where? What’s the weather like? What size are you? Are you going to a wedding or hiking a trail? There’s no single answer that works for every situation.

🧡

12.06.2025 19:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Do the thing that you will regret not doing when you're old.

09.06.2025 12:32 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I went to check what new courses deeplearning.ai has, and was pleasantly surprised to see that the short course Marc Sun, Younes Belkada, and I have built over a year ago is still featured as one of the Top Rated courses 😍

30.05.2025 19:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

At least I have interrupted your doomscrolling with some cuteness!

20.05.2025 19:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I'm taking this whole developer becoming a farmer dream way too far, am I?

20.05.2025 19:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If you've been prioritizing urgent work,
make sure to prioritize important work.

15.05.2025 12:38 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

How anyone can like peanut butter is beyond me.

12.05.2025 14:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Similar β‰  relevant

09.05.2025 12:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Level Up Your GenAI Apps: Overview of Advanced RAG Techniques – Unstructured Explore advanced RAG retrieval techniquesβ€”including re-ranking, hybrid search, metadata filtering, and Agentic RAGβ€”that go beyond basic vector similarity to deliver more relevant, high-precision resul...

Part 2 is a high-level overview of advanced RAG techniques: unstructured.io/blog/level-u...

08.05.2025 13:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Nothing starts a Wednesday morning quite like your dog getting sprayed by a skunk 🀒

07.05.2025 12:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I have some epic plans for this summer and none of you’ll be able to guess what they are.

02.05.2025 22:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Level Up Your GenAI Apps: RAG Beyond the Basics – Unstructured Learn why naive implementations fall shortβ€”and how smarter data preprocessing lays the foundation for reliable, high-performance RAG.

I'm starting a series of blog posts on RAG beyond the basic set up. In the first part, we're setting the stage. Why naive RAG is not enough, and how a lot of the issues can be traced back to data processing choices.
Part 1: unstructured.io/blog/level-u...

01.05.2025 12:48 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

What you're not changing, you're choosing.
This is a gentle reminder for the next time you're prioritizing a cool new shiny thing over building the foundation or addressing tech debt.

30.04.2025 12:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Word of the day seems to be "sycophantic".
Thanks AI community for increasing my vocabulary :)

29.04.2025 14:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

32 pages - is it still a blog or do I call it a book now?

28.04.2025 19:31 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

you’re doing the work. You’re still making decisions, solving problems, and putting together something useful.

Coding with AI only becomes β€œvibe coding” if you’re not paying attention or care to what you or your tools are doing.

25.04.2025 13:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

good code. Are we now repeating the same sentiment but with LLMs?

AI is just another tool - like a refactoring feature in the IDE or a debugger. Debugger doesn’t find the issue - you do, faster with its help. So does AI help you work faster or more efficiently, but at the end of the day,

25.04.2025 13:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Though there used to be people saying things along the lines of β€œYou’re not a real programmer if you don't code without your IDE’s fancy features!" or "True devs only need Notepad and a terminal”. Surely, we’ve long moved past that. It doesn't matter what tools you are using - it's about writing

25.04.2025 13:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@mariak is following 19 prominent accounts