Preetha Chatterjee @preethac

LLMs can repair code, but often miss the broader context developers use every day.
We propose a 3-layer knowledge injection framework that incrementally feeds LLMs with bug, repository, and project knowledge.

Preprint of our ASE '25 paper: arxiv.org/pdf/2506.24015

24.08.2025 14:53 — 👍 1 🔁 1 💬 1 📌 0

Error analysis reveals that unresolved bugs are not randomly distributed; they cluster around specific bug types and higher complexity profiles. In particular, Program Anomaly, Network, and GUI bugs remain the most challenging for both models.

24.08.2025 14:55 — 👍 0 🔁 0 💬 0 📌 0

Evaluated on 314 real-world Python bugs, we observed consistent gains in both #fixed and Pass@k scores for Llama 3.3 and GPT-4o-mini, demonstrating a 23% improvement over prior work.

24.08.2025 14:54 — 👍 0 🔁 0 💬 1 📌 0

This layered approach offers several advantages.
Allows simpler bugs to be fixed with minimal input, conserving tokens and computation.
Scales context progressively, injecting more information only when necessary. Enables analysis of bug types & complexity.

24.08.2025 14:54 — 👍 0 🔁 0 💬 1 📌 0

1️⃣ Bug Knowledge (e.g., immediate code and test context)
2️⃣ Repository Knowledge (e.g., related files, dependencies, commit history)
3️⃣ Project Knowledge (e.g., documentation, past bug fixes)

24.08.2025 14:53 — 👍 0 🔁 0 💬 1 📌 0

24.08.2025 14:53 — 👍 1 🔁 1 💬 1 📌 0

🌍 The future of #icse is global!
🇧🇷 ICSE 2026 – Brazil #icse2026
🇮🇪 ICSE 2027 – Ireland #icse2027
🌺 ICSE 2028 – Hawaii #icse2028
We can't wait to see you there! Pack your ideas and your passport. 🧳✈️

02.05.2025 13:24 — 👍 15 🔁 9 💬 1 📌 1

💡 If you are building, evaluating, or relying on LLMs for software development, please ask yourself: Did it warn you about the hidden security risk?

07.04.2025 13:43 — 👍 0 🔁 0 💬 0 📌 0

As a preliminary solution to this problem, we built a CLI tool prototype that integrates static analysis with LLM prompting, aiming to make AI code suggestions more secure by design.

07.04.2025 13:43 — 👍 0 🔁 0 💬 1 📌 0

However, when LLMs do warn you, they tend to offer more complete explanations, including potential causes of the vulnerability, exploits, and even fixes.

07.04.2025 13:43 — 👍 0 🔁 0 💬 1 📌 0

We evaluated GPT-4, Claude 3, and Llama 3 across 300 real-world Stack Overflow posts containing vulnerable code.

The results?
⚠️<40% of vulns flagged
⚠️As low as 12.6% when code was obfuscated
⚠️Common issues (e.g., unsanitized input) often missed - unless explicitly prompted

07.04.2025 13:43 — 👍 0 🔁 0 💬 1 📌 0

Do LLMs Consider Security? An Empirical Study on Responses to Programming Questions The widespread adoption of conversational LLMs for software development has raised new security concerns regarding the safety of LLM-generated content. Our motivational study outlines ChatGPT's potent...

LLMs are great at generating code, but are they silently spreading vulnerabilities? TLDR: Yes.

In our latest EMSE paper, we look into: when developers unknowingly share vulnerable code with LLMs, do these models proactively raise security red flags? 🧵

👉 Read the paper: arxiv.org/abs/2502.14202

07.04.2025 13:42 — 👍 2 🔁 0 💬 1 📌 0

Delighted to share that our paper, led by my PhD advisee Ramtin Ehsani, “Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution,” has been accepted to the Research Track of MSR 2025.

Preprint: soar-lab.github.io//papers/MSR2...

21.01.2025 02:17 — 👍 3 🔁 0 💬 0 📌 0

I can now run a GPT-4 class model on my laptop Meta’s new Llama 3.3 70B is a genuinely GPT-4 class Large Language Model that runs on my laptop. Just 20 months ago I was amazed to see something that felt …

I can now run a GPT-4 class model on my laptop

(The exact same laptop that could just about run a GPT-3 class model 20 months ago)

The new Llama 3.3 70B is a striking example of the huge efficiency gains we've seen in the last two years
simonwillison.net/2024/Dec/9/l...

09.12.2024 15:19 — 👍 359 🔁 59 💬 11 📌 6

Congrats!!

10.12.2024 14:59 — 👍 1 🔁 0 💬 0 📌 0

#NeurIPS2024 paper 3, Assemblage - the dataset of source-to-binary projects compiled from GitHub that you've dreamed of bet never had before! Collab with @krismicinski.bsky.social and a multi-year effort to get to @NeurIPSConf @BoozAllen arxiv.org/abs/2405.03991

07.12.2024 20:01 — 👍 6 🔁 3 💬 1 📌 0

🎉 Thrilled to share that our paper (with Ramtin Ehsani and @rezapour.bsky.social) has been accepted at NLBSE'25, co-located with @icseconf.bsky.social! 🎉

Our work shows promise in improving toxicity detection in OSS using moral values & psycholinguistic cues. Preprint coming soon.

09.12.2024 16:42 — 👍 3 🔁 2 💬 0 📌 0

Can you please add me here

23.11.2024 03:28 — 👍 1 🔁 0 💬 0 📌 0

Preetha Chatterjee

Latest posts by preethac.bsky.social on Bluesky

@preethac is following 20 prominent accounts