Scott Condron's Avatar

Scott Condron

@scottcondron.bsky.social

Working at wandb on Weave, helping teams ship AI applications

293 Followers  |  424 Following  |  22 Posts  |  Joined: 16.11.2024  |  1.9067

Latest posts by scottcondron.bsky.social on Bluesky

How do I get Bluesky to show me less politics and more AI/ML things? I have followed mostly people who work in AI/ML

09.03.2025 11:12 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Maybe they could tell you what they’ve learned like β€œit seems you’re interested in staying up to date with recommender systems, want to add that to your feed?”

09.03.2025 10:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks Scott! Very exciting

09.03.2025 09:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Prompts within a complex system are brittle

I have seen some teams be successful by replacing prompts with smaller, more deterministic components and improved reliability with fine-tuning. Anyone else have success with this approach?

Seems to help a lot with agents

29.11.2024 10:16 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I collected some folk knowledge for RL and stuck them in my lecture slides a couple weeks back: web.mit.edu/6.7920/www/l... See Appendix B... sorry, I know, appendix of a lecture slide deck is not the best for discovery. Suggestions very welcome.

27.11.2024 13:36 β€” πŸ‘ 113    πŸ” 17    πŸ’¬ 3    πŸ“Œ 3

If you’re taking time to enjoy your family and not building with LLMs, you’re ngmi.
America is cooked

28.11.2024 07:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

LLM app dev broke our comparison tools because tiny diffs can cause large behaviour change.

At wandb, we've spent years thinking about experiment comparison. We've added new tools for LLM app dev: code, prompts, models, configs, outputs, eval metrics, eval predictions, eval scores..
wandb.me/weave

26.11.2024 13:38 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

The art of how to refer to model behaviour with tasteful non-person metaphors. Say β€œstochastic” you’re in one camp, say β€œemergent” you’re in another.
It’s a minefield out there people

25.11.2024 20:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

People ask for an iOS app but maybe we shouldn’t as it would cause more misery on-the-go

23.11.2024 10:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Being logged into wandb on your phone is a recipe for misery

20.11.2024 04:09 β€” πŸ‘ 74    πŸ” 4    πŸ’¬ 10    πŸ“Œ 0

Would be happy to schedule a chat to hear more about your experience with W&B

23.11.2024 09:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

hey, sorry to hear your complaints about wandb. Have you seen the big response in that issue with options? Tables is built on parquet so it’s difficult from an architectural perspective. With the recent release Weave, there may be a path forward by using the weave backend instead of parquet…

23.11.2024 09:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Agreed, fellow competitor.

It’s the biggest hurdle I see from teams trying to build GenAI features

We need tools to lower the barrier to entry with LLM judges, existing benchmarks, manual annotation as eval collection, synthetic data… anything else?

23.11.2024 08:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think these small models are not for day to day use but instead, they’re for b2c applications of LLMs, where it’s cost/latency prohibitive to use anything else

22.11.2024 08:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
chore: Add llms.txt by scottire Β· Pull Request #3045 Β· wandb/weave Adds a script to generate llms.txt file. Features Generates Docs & Optional sections Links to Github markdown Includes logic to remove certain files Includes generated markdown To generate ll...

- it really works to teach an LLM about your tool, thank you long context!

Link for the curious:
github.com/wandb/weave/...

21.11.2024 19:16 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

- it's much better for scraping if the links included are .md files
- you need to be clear which files to include and which are optional because context blows up quickly
- automating creating your docs' llms.txt is pretty easy

21.11.2024 19:16 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Lessons from creating an llms.txt file
An llms.txt file is a way to tell a LLM about your website. In the .txt file, you include links to other files with info to learn more.
- the llms.txt file isn't the file you send to an LLM, you use it to generate a llms .md file

21.11.2024 19:16 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Creating a LLM-as-a-Judge That Drives Business Results – A step-by-step guide with my learnings from 30+ AI implementations.

hamel.dev/blog/posts/l...

21.11.2024 07:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

from @hamel.bsky.social’s hamel.dev/blog/posts/llm…

We're building LLM / Human "scorers" in @weightsbiases.bsky.social to have the same data model for this reason

20.11.2024 20:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Your human and LLM judges should follow the same criteria.

Then, you can transition from manual to automated evaluation once you have inter-annotator agreement between LLM & human. You now have a faster iteration speed and the annotator can focus on finding edge cases!

20.11.2024 20:30 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
glif - MSPaint (Flux) by fab1an

glif.app/@fab1an/glif...

20.11.2024 08:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Put glue on pizza

20.11.2024 08:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The most bizarre AI interview I've ever done was at wandb when as usual I asked a candidate to build an AI classifier in any language/framework of their choice..

And they nonchalantly said "I'll write it in Redstone", to which I almost let loose a chuckle until...

19.11.2024 22:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Claude defaults to concise responses when there's high demand, clever way to smooth peaks

19.11.2024 20:21 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We've been working on just that at @weightsbiases.bsky.social with Weave!

Weave is a lightweight llm tracing and evaluations toolkit, that focuses on letting you iterate fast and make sure that your production LLM based application is not degrading when you change prompts or models!

18.11.2024 17:41 β€” πŸ‘ 14    πŸ” 3    πŸ’¬ 4    πŸ“Œ 2

@scottcondron is following 20 prominent accounts