Robert Nishihara's Avatar

Robert Nishihara

@robertnishihara.bsky.social

Co-founder of Anyscale. Co-creator of Ray. Previously PhD ML at Berkeley.

273 Followers  |  8 Following  |  6 Posts  |  Joined: 25.02.2024  |  1.4106

Latest posts by robertnishihara.bsky.social on Bluesky

Speak at Ray Summit!

10.07.2025 06:20 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
DuckDB goes distributed? DeepSeek’s smallpond takes on Big Data DeepSeek is pushing DuckDB beyond its single-node roots with smallpond, a new, simple approach to distributed compute. But does it solve the scalability challengeβ€”or introduce new trade-offs?

More details in this blog post. mehdio.substack.com/p/duckdb-goe...

04.03.2025 06:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

DeepSeek released smallpond, a big data processing framework built on top of Ray.
- Smallpond targets high performance data processing.
- It provides a high-level dataframe API
- Targets petabyte-level scaling

The challenges around training data prep only grow when you include multimodal data.

04.03.2025 06:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | Amazon Web Services Large-scale, distributed compute framework migrations are not for the faint of heart. There are backwards-compatibility constraints to maintain, performance expectations to meet, scalability limits to...

Amazon published this only 4 months ago, but it feels like an eternity. It's one of the most impressive large-scale data processing migration efforts. Rare to see companies truly achieving order of magnitude cost improvements (while simultaneously increasing scale).

aws.amazon.com/blogs/openso...

22.11.2024 03:00 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
ChatGPT Creator John Schulman on OpenAI | Ray Summit 2023
YouTube video by Anyscale ChatGPT Creator John Schulman on OpenAI | Ray Summit 2023

Talked with John Schulman last year about the ChatGPT backstory and scaling laws 😍 John co-founded OpenAI and created ChatGPT. www.youtube.com/watch?v=6Ctv...

22.11.2024 00:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Fine-tuning LLMs for longer context and better RAG systems Based on the popular β€œNeedle In a Haystack” benchmark and RAG, we share our process of creating a problem-specific fine-tuning dataset to extend the context of models to build better RAG systems.

A good overview of the fundamentals of how to extend context windows for LLMs (if you care about RAG, you probably care about context lengths).

26.02.2024 06:35 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@robertnishihara is following 8 prominent accounts