Andrew Vaziri's Avatar

Andrew Vaziri

@4threv.com.bsky.social

AI, Robotics & Society

32 Followers  |  44 Following  |  11 Posts  |  Joined: 14.11.2024  |  1.4192

Latest posts by 4threv.com on Bluesky

The majority of AI computation is already people using AI rather than engineers training AI. Deepseek R1 embarrasses corporate AI research labs, sure, but why did Chip manufacturers take such a strong market hit? There is still plenty of demand, R1 70B model can't run on my new RTX 4070 gaming GPU

28.01.2025 08:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There will still be niches where open-source won’t have the specialized data to compete, but you can’t copyright math or logic. The fundamental capability to reason will continue to be actively developed. (8/8)

28.01.2025 07:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In conclusion, Deepseek's advancements are newsworthy, but the market didn’t seem to know how to interpret the impact. The sky isn't falling if open-source models are competitive. (7/8)

28.01.2025 07:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I'm not minimizing Deepseek's achievement, but this isn't a totally unprecedented result. The idea that larger models would be distilled into smaller, more practical ones is expected. (6/8)

28.01.2025 07:47 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

2. The authors claim that it was very important to train R-1 on the outputs of larger models, a process called distilling. This is literally the first thing you’d try to make a model smaller while keeping performance. (5/8)

28.01.2025 07:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Models that don’t specialize in reasoning, but instead focus on writing style or broad knowledge, will still be needed. These models are more resource-intensive to train than R-1. (4/8)

28.01.2025 07:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

1. R-1 is competing in a specialized class of AI models focused on "chain of reasoning". Some of the (reinforcement learning) tricks used to make it train efficiently only work in areas where there's a definitive correct answer, like math or formal logic. (3/8)

28.01.2025 07:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

R-1 is newsworthy, don't get me wrong, but a few things to keep in mind when assessing how much this should change your worldview: (2/8)

28.01.2025 07:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

That news of Deepseek R-1's better-than-expected performance wiping out $1T of market value says less about the technology and more about how many market participants are operating on hype without understanding. In other words, a bubble. (1/8)

28.01.2025 07:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 7    πŸ“Œ 0
Post image

So Sauron tried to recruit me into a surveillance startup. After three emails and an ignored request to remove me from their list I thought I would have fun

17.01.2025 19:07 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Automation and Employment - Robohub

For my first post, a throwback. In 2016 I spoke with @maosbot.bsky.social about
automation and employment. This was half a decade before ChatGPT and yet the predictions were prophetic. We can forecast challenges, and must be proactive in rising to meet them. robohub.org/robots-autom...

12.01.2025 18:41 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@4threv.com is following 18 prominent accounts