's Avatar

@halgorithmist.bsky.social

Causality, RL Research

10 Followers  |  569 Following  |  2 Posts  |  Joined: 07.11.2024  |  1.5896

Latest posts by halgorithmist.bsky.social on Bluesky

can I code fast? no. but can I code well? also no. but does my code work? alas, no

30.11.2024 21:39 β€” πŸ‘ 18423    πŸ” 2154    πŸ’¬ 409    πŸ“Œ 154
Post image

From Szeliski's "Computer Vision – Algorithms and Applications."

29.11.2024 12:03 β€” πŸ‘ 27    πŸ” 2    πŸ’¬ 3    πŸ“Œ 0

That account is doing AMA about research at deepmind 🀣

26.11.2024 05:28 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Trying to build a "books you must read" list for my lab that everyone gets when they enter. Right now its:

- Sutton and Barto
- The Structure of Scientific Revolutions
- Strunk and White
- Maybe "Prediction, Learning, and Games", TBD

Kinda curious what's missing in an RL / science curriculum

25.11.2024 17:43 β€” πŸ‘ 141    πŸ” 11    πŸ’¬ 36    πŸ“Œ 1
Post image Post image Post image

Want to learn / teach RL? 

Check out new book draft:
Reinforcement Learning - Foundations
sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.

25.11.2024 12:08 β€” πŸ‘ 154    πŸ” 34    πŸ’¬ 4    πŸ“Œ 4

Big fan of this essay by @abeba.bsky.social

25.11.2024 20:22 β€” πŸ‘ 25    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

+100 to this recommendation

26.11.2024 01:03 β€” πŸ‘ 31    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0
Preview
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning These notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning. They present a theory (devel...

Nadav Cohen and I recently uploaded lecture notes on the theory (and surprising practical applications) of linear neural networks.

Hope that it can be useful, especially to those entering the field as it highlights distinctions between DL and "classical" ML theory

arxiv.org/abs/2408.13767

20.11.2024 13:51 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Your blog makes up for it... Equally balanced πŸ˜„πŸ˜„

24.11.2024 19:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Games are reasonable stepping stones as testbeds for AI progress. NetHack and text adventure games hit on modern AI weaknesses. I literally just gave a talk on why we should take Dungeons and Dragons and other role-playing games seriously as AI challenges.

23.11.2024 13:36 β€” πŸ‘ 71    πŸ” 11    πŸ’¬ 7    πŸ“Œ 1
Book outline

Book outline

Over the past decade, embeddings β€” numerical representations of
machine learning features used as input to deep learning models β€” have
become a foundational data structure in industrial machine learning
systems. TF-IDF, PCA, and one-hot encoding have always been key tools
in machine learning systems as ways to compress and make sense of
large amounts of textual data. However, traditional approaches were
limited in the amount of context they could reason about with increasing
amounts of data. As the volume, velocity, and variety of data captured
by modern applications has exploded, creating approaches specifically
tailored to scale has become increasingly important.
Google’s Word2Vec paper made an important step in moving from
simple statistical representations to semantic meaning of words. The
subsequent rise of the Transformer architecture and transfer learning, as
well as the latest surge in generative methods has enabled the growth
of embeddings as a foundational machine learning data structure. This
survey paper aims to provide a deep dive into what embeddings are,
their history, and usage patterns in industry.

Over the past decade, embeddings β€” numerical representations of machine learning features used as input to deep learning models β€” have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Cover image

Cover image

Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🀩

Let's start with "What are embeddings" by @vickiboykis.com

The book is a great summary of embeddings, from history to modern approaches.

The best part: it's free.

Link: vickiboykis.com/what_are_emb...

22.11.2024 11:13 β€” πŸ‘ 653    πŸ” 101    πŸ’¬ 22    πŸ“Œ 6

Cohere's studies have shown that LLMs tend to rely on documents that contain procedural knowledge, such as code or mathematical formulas, when performing reasoning tasks.

This suggests that LLMs learn to reason by synthesizing procedural knowledge from examples of similar reasoning processes.

20.11.2024 17:18 β€” πŸ‘ 39    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

09.11.2024 09:13 β€” πŸ‘ 557    πŸ” 213    πŸ’¬ 67    πŸ“Œ 55

Is there any evidence for "pure" memory of space and time? All spatial and temporal memory tasks I can think of query memory of particular events/objects embedded in space and time. I conjecture that it's impossible for us to recall a location or time in the absence of events or objects.

20.11.2024 10:38 β€” πŸ‘ 52    πŸ” 8    πŸ’¬ 10    πŸ“Œ 1
Post image

The Llama 3.2 1B and 3B models are my favorite LLMs -- small but very capable.
If you want to understand how the architectures look like under the hood, I implemented them from scratch (one of the best ways to learn): github.com/rasbt/LLMs-f...

20.11.2024 08:33 β€” πŸ‘ 142    πŸ” 16    πŸ’¬ 7    πŸ“Œ 1

@halgorithmist is following 19 prominent accounts