Parth Shah's Avatar

Parth Shah

@parthshaha.bsky.social

Machine Learning Engineer @ Zscaler 🏫: UC San Diego, IIT Guwahati 🏢: Signify Research, Wadhwani AI, Publicis Sapient parthatom.github.io

52 Followers  |  679 Following  |  18 Posts  |  Joined: 16.11.2024
Posts Following

Posts by Parth Shah (@parthshaha.bsky.social)

All the classic feeds (Popular with friends, Quiet posters, Following) are endlessly repetitive.

Feed engineers - please come up with feeds that remove posts based on these repetitiveness criteria. This will make bsky atleast 2x more useful

10.12.2024 23:04 — 👍 3    🔁 0    💬 0    📌 0

2. Usually I open bluesky for a couple of minutes, browse and log off. If i come back in 30-60 minutes, a majority of the top 10 posts are going to be the same ones I've seen already which turns me off.

10.12.2024 23:02 — 👍 0    🔁 0    💬 0    📌 0

Repetitiveness is my biggest problem with bluesky.

Problematic examples:

1. After I've done interacting with a thread. I dont want to keep seeing new replies or quote tweets on it.

10.12.2024 23:02 — 👍 0    🔁 0    💬 2    📌 0

Linking the thread on the paper from the main author for people who are interested bsky.app/profile/laur...

03.12.2024 00:37 — 👍 1    🔁 0    💬 0    📌 0

I disagree

(low hanging fruit im sorry)

02.12.2024 17:57 — 👍 3    🔁 0    💬 0    📌 0

Reposting as a mechanism of bookmarking (public)

01.12.2024 22:36 — 👍 0    🔁 0    💬 0    📌 0

> The decoder often effectively is a conditional GAN

Any intuition/math/papers you can share to understand this? It's not v clear to me.

28.11.2024 20:53 — 👍 2    🔁 0    💬 1    📌 0

After training on these datasets - did you test on any other datasets not from the same distribution? (Like a coding test set instead of GSM8K test set).

How much distribution change can AdaptiveDecoder handle?

22.11.2024 22:08 — 👍 0    🔁 0    💬 0    📌 0

Great work!
Nitpick: Rewrite first tweet for emphasis on impact
- Learn to predict temperature to auto-adjust for creativity vs factuality
- Against fix-temperature, our predicted temperature wins 10% more across GSM8K and UltraFeedback datasets.
- New layer and model to learn any hyperparameter

22.11.2024 22:00 — 👍 0    🔁 0    💬 0    📌 0

3/6 is not bad either way

22.11.2024 02:16 — 👍 1    🔁 0    💬 0    📌 0

Right then if you phrase distribution change as learning - when context changes the distribution wouldn't you call it "In-context learning"?

I see the above relation of B follows from A well defined.

What do you see is missing?

21.11.2024 17:45 — 👍 0    🔁 0    💬 0    📌 0

Curious to understand why you think probability of y can change when given x and D are fixed.

That is not true of any probability distributions.

21.11.2024 17:05 — 👍 0    🔁 0    💬 1    📌 0

Yea I phrased it as - we "should" promote them, when I meant the current framework "should" promote them regardless.

21.11.2024 16:05 — 👍 0    🔁 0    💬 0    📌 0

Removing chaos by yourself is low visibility indeed. However, wouldn't you agree that someone one who can influence enough people to remove said chaos should ve promoted for their influence and culture shaping capabilities?

21.11.2024 06:27 — 👍 0    🔁 0    💬 1    📌 0

Awesome slides. Thank you :)

20.11.2024 16:53 — 👍 1    🔁 0    💬 0    📌 0

This is an excellent list. I would probably add @colah.bsky.social's Transformer Circuits Initial Thoughts youtube playlist along with the corresponding paper.

Do you have a website for the course I can follow?

20.11.2024 08:23 — 👍 1    🔁 0    💬 1    📌 0

Apple needs to add this to its vision pro on the minimum

20.11.2024 08:11 — 👍 0    🔁 0    💬 0    📌 0

Appreciate this.

More similar categorizations - "dumb" questions/obvious questions that everybody thinks they know the answer to/in between the line questions.

18.11.2024 08:29 — 👍 0    🔁 0    💬 0    📌 0