Zack Angelo's Avatar

Zack Angelo

@zackangelo.bsky.social

building ai inference @ mixlayer

25 Followers  |  124 Following  |  8 Posts  |  Joined: 25.10.2024  |  1.3687

Latest posts by zackangelo.bsky.social on Bluesky

just realized bsky doesn't support gifs lol

15.12.2024 14:40 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

functions can even compose, here's the model using the output of one as the input into another

13.12.2024 20:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

one of the most slept on capabilities of newer AI models is the ability to call multiple tools in a single shot. here's the newest llama 70b running on mixlayer calling 4 tools (lookup weather in 3 cities and perform some arithmetic)

13.12.2024 20:24 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
LLM Reasoning 101 - Mixlayer Large Language Models (LLMs) can be made better at complex reasoning tasks through techniques like few-shot prompting and Chain of Thought (CoT) reasoning, which allow smaller models to match the perf...

Want to play around with chain of thought and some other prompting techniques? I put up a few
Mixlayer demos on Meta's Llama 3.1 8b in this blog post. www.mixlayer.com/blog/2024-12...

11.12.2024 16:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

weird that the instruction tuned Llama3 8b models are downloaded less than the original?

04.12.2024 15:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I doubt they switch to a lower precision model, but would not be surprised if they start using a quantized or fp8 KV cache. Much easier to switch out dynamically in response to load vs the model weights.

23.11.2024 17:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Extending the Context Length to 1M Tokens! API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction After the release of Qwen2.5, we heard the communityโ€™s demand for processing longer contexts. In recent months, we have made m...

Crazy to think that a 1M token context window will be the norm soon.

Doesn't look like this model has made it onto HF yet (just a space, no weights), curious to learn more about the sparse attention mechanism.

qwenlm.github.io/blog/qwen2.5...

18.11.2024 15:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

woke up in a 3am fit of terror last night bc I dreamt I left an 8x a100 gpu cluster running by accident ๐Ÿซ 

17.11.2024 13:58 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@zackangelo is following 20 prominent accounts