dahara1's Avatar

dahara1

@dahara1.bsky.social

I made machine translation with LLMs. I made PC Chrome translation plugin for bluesky. I made smart feed for bluesky. I mada content import agent. Let's improve these qualities!

89 Followers  |  100 Following  |  400 Posts  |  Joined: 19.08.2023  |  1.5773

Latest posts by dahara1.bsky.social on Bluesky

Post image

There are rumors that llama 4.1 and 4.2 are SLM.

15.08.2025 08:58 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

An increasing number of services and products are setting up "AI support chatbots" without publishing product manuals or usage instructions on their websites.

Without reliable documentation, the AI's responses will be hallucinatory and completely useless.

12.08.2025 04:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

As AI has made writing easier, contests and other events have begun requiring the submission of explanatory videos.

Overall, it seems like there's more work for humans to do than ever before.

07.08.2025 02:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Opus can no longer do the tasks that it was able to do a few weeks ago. I'm very sad.

02.08.2025 10:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

AI winter is coming.

I introduced VoiceCore, LLM-based Japanese TTS I created, to the Japanese-learning community, but reaction was negative.

People are getting tired of the innovative AI-powered learning materials introduced by influencers on TikTok, and anything with the word "AI" is boring.

31.07.2025 06:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image 27.07.2025 13:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
VoiceCore - AI้Ÿณๅฃฐ็”Ÿๆˆใ‚ทใ‚นใƒ†ใƒ 

We have finally completed a TTS model that can generate emotional Japanese speech from text.

Those who can speak Japanese might be interested.

webbigdata.jp/voice-ai-age...

26.07.2025 07:06 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Opus omits just two lines of main, saying "the rest of the code is the same"
โ†“
I spent two hours debugging with Gemini to find out why the app suddenly stopped working
โ†“
Rage with nowhere to go

25.07.2025 14:02 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

It's hard to find a single prompt that will always give you the perfect answer.

You might want to consider splitting the answer and the verification into two prompts.

25.07.2025 05:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
ME: Ask AI to create a fully automated script
AI: AI demands manual pre-work

ME: Ask AI to create a fully automated script AI: AI demands manual pre-work

23.07.2025 06:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Subliminal Learning: Language models transmit behavioral traits via hidden signals in data We study subliminal learning, a surprising phenomenon where language models transmit behavioral traits via semantically unrelated data. In our main experiments, a "teacher" model with some trait T (su...

Subliminal Learning

The teacher model is given a system prompt to make it like owls.

Instruct it to output about 10 three-digit numbers, and create 10,000 data that are just numbers.

The student model learns this.

For some reason, the student model begins to like owls.
arxiv.org/abs/2507.14805

23.07.2025 03:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
HuggingFaceTB/SmolLM3-3B-checkpoints ยท Hugging Face Weโ€™re on a journey to advance and democratize artificial intelligence through open source and open science.

SmolLM3-3B-checkpoints

Hugging Face's powerful 3B model (multi-language, up to 128K context expansion) SmolLM3 training checkpoints and loss logs are released

It's quite a large scale, with 11T tokens training on 384 H100s, so I'm grateful for the reference.
huggingface.co/HuggingFaceT...

22.07.2025 02:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I tried QAT(quantized-aware-training) for the first time, but the model's performance was lower than I expected. Is there any trick to it that's different from regular training?

19.07.2025 02:37 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Diffusion models (Dream 7B) support for llama.cpp has been merged (PR14644)

It's still slow at the moment, but I was impressed that the diffusion language model works properly with my CPU.

16.07.2025 18:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

If you want to try TensorRT-LLM, it might be smoother to use python 3.10.

15.07.2025 16:52 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

TensorRT-LLM finally worked!

Why do I have to worry about OpenCV dependencies to run LLMs?
Well, I'll be able to open the Japanese TTS demo site soon.

14.07.2025 17:22 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Docker was supposed to "solve dependency hell" but in reality it just created a new hell of version conflicts.

13.07.2025 10:11 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful Conventional wisdom dictates that small batch sizes make language model pretraining and fine-tuning unstable, motivating gradient accumulation, which trades off the number of optimizer steps for a pro...

Small Batch Size Training for Language Models

Batch size 1 > Batch size 512?
very interesting for gpu poor.

arxiv.org/abs/2507.07101

11.07.2025 04:14 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

When I see models confidently hallucinating and declaring that they are correct, I think maybe I should learn a bit of this confident attitude.

10.07.2025 15:26 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I had too many connection errors today, so I decided to switch from Claude to Gemini.

I feel like pro users are being neglected since the max plan was introduced recently.

10.07.2025 14:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The smarter the language model, the harder it is to spot hallucinations within it.

It's especially hard to explain the risks to people who aren't yet familiar with AI products. They think, "My chatGPT couldn't lie like that."

08.07.2025 11:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Prompt Engineering: The art of improving specific prompts to get better results

Context Engineering: The work of improving the entire AI workflow, including system/user prompts, structured output, function calls, RAGs, history, etc.

06.07.2025 04:06 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Acoustically beautiful pronunciation != correct pronunciation

Yes, if you try to focus on the beauty of pronunciation in a speech model using reinforcement learning, the model will quickly realize that language structure and beauty are unrelated.

04.07.2025 06:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I heard that "People who regularly read occult magazines that feature aliens, ghosts, magic, and ancient civilizations don't believe in recent conspiracy theories."

Because recent conspiracy theories lack romance, are repetitive, boring

It's better for AI to have romance, and maybe that's AGI.

03.07.2025 05:18 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The difficulty in creating a model is that you cannot directly correct abnormal output.

Dataset quality and quantity, hyperparameters, overlearning, underlearning, parameter optimization during inference, prompt template errors

It is difficult to further improve quality beyond a certain level.

02.07.2025 10:27 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

When I ran the 3B model with Transformers on my GPU (rtx 4060ti), it was 35 tokens/s. When I used 4-bit awq quantization with TensorRT-LLM, it became over 90 tokens/s.

The quality has decreased, but the speed has improved dramatically.

30.06.2025 14:11 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Humans may have more Grid (Willingness to not give up until completion) than AI.

They give up surprisingly quickly and try to find alternatives.
But I want this, So I did this.

29.06.2025 16:37 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Claude Opus can hit the limit without warning. I have no choice but to go to bed.

27.06.2025 17:04 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

google/gemma-3n-E4B-it is more memory hungry than I thought. OOM occurs frequently at 16GB.

27.06.2025 12:29 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

A survey showed that performance did not drop significantly even when using a GPU via a VM
(AMD GPU)

github.com/sbnb-io/sbnb...

26.06.2025 16:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@dahara1 is following 20 prominent accounts