Jason's Avatar

Jason

@jason.rapidata.ai

Founder of Rapidata

66 Followers  |  86 Following  |  25 Posts  |  Joined: 26.11.2023  |  2.1442

Latest posts by jason.rapidata.ai on Bluesky

Preview
Rapidata/text-2-image-Rich-Human-Feedback-32k · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Give it a ♥️ at: huggingface.co/datasets/Rap...

08.05.2025 13:59 — 👍 1    🔁 0    💬 0    📌 0

These are scales that are just not feasible with traditional methods, let alone as a startup.

But for us it was a relatively easy task.

Its already Trending on 4th place on Hugging Face, give it some love so that we can get to the first place!

08.05.2025 13:59 — 👍 1    🔁 0    💬 1    📌 0

We beat one of Googles most famous Datasets!

We just released a new dataset with over 32k images annotated with over 3 Million (!) human responses.

08.05.2025 13:59 — 👍 2    🔁 1    💬 1    📌 0
Video thumbnail

So this is apparently a thing in Palo Alto now. (I'm guessing it was hacked)

12.04.2025 16:42 — 👍 2    🔁 0    💬 0    📌 0
Preview
Rapidata/Translation-deepseek-llama-mixtral-v-deepl · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

We uploaded the results to @hf.co , check them out:
huggingface.co/datasets/Rap...
(4/4)

06.03.2025 14:51 — 👍 2    🔁 0    💬 0    📌 0

We used our own annotation service, Rapidata, to gather over 51k votes from native speakers and realized, that the translation quality from DeepL was much better.

Translation quality is super important for the perceived quality of your product, so we are sticking to the premium option.
(3/4)

06.03.2025 14:51 — 👍 2    🔁 0    💬 1    📌 0

It shows that their models have a better understanding of the structures of language and the relationship of words.

We use DeepL to translate tasks for our annotators, its a costly service and we could use our cloud credits to use LLMs like Deepseek-R1, Llama or Mistral for free.
(2/4)

06.03.2025 14:51 — 👍 2    🔁 0    💬 1    📌 0
Post image

Europe can do more than just unremovable bottlecaps!

Our AI industry is often belittled, but one of our earliest players, DeepL from Germany, is sticking it to the bajillion $ tech giants.

Their models outperform the top LLMs at translating text, one of the most fundamental task for any AI. (1/4)

06.03.2025 14:51 — 👍 3    🔁 0    💬 1    📌 0

He had wasted a bunch of our time, so we thought it would be funny to send him an invoice for USD 9.99 for "Emotional Damages"

AND HE FUCKING PAYED IT

This is how you generate revenue as a startup.

The payment receipt now hangs on our door.
(6/6)

22.01.2025 15:36 — 👍 4    🔁 0    💬 0    📌 0

This went back and forth for a few hours until we finally managed to cut him off. Trolling at its finest, but then again, we learnt a lot for next time.

Anyhow, we noticed that he had started a payment process at some point with us, so he was registered in our Stripe account.
(5/?)

22.01.2025 15:36 — 👍 4    🔁 0    💬 1    📌 0

Then he started to ask very inappropriate things like: "Do you think you are safe?" or "are you aware that you are not safe right now?"
(These were all caught by our automated audit, so no harm done)

So we blocked his IP, but he turned on a VPN.
(4/?)

22.01.2025 15:36 — 👍 2    🔁 0    💬 1    📌 0

Tons of non-sensical orders started flooding in all at once.

He tried to advertise his discord to our annotators through his orders.

We blocked his account, but he just kept making new ones.
(3/?)

22.01.2025 15:36 — 👍 2    🔁 0    💬 1    📌 0

As long as your early customers data is safe (this is a must, obviously) but your system is a bit wobbly, that is ok.

Until a few days ago we never had anyone really try to fuck with our software.

But then some dude in Czechia randomly decided to test us.
(2/?)

22.01.2025 15:36 — 👍 2    🔁 0    💬 1    📌 0
a paid stripe invoice for 9.99 usd for "Emotional Damage"

a paid stripe invoice for 9.99 usd for "Emotional Damage"

9.99 USD for Emotional Damages

As a startup, you need to be fast, that's really your only advantage over the big boys. (They got more $ than you)

When you iterate and prototype and build, you don't want to waste time on building all of the safety systems first, if no one is using it.. (1/?)

22.01.2025 15:36 — 👍 6    🔁 0    💬 1    📌 1
Preview
Beyond Image Preferences - Rich Human Feedback for Text-to-Image Generation A Blog post by Rapidata on Hugging Face

We also wrote a blog article about how we collected and analyzed this dataset on huggingface for those who are interested in the details: huggingface.co/blog/Rapidat...

10.01.2025 19:10 — 👍 4    🔁 1    💬 0    📌 0
Screenshot of the dataset on the Hugging Face Hub

Screenshot of the dataset on the Hugging Face Hub

🔍 Massive human feedback dataset for text-to-image models from RapidData
- 1.5M human responses from 152K participants
- Evaluates image coherence, style & prompt alignment
- Includes detailed error heatmaps
- Covers DALL-E, Midjourney, Imagen outputs
Available on @hf.co

09.01.2025 14:00 — 👍 12    🔁 2    💬 1    📌 0
Post image

What a view...
(Trending page of Huggingface Image Datasets)

10.01.2025 12:41 — 👍 3    🔁 0    💬 0    📌 0
Preview
Image Comparison Tool | Rapidata Do you want to know which logo is more appealing to users around the world? Our compare tool let's you quickly ask any question and have the user respond by selecting one of two images. It's super eas...

For the UI demo (app.rapidata.ai ):
We just threw out the login all together.
Of course there are risks, but until those risks actually cause you harm (and the level of harm here is, it will cost us a hundred dollars) you can just ignore them.

07.01.2025 17:03 — 👍 2    🔁 0    💬 0    📌 0

... So, we set up and authentication flow that automatically pops up a browser window to log you in using your existing google account.
We ask no questions and there is not setup process. The code will instantly run after you hit the button.
.....

07.01.2025 17:03 — 👍 2    🔁 0    💬 1    📌 0
Rapidata Python SDK Documentation

For the python API (docs.rapidata.ai ):
Basically every API out there you will need to set an api key to use it (e.g. openAI's api). That's a lot of work, it will take you at least 15-20 seconds. NOT ACCEPTABLE!
An customer should literally just copy some code into the vs code and run it...

07.01.2025 17:03 — 👍 1    🔁 0    💬 1    📌 0

You have to reduce all friction to getting started, so people can experience that fucking awesomeness that is your product while jumping through as few hoops as possible.
At Rapidata we took it to an extreme, pushing the industry standard for brain rot compatibility.....

07.01.2025 17:03 — 👍 0    🔁 0    💬 1    📌 0
Video thumbnail

8.25 seconds

That's the average adult attention span these days. Our brains have been cooked by tiktok, instagram reels and youtube shorts.

As a startup founder, this has to be acknowledged when building a product now.... 🧵

07.01.2025 17:03 — 👍 3    🔁 0    💬 1    📌 0

I feel like community notes on Twitter is actually a really good feature. Obviously the post should be taken down in a bad case of misinformation, but not everything is straight up misinformation but could really benefit from some added context from reliable sources.

06.12.2024 10:14 — 👍 1    🔁 0    💬 0    📌 0
Preview
EU Inc — Sign the petition to create a pan-european startup entity Let’s unite the European startup ecosystems by creating a pan-european legal entity. We got 6 weeks. Sign now and help promoting!

With the US doing US things, Europe should make sure we give startups all the tools necessary to be able to compete with their US counterparts. For large companies the EU already acts as a single market, but for startups it feels like 27 different markets. www.eu-inc.org/petition

15.11.2024 09:15 — 👍 2    🔁 0    💬 0    📌 0
Preview
Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation Efficiently evaluating the performance of text-to-image models is difficult as it inherently requires subjective judgment and human preference, making it hard to compare different models and quantify ...

My startup asked 2M real humans which text-2-image model is better. Check out our findings:
arxiv.org/abs/2409.11904

13.11.2024 10:44 — 👍 0    🔁 0    💬 0    📌 0

How do we move the whole tech community from Twitter to here 🤔

13.11.2024 10:39 — 👍 4    🔁 0    💬 0    📌 0

@jason.rapidata.ai is following 20 prominent accounts