Amélie Viallet's Avatar

Amélie Viallet

@ameeelie.bsky.social

Working at HF I love simple things and making them even simpler. I create both digital and physical products. I co-created Argilla, an Open-Source app for all who care about doing AI projects responsibly by caring about their data.

578 Followers  |  116 Following  |  24 Posts  |  Joined: 18.11.2024  |  1.5998

Latest posts by ameeelie.bsky.social on Bluesky

Preview
Sheets - a Hugging Face Space by aisheets Create a dataset by describing what you need in a simple text input or by choosing from existing examples. The application will help you design your dataset based on your description.

huggingface.co/spaces/aishe...

10.06.2025 14:41 — 👍 0    🔁 0    💬 0    📌 0
Video thumbnail

Introducing Hugging Face Sheets, where we explore how to create more accurate and reliable structured data with AI and web sources.

10.06.2025 14:39 — 👍 1    🔁 0    💬 1    📌 0
Cultivate robust synthetic data and reduce model hallucinations

Cultivate robust synthetic data and reduce model hallucinations

Exciting news! I’m designing an open-source app that helps AI builders create high-quality datasets in minutes—whether they start with data or not.

🍆 Watch your dataset grow consistently column by column
🪁 Adjust data generation anytime
🦄 Stay in control while automation does the heavy lifting

17.03.2025 14:25 — 👍 2    🔁 0    💬 0    📌 0
Preview
Art et IA, une cohabitation est-elle possible ? Pas encore de compte ?

shs.cairn.info/dossiers-202...

14.03.2025 13:48 — 👍 0    🔁 0    💬 0    📌 0

Cette phrase n’est pas de moi je l'ai lu dans un article de Jean-Lou Fourquet dans son introduction au dossier "Art et IA, une cohabitation est-elle possible ?"

14.03.2025 13:46 — 👍 0    🔁 0    💬 0    📌 0
Post image

Cela calme la frénésie autour de l'IA qui pourrait s'emparer de tous les sujets, tous les domaines avec pertinence. Elle éclaire des réflexions sur le lien entre art et automatisation de certains “gestes”.

lnkd.in/dPgVApDA

14.03.2025 13:46 — 👍 0    🔁 0    💬 2    📌 0

There is a way to make better use of Generative AI.

As a designer working on a synthetic data generator, I attach great importance to working on fundamental “details” that help users consciously use the technology.

Action prevention, flexibility, efficiency, and information transparency.

20.02.2025 13:06 — 👍 0    🔁 0    💬 0    📌 0

It echoes a mention I read a few days ago in a post from Dan Shipper summarizing an interview with the CEO of Vercel.

“AI tools are shifting software toward consumption-based billing models, making us capital allocators who decide how much compute the AI consumes.”

20.02.2025 13:05 — 👍 1    🔁 0    💬 0    📌 0

From a citizen's perspective, this list of prompts (read each of them and think about it) illustrates well how companies are inciting people to consume AI in an unsustainable way, making it their faithful companion 🐶 all day long.

20.02.2025 13:04 — 👍 0    🔁 0    💬 0    📌 0

From a citizen's perspective, this list of prompts (read each of them and think about it) illustrates well how companies are inciting people to consume AI in an unsustainable way, making it their faithful companion 🐶 all day long.

20.02.2025 13:03 — 👍 0    🔁 0    💬 0    📌 0

From a UX perspective, I like this design principle of guiding users to achieve their tasks. In this case, the continuous flow of prompt examples does the job.

20.02.2025 13:02 — 👍 0    🔁 0    💬 3    📌 0
Post image

About the last ChatGPT home page:

“Be more assisted” would be a more accurate title, in my opinion.

20.02.2025 13:01 — 👍 1    🔁 0    💬 2    📌 0
Post image

Of course L'Express was obliged to use a photo of Sam Altman to best illustrate a conversation with me.
Of course neither a) taking a new picture of me during the interview, nor b) asking me to provide a picture of myself were impossible.
Of course.

11.02.2025 16:18 — 👍 35    🔁 5    💬 2    📌 1
Image showing an overview of languages in the FineWeb-c Dataset.

Image showing an overview of languages in the FineWeb-c Dataset.

🌍 Big step for multilingual AI data!

The @hf.co community has rated educational content in languages spoken by 1.6 billion people! New additions:
• Japanese
• Italian
• Old High German

These ratings can help enhance training data for major world languages.

27.01.2025 12:30 — 👍 27    🔁 3    💬 1    📌 1
Progress bars showing remaining annotations needed for 15 languages in FineWeb-C dataset, ranging from 6 to 593 annotations needed

Progress bars showing remaining annotations needed for 15 languages in FineWeb-C dataset, ranging from 6 to 593 annotations needed

The finish line is near! We're building FineWeb-Edu for many languages and need your help 🤗

Many FineWeb-C languages are close to 1,000 annotations!

Assamese is 99.4% done, French needs 64 more annotations, Tamil: 216.

Please help us reach the goal: huggingface.co/spaces/data-...

06.01.2025 14:32 — 👍 20    🔁 5    💬 1    📌 1
Post image

Imagine creating custom datasets and training AI models WITHOUT writing a single line of code. We did and made it a reality.

@hf.co Synthetic Data Generator

Blog: huggingface.co/blog
Space: huggingface.co/spaces/argil...
GitHub: github.com/argilla-io/s...

16.12.2024 15:37 — 👍 22    🔁 6    💬 0    📌 0
Preview
fra - français - French Join and contribute to the dataset fra - français - French

I've just contributed 20 examples to FineWeb 2 in French! Join me; we are already a couple of annotators there!

data-is-better-together-fineweb-c.hf.space/share-your-p...

10.12.2024 15:58 — 👍 4    🔁 0    💬 0    📌 0
Post image

In a couple of minutes, we’ll officially make the FineWeb 2 Annotation Sprint.

🎶 Go with your rhythms, and do what you can.
🤏 There is no minimum.
👐 Each contribution is welcomed.

The more we are, the better the result will be.

10.12.2024 11:51 — 👍 2    🔁 2    💬 0    📌 0
Post image Post image

Vanakkam makkalae , glad that I’ll be leading the FineWeb 2 collaborative annotation sprint for Tamil! 🤗

I’ll be helping to build an open dataset to improve language models for our language. Do join the process of improving models !

huggingface.co/spaces/Huggi...

huggingface.co/spaces/data-...

10.12.2024 11:05 — 👍 1    🔁 1    💬 0    📌 0
Post image

I am thrilled to see Argilla increasingly used to enable impactful collaborative work around datasets.

Next week, we’ll announce a massive multi-language open annotation sprint to ensure all languages advance equally in AI.

06.12.2024 11:38 — 👍 2    🔁 1    💬 0    📌 0
Post image


[UX-UI update]
The latest update on the Argilla homepage provides a clear overview of your annotation projects, enhances project monitoring, and highlights the importance of collaboration in data curation.

29.11.2024 14:06 — 👍 2    🔁 0    💬 0    📌 0

👀 Who said the Argilla tool was only for text? I am proud of my brilliant teammates for setting up this significant initiative 🤗 @benburtenshaw.bsky.social @davidberenstein.bsky.social @danielvanstrien.bsky.social @dvilasuero.hf.co

26.11.2024 13:08 — 👍 2    🔁 1    💬 0    📌 0
Post image

Let’s make a generation of amazing open source image generation models from high quality data.

The best image generation models train on human preferences. Unfortunately, many of these datasets are closed. Let’s change that!

🧵 we're building a community dataset and we need help reviewing!

26.11.2024 12:00 — 👍 41    🔁 11    💬 5    📌 2

This is great, one of the responses could be "more effort on open datasets"

26.11.2024 08:41 — 👍 0    🔁 0    💬 1    📌 0

Soon: Another sprint to label all together a big dataset for many languages! Nominate yourself to lead your language community!

26.11.2024 07:21 — 👍 5    🔁 0    💬 0    📌 0
Preview
Labelers training AI say they're overworked, underpaid and exploited by big American tech companies Digital workers in Kenya had to sift through horrific online content to train AI, but say they were underpaid, overworked, and got inadequate mental health support. So they're fighting back.

"Naftali was assigned to train AI to recognize and weed out pornography, hate speech and excessive violence, which meant sifting through the worst of the worst content online for hours on end."
So much of AI is based on exploiting workers in precarious conditions 😔
www.cbsnews.com/news/labeler...

25.11.2024 15:16 — 👍 46    🔁 22    💬 3    📌 3
Post image

What about building a high-quality dataset together and making the result available to the whole Open-Source community?

You are all welcome with or without code skills, with or without a background in AI.

It's about democratizing access to AI projects and making how they work transparent.

25.11.2024 15:32 — 👍 4    🔁 1    💬 0    📌 0
Post image

⚫️⚪️ If you think transparency is key to building the image generation models of tomorrow, consider contributing to a massive open dataset.

Follow huggingface.co/data-is-bett... not to miss the announcement!

22.11.2024 15:05 — 👍 29    🔁 9    💬 0    📌 2
Preview
🧧 AIoAI / Curated Images + Sub-Collections | Are.na This curated collection of images, cut-outs, and scrap materials is here to insp…rce or explore the context of any image, we recommend using Google Image Search.

🏺🎞️ Beautiful collection of graphic materials to create better images with AI. www.are.na/aixdesign/ai...

22.11.2024 10:45 — 👍 2    🔁 1    💬 0    📌 0

@ameeelie is following 20 prominent accounts