Patrice Bechard @patricebechard

StarFlow: Generating Structured Workflow Outputs From Sketch Images Workflows are a fundamental component of automation in enterprise platforms, enabling the orchestration of tasks, data processing, and system integrations. Despite being widely used, building workflow...

From notebook to workflow—just by sketching.
That’s the vision.

🔗 arxiv.org/abs/2503.21889
📝 tinyurl.com/3utdbn97

Thanks to @joanrod.bsky.social, @perouz.bsky.social, @spandanagella.bsky.social and all co-authors!
#AI #VLM #WorkflowAutomation #Sketch2Flow #arXiv

29.05.2025 03:34 — 👍 0 🔁 0 💬 0 📌 0

🔍 Extra findings:

• Models struggle most with handwritten & whiteboard sketches
• UI screenshots are easiest
• End-to-end generation beats decomposed pipelines
• Finetuning on diverse sketch data is key to generalization

29.05.2025 03:34 — 👍 1 🔁 0 💬 1 📌 0

📊 We benchmarked top VLMs (GPT-4o, Claude, Gemini) vs. open-weight models (Qwen, LLaMA, Pixtral).

📈 Finetuned open models outperform proprietary ones:

Qwen2.5-VL-7B → FlowSim: 0.614
GPT-4o → FlowSim: 0.786
𝐐𝐰𝐞𝐧𝟐.𝟓-𝐕𝐋-𝟕𝐁 (𝐟𝐢𝐧𝐞𝐭𝐮𝐧𝐞𝐝) → 𝐅𝐥𝐨𝐰𝐒𝐢𝐦: 𝟎.𝟗𝟓𝟕

29.05.2025 03:34 — 👍 0 🔁 0 💬 1 📌 0

🧠 We built a large dataset (22K+ samples) of workflow diagrams:

• Synthetic (Graphviz)
• Manual (hand-drawn)
• Whiteboard
• Digital
• UI screenshots

These were paired with structured JSON workflow outputs for training and evaluation.

29.05.2025 03:34 — 👍 0 🔁 0 💬 1 📌 0

𝐖𝐡𝐲?

Workflow automation is powerful—but authoring flows is still complex, even with low-code tools.
💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰 explores a simpler interface: 𝐣𝐮𝐬𝐭 𝐝𝐫𝐚𝐰 𝐢𝐭.

Imagine sketching a workflow on a whiteboard and getting a runnable flow in return.

29.05.2025 03:34 — 👍 0 🔁 0 💬 1 📌 0

🚀 New paper from our team at @servicenowresearch.bsky.social!⁣
⁣
💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐖𝐨𝐫𝐤𝐟𝐥𝐨𝐰 𝐎𝐮𝐭𝐩𝐮𝐭𝐬 𝐅𝐫𝐨𝐦 𝐒𝐤𝐞𝐭𝐜𝐡 𝐈𝐦𝐚𝐠𝐞𝐬⁣
We use VLMs to turn 𝘩𝘢𝘯𝘥-𝘥𝘳𝘢𝘸𝘯 𝘴𝘬𝘦𝘵𝘤𝘩𝘦𝘴 and diagrams into executable workflows 🖍️→⚙️⁣
⁣
🔗 arxiv.org/abs/2503.218...
📝 tinyurl.com/3utdbn97%E2%...
#Sketch2Flow #AI #VLM

29.05.2025 03:34 — 👍 0 🔁 1 💬 1 📌 0

Multi-task retriever fine-tuning for domain-specific and efficient RAG Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical limitations such as generating hallucinated or outdated information. H...

🔍 Want to learn more? Look at our paper to learn more on how to:

* Build balanced training datasets for real-world tasks
* Learn how to handle data imbalance
* Get insights on how to design for at-scale deployment

arxiv.org/abs/2501.04652

09.01.2025 15:46 — 👍 0 🔁 0 💬 0 📌 0

🌟 Key Features:

* One retriever for many use cases
* Works across languages! 🌍
* Handles structured data like workflows
* Lightweight & fast for production
* Generalizes to new domains & tasks

09.01.2025 15:46 — 👍 0 🔁 0 💬 1 📌 0

📊 Our Results:

Multi-task instruction fine-tuning FTW! Our approach beats both BM25 and strong off-the-shelf encoder models across all retrieval tasks (in-distribution and out-of-distribution).

09.01.2025 15:46 — 👍 0 🔁 0 💬 1 📌 0

💡 The Challenge:

* RAG needs domain-specific knowledge
* Multiple apps = multiple retrievers = 💰
* Different types of data (steps, tables, fields, ...)

09.01.2025 15:46 — 👍 0 🔁 0 💬 1 📌 0

🚀 Excited to share our new work on making RAG actually work for enterprise applications!
We present a recipe to build a custom retriever that handles multiple retrieval tasks simultaneously for domain-specific RAG applications 🧵

09.01.2025 15:46 — 👍 1 🔁 0 💬 1 📌 0

We’re really excited to release this large collaborative work for unifying web agent benchmarks under the same roof.

In this TMLR paper, we dive in-depth into #BrowserGym and #AgentLab. We also present some unexpected performances from Claude 3.5-Sonnet

12.12.2024 17:55 — 👍 20 🔁 11 💬 1 📌 2

🎉 Excited to introduce BigDocs!
An open, transparent multimodal dataset designed for:
📄 Documents
🌐 Web content
🖥️ GUI understanding
👨‍💻 Code generation from images
We’re also launching BigDocs-Bench:
➡️ Document, Web, GUI Visual reasoning
➡️ Converting images into JSON, Markdown, LaTeX, SVG, and more!

10.12.2024 18:34 — 👍 16 🔁 8 💬 1 📌 2

Generating a Low-code Complete Workflow via Task Decomposition and RAG AI technologies are moving rapidly from research to production. With the popularity of Foundation Models (FMs) that generate text, images, and video, AI-based systems are increasing their complexity. ...

Ready to learn more? Check out our full paper here: arxiv.org/abs/2412.00239

If this sounds exciting, follow us! We’ve got more papers and insights on the way—don’t miss out! 🚀

03.12.2024 15:15 — 👍 0 🔁 0 💬 0 📌 0

Finally, we outline trade-offs and practical considerations, from latency improvements to deployment strategies. If you’re designing GenAI systems, this is a goldmine of insights!

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

Evaluation was key: we developed a novel tree-based metric, Flow Similarity, to assess workflow correctness. Plus, we measured each sub-task and RAG component separately for fine-grained insights.

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

We dive deep into dataset creation, discussing how Task Decomposition guided our labeling efforts. By focusing on smaller tasks, we sped up labeling, reduced costs, and iteratively improved our system.

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

RAG enhances the system by grounding the generation process in real-time data from the environment. This reduces hallucinations and ensures that the generated workflows are accurate and context-aware.

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

Task Decomposition allows us to split the workflow generation into two sub-tasks:

1. Outlining the workflow structure
2. Populating inputs for each step

Each sub-task is easier to solve and test, boosting the system’s modularity and maintainability.

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

We tackle a real-world use case: Workflow Generation. Given a user requirement in natural language, our system generates complex workflows step by step. This involves breaking the problem into smaller, manageable tasks.

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

Looking to build an LLM-powered app but finding it hard to make it robust? We’ve got you covered! Our new paper explores how Task Decomposition and Retrieval-Augmented Generation (RAG) can help you create reliable systems. 🧵👇

03.12.2024 15:15 — 👍 0 🔁 0 💬 1 📌 0

Patrice Bechard

Latest posts by patricebechard.bsky.social on Bluesky

@patricebechard is following 20 prominent accounts