Patterns, a Cell Press journal @cp-patterns

Lessons from complex systems science for AI governance Complex systems science offers critical insights for designing effective AI governance measures, including early and scalable intervention, dynamic institutional frameworks, and risk thresholds calibrated to trigger timely regulatory responses.

Online Now: Lessons from complex systems science for AI governance #datascience

01.08.2025 15:35 — 👍 2 🔁 0 💬 0 📌 0

BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models BioLLM is a unified framework that simplifies the use of single-cell foundational models by bridging diverse architectures through standardized APIs and evaluation protocols. It enables seamless integration, model switching, and benchmarking in both zero-shot and fine-tuned settings. Through a comprehensive comparison of leading models, BioLLM reveals key performance differences and practical trade-offs. This work advances the accessibility, usability, and reproducibility of foundational models in single-cell transcriptomics.

Online Now: BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models #datascience

30.07.2025 19:46 — 👍 2 🔁 0 💬 0 📌 0

A one-shot, lossless algorithm for cross-cohort learning in mixed-outcomes analysis Li et al. present mixWAS, a one-shot, lossless algorithm for cross-cohort analysis of mixed outcomes using summary statistics. Applied to EHR data from seven US cohorts, mixWAS identified over 4,500 SNP-trait associations, 97.7% of which were validated in the UK Biobank. This method improves power and efficiency over traditional approaches and enables scalable, privacy-preserving discovery of multi-phenotype associations across distributed datasets.

Online Now: A one-shot, lossless algorithm for cross-cohort learning in mixed-outcomes analysis #datascience

30.07.2025 15:36 — 👍 1 🔁 0 💬 0 📌 0

A consensus privacy metrics framework for synthetic data Synthetic data can enable privacy-preserving data sharing across multiple sectors, but residual privacy vulnerabilities must be evaluated. Through a formal consensus process, this study developed an expert consensus framework to evaluate privacy in synthetic data.

Online Now: A consensus privacy metrics framework for synthetic data #datascience

29.07.2025 15:35 — 👍 1 🔁 0 💬 0 📌 0

Tucano: Advancing neural text generation for Portuguese Recent natural language processing (NLP) advances have favored high-resource languages while leaving many others underrepresented. This imbalance challenges global AI inclusivity. Addressing it requires transparent efforts to develop datasets and models for low-resource languages. The authors present GigaVerbo, a 200-billion-token dataset for Portuguese, and Tucano, a family of natively pretrained large language models trained on GigaVerbo to support Portuguese NLP and promote linguistic diversity in AI.

Online Now: Tucano: Advancing neural text generation for Portuguese #datascience

23.07.2025 15:35 — 👍 2 🔁 0 💬 0 📌 0

Bioconductor: Planning a third decade of comprehensive support for genomic data science This opinion piece discusses the Bioconductor project for open-source bioinformatics and the engineering concepts underlying its effectiveness to date…

📄 New opinion piece by Vince Carey: Bioconductor: Planning a third decade of comprehensive support for genomic data science

doi.org/10.1016/j.pa...

#Bioconductor #RStats #OpenScience

16.07.2025 08:56 — 👍 12 🔁 5 💬 0 📌 0

A systematic survey of natural language processing for the Greek language Many languages remain under-represented in AI research, limiting the global reach of language technologies. This study introduces a clear, repeatable method for surveying natural language processing (NLP) work in individual languages. When the method is applied to Greek (2012–2023), the survey reveals key trends, gaps, and resource needs—insights that are essential for fairer and more effective language technology. The approach offers a blueprint for mapping NLP progress in other under-resourced languages, helping to build a more inclusive multilingual AI landscape.

Online Now: A systematic survey of natural language processing for the Greek language #datascience

21.07.2025 15:35 — 👍 3 🔁 0 💬 0 📌 0

Resource is a flexible format that allows authors to combine technical description with information on the organization, governance, and history of a project. Submissions may describe major new software tools, datasets, or other resources or may provide updates on existing, broadly used resources.

14.07.2025 12:14 — 👍 0 🔁 0 💬 0 📌 0

This issue also serves as the launch of our new resource article type.

Explore our resource papers: www.cell.com/patterns/col...
Formatting guidelines: www.cell.com/patterns/inf...

14.07.2025 12:10 — 👍 0 🔁 0 💬 1 📌 0

Our cover image this month is an artistic rendering of a community garden, seen from above. Each plot in the garden may serve a specific family or group, and the gardeners themselves will have widely varying levels of experience, but across the garden, experience and tools are shared, creating a fertile, cooperative ecosystem that supports a broad community. Image credit: Philip Krzeminski

Our July issue is now live!
www.cell.com/patterns/iss...

Several papers in this issue highlight #opensource software and the crucial role it plays in #datascience. Check out our related editorial: www.cell.com/patterns/ful...

14.07.2025 12:04 — 👍 5 🔁 1 💬 1 📌 1

Very proud our paper "OpenML: Insights from 10 years and more than a thousand papers" that describes the current state of OpenML and also analyzes the impact OpenML had over the last years (yes, we manually looked at every paper citing OpenML). #openml #automl #openscience 1/4

07.07.2025 12:15 — 👍 2 🔁 3 💬 1 📌 0

ASReview LAB v.2: Open-source text screening with multiple agents and a crowd of experts ASReview LAB v.2 enables efficient, transparent AI-assisted systematic reviews by combining multiple machine learning models with input from expert crowds. Users can switch between fast, general-purpose, and domain-specific models within the same project. Leveraging the SYNERGY benchmark, version 2 improves performance by 24.1%. The open-source platform paves the way for scalable, collaborative knowledge curation and demonstrates how AI can support expert judgment in high-stakes decision-making across science, medicine, and policy.

Online Now: ASReview LAB v.2: Open-source text screening with multiple agents and a crowd of experts #datascience

03.07.2025 19:46 — 👍 2 🔁 0 💬 0 📌 0

OpenML: Insights from 10 years and more than a thousand papers The authors reflect on 10 years of OpenML, the open-source platform that turns machine-learning experiments into open, linked, and reusable knowledge. They present the current state of the OpenML ecosystem and its design and assess how community-curated datasets, tasks, and benchmark suites have powered more than 1,500 studies, advanced machine-learning research, and improved reproducibility. Lastly, they share lessons from a decade of building, maintaining, and expanding OpenML to guide the future of open-science infrastructure for machine learning.

Online Now: OpenML: Insights from 10 years and more than a thousand papers #datascience

03.07.2025 15:36 — 👍 7 🔁 4 💬 1 📌 0

A novel quantum algorithm for efficient attractor search in gene regulatory networks Gene regulatory networks (GRNs) are a fundamental framework for studying cellular behavior. They can be modeled as Boolean dynamical networks, but their complete characterization is often computationally intractable. This work presents a quantum algorithm that efficiently identifies all attractors of such networks using iterative quantum amplitude suppression, achieving performance improvements over classical approaches. The method is validated on quantum simulators and demonstrates robustness to noise, making it a promising candidate for current and near-term quantum computing applications in biology.

Online Now: A novel quantum algorithm for efficient attractor search in gene regulatory networks #datascience

02.07.2025 23:03 — 👍 3 🔁 0 💬 0 📌 0

Urban sustainability and data science collection: Cell Press A collection from Patterns and Cell Reports Sustainability that explores the intersection of data science and urban sustainability research.

New collection drops! What happens when data science meets urban sustainability research? Read our full collection and find out. A joint effort between @cp-patterns.bsky.social & @cp-cellrepsustain.bsky.social

www.cell.com/cp/collectio...

27.06.2025 11:11 — 👍 0 🔁 1 💬 0 📌 0

Data-driven discovery of medication effects on blood glucose from electronic health records Blood glucose levels in hospitalized patients are shaped by many medications not typically linked to glycemic control. Using feature selection and causal inference on EHR data from nearly 100,000 hospitalizations, Momenzadeh et al. identified 55 variables significantly associated with blood glucose changes. Their findings, validated on a recent cohort, uncover both established and previously unrecognized medication effects, offering new insights to guide safer inpatient diabetes management.

Online Now: Data-driven discovery of medication effects on blood glucose from electronic health records #datascience

26.06.2025 19:46 — 👍 3 🔁 0 💬 0 📌 0

UltraLight VM-UNet: Parallel Vision Mamba significantly reduces parameters for skin lesion segmentation In the field of computer-aided medical diagnosis, medical image segmentation techniques (e.g., skin lesion segmentation) have been a key research hotspot. In this paper, the authors propose the UltraLight VM-UNet model, which operates efficiently in resource-constrained environments while guaranteeing high segmentation performance through an innovative PVM Layer. These results have implications for enhancing the diagnostic capabilities of mobile medical devices and promoting the widespread application of medical image segmentation technology.

Online Now: UltraLight VM-UNet: Parallel Vision Mamba significantly reduces parameters for skin lesion segmentation #datascience

26.06.2025 15:36 — 👍 1 🔁 0 💬 0 📌 0

cytoGPNet: Enhancing clinical outcome prediction accuracy using longitudinal cytometry data in small cohort studies cytoGPNet is a deep learning framework that uses Gaussian processes to predict individual-level outcomes from high-dimensional, longitudinal cytometry data. It tackles key challenges in immune profiling, including small sample sizes, variable cell counts, and the need for interpretability. When applied to diverse datasets, it outperforms existing methods and reveals biological insights. This framework advances single-cell data analysis and supports biomarker discovery for precision medicine.

Online Now: cytoGPNet: Enhancing clinical outcome prediction accuracy using longitudinal cytometry data in small cohort studies #datascience

25.06.2025 19:46 — 👍 5 🔁 0 💬 0 📌 0

A view of the sustainable computing landscape The authors present a holistic research agenda for sustainable computing, addressing the growing carbon footprint of information and communications technology. The agenda is an interdisciplinary perspective that integrates ideas in software design, hardware architecture, renewable energy infrastructure, power and carbon accounting, and economic policy. Collectively, these design and management strategies will reduce the carbon emissions associated with the manufacture and operation of future computer systems, a pressing objective given computing’s rapid growth in emerging applications such as artificial intelligence.

Online Now: A view of the sustainable computing landscape #datascience

25.06.2025 15:35 — 👍 2 🔁 0 💬 0 📌 0

A decade of gender bias in machine translation Despite over a decade of research, gender bias in machine translation remains a complex challenge with no simple fix. A review of 133 studies highlights both promising trends and persistent gaps, such...

A new study uncovers years of gender bias in machine translation, if languages evolve, shouldn't our algorithms?👀 Check out the study here 👉 www.cell.com/patterns/ful... #AIandLanguage #GenderBias #ResponsibleAI

25.06.2025 07:41 — 👍 0 🔁 0 💬 0 📌 0

Building inclusivity in science: Cell Press We are proud to present this collection of voices, perspectives, and commentaries focused on building a more inclusive scientific community.

Equity in science isn't optional, it's essential! Explore @cellpress.bsky.social Inclusivity in Science collection, featuring research and perspectives that push for a more diverse and inclusive scientific community 🔬🔎🌍https://www.cell.com/cp/collections-inclusivity

25.06.2025 07:35 — 👍 2 🔁 2 💬 0 📌 0

Our EiC will be speaking tomorrow in Görlitz at the #CASUS @hzdr.bsky.social about Patterns, #openscience and how to ethically use AI tools in paper preparation.

24.06.2025 11:04 — 👍 0 🔁 0 💬 0 📌 0

S2S: A deep learning method for the radiological diagnosis of fine-grained diseases spanning screening to subtyping Diagnosing complex diseases like cancers demand nuanced analysis of medical images. This study presents S2S-Med, an AI system that streamlines radiological workflows (from detecting abnormalities to classifying cancer subtypes) using CT scans. Outperforming existing tools, S2S-Med improves diagnostic accuracy while aiding physicians, who achieve better outcomes when collaborating with AI. By integrating multi-stage clinical reasoning into AI, this approach enhances trust and precision in personalized diagnostics, offering a scalable solution for challenging diseases where subtle differences dictate treatment.

Online Now: S2S: A deep learning method for the radiological diagnosis of fine-grained diseases spanning screening to subtyping #datascience

17.06.2025 15:35 — 👍 2 🔁 0 💬 0 📌 0

Dopamine encodes deep network teaching signals for individual learning trajectories Longitudinal tracking of long-term learning behavior and striatal dopamine reveals that dopamine teaching signals shape individually diverse yet systematic learning trajectories, captured mathematical...

Does the brain learn by gradient descent?

It's a pleasure to share our paper at @cp-cell.bsky.social, showing how mice learning over long timescales display key hallmarks of gradient descent (GD).

The culmination of my PhD supervised by @laklab.bsky.social, @saxelab.bsky.social and Rafal Bogacz!

15.06.2025 09:33 — 👍 68 🔁 17 💬 3 📌 1

We've done it 🥳

Finally, there's a canonical paper on the #QGIS project, its history, workings, and challenges: https://www.sciencedirect.com/science/article/pii/S2666389925001138

Thanks to @timlinux and @mbernasocchi for joining me in trying to tell the QGIS story 💚

#OSGeo #GISChat #GIScience

21.05.2025 18:17 — 👍 43 🔁 43 💬 4 📌 2

Coordination of network heterogeneity and individual preferences promotes collective fairness The authors investigate how network structure and preference diversity jointly influence collective fairness using behavioral experiments, an evolutionary game model, and agent-based simulations. They find that while network heterogeneity can promote fairness, this outcome depends on coordinated interactions between individuals with diverse prosocial tendencies and the underlying network topology. In particular, fairness-driven individuals in central network positions can trigger collective fairness, highlighting that network reciprocity emerges not automatically, but through alignment between structural connectivity and behavioral diversity.

Online Now: Coordination of network heterogeneity and individual preferences promotes collective fairness #datascience

16.06.2025 15:35 — 👍 2 🔁 0 💬 0 📌 0

Highlighting the achievements and impact of women in data science Women have been instrumental in shaping data science from its earliest days. This opinion highlights both the achievements and the ongoing challenges faced by women in the field, emphasizing that a wi...

Also in this month's issue:

Did you know women have been shaping data science since BEFORE it had a name? Check out this opinion about the often overlooked history of women in data science and why it matters

www.cell.com/patterns/ful...

#WomenInSTEM

13.06.2025 15:15 — 👍 1 🔁 1 💬 0 📌 0

$On the cover: “Sur Incise / On Intersection”: Seeing through layers “Sur Incise” draws us in with fractured surfaces—blocks of ochre, navy, yellow, and green, colliding with purpose. Its title, referencing Boulez's composition and the act of cutting, hints at fragmentation and ambiguity. At the painting's center lies a human figure—nude, abstracted, shifting between identities. Visible in color but elusive in form, it invites questions: what remains unseen when language fails us? This visual tension mirrors findings from Savoldi et al.'s study on gender bias in machine translation, where systems default to binaries and overlook intersectionality. Just as the painting resists clear categorization, translation models struggle with identities that do not conform to fixed labels—flattening complexity and erasing nuance. “Sur Incise” offers no singular reading. Gender, like color, is layered, mutable. The figure is not absent—only obscured. To perceive it, we must move beyond rigid categories and accept that some meanings are felt rather than defined.Image credit: Picture of the painting “Sur Incise / On Intersection” 2025. Acrylic paint on a linen canvas mounted on an aluminum frame. 120 cm × 120 cm × 1.5 cm. © Dirk Vanmassenhove.$

On the cover: “Sur Incise / On Intersection”: Seeing through layers “Sur Incise” draws us in with fractured surfaces—blocks of ochre, navy, yellow, and green, colliding with purpose. Its title, referencing Boulez's composition and the act of cutting, hints at fragmentation and ambiguity. At the painting's center lies a human figure—nude, abstracted, shifting between identities. Visible in color but elusive in form, it invites questions: what remains unseen when language fails us? This visual tension mirrors findings from Savoldi et al.'s study on gender bias in machine translation, where systems default to binaries and overlook intersectionality. Just as the painting resists clear categorization, translation models struggle with identities that do not conform to fixed labels—flattening complexity and erasing nuance. “Sur Incise” offers no singular reading. Gender, like color, is layered, mutable. The figure is not absent—only obscured. To perceive it, we must move beyond rigid categories and accept that some meanings are felt rather than defined.Image credit: Picture of the painting “Sur Incise / On Intersection” 2025. Acrylic paint on a linen canvas mounted on an aluminum frame. 120 cm × 120 cm × 1.5 cm. © Dirk Vanmassenhove.

Our June issue is now live! www.cell.com/patterns/iss...

On the cover is an artwork from Dirk Vanmassenhove that the artist feels evokes findings in a study on gender bias also published in this issue. "Gender, like color, is layered, mutable," he writes.

Related study
www.cell.com/patterns/ful...

13.06.2025 15:10 — 👍 4 🔁 2 💬 0 📌 0

Genetic variation in the National Institutes of Health All of Us research program and the overlap between self-identified US race and ethnicity.

US self-reported race and ethnicity are poor proxies of genetic ancestry, researchers say. http://dlvr.it/TLBN1M

Learn more in @ajhgnews.bsky.social

05.06.2025 15:01 — 👍 7 🔁 2 💬 0 📌 0

Visualization of associative exploration of temporal concepts via frequent patterns This paper introduces PanTeraV, a visual interface for exploring temporal patterns, both continuous and discrete, independent of the mining algorithm. By enabling bidirectional exploration of patterns with an efficient indexing approach, PanTeraV significantly advances the discovery of complex temporal-interval-related patterns (TIRPs). A usability study highlights that PanTeraV allows faster, more accurate pattern discovery compared to existing tools, offering a powerful solution that does not require prior data mining expertise, broadening accessibility for diverse users.

Online Now: Visualization of associative exploration of temporal concepts via frequent patterns #datascience

11.06.2025 19:47 — 👍 1 🔁 0 💬 0 📌 0

Patterns, a Cell Press journal

Latest posts by cp-patterns.bsky.social on Bluesky

@cp-patterns is following 20 prominent accounts