EPPI Reviewer @eppi-reviewer

Latest Changes (30/09/2025 - V6.17.0.0) - Forum announcements

#EPPI-Reviewer new (major) release:
Version 6.17.0.0 adds a new system to create prioritised lists of items to screen, as well as other improvements to screening systems.
Also extends the list of supported LLMs, adding Llama and Mistral.

All details: eppi.ioe.ac.uk/cms/Default....

30.09.2025 14:33 — 👍 3 🔁 1 💬 0 📌 0

Latest Changes (27/05/2025 - V6.16.2.0) - Forum announcements

‪
‪#EPPI-Reviewer new release:
Version 6.16.3.0 adds DeepSeek to the list of LLMs available.

Additionally, it includes significant improvements on how the Coding tools are shown/handled, and a new search type.

All details: eppi.ioe.ac.uk/cms/Default....

03.07.2025 13:39 — 👍 2 🔁 1 💬 0 📌 0

Latest Changes (27/05/2025 - V6.16.2.0) - Forum announcements

‪#EPPI-Reviewer new release: Version 6.16.2.0 introduces a number different LLM models, which can be used for robot/auto coding tasks.
Before only one LLM/model was available.
This release also contains a bug fix and a security fix.

All details: eppi.ioe.ac.uk/cms/Default....

27.05.2025 13:30 — 👍 2 🔁 1 💬 0 📌 0

Latest Changes (28/04/2025 - V6.16.1.0) - Forum announcements

#EPPI-Reviewer new release: Version 6.16.1.0 introduces the "Automation Logs", provides DIY access to LLM-Driven coding functions, supports "Priority Screening Simulations" and more.
All details: eppi.ioe.ac.uk/cms/Default....

28.04.2025 13:12 — 👍 1 🔁 1 💬 0 📌 0

So, if you worry that LLMs (or so called AI) will inevitably deteriorate the quality of evidence synthesis work, and thus devalue it, perhaps we can help.
This tech can be used responsibly, with some effort. We at EPPI Reviewer/EPPI Centre are trying to do so, and trying to help others too.
🧵 18/18

13.02.2025 15:26 — 👍 1 🔁 0 💬 0 📌 0

Meanwhile, staff at the @eppicentre.bsky.social, along with collaborators are busy designing, running, and reporting on "Studies within a review" (SWARS). This is done for 2 purposes:
1. Ensure our results can be trusted.
2. Produce published examples of responsible use & evaluation of LLMs.
🧵 17/n

13.02.2025 15:26 — 👍 1 🔁 0 💬 1 📌 0

Wrapping up:
#EPPI-Reviewer now supports using an LLM (limited to GPT4o, for now) to run screening and data extraction tasks.
The LLM can be fed titles and abstracts or the full text.
All functionalities have been designed to *facilitate* (rather than obfuscate) their per-review evaluation.
🧵 16/n

13.02.2025 15:24 — 👍 1 🔁 0 💬 1 📌 0

[One under-evaluated, or rather, borderline ignored issue is that of error propagation: the more steps are "automated", the more errors will happen at each different stage, and they are likely to compound non-linearly, undermining more and more the trustworthiness of results.]
🧵 15/n

13.02.2025 15:24 — 👍 1 🔁 0 💬 1 📌 0

For example, the machine might make up the answer to the "how many participants?" question when supplied with an RCT protocol, rather than the actual study report.
Effect: every evidence synthesis effort, and every LLM-task therein NEEDS to be evaluated on its own and then in its full context
🧵 14/n

13.02.2025 15:23 — 👍 1 🔁 0 💬 1 📌 0

LLMs are necessarily extremely sensitive to:
- Every specific prompt,
- The contents supplied,
- And even more, the interaction between the 2.
What we found early on is that most "hallucinations" happen when the question asked implies an assumption that isn't valid for a given “content”.
🧵 13/n

13.02.2025 15:22 — 👍 0 🔁 0 💬 1 📌 0

The guiding principles are conceptually simple, though.
Any new feature shipped out needs to be matched by effective, accessible to regular users, and not-too-onerous systems to evaluate in full how well it works.
This is paramount, because as a general rule, evaluations DO NOT generalise.
🧵 12/n

13.02.2025 15:21 — 👍 0 🔁 0 💬 1 📌 0

Side effects are that we need to move fast, but move thoughtfully and effectively too.
Thus, from 2024 onwards, we developers have been in "marathon-length, sprint mode".
The main focus is to deliver facilities to leverage LLMs in ways that maximise the trustworthiness of results.
🧵 11/n

13.02.2025 15:20 — 👍 0 🔁 0 💬 1 📌 0

We still challenge one another with such questions, but we are also all aware that we have fully committed to a general strategy.
1. LLMs will be used in Evidence Synthesis - no matter what our positions will be.
2. Thus, what we want to do is to help the field to use LLMs responsibly.
🧵 10/n

13.02.2025 15:19 — 👍 0 🔁 0 💬 1 📌 0

For us working on #EPPI-Reviewer, this single clear "answer" opened up a multitude of new questions about what we *should* do, how to manage risks without curbing our effectiveness, and more importantly, on what kind of "effects" we wanted to have.
🧵 9/n

13.02.2025 15:18 — 👍 0 🔁 0 💬 1 📌 0

The results are here: ceur-ws.org/Vol-3832/pap...
(With many thanks to Lena Schmidt, @kaitlynhair.bsky.social, Fiona Campbell, @clkapp.bsky.social, Alireza Khanteymoori, Dawn Craig and Mark Engelbert.)
Take home message was: yes, this kind of tech will be used in Evidence Synthesis - a lot.
🧵 8/n

13.02.2025 15:17 — 👍 0 🔁 0 💬 1 📌 1

At the Hackathon, we were able to run a very small, but we think "carefully planned" study that we hoped could begin answering our initial question, and concurrently provide an example of how to run such studies responsibly.
🧵 7/n

13.02.2025 15:17 — 👍 0 🔁 0 💬 1 📌 0

At the same time, it was very evident that lots of quick and dirty evaluations were being run and/or published, often flawed, not only because of their small sample size, but more worryingly, undermined by not-very-thoughtful methods and/or an excess of optimism.
🧵 6/n

13.02.2025 15:16 — 👍 1 🔁 0 💬 1 📌 0

Developing collaborations and technology for evidence synthesis An event series to develop open software for evidence synthesis

The only way to find out was to try. So we put together "proof of concept" functionalities, and put them to a preliminary test with the invaluable help of external collaborators, gathered together for the Evidence Synthesis Hackathon (2023 - www.eshackathon.org).
🧵 5/n

13.02.2025 15:15 — 👍 0 🔁 0 💬 1 📌 0

This is important, because it can potentially mitigate or even remove the problem of (so called) hallucinations, and applied to many, if not most steps of evidence synthesis, broadly defined.
Thus, the 1st question was: can we really use LLMs and get reliable results?
🧵 4/n

13.02.2025 15:14 — 👍 0 🔁 0 💬 1 📌 0

Back when GPT was "new", it was very clear to us that Large Language Models (LLMs) could be useful (and thus: disruptive) in the field of evidence synthesis, because in our use-case, one could ask to the machine questions, supplied along with the text that should be used to answer them.
🧵 3/n

13.02.2025 15:14 — 👍 0 🔁 0 💬 1 📌 0

Latest Changes (12/02/2025 - V6.16.0.0) - Forum announcements

It took us 15 months or more to move from "proof of concept" to releasing a version of EPPI Reviewer that integrates comprehensive LLM-driven functionalities, available to all users (see: eppi.ioe.ac.uk/cms/Default....).
It's been quite a journey, so perhaps it's worth sharing some thoughts.
🧵 2/n

13.02.2025 15:13 — 👍 0 🔁 0 💬 1 📌 0

Doing evidence synthesis and grappling with the ever-widening range of promises about using new Large Language Models to speed up the process?
Perhaps you’re worried about how such tools may be misused?
Yeah, we (the #EPPI-Reviewer team) feel the same.
A thread on our journey so far.
🧵 1/n

13.02.2025 15:11 — 👍 5 🔁 7 💬 1 📌 3

Latest Changes (12/02/2025 - V6.16.0.0) - Forum announcements

#EPPI-Reviewer new release: Version 6.16.0.0 makes the LLM-driven auto-coding functionalities available (on demand) to all users. It also includes a new Machine Learning facility to retrospectively check Screening decisions.
More details are here: eppi.ioe.ac.uk/cms/Default....

12.02.2025 14:14 — 👍 4 🔁 3 💬 0 📌 0

Calorie (energy) labelling for changing selection and consumption of food or alcohol - Clarke, N - 2025 | Cochrane Library Select your preferred language for Cochrane reviews and other content. Sections without translation will be in English.

🔊 Published today, our new Cochrane review on the impact of calorie labelling on people’s selection and consumption of food (including non-alcoholic drinks) and alcohol:
www.cochranelibrary.com/cdsr/doi/10....

Short thread below 👇

17.01.2025 01:03 — 👍 55 🔁 32 💬 7 📌 2

Latest Changes (06/12/2024 - V6.15.5.1) - Forum announcements

#EPPI-Reviewer new release: Version 6.15.5.1 is a "Single Feature" release, which extends the (on invitation, experimental) GPT-coding features, which can now operate against the full-text, instead of being limited to title and abstract.
All details are here: eppi.ioe.ac.uk/cms/Default....

06.12.2024 13:31 — 👍 6 🔁 3 💬 0 📌 0

EPPI Reviewer

Latest posts by eppi-reviewer.bsky.social on Bluesky

@eppi-reviewer is following 20 prominent accounts