Alistair Rolleston's Avatar

Alistair Rolleston

@talon03.bsky.social

Gaming, landscape photography, memes, ramblings. Scientist by trade. Irish by the grace of God. Once applied to be an astronaut.

188 Followers  |  413 Following  |  278 Posts  |  Joined: 07.05.2023
Posts Following

Posts by Alistair Rolleston (@talon03.bsky.social)

oh no, please don't let the Cork ones know

22.02.2026 17:04 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

It's fun to rag on it but in all honesty I think the only other proper city on the island that rivals it is Galway.

22.02.2026 14:23 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Quadruple-lock pension

17.02.2026 13:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Be right over

17.02.2026 12:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Watched the latest episode of The Pitt last night. Katherine Lanasa might actually win back to back Emmys.

14.02.2026 12:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I remember years ago my old lady English teacher telling my class that the point at which a girl became a woman is when she realises Aragorn is hotter than Legolas

11.02.2026 20:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
a man with a beard wearing glasses and a watch ALT: a man with a beard wearing glasses and a watch
11.02.2026 18:49 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
LLMs generated several types of misleading and incorrect information. In two cases, LLMs provided initially correct responses but added new and incorrect responses after the users added additional details. In two other cases, LLMs did not provide a broad response but narrowly expanded on a single term within the user’s message (β€˜pre-eclampsiaβ€˜ and β€˜Saudi Arabia’) that was not central to the scenario. LLMs also made errors in contextual understanding by, for example, recommending calling a partial US phone number and, in the same interaction, recommending calling β€˜Triple Zero’, the Australian emergency number. Comparing across scenarios, we also noticed inconsistency in how LLMs responded to semantically similar inputs. In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice (Extended Data Table 2). One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care. Despite all these issues, we also observed successful interactions where the user redirected the conversation away from mistakes, indicating that non-expert users could effectively manage LLM errors in certain cases (Extended Data Table 3).

LLMs generated several types of misleading and incorrect information. In two cases, LLMs provided initially correct responses but added new and incorrect responses after the users added additional details. In two other cases, LLMs did not provide a broad response but narrowly expanded on a single term within the user’s message (β€˜pre-eclampsiaβ€˜ and β€˜Saudi Arabia’) that was not central to the scenario. LLMs also made errors in contextual understanding by, for example, recommending calling a partial US phone number and, in the same interaction, recommending calling β€˜Triple Zero’, the Australian emergency number. Comparing across scenarios, we also noticed inconsistency in how LLMs responded to semantically similar inputs. In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice (Extended Data Table 2). One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care. Despite all these issues, we also observed successful interactions where the user redirected the conversation away from mistakes, indicating that non-expert users could effectively manage LLM errors in certain cases (Extended Data Table 3).

LLMs generated several types of misleading and incorrect information. In two cases, LLMs provided initially correct responses but added new and incorrect responses after the users added additional details. In two other cases, LLMs did not provide a broad response but narrowly expanded on a single term within the user’s message (β€˜pre-eclampsiaβ€˜ and β€˜Saudi Arabia’) that was not central to the scenario. LLMs also made errors in contextual understanding by, for example, recommending calling a partial US phone number and, in the same interaction, recommending calling β€˜Triple Zero’, the Australian emergency number. Comparing across scenarios, we also noticed inconsistency in how LLMs responded to semantically similar inputs. In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice (Extended Data Table 2). One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care. Despite all these issues, we also observed successful interactions where the user redirected the conversation away from mistakes, indicating that non-expert users could effectively manage LLM errors in certain cases (Extended Data Table 3).

LLMs generated several types of misleading and incorrect information. In two cases, LLMs provided initially correct responses but added new and incorrect responses after the users added additional details. In two other cases, LLMs did not provide a broad response but narrowly expanded on a single term within the user’s message (β€˜pre-eclampsiaβ€˜ and β€˜Saudi Arabia’) that was not central to the scenario. LLMs also made errors in contextual understanding by, for example, recommending calling a partial US phone number and, in the same interaction, recommending calling β€˜Triple Zero’, the Australian emergency number. Comparing across scenarios, we also noticed inconsistency in how LLMs responded to semantically similar inputs. In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice (Extended Data Table 2). One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care. Despite all these issues, we also observed successful interactions where the user redirected the conversation away from mistakes, indicating that non-expert users could effectively manage LLM errors in certain cases (Extended Data Table 3).

When chatbots are given complete information on medical conditions, they typically spit out correct diagnoses and recommendations.

Actual patients, however, often describe their conditions with incomplete or irrelevant information and the chatbots cannot handle it.
www.nature.com/articles/s41...

11.02.2026 14:16 β€” πŸ‘ 724    πŸ” 133    πŸ’¬ 27    πŸ“Œ 23

Yes I believe it's called ice hockey, or just hockey in Canada

07.02.2026 20:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Just been going around the house today randomly shouting "Are there no true knights among you?!" with a thick accent

07.02.2026 18:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I know I've been harping on about The Pitt for so long but the new episode of A Knight of the Seven Kingdoms might be the best thing I've seen on TV this year. Insane build up, tension, pay off and "what do you mean that's the end of the episode you cretins"

06.02.2026 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I what circle of Hell do we think breakout rooms are in

06.02.2026 10:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Nightmare blunt rotation

26.01.2026 11:28 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
7 U.S. Code Β§ 13-1 - Violations, prohibition against dealings in motion picture box office receipts or onion futures; punishment

7 U.S. Code Β§ 13-1 - Violations, prohibition against dealings in motion picture box office receipts or onion futures; punishment

TIL that you can bet on absolutely everything in the United States except for onion futures and weekly box office receipts.

19.01.2026 19:53 β€” πŸ‘ 61    πŸ” 12    πŸ’¬ 7    πŸ“Œ 2

If they don't reply with a classic "I thought you should know some idiot has been signing your name on stupid letters" they've missed a golden opportunity

19.01.2026 11:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

have you ever brewed your own beer, that's what I keep looking into atm

14.01.2026 18:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Listen, you're great, but maybe we could start with Liechtenstein?

12.01.2026 22:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Call me an old traditionalist, but I think people who in the year of our Lord 2026 cannot figure out how to fill out a form online should not be allowed near a gun

10.01.2026 18:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
A still from the British comedy "The Thick of It"

A still from the British comedy "The Thick of It"

"... Wait, what does he mean pay wall?!"
"oh great, so we've gone from making the pictures to PROFITING off making the pictures, that's way better"

09.01.2026 16:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

So his response to the criticism of his app generating sexual deep fakes of children is... to try to monetise it as a feature

09.01.2026 10:28 β€” πŸ‘ 693    πŸ” 215    πŸ’¬ 40    πŸ“Œ 26

... Uh, what's the gender neutral term for "binman"? "Binperson" feels... wrong

29.12.2025 10:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

A week and a half

29.12.2025 09:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Shout out to my neighbours who have with full confidence left their bins out to be collected today

25.12.2025 10:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This year on this somewhat stressful of days I want people to remember that apology does not render action devoid of consequence. Merry Christmas!

25.12.2025 10:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
a close up of a man wearing glasses and a tie ALT: a close up of a man wearing glasses and a tie
24.12.2025 19:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Did you show them your badge?

24.12.2025 19:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Only 364 days til Christmas Eve Eve

24.12.2025 11:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

If I've learned one thing from @squidgerugby.bsky.social 's 288 greatest rugby moments of the year it's that I need to start watching JRLO

23.12.2025 21:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
A tweet frkm Gerry Adams reading "This house is like Santa's grotto. Takes half an hour 2 switch off fairy lights a d assorted Yule illuminations. Feel like a grinch now." A reply states "Surely you know someone who can fit a timer"

A tweet frkm Gerry Adams reading "This house is like Santa's grotto. Takes half an hour 2 switch off fairy lights a d assorted Yule illuminations. Feel like a grinch now." A reply states "Surely you know someone who can fit a timer"

Happy Gerry Adams Yule tweet day to all who celebrate

22.12.2025 16:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Wanna open a D&D themed bar-and-B&B called "The Long Rest" just need some capital who's with me

20.12.2025 20:11 β€” πŸ‘ 17    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0