Andrés Corrada @andrescorrada

Logical Consistency Between Disagreeing Experts and Its Role in AI Safety

Logical consistency between disagreeing experts can be used to keep us safer when we use their noisy decisions. My latest on this overlooked tool to tame AIs -- submitted to IEEE SaTML 2026 in Berlin. arxiv.org/abs/2510.00821

02.10.2025 12:08 — 👍 1 🔁 0 💬 0 📌 0

Formalization is considered the sine qua non of maturity in many fields. How far could you go with your work, @far.ai , if you understood and used the logic of unsupervised evaluation I have developed? How far could you inspire others? Anybody can try this logic. ntqr.readthedocs.io/en/latest

01.10.2025 21:12 — 👍 1 🔁 0 💬 0 📌 0

Verification of evaluations for classifiers in unlabeled settings exists. You can read about it or try it now. ntqr.readthedocs.io/en/latest

01.10.2025 21:05 — 👍 0 🔁 0 💬 0 📌 0

I'm befuddled. These are technical people that can understand linear algebra. Something about this logic is wrong for them. Unexpected? Wrong messenger? Not what they thought the solution would be? But you don't have to wait for them to try it!

01.10.2025 21:03 — 👍 0 🔁 0 💬 0 📌 0

@far.ai is not the only institution or person clamoring for verification of AI systems. The authors of this paper, which include @yoshuabengio.bsky.social , also are tone deaf to the existence of a logic for unsupervised evaluation that uses no probability. There is no "towards" in the logic.

01.10.2025 21:03 — 👍 0 🔁 0 💬 1 📌 0

And we need @far.ai to understand that the verification of evaluations of classifiers exists. This is my work on the logic of unsupervised evaluation for classifiers. So simple anybody who understands high-school math will get it. ntqr.readthedocs.io/en/latest

01.10.2025 21:03 — 👍 1 🔁 0 💬 1 📌 0

Drummond asserted, at a public talk he gave at @mcgilluniversity.bsky.social on his paper, that replicability was a relatively modern scientific requirement. When I asked him what he made of the Royal Society's "Nullius in Verbia" motto, he shrugged his shoulders.

30.09.2025 23:04 — 👍 2 🔁 0 💬 0 📌 0

Live from "war ravaged" Portland

27.09.2025 21:13 — 👍 11561 🔁 3784 💬 532 📌 382

Sumerians Look On In Confusion As God Creates World

Sumerians Look On In Confusion As God Creates World https://theonion.com/sumerians-look-on-in-confusion-as-god-creates-world-1819571221/

28.09.2025 15:00 — 👍 1578 🔁 196 💬 21 📌 11

spent more than 2 years with little progress to show for it, it was becoming clear to him and his co-workers that nothing was going to come of the project.
Like many "useless" projects, this one gave him solid training and a good start to his career. The book above a payoff of that wasted money.

28.09.2025 14:55 — 👍 1 🔁 0 💬 0 📌 0

Fresh out of graduate school, he landed a job at MITRE, north of Boston, on a project funded by a wealthy coder/entrepreneur who had some ideas about how to create much more powerful code verifiers that had broader application.
It was not clear to the whole team what his ideas were, but as they ...

28.09.2025 14:55 — 👍 1 🔁 0 💬 1 📌 0

I was invited to give a talk at @mcgilluniversity.bsky.social this May at Trusted and Automated Decision Making (TADM). One of the other speakers, Bill Farmer,wrote this book on Type Theory. He told me a story about the quixotic quest for the universal code verifier he was involved in.

28.09.2025 14:55 — 👍 1 🔁 0 💬 1 📌 0

@adolfont.github.io, it may interest you to know that we can take concepts from theorem provers to formalize the computation of the group evaluations that are logically consistent with how we observe experts, human or robotic, agreeing/disagreeing on a test. ntqr.readthedocs.io/en/latest

28.09.2025 14:30 — 👍 1 🔁 1 💬 1 📌 0

This huge asymmetry about what we can formalize about evaluations (not theories about the World) in AI or anything involving disagreeing experts is not being noticed or used by AI researchers or users. It should be.

28.09.2025 14:43 — 👍 0 🔁 0 💬 0 📌 0

Everything about carrying out statistical evaluation of experts taking any multiple choice exam is complete. These pictures show one aspect of that completeness -- we can enumerate all the possible evaluations for any number of experts on any test, on any subject given summaries of their decisions.

28.09.2025 14:43 — 👍 0 🔁 0 💬 1 📌 0

Part of the reason is that we have no theory of all the theories we could have about the World! So completeness of anything that claims to verify the correctness of experts making decisions about the World is not possible. This is not so for evaluations of experts that have such theories.

28.09.2025 14:43 — 👍 0 🔁 0 💬 1 📌 0

There have been calls for using formal verification of software programs as a model for formally verifying AI systems, for example @yoshuabengio.bsky.social and his co-authors. But, as you know, formal verification of code is a non-trivial task. The verifier of any code does not exist.

28.09.2025 14:43 — 👍 1 🔁 0 💬 2 📌 0

@adolfont.github.io, it may interest you to know that we can take concepts from theorem provers to formalize the computation of the group evaluations that are logically consistent with how we observe experts, human or robotic, agreeing/disagreeing on a test. ntqr.readthedocs.io/en/latest

28.09.2025 14:30 — 👍 1 🔁 1 💬 1 📌 0

The purpose of ground truth is to provide an optimization path. If you want to align without ground truth other than that of experts that are disagreeing, how would you know you are not going off the rails? Any algorithm that gave you what GT gives - average performance - aids gradient descent.

27.09.2025 20:05 — 👍 0 🔁 0 💬 1 📌 0

Nice. Very ambitious. My work, if it were applicable to your work, is more like a small, formally verifiable alarm for problems or other simple inferences. It is not inconceivable it would help in your much more complex task, but it is not immediately obvious how or where it would go.

27.09.2025 20:03 — 👍 1 🔁 0 💬 1 📌 0

If we knew how correct, on average, not on each decision, an agent is, we can mitigate for their mistakes. The example you give of lying in a topic for which you were not trained to be truthful is one place where this could help. The rub is getting that average knowledge.

27.09.2025 20:00 — 👍 1 🔁 0 💬 1 📌 0

When no one knows the ground truth, as may happen if you tried to train on unlabeled data, how do you confirm inner alignment? Evaluation logic is not about inner or outer, training or testing, lab or real world. It is about how to interpret evaluations where you use them. In inner or outer.

27.09.2025 19:52 — 👍 1 🔁 0 💬 1 📌 0

if we polled a large number of humans. Is that so? How could we check?
This applies to training because we would like to train in unlabeled data (the vast majority of what we have). But eating your own dog food can lead to model degradation.

27.09.2025 19:47 — 👍 0 🔁 0 💬 0 📌 0

"As intended" -- this is the very first thing that should be questioned when metrics are invented. Are we measuring something real? If we do not know it, how do we know the experts are correct establishing it? Aligning AIs to humans in training assumes this truth exists and would be found

27.09.2025 19:47 — 👍 1 🔁 0 💬 2 📌 0

That training depends on a ground truth. No? Who or what determined it? That is the 1st thing we should check when we talk about aligning on "human values". If they disagree on what ground truths are, we should use that to establish max min levels of their expertise.

27.09.2025 19:44 — 👍 1 🔁 0 💬 1 📌 0

The very idea of aligning AI systems with humans should be verified in each case. What does alignment to humans mean when they only agree 80% on the answer key to test sets?
How do we know that humans are aligning to a definable, scientific idea of a "ground truth" is 100% agreement is impossible?

27.09.2025 19:41 — 👍 1 🔁 0 💬 1 📌 0

You can formalize evaluations of classifiers or any expert answering a multiple-choice exam. It is mathematically complete and relies on one logical property alone - logical consistency between disagreeing experts.

27.09.2025 19:39 — 👍 1 🔁 0 💬 2 📌 0

Speaking of boxes of numbers. Here are the possible evaluations for a single binary classifier that labeled Q=10 items (left). If you understand that evaluations of classifiers obey linear axioms, you can restrict this space of 285 possible evaluations to just 35 after seeing a summary their labels.

25.09.2025 15:56 — 👍 0 🔁 1 💬 1 📌 0

I spent the week at the police surveillance convention and let me tell you my biggest observation: The name of the game now is consolidating as much information as humanely possible from surveillance devices, the internet, other governmental data, and literally a million other places. 🧵

26.09.2025 16:47 — 👍 1392 🔁 725 💬 43 📌 93

There is nothing wrong from using other side assumptions to limit possible accuracies for experts. But not starting from the logically consistent set is, well, inconsistent! This is why the science of AI evals needs to implement the logic.

26.09.2025 18:06 — 👍 0 🔁 0 💬 0 📌 0

Andrés Corrada

Latest posts by andrescorrada.bsky.social on Bluesky

@andrescorrada is following 19 prominent accounts