I'd love to see someone specify that function...where does Terence Tao sit on the curve?
09.10.2025 20:02 β π 1 π 0 π¬ 0 π 0@erickscott.bsky.social
Scientist, building cstructure.
I'd love to see someone specify that function...where does Terence Tao sit on the curve?
09.10.2025 20:02 β π 1 π 0 π¬ 0 π 0I have been surprised that the first generation is usually more thoughtful with the sycophant warning as a system prompt.
I won't speculate on what's actually happening with matmul/reasoning, but I have found it helpful to counteract the vendor's ingratiating base prompt.
Try adding: 'Don't be a sycophant' to your system prompt.
Gemini is more stubborn...
I don't think they are useless and deepmind is definitely moving forward with principled quantitative approaches. Moving stochastic outputs into structured models + rapid human error correction is what @travisgerke.bsky.social and I are working on at cStructure. Happy to chat anytime
09.10.2025 17:49 β π 2 π 0 π¬ 1 π 0The code that supposedly underpinned the analysis used a fake propensity score (0.1*covariate1 + 0.2*covariate2...) with a comment that a real propensity model should be implemented.
This happens all the time: code syntax was fine, semantics wrong - assoc. text was plausible. User beware.
2/2
I have many similar stories. For example, I asked for a propensity score analysis of Lalonde assuming this canonical example is a best case scenario. I provided the dataset. The generated text provided a correct and nuanced description of the estimator and the ATE. 1/2
09.10.2025 17:41 β π 3 π 0 π¬ 1 π 0cStructure is proud to collaborate with @BeeKeeperAI_Inc and DREAM on The COVID Causal Diagram DREAM Challenge.
Privacy-preserving compute + collaborative causal modeling -> the future of responsible AI development.
Learn more at cstructure.net
@travisgerke.bsky.social
Tonight is the 250th anniversary of Paul Revereβs midnight ride. May his memory remind us all to resist the tyranny forming in our government.
19.04.2025 00:00 β π 23420 π 7548 π¬ 329 π 286Bayesian posterior distributions. So much information packed into the density. If two people disagree on what threshold should be used to make a decision, it's easy to calculate the support for either.
18.04.2025 02:59 β π 3 π 1 π¬ 0 π 0Reminds me of the difference b/w Efficacy (ITT if properly used, e.g. abstinence for teen pregnancy) vs Effectiveness (PP, outcomes when abstinence is used in practice).
In practice, I think the effectiveness of a causal method is important as unmeasured confounding is ever present in real data.
The Trump-Supreme Court battle is not really the crisis.
The crisis is here now. Trump is enacting an insidious coordinated attack on our institutions of democratic accountability, designed to crater democracy before next fall.
1/ A long π§΅to explain the plan & how we stop it.
I really do believe Gordon et al. offered a best effort assessment. The scale, diversity, and quality of their ground truth is quite impressive.
What would you have done differently to reduce assumption violation?
I see simulations as a useful tool to assess method performance under various degrees of assumption violation.
I also think the simulations should approximate the magnitude and direction of bias seen in high quality empirical studies.
I love the WeightIt package, thanks for developing it.
How do you interpret these simulation papers in light of large scale empirical benchmarks
www.researchgate.net/publication/...
Am I the only one in industry, that looks at this thread and remembers junior hires showing up to their first stakeholder meeting after throwing "all the x's" into sci-kit learn and then getting absolutely thrashed by the domain experts?
16.04.2025 19:38 β π 0 π 0 π¬ 0 π 0It's like we learned absolutely nothing from the reproducibility crisis, kitchen-sink machine learning models for covid, population/environmental stratification in genomics, a/b testing at scale...sigh.
16.04.2025 19:34 β π 1 π 0 π¬ 0 π 0I just named several industries that in practice don't blindly use LASSO. A/B testing is used by any industry with a website/app and small to large companies employ (data) scientists to design and analyze the experiments. Healthcare is a pretty large industry, Computing is a pretty large industry??
16.04.2025 19:06 β π 0 π 0 π¬ 0 π 0Then I am puzzled by the idea that in practice scientists just expect LASSO to select the right variables. Here's SHAP docs describing why that is a bad assumption **in practice**
shap.readthedocs.io/en/latest/ex...
You should encourage him to explore causal inference.
Practical applications that share the same concern about LASSO: A/B testing, drug development, electrical engineering, physics
Bone chilling.
A court ordered Kilmar Abrego Garcia to stay in the United States.
The Supreme Court ruled 9-0 that he was illegally removed. Trump is pretending he won the ruling 9-0.
1/ You may not think this case means anything to you. But let me tell you why it does.
Empirical studies
RCT-Duplicate: www.rct-duplicate.org
FDA RWE examples: www.fda.gov/media/146258... &
Northwestern/Meta A/B testing: www.kellogg.northwestern.edu/faculty/gord... and arxiv.org/abs/2201.07055
There are several excellent technical books on this subject.
WhatIf by HernΓ‘n and Robins: miguelhernan.org/whatifbook
Causal Inference in Statistics by Pearl: www.amazon.com/Causal-Infer...
Causality by Pearl: bayes.cs.ucla.edu/BOOK-2K/
Those trying to understand the tariffs as economic policy are dangerously naive.
No, the tariffs are a tool to collapse our democracy. A means to compel loyalty from every business that will need to petition Trump for relief.
1/ A π§΅ to explain his plan and how we fight back.
Rectangle is amazing for organizing windows
rectangleapp.com
Spaces are also really helpful to keep work streams partitioned. 3 finger up/down
Dbeaver is the best free database GUI for mac
Homebrew for package management is a must
π The @marimo.io YouTube channel crossed 1k subscribers today β just two weeks after its launch!
marimo is best understand by seeing it in action. In his latest video, the one and only @koaning.bsky.social gives a bird's eye overview of what sets marimo apart:
www.youtube.com/watch?v=3N6l...
Did you even glance at the report? 19% of the funds distributed were unrestricted. Also, look at the other categories. If there was ever a time to 'break the glass', it is now. Unis that choose to preserve their endowments over their staff and students should consider their non-profit mission
11.03.2025 22:00 β π 0 π 0 π¬ 0 π 0The financial challenges for US universities and science are unprecedented. Columbia can abs absorb the funding gap for the next 4 years w/o any financial hardship, endowment >$14b and annualized returns of 5-8%. Public unis are not so comfortable endowment.giving.columbia.edu/endowment-pe...
11.03.2025 21:43 β π 0 π 0 π¬ 1 π 0Big news: we are setting up a new non-profit organization to run bioRxiv and medRxiv. It's called openRxiv [no it's not a new preprint server; it's dedicated organization to oversee the servers] openrxiv.org 1/n
11.03.2025 13:20 β π 2571 π 849 π¬ 55 π 422) Construct a living database of published studies consistent with Meiotic Randomization using the guideline
3) Publish/Disseminate high-profile systematic review using the Meiotic Randomization database
4) Convince your colleagues to use the term and methodology
5) Watch the selective sweep
2/2
Meiotic Randomization is more precise, it is never too late to evolve methods, nomenclature, and practice.
1) Create Equator Network guideline for the required elements of a Meiotic Randomization study that corrects the failures of most Mendelian Randomization designs.
1/n