Erick Scott's Avatar

Erick Scott

@erickscott.bsky.social

Scientist, building cstructure.

42 Followers  |  94 Following  |  50 Posts  |  Joined: 11.02.2024  |  2.0092

Latest posts by erickscott.bsky.social on Bluesky

I'd love to see someone specify that function...where does Terence Tao sit on the curve?

09.10.2025 20:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I have been surprised that the first generation is usually more thoughtful with the sycophant warning as a system prompt.

I won't speculate on what's actually happening with matmul/reasoning, but I have found it helpful to counteract the vendor's ingratiating base prompt.

09.10.2025 18:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Try adding: 'Don't be a sycophant' to your system prompt.

Gemini is more stubborn...

09.10.2025 17:51 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

I don't think they are useless and deepmind is definitely moving forward with principled quantitative approaches. Moving stochastic outputs into structured models + rapid human error correction is what @travisgerke.bsky.social and I are working on at cStructure. Happy to chat anytime

09.10.2025 17:49 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The code that supposedly underpinned the analysis used a fake propensity score (0.1*covariate1 + 0.2*covariate2...) with a comment that a real propensity model should be implemented.

This happens all the time: code syntax was fine, semantics wrong - assoc. text was plausible. User beware.
2/2

09.10.2025 17:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I have many similar stories. For example, I asked for a propensity score analysis of Lalonde assuming this canonical example is a best case scenario. I provided the dataset. The generated text provided a correct and nuanced description of the estimator and the ATE. 1/2

09.10.2025 17:41 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

cStructure is proud to collaborate with @BeeKeeperAI_Inc and DREAM on The COVID Causal Diagram DREAM Challenge.

Privacy-preserving compute + collaborative causal modeling -> the future of responsible AI development.

Learn more at cstructure.net

@travisgerke.bsky.social

16.05.2025 17:13 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Tonight is the 250th anniversary of Paul Revere’s midnight ride. May his memory remind us all to resist the tyranny forming in our government.

19.04.2025 00:00 β€” πŸ‘ 23420    πŸ” 7548    πŸ’¬ 329    πŸ“Œ 286
Post image

Bayesian posterior distributions. So much information packed into the density. If two people disagree on what threshold should be used to make a decision, it's easy to calculate the support for either.

18.04.2025 02:59 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Reminds me of the difference b/w Efficacy (ITT if properly used, e.g. abstinence for teen pregnancy) vs Effectiveness (PP, outcomes when abstinence is used in practice).

In practice, I think the effectiveness of a causal method is important as unmeasured confounding is ever present in real data.

17.04.2025 18:49 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

The Trump-Supreme Court battle is not really the crisis.

The crisis is here now. Trump is enacting an insidious coordinated attack on our institutions of democratic accountability, designed to crater democracy before next fall.

1/ A long 🧡to explain the plan & how we stop it.

17.04.2025 16:22 β€” πŸ‘ 4583    πŸ” 1951    πŸ’¬ 160    πŸ“Œ 439

I really do believe Gordon et al. offered a best effort assessment. The scale, diversity, and quality of their ground truth is quite impressive.

What would you have done differently to reduce assumption violation?

17.04.2025 15:16 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I see simulations as a useful tool to assess method performance under various degrees of assumption violation.

I also think the simulations should approximate the magnitude and direction of bias seen in high quality empirical studies.

17.04.2025 15:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
(PDF) Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement PDF | Randomized controlled trials (RCTs) have become increasingly popular in both marketing practice and academia. However, RCTs are not always... | Find, read and cite all the research you need on R...

I love the WeightIt package, thanks for developing it.

How do you interpret these simulation papers in light of large scale empirical benchmarks

www.researchgate.net/publication/...

17.04.2025 00:51 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Am I the only one in industry, that looks at this thread and remembers junior hires showing up to their first stakeholder meeting after throwing "all the x's" into sci-kit learn and then getting absolutely thrashed by the domain experts?

16.04.2025 19:38 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It's like we learned absolutely nothing from the reproducibility crisis, kitchen-sink machine learning models for covid, population/environmental stratification in genomics, a/b testing at scale...sigh.

16.04.2025 19:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I just named several industries that in practice don't blindly use LASSO. A/B testing is used by any industry with a website/app and small to large companies employ (data) scientists to design and analyze the experiments. Healthcare is a pretty large industry, Computing is a pretty large industry??

16.04.2025 19:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Then I am puzzled by the idea that in practice scientists just expect LASSO to select the right variables. Here's SHAP docs describing why that is a bad assumption **in practice**
shap.readthedocs.io/en/latest/ex...

16.04.2025 18:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

You should encourage him to explore causal inference.

Practical applications that share the same concern about LASSO: A/B testing, drug development, electrical engineering, physics

16.04.2025 14:36 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Bone chilling.

A court ordered Kilmar Abrego Garcia to stay in the United States.

The Supreme Court ruled 9-0 that he was illegally removed. Trump is pretending he won the ruling 9-0.

1/ You may not think this case means anything to you. But let me tell you why it does.

14.04.2025 21:04 β€” πŸ‘ 13176    πŸ” 4344    πŸ’¬ 467    πŸ“Œ 342

Empirical studies

RCT-Duplicate: www.rct-duplicate.org

FDA RWE examples: www.fda.gov/media/146258... &

Northwestern/Meta A/B testing: www.kellogg.northwestern.edu/faculty/gord... and arxiv.org/abs/2201.07055

14.04.2025 14:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There are several excellent technical books on this subject.

WhatIf by HernΓ‘n and Robins: miguelhernan.org/whatifbook

Causal Inference in Statistics by Pearl: www.amazon.com/Causal-Infer...

Causality by Pearl: bayes.cs.ucla.edu/BOOK-2K/

14.04.2025 13:52 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Those trying to understand the tariffs as economic policy are dangerously naive.

No, the tariffs are a tool to collapse our democracy. A means to compel loyalty from every business that will need to petition Trump for relief.

1/ A 🧡 to explain his plan and how we fight back.

03.04.2025 03:29 β€” πŸ‘ 28255    πŸ” 13950    πŸ’¬ 926    πŸ“Œ 3531
Preview
Rectangle Move and resize windows in macOS using keyboard shortcuts or snap areas. The official page for Rectangle.

Rectangle is amazing for organizing windows
rectangleapp.com

Spaces are also really helpful to keep work streams partitioned. 3 finger up/down

Dbeaver is the best free database GUI for mac

Homebrew for package management is a must

28.03.2025 21:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
An overview of marimo
YouTube video by marimo An overview of marimo

πŸš€ The @marimo.io YouTube channel crossed 1k subscribers today β€” just two weeks after its launch!

marimo is best understand by seeing it in action. In his latest video, the one and only @koaning.bsky.social gives a bird's eye overview of what sets marimo apart:

www.youtube.com/watch?v=3N6l...

12.03.2025 19:58 β€” πŸ‘ 11    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Post image

Did you even glance at the report? 19% of the funds distributed were unrestricted. Also, look at the other categories. If there was ever a time to 'break the glass', it is now. Unis that choose to preserve their endowments over their staff and students should consider their non-profit mission

11.03.2025 22:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Endowment Performance and Management - Columbia Endowment 2024 See Columbia’s endowment performance for the most recent fiscal year and its transformative long-term endowment gains.

The financial challenges for US universities and science are unprecedented. Columbia can abs absorb the funding gap for the next 4 years w/o any financial hardship, endowment >$14b and annualized returns of 5-8%. Public unis are not so comfortable endowment.giving.columbia.edu/endowment-pe...

11.03.2025 21:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Homepage - openRxiv openRxiv is an independent non-profit, the new organizational home for bioRxiv and medRxiv, enabling researchers to instantly share groundbreaking findings with the global scientific community.

Big news: we are setting up a new non-profit organization to run bioRxiv and medRxiv. It's called openRxiv [no it's not a new preprint server; it's dedicated organization to oversee the servers] openrxiv.org 1/n

11.03.2025 13:20 β€” πŸ‘ 2571    πŸ” 849    πŸ’¬ 55    πŸ“Œ 42

2) Construct a living database of published studies consistent with Meiotic Randomization using the guideline
3) Publish/Disseminate high-profile systematic review using the Meiotic Randomization database
4) Convince your colleagues to use the term and methodology
5) Watch the selective sweep
2/2

09.03.2025 16:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Meiotic Randomization is more precise, it is never too late to evolve methods, nomenclature, and practice.

1) Create Equator Network guideline for the required elements of a Meiotic Randomization study that corrects the failures of most Mendelian Randomization designs.

1/n

09.03.2025 15:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

@erickscott is following 20 prominent accounts