x[c(2,5,9),] identifies the submatrix
00
AD
00
00
BE
00
00
00
CF
00
R fills in matrices by column (alphabetically above), recycling the vector 1:2 as needed (in this case 3 times to fill 6 spots.
So we get A = 1, B = 2, C = 1, D = 2, E = 1, F = 2.
29.11.2025 04:33 β π 1 π 0 π¬ 1 π 0
I TA'd for John at ICPSR in 2017. He was very supportive of me early on in my career and so inspiring as someone bridging statistics education, methodological research, and statistical computing. He is certainly a giant whose shoulders I am honored to stand on.
28.11.2025 16:47 β π 16 π 2 π¬ 0 π 0
sprintf() is pretty close
26.11.2025 13:14 β π 1 π 0 π¬ 1 π 0
Do you mean "immortal time bias"? Definitely one of the coolest named biases
25.11.2025 17:17 β π 6 π 0 π¬ 1 π 0
"The probability that there is no effect is the same as the probability of flipping 7 heads in a row"
See, I can misinterpret S-values, too
25.11.2025 16:03 β π 10 π 0 π¬ 1 π 1
Any #rstats advice for writing mathematical expressions in ggplot2 axis labels? Dying not to have to use the insanity that is bquote()
13.11.2025 21:57 β π 8 π 2 π¬ 1 π 0
I could have bootstrapped the whole thing and it still would have been quicker!
04.11.2025 20:26 β π 2 π 0 π¬ 1 π 0
What do you mean my Bayesian logistic regression on a moderate sample took 3 minutes to run
04.11.2025 16:41 β π 17 π 0 π¬ 8 π 0
Hopefully this causes a paradigm shift and stops these papers from getting through.
Next we need clones of this paper for every social science discipline.
27.10.2025 18:43 β π 2 π 0 π¬ 0 π 0
This is excellent, and I'm so glad this paper was finally written, and so clearly as well. I basically write an equivalent every time I am consulting with someone proposing this type of study, and I'm so glad I can save my effort and just send them this instead!
27.10.2025 18:43 β π 12 π 1 π¬ 2 π 0
A "methods primer" article in the journal "BMJ Medicine", titled "Factors associated with: problems of using exploratory multivariable regression to identify causal risk factors"
We wrote an article explaining why you shouldn't put several variables into a regression model and report which are statistically significant - even as exploratory research. bmjmedicine.bmj.com/content/4/1/.... How did we do?
27.10.2025 17:39 β π 272 π 109 π¬ 26 π 20
This sounds suspiciously like you want to interpret the coefficients from this model⦠Fit the best model for the data (probably logistic) and use marginaleffects to compute the quantity of interest.
24.10.2025 13:04 β π 4 π 0 π¬ 0 π 0
Only true for the ATT, but also equivalent to the just identified CBPS!
18.10.2025 11:17 β π 2 π 0 π¬ 0 π 0
Looks like observations are being dropped due to missingness. Did you encode the censoring event as NA if it didnβt occur? I think itβs supposed to be encoded as logical. Would be helpful to see the function call, not just the output.
14.10.2025 10:58 β π 2 π 0 π¬ 1 π 0
π
13.10.2025 17:39 β π 5 π 0 π¬ 0 π 0
My hot take is that "fixed effects" has a single, clear meaning that is equivalent across all subdisciplines of statistics.
05.10.2025 00:44 β π 9 π 0 π¬ 2 π 0
bsky.app/profile/noah...
26.09.2025 15:58 β π 0 π 0 π¬ 0 π 0
These methods are new and bespoke, but I'd love to hear about if they inspire or help you in your own research! Feel free to let me know if you have any questions about the methodology. How would you have solved these problems?
18.09.2025 15:23 β π 1 π 0 π¬ 0 π 0
Computation of the Balance-Sample Size Frontier in Matching Methods for Causal Inference
Returns the subset of the data with the minimum imbalance for
every possible subset size (N - 1, N - 2, ...), down to the data set with the
minimum possible imbalance. Also includes tool...
All this can be done using my {MatchingFrontier} #Rstats package, which isn't yet on CRAN. This is part of my growing body of "cool new methods I've programmed but am too lazy to write a paper about". Please get in touch if you want to collab on some.
18.09.2025 15:23 β π 10 π 0 π¬ 1 π 0
We used g-computation with a cluster-robust SE for pair membership. Because we dropped units from both groups, this analysis targeted the ATO, which is the best we could do given the lack of overlap.
I won't speak about the results (IMO they are less cool than the methods π)
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
After selecting our subset, we did a 2:1 pair match on the scaled Euclidean distance to assign each control unit a treated unit, which supplied its initiation date. Finally, we regressed the 90-day outcomes on the treatment and selected covariates in the matched sample and estimated the effect.
18.09.2025 15:23 β π 1 π 0 π¬ 1 π 0
The lowest energy distance was achieved with too small a sample to make inferences, given our hypothesized effect sizes. Instead, we chose the largest sample along the 2:1 energy distance frontier that had all SMDs below .1, and adjusted for remaining imbalance with regression.
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
At each step, we dropped the unit that, when dropped, yielded the largest decrease in the energy distance, subject to the constraint that the ratio of control units to treated units was 2:1 so we could perform 2:1 matching in the final selected subset. This process is greedy, not optimal, but worked
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
We dropped units to minimize the energy distance between the groups. The energy distance was described for balancing by Huling & Mak (2014); it is a scalar, multivariate measure of the difference between two *joint* distributions. In this case, those are the confounder distributions in the groups.
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
We dropped units from the sample one at a time so that the units that remained formed a balanced sample. How did we decide which unit to drop at each step and when to stop dropping units? The framework for this is described by King et al (2017) as the "matching frontier".
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
In traditional matching, pairing is used to identify a matched subset of the original sample that is balanced. Though we want pairs and and we want balance, it turns out we can do the subset selection first and the pairing second and get better results. Cho et al (2013) describe subset selection.
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
References for the above, since these papers are not well known, but should be:
King et al (2017): doi.org/10.1111/ajps...
Huling & Mak (2024): doi.org/10.1515/jci-...
Cho et al (2013): doi.org/10.1111/stan...
18.09.2025 15:23 β π 4 π 1 π¬ 1 π 0
To solve this, we combined several existing methods to develop our own. This was a combination of the matching frontier developed by King et al (2017), energy balancing developed by Huling & Mak (2024), and subset selection by Cho et al (2013).
18.09.2025 15:23 β π 3 π 0 π¬ 1 π 0
Traditional and even advanced matching methods failed. PS matching, Mahalanobis distance matching, and cardinality matching all failed to achieve adequate balance or maintain sufficient sample size, even after trying many variations (calipers, etc.). We needed a more specialized approach.
18.09.2025 15:23 β π 2 π 0 π¬ 1 π 0
ASA Fellow; #rstats developer of graphical methods for categorical and multivariate data; #datavis history of data visualization; #historicaldatavis; Milestones project
Web: www.datavis.ca
GitHub: github.com/friendly
Fostering a dialogue between industry and academia on causal data science.
Causal Data Science Meeting 2025: causalscience.org
Professor of Epidemiology
Emory University
Social policy evaluator. London-based, Belfast-born. They/them. Personal account β views mine. Posts auto delete after a month.
Mastodon: https://sciences.social/@andi
Web: https://andifugard.info/
DM: https://andifugard.info/contact/
Interested in the design, validation, and analysis of self-report assessments. PhD from UCSB. Currently working in biotech. Happy to chat about AltAC!
professor of psych methods at UC Davis. SEM, measurement, rigorous and replicable research methods.
PhD in Developmental Science | JSMF Postdoctoral Fellow studying the development of social cognition & interaction | Complex Systems Lab| UPenn | She/her
Professor of #quantpsych. Peoples' processes might be different, let's figure out how to model this #rstats. π³οΈβπ
Recovering behavioral scientist, posing as a statistical consultant. Applied stats #RStats, neurodiversity, learning Ukrainian, writing things
Software engineer @posit.co, humane #rstats
Social Science Data Analysis (especially Dyadic, Mediation, SEM; davidakenny.net) and My Twisted Take on Political & Cultural Topics; Reposts Are Not Necessarily Endorsements
Post-doc at NYU Grossman School of Medicine (this account is solely in my personal capacity, all views are my own etc). Non-parametric statistics, causal inference, Bayesian methods. Herbsusmann.com
A channel for making you better at statistics
Asst. Prof. of Statistics & Political Science at Penn State. I study stats methods, gerrymandering, & elections. Bayesian. Founder of UGSDW and proud alum of HGSU-UAW L. 5118.
corymccartan.com
Assistant Professor of Biostatistics at Penn. Views are my own.
You donβt need a PhD in statistics or years of coding experience to learn R, the most powerful tool for data analysis and visualization.
https://rfortherestofus.com/
Dedicated to training and dissemination in research methods, statistics, and data science; centerstat.org
Bayescurious evidence enthusiast at the100.ci
Topics: evolution, ovulation, mutation, intelligence, personality, sexuality, R, open science & source tools. https://rubenarslan.github.io/