Next week’s seminar brings another thought-provoking discussion! Join us on February 11th to hear from CTML GSR Mingxun (Michael) Wang presenting his talk on "Highly Adaptive Principal Component Regression: Fast HAL/HAR via Outcome-Blind Kernel PCA." The seminar will take place at 12:00 PM in Berkeley Way West, 5th Floor, Room 5401.
The Highly Adaptive Lasso (HAL) has strong rate guarantees under minimal smoothness assumptions, but can be computationally prohibitive in moderate to high dimensions due to its enormous basis expansion. We introduce PCHAL and PCHAR, which perform outcome-blind dimension reduction by projecting the highly adaptive kernel onto its leading principal components, yielding simple closed-form ridge solutions and a lasso solution that reduces to soft-thresholding in an orthogonal score space. The resulting estimators substantially accelerate fitting and cross-validation while matching the empirical predictive performance of HAL/HAR, and are implemented in the hapc R package.
Next week’s seminar brings another thought-provoking discussion! Join us on February 11 to hear from CTML GSR Mingxun Wang presenting his talk on "Highly Adaptive Principal Component Regression: Fast HAL/HAR via Outcome-Blind Kernel PCA." The seminar will take place at 12PM in BWW, 5th Fl, Rm 5401.
05.02.2026 17:17 — 👍 1 🔁 0 💬 0 📌 0
We’re pleased to welcome Toru Shirakawa, CTML GSR and CPH PhD Student, to next week’s CTML Seminar on Wednesday, February 4th, presenting on “A Conformalized Inference on Unobservable Variables.”The seminar will take place at 12:00 PM in Berkeley Way West, 5th Floor, Room 5401.
Abstract: Quantifying uncertainty in predicted unobservable variables is a critical area of research in statistics, artificial intelligence, and empirical science. Most scientific studies assume a specific structure involving unobservable variables for the data-generating process and draw inferences from a parameter of interest within that framework. Conformal prediction is a popular model-agnostic method for constructing prediction intervals for new observations. However, it typically requires observed true labels to build the prediction interval, making it unsuitable for unobserved latent variables. We propose a method to construct a prediction interval by leveraging sample-splitting of the training data and analyzing the discrepancy between two independently trained models. To ensure the identifiability of the distribution of this conformity score, we introduce a few assumptions regarding the distribution of the residuals of the predictions. Furthermore, we propose a residual orthogonalization to satisfy these assumptions with a coordinating regularization term. The performance of the proposed method was evaluated using both simulation and large language model experiments.
We’re pleased to welcome Toru Shirakawa, CTML GSR and CPH PhD Student, to next week’s CTML Seminar on February 4th! The seminar will take place at 12PM in BWW, 5th Fl, Rm 5401.
Click here for the abstract 👉 tinyurl.com/5ab59c85
29.01.2026 21:34 — 👍 2 🔁 0 💬 0 📌 0
As we gear up in anticipation for this wonderful workshop, we'd like to highlight it once again to give every member of our community a chance to participate!
CTML Co-Director Mark van der Laan’s talk, "The Causal Roadmap, Targeted Learning and TMLE: What Is That All About?", will be presented at the Pharmaceutical Users Software Exchange (PHUSE) Workshop this Thursday, on January 29th, 2026, from 10:00 AM–11:00 AM (EST). The talk will be available online via Zoom.
Click the link in our bio to learn more about the talk and register today!
As we gear up in anticipation for this wonderful workshop, we'd like to remind members of our community to endeavor to participate!
CTML Co-Director Mark van der Laan’s talk will be presented at the Pharmaceutical Users Software Exchange (PHUSE) Workshop this Thursday from 10:00 AM–11:00 AM (EST).
26.01.2026 23:40 — 👍 1 🔁 0 💬 0 📌 0
Connect with the Center for Targeted Machine Learning and Causal Inference (CTML) community! Next week’s seminar on Wednesday, January 28th, will feature CTML GSR Kaitlyn Lee presenting her talk, "Improving Precision through Covariate Adjustment in RCTs with Binary Outcomes." The seminar will take place at 12:00 PM in Berkeley Way West, 5th Floor, Room 5401. Please note that this week’s session will be open exclusively to the CTML and UC Berkeley community.
Abstract: Covariate adjustment is a general method for improving precision when estimating treatment effects in randomized trials and is recommended by the FDA in its recent guidance when baseline variables are prognostic for the primary outcome. In this talk, we review the principles underlying covariate adjustment, with a focus on standardization, a method highlighted in the guidance for estimating the marginal treatment effect. We concentrate on settings with binary outcomes, describing practical implementation of the estimator and discussing open questions related to variance estimation and finite-sample performance.
Connect with the CTML community! Next week’s seminar on Wed, Jan 28, will feature CTML GSR Kaitlyn Lee. The seminar will take place at 12 PM in BWW, 5th Fl, Rm 5401. Please note that this week’s session will be open exclusively to the CTML and UC Berkeley Community.
22.01.2026 19:14 — 👍 1 🔁 1 💬 0 📌 0
The Center for Targeted Machine Learning and Causal Inference (CTML) Spring Seminar Series kicks off next Wednesday, January 21, with a talk by CTML GSR Andy Kim: “Predicting Loss to Follow-Up Under Resource Constraints: Leveraging Registry-Linked Mobile Health Data in Trauma Care.” Join us at 12:00 PM in Berkeley Way West, 5th Floor, Room 5401.
Abstract: Traumatic injury remains a leading cause of morbidity and mortality in sub-Saharan Africa, with a substantial proportion of adverse outcomes occurring after hospital discharge due to missed follow-up care. Leveraging linked-data from the Cameroon Trauma Registry (CTR) and Mobile Health (mHealth) follow-up system, I use predictive ensemble super learning methods to construct risk scores for loss to follow-up and evaluate models using recall-based metrics most relevant under resource constraints (i.e. who to prioritize calling given limited funding, staff, etc.). By framing loss to follow-up as a prioritization problem rather than a classification task, this work highlights how context-driven choices of loss functions and performance metrics shape predictive modeling strategies in applied settings.
The CTML Spring Seminar Series kicks off next Wednesday, January 21, with a talk by CTML GSR Andy Kim: “Predicting Loss to Follow-Up Under Resource Constraints: Leveraging Registry-Linked Mobile Health Data in Trauma Care.” Join us at 12:00 PM in Berkeley Way West, 5th Floor, Room 5401.
15.01.2026 22:18 — 👍 1 🔁 0 💬 0 📌 0
We’re excited to kick off our first newsletter for the Spring 2026 semester coming next Wednesday, January 21!
Subscribe today to stay in the loop with the CTML community: http://eepurl.com/iLgQIw
We’re excited to kick off our first newsletter for the Spring 2026 semester coming next Wednesday, January 21!
Subscribe today to stay in the loop with the CTML community 👉: eepurl.com/iLgQIw
14.01.2026 19:23 — 👍 0 🔁 0 💬 0 📌 0
Huge thank you to Dr. Alejandro Schuler, Assistant Professor in Residence at UC Berkeley Biostatistics and CTML faculty member, for his continued efforts with CTML and for his talk at the Gilead Health Equity Partnership (GHEP) Seminar Series titled " Increasing the Efficiency of Randomized Trials with Machine Learning."
Stay tuned for more cutting-edge conversations in our ongoing CTML × Gilead Health Equity Partnership (GHEP) Series in the coming weeks and on our website!
Trials enroll a large number of subjects in order to attain power, making them expensive and time-consuming. However, advancements in machine learning can make adjusted trial analyses more efficient, yielding smaller confidence intervals and p-values without sacrificing control of false positives. Adjustment works by explaining away within-treatment-group variability in the outcome using associated variability in baseline covariates, similar to stratification. Therefore, the key parameter that determines power and confidence of an adjusted analysis is how predictive the baseline covariates are for the outcome. Machine learning models often predict better than linear models and therefore they boost power. The power gain is predictable if we can accurately anticipate model performance, which allows us to trade power gains with the same sample size for smaller trials with equal power. In some settings, strongly predictive baseline information like images and free text are captured but never exploited for adjustment because they are not tabular data. Use of these data could therefore hugely increase power or decrease sample sizes, which I propose to enable using multimodal foundation models.
Huge thank you to Dr. Alejandro Schuler for his continued efforts with CTML and for his talk at the Gilead Health Equity Partnership (GHEP) Seminar Series! Stay tuned for more cutting-edge conversations in our ongoing CTML × GHEP Series in the coming weeks and on our website.
11.12.2025 18:02 — 👍 1 🔁 0 💬 0 📌 0
Explore everything CTML - Center for Targeted Machine Learning and Causal Inference has planned for our Spring Seminar Series! The full schedule is available by scanning the QR code on this post or by clicking the link in our bio.
This spring, you can look forward to an engaging lineup of events—including a career panel featuring professionals in both biostatistics and epidemiology, as well as a panel discussion with experts from Blue Shield who will share insights from industry practice. Join us starting January 21st, 2026 to learn, connect, and grow with the CTML community!
Explore everything CTML has planned for our Spring 2026 Seminar Series! The full schedule is available by scanning the QR code on this post or by clicking the link to our website 👉 ctml.berkeley.edu/spring-2026-...
09.12.2025 22:04 — 👍 2 🔁 0 💬 0 📌 0
CTML Co-Director Mark van der Laan’s talk will be presented at the Pharmaceutical Users Software Exchange (PHUSE) Workshop on Jan 29th, 2026, from 10AM–11AM (EST). The talk will be available online via Zoom. Click the link to register today! 🔗: https://www.phuse-events.org/attend/frontend/reg/tOtherPage.csp?pageID=80014&eventID=106&traceRedir=4
CTML Co-Director Mark van der Laan’s talk will be presented at the Pharmaceutical Users Software Exchange (PHUSE) Workshop on Jan 29th, 2026, from 10AM–11AM (EST). The talk will be available online via Zoom. Click the link to register today! 🔗: www.phuse-events.org/attend/front...
08.12.2025 19:10 — 👍 0 🔁 0 💬 0 📌 0
Former CTML Postdoc David McCoy will be presenting his and CTML Postdoc Zach Butzin-Dozier's research at the Neural Information Processing Systems (NeurIPS) 2025 Conference in Mexico City. If attending the conference, please stop by and check out their poster!
Former CTML Postdoc David McCoy will be presenting his and CTML Postdoc Zach Butzin-Dozier's research at the Neural Information Processing Systems (NeurIPS) 2025 Conference in Mexico City. If attending the conference, please stop by and check out their poster!
04.12.2025 19:19 — 👍 1 🔁 0 💬 0 📌 0
Thank you to Antonio Remiro Azócar for presenting his talk "Data Fusion for Indirect Treatment Comparisons in Health Technology Assessment" as part of the JICI Lab Seminar Series on 12/2/2025. To learn more about Antonio's work, please scan the QR code or visit lnkd.in/ge_qruVY.
03.12.2025 22:47 — 👍 0 🔁 0 💬 0 📌 0
CTML Postdoc Marie Charpignon, will be presenting her poster at the AHLI 2025 ML4H Symposium next Monday, December 1st. If you’ll be attending AHLI, we welcome you to stop by Poster #4 to learn more about her work on federated target trial emulation using EHR data spanning multiple health systems!
CTML Postdoc Marie Charpignon, will be presenting her poster at the AHLI 2025 ML4H Symposium next Monday, December 1st. If you’ll be attending AHLI, we welcome you to stop by Poster #4 to learn more about her work on federated target trial emulation using EHR data spanning multiple health systems!
26.11.2025 17:11 — 👍 4 🔁 0 💬 0 📌 0
T-shirt only: We’re reopening Professor Art Reingold’s commemorative t-shirt store for the last time! 🎉
Celebrate his remarkable career and legacy with an exclusive t-shirt designed just for this occasion.
📦Order here: www.customink.com/g/rgc0-00cz-...
📅 The store will close on November 30th.
25.11.2025 20:02 — 👍 3 🔁 0 💬 0 📌 0
Interest in AI for social impact? Join us on December 2nd as CTML’s Dr. Laura Balzer presents her latest research during #BerkeleyPublicHealth’s virtual Latest in Public Health Research series.
To register for zoom: lnkd.in/gXT-k2Ve
For more info: lnkd.in/girU-8Cf
20.11.2025 23:45 — 👍 2 🔁 1 💬 0 📌 0
Thank you to Alejandro Schuler, for presenting his talk "Increasing the Efficiency of Randomized Trials with Machine Learning" as part of the JICI Lab Seminar Series on 11/18/2025. To learn more about Alejandro’s talk, please scan the QR code or visit lnkd.in/ge_qruVY.
18.11.2025 21:31 — 👍 2 🔁 0 💬 0 📌 0
A big thank you to CTML's Co-Director Mark van der Laan for delivering an engaging talk on "Targeted Learning as a Principled Statistical Approach for Decision Making" at the Bay Area Biotech-Pharma Statistics Workshop on November 7, 2025!
13.11.2025 23:01 — 👍 1 🔁 0 💬 0 📌 0
Meet one of our newest team members, Sophie Chen.
"Hi everyone! My name is Sophie! I'm an undergrad student pursuing a double degree in Statistics and Economics with a Data Science minor. I'm interested in finance and technology. I'm excited to be part of the CTML community!"
12.11.2025 22:44 — 👍 1 🔁 0 💬 0 📌 0
Continuing our CTML Seminar Series is CTML GSR, Yi Li. His talk, "Targeted Deep Architectures: A TMLE-Based Framework for Robust Causal Inference in Neural Networks" will take place on October 29th at 12:00PM at Berkeley Way West, 5th Floor, Room 5401. You won't want to miss it!
Modern neural networks excel at prediction but often produce biased estimates and unreliable uncertainty for causal target parameters(e.g., average treatment effects or entire survival curves). This talk introduces Targeted Deep Architectures (TDA), a framework that embeds a targeted maximum likelihood–style update directly into a network’s parameter space. TDA freezes most weights, identifies a small “targeting” subset, and projects influence functions onto network gradients to obtain a targeting direction that iteratively removes first‑order bias. The resulting universal targeting gradient enables simultaneous debiasing of multidimensional parameters—for example, an entire survival curve—without cumbersome post‑hoc fluctuations or specialized losses.
If time permits, I will also present our new work on weighted‑path updates for simultaneous targeting of multidimensional parameters, where per‑component targeting directions are combined via statistically informed weights to produce stable, coherent updates and practical convergence criteria. The framework is model‑agnostic and integrates seamlessly with modern deep architectures.
Continuing our CTML Seminar Series is CTML GSR, Yi Li. His talk, "Targeted Deep Architectures: A TMLE-Based Framework for Robust Causal Inference in Neural Networks" will take place on October 29th at 12:00PM at Berkeley Way West, 5th Floor, Room 5401. You won't want to miss it!
23.10.2025 16:30 — 👍 2 🔁 0 💬 0 📌 0
Center for Targeted Machine Learning and Causal Inference (CTML) at DahShu Data Science Symposium 2025!
Mark van der Laan, CTML Faculty, and Sky Qiu, CTML Graduate Student Researcher, will be presenting a full-day short course titled "Highly Adaptive Lasso and Targeted Learning" at the DahShu Data Science Symposium 2025: Innovative Frontiers – AI and Data-Driven Advances in Drug Development, Precision Medicine, and Healthcare.
Saturday, October 18th
9:00 AM – 5:00 PM
This short course introduces the Highly Adaptive Lasso (HAL) and its role in advancing Targeted Learning, a general framework for constructing asymptotically linear, efficient estimators under realistic assumptions about the data generating process. Attendees will learn about HAL’s theoretical properties, its implementation for causal inference in R, and new developments including A-TMLE, P-TMLE, and DeepLTMLE with applications to modern challenges in drug development and precision medicine.
Make sure to catch their course during the symposium!
CTML Faculty Mark van der Laan and CTML GSR Sky Qiu will be presenting a full-day short course at the DahShu Data Science Symposium 2025: Innovative Frontiers – AI and Data-Driven Advances in Drug Development, Precision Medicine, and Healthcare.
📅 Saturday, Oct 18
🕘 9AM – 5PM
14.10.2025 15:48 — 👍 0 🔁 0 💬 0 📌 0
The next talk in our CTML Seminar Series is coming up on October 15th! Join us for an engaging discussion led by Michael Rosenblum, Ph.D.(link is external), Professor of Biostatistics at Johns Hopkins Bloomberg School of Public Health, on "Methodological Problems in Every Black-Box Study of Forensic Firearm Comparisons." This talk will take place at 12:00PM at Berkeley Way West, 5th Floor, Room 5401.
Reviews conducted by the National Academy of Sciences (2009) and the President’s Council of Advisors on Science and Technology (2016) concluded that the field of forensic firearm comparisons has not been demonstrated to be scientifically valid. Scientific validity requires adequately designed studies of firearm examiner performance in terms of accuracy, repeatability, and reproducibility. Researchers have performed “black-box” studies with the goal of estimating these performance measures. As statisticians with expertise in experimental design, we conducted a literature search of such studies to date and then evaluated the designand statistical analysis methods used in each study. Our conclusion is that all studies in our literature search have methodological flaws that are so grave that they render the studies invalid, that is, incapable of establishing scientific validity of the field of firearms examination. Notably, error rates among firearms examiners, both collectively and individually, remain unknown. Therefore, statements about the common origin of bullets or cartridge cases that are based on examination of “individual” characteristics do not have a scientific basis. We provide some recommendations for the design and analysis of future studies.
The next talk in our CTML Seminar Series is coming up on Oct 15! Join us for an engaging discussion led by Michael Rosenblum, Prof. of Biostatistics at Johns Hopkins Bloomberg School of Public Health. This talk will take place at 12:00PM at Berkeley Way West, 5th Floor, Room 5401.
09.10.2025 16:37 — 👍 2 🔁 0 💬 0 📌 0
T-shirt only: We’re reopening Professor Art Reingold’s commemorative t-shirt store for a limited time!
Celebrate his remarkable career and legacy with an exclusive t-shirt designed just for this occasion.
Link in bio to order!
The store will close by October 25th.
T-shirt only: We’re reopening Professor Art Reingold’s commemorative t-shirt store for a limited time! 🎉
Celebrate his remarkable career and legacy with an exclusive t-shirt designed just for this occasion.
📦 Link in bio to order!
📅 The store will close by October 25th.
01.10.2025 17:56 — 👍 0 🔁 0 💬 0 📌 0
Don't miss the next session of the CTML Seminar Series on October 1st, where Wenxin Zhang will discuss "Efficient Statistical Estimation for Sequential Adaptive Experiments with Implications for Adaptive Designs." This talk will take place from 12:00PM-1:00PM at Berkeley Way West, 5th Floor, Room 5401.
Adaptive designs are increasingly used in clinical trials and online experiments to improve participant outcomes by dynamically updating treatment allocation based on accumulating data. At the end of adaptive experiments, it is often desirable to answer various causal questions based on the observed data. However, the adaptive nature of such experiments and the resulting dependence among observations pose significant challenges for providing statistical inference of causal estimands. Building upon the Targeted Maximum Likelihood Estimation (TMLE) literature that has provided valid statistical inference tailored to adaptive experimental settings, I will discuss our recent advance in efficient statistical inference under relaxed assumptions of adaptive experiments, with their implications for improving adaptive designs.
Don't miss the next session of the CTML Seminar Series on October 1st, where Wenxin Zhang will discuss "Efficient Statistical Estimation for Sequential Adaptive Experiments with Implications for Adaptive Designs." This talk will take place from 12-1 PM at Berkeley Way West, 5th Floor, Room 5401.
29.09.2025 22:57 — 👍 1 🔁 0 💬 0 📌 0
📢 Don’t miss this session today at the ASA Biopharmaceutical Section RISW 2025!
CTML Faculty Mark van der Laan, will be presenting:
“The Causal Roadmap and Adaptive TMLE for ECT-Hybrid Designs”
📅 Sept 25
🕐 1:15PM – 2:30PM(EDT)
Be sure to add this to your agenda if you’re attending the conference!
25.09.2025 16:27 — 👍 0 🔁 0 💬 0 📌 0
Thank you to our outstanding speakers & panelists for sharing their expertise at this year’s Fall Visit for the Joint Initiative for Causal Inference (JICI)!
A special note of appreciation goes to CTML Leads, Maya Petersen and Mark van der Laan, for their leadership in making this event a success.
24.09.2025 16:07 — 👍 0 🔁 0 💬 0 📌 0
There’s still time to contribute to Professor Art Reingold’s legacy book!
📝 Please keep testimonials to 100 words or less.
📸 You may share a photo in your commemorative t-shirt, holding his graphic, or with Art.
📬 Submissions due by Sept 30: forms.gle/LY4T126cPjdq...
22.09.2025 17:41 — 👍 0 🔁 0 💬 0 📌 0
We are excited to announce CTML faculty, students and staff will lead roundtable discussions at the upcoming RISW (Regulatory Industry Statistic Workshop) at the ASA (American Statistical Association) conference in Rockville, Maryland!
Date: September 25, 2025
Time: 11:45 AM – 1:00 PM (EDT)
We are excited to announce CTML faculty, students and staff will lead roundtable discussions at the upcoming RISW (Regulatory Industry Statistic Workshop) at the ASA (American Statistical Association) conference in Rockville, Maryland!
📌 Date: September 25, 2025
📌 Time: 11:45 AM – 1:00 PM (EDT)
18.09.2025 18:54 — 👍 0 🔁 0 💬 0 📌 0
Mark your calendars for September 17th! The CTML Seminar Series explores “Causal Inference via Electronic Health Record Data.” This exciting talk will be led by CTML's Postdoc Zachary Butzin-Dozier from 12:00PM-1:00PM at Berkeley Way West, 5th Floor, Room 5401.
15.09.2025 22:40 — 👍 0 🔁 0 💬 0 📌 0
📖 The Legacy of Art
Help honor Professor Art Reingold with a legacy book of memories.
📸 Share a photo with Art, in your t-shirt, or with his graphic + 📝 a testimonial by Sept 30 to this link: forms.gle/LY4T126cPjdq...
Every submission adds to the story of Art’s impactful career!
08.09.2025 20:42 — 👍 1 🔁 0 💬 0 📌 1
The Center for Targeted Machine Learning and Causal Inference (CTML) Seminar Series continues on September 10th! Join us for an exciting talk on "Machine Learning, Causal Queueing, and SiMLQ for Data Driven Simulation." Opher Baron, Professor of Operations Management at the Rotman School of Management, University of Toronto, will be presenting joint work with Zhenghang Xu, a fifth-year PhD candidate in Operations Management and Statistics. This talk will take place from 12:00PM-1:00PM at Berkeley Way West, 5th Floor, Room 5401.
The objective of this talk is to expose researchers to the vast possibilities of using modern machinery and data for implementing effective management analytics for queueing processes. Such process are ubiquitous in modern economies, e.g., customers waiting to service, inventory waiting for processing/transportation, payments and invoices waiting to be generated/cleared, computing tasks waiting for resources. I will thus discuss recent developments in queueing analysis based on several papers. We will also see a demo of www.SiMLQ.com that demonstrates how to take this theory to practice.
The CTML Seminar Series continues on Sept 10! Join us for an exciting talk on "Machine Learning, Causal Queueing, and SiMLQ for Data Driven Simulation." Opher Baron will be presenting joint work with Zhenghang Xu. This talk will take place from 12PM-1PM at BWW, 5th Fl, Rm 5401.
04.09.2025 17:19 — 👍 0 🔁 0 💬 0 📌 0
Join us on September 3rd to kick off our Fall 2025 CTML Seminar Series! Adam Yala, Assistant Professor in Computational Precision Health, Statistics, and Computer Science (EECS), will start our series with his talk "AI for Personalized Cancer Care." This talk will take place at 12:00PM at Berkeley Way West, 5th Floor, Room 5401.
Early detection significantly improves outcomes across many cancers, motivating major investments in population-wide screening programs, such as low-dose CT for lung cancer. To make screening more effective, we must simultaneously improve early detection for patients who will develop cancer while minimizing the harms of over screening. Advancing this Pareto frontier requires progress across three fronts: (1) accurately predicting patient outcomes from all available data, (2) designing intervention strategies tailored to risk, and (3) evaluating and translating these strategies into clinical practice. In this talk, I will present ongoing work across all three areas, driven by the goal of using every available bit of patient data to personalize care.
Join us on Sept 3rd to kick off our Fall 2025 CTML Seminar Series! Adam Yala, Assistant Prof. in Computational Precision Health, Statistics, and Computer Science (EECS), will start our series with his talk "AI for Personalized Cancer Care." This talk will take place at 12PM at BWW, 5th Fl, Rm 5401.
26.08.2025 15:40 — 👍 0 🔁 0 💬 0 📌 0