Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
High-quality training data has proven crucial for developing performant large language models (LLMs). However, commercial LLM providers disclose few, if any, details about the data used for training. ...
Want to know what training data has been memorized by models like GPT-4?
We propose information-guided probes, a method to uncover memorization evidence in *completely black-box* models,
without requiring access to
๐
โโ๏ธ Model weights
๐
โโ๏ธ Training data
๐
โโ๏ธ Token probabilities ๐งต (1/5)
21.03.2025 19:08 โ ๐ 97 ๐ 27 ๐ฌ 4 ๐ 8
My friends are organizing a new workshop bringing together NLP and CSS research with psychology: First Workshop on Integrating NLP and Psychology to Study Social Interactions (NLPSI) at ICWSM 2025 nlpsi-workshop.github.io Consider submitting a paper (long/short) or an extended abstract :)
24.02.2025 10:20 โ ๐ 2 ๐ 2 ๐ฌ 0 ๐ 0
Imagine if @aoc @sanders.senate.gov , every dem, SIMULTANEOUSLY held Town Halls where they allowed grant and contract recipients to explain to the country what it is they do and why it's important
Invite all media. Including RW podcasters. @spaces Flood the zone
Call it a Day Of Transparency
11.02.2025 18:51 โ ๐ 96357 ๐ 20133 ๐ฌ 3741 ๐ 1640
โI, tooโ by Langston Hughes
10.02.2025 17:01 โ ๐ 718 ๐ 115 ๐ฌ 7 ๐ 6
Alumni too! Call/email your schools and tell them that they have a legal and ethical responsibility to protect student data. Iโm still paying OSU and they should be safeguarding my data from third parties.
04.02.2025 14:36 โ ๐ 28 ๐ 9 ๐ฌ 1 ๐ 0
Is this based on fear from politicians cutting funding or are some of these โscholarsโ being emboldened to show their true colors?
06.12.2024 12:42 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
I firmly believe that there needs to be more conversation about ethical AI development and less about โSkyNetโ like scenarios.
30.11.2024 12:56 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
A statistical approach to model evaluations
A research paper from Anthropic on how to apply statistics to improve language model evaluations
These seem like some very common (statistical) sense recommendations for eval. Still the title of the paper is somewhat confusing as if these ideas are new. I wouldโve suggested โBringing a statistical approachโฆโ
www.anthropic.com/research/sta...
22.11.2024 10:39 โ ๐ 39 ๐ 2 ๐ฌ 2 ๐ 0
I can already feel my timeline getting smarter and happier
21.11.2024 19:04 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Hello all DeBoris here. I work on ML models in the healthcare space. Fulltime blerd, husband and father.
06 ๐ค๐ฟ
19.11.2024 13:24 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
๐ค๐ฟ
19.11.2024 13:21 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
19.11.2024 03:47 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
This app feels like 2009. I hope it stays that way
18.11.2024 13:22 โ ๐ 39 ๐ 1 ๐ฌ 2 ๐ 0
Reposting all the latest tweets by NBA news insider Shams Charania - plus other notable NBA news posts.
(This is a bot, to be clear. ๐ค Posts in brackets & reposts are by manager.)
โข Automated by: @beastmoderocco.com
โข Managed by: @velodus.bsky.social
National Insider for NFL Network and www.nfl.com. Made seven cameos in the movie Draft Day.
Inquiries: shahob@william-raymond.com
https://link.me/rapsheet
DC Studios co-CEO, #Superman, #GuardiansoftheGalaxy, #Peacemaker, dog owner, husband, servant to my cat.
๐โจ Exploring ideas and building connections
Passionate about equality & human potential
Spreading love, one post at a time ๐ซ๐ Vote ๐
Retired IT Director with mild Cerebral Palsy advocating for those with Developmental Disabilities. NO telegram ๐ซWA ๐ซsignal ๐ซCrypto ๐ซIlluminati
https://youtube.com/shorts/nBGae2ukuUE?si=qKol-OJdCXQbd1CDq
#FATP #TeamJustice #JusticeMatters
AuDHD. Swedish. Would defend Canada & Denmark from Trump/Musk if I could.
Mi Deh Yah ! ๐ฐ๐ณ๐ฏ๐ฒโCosplaYUHโArtiztรฉโGamerโPoet โ Gen Z Saiyan Queen ๐00โ โ aka Muchaki Hyuzu ๐ THEKAZUYALEE ๐พ THE SUPER SAIYAN GYAL โผ๏ธ
Instagram: @itzkazuyah โฏ
๐ฎPSN: QueenKazuya ยฏ\_(x_T)_/ยฏ
onlyfans.com/itzkazuyah ๐น
My Links๐ซด๐พ linktr.ee/QueenLiaLee
Baltimore bred est. 88
๐๐Morgan State University ๐ป๐ถ๐ท
๐University of Baltimore - [1906๐ฆ]
@CBVPhotography ๐ธ
@SharpShooterDefenseSolutions
Senior research scientist at the Global Modeling and Assimilation Office, NASA GSFC and GESTAR-II, Morgan State University
Jared A. Ball is a Professor of Communication and Africana Studies at Morgan State University in Baltimore, MD.
https://linktr.ee/jaredball
Mom, Data Science/Machine Learning/Deep Learning/NLP, Teaching Fellow @ Harvard, Kaggle Competition Master https://kaggle.com/rashmibanthia
โจ Try customized Kaggle Feed โก๏ธ https://bsky.app/profile/did:plc:mbjdssrrxzllb2g2rq7px4pq/feed/aaadft2egbdoe
Cortical surface modelling and interpretable/explainable #AI, geometric deep learning #neuroscience. Open science: HCP, dhcp, UKBiobank
PhD student at JHU. @Databricks MosaicML, Microsoft Semantic Machines/Translate, Georgia Tech. I like datasets!
https://marcmarone.com/
Final year Ph.D. candidate in NLP, CV at JHU. Researching reasoning systems, multimodality, and AI for science. On the job market for full-time industry positions! #NLProc
https://katesanders9.github.io/
Applied Scientist @ Amazon
(Posts are my own opinion)
Previously PhD@JHU
PhD Student at Johns Hopkins University. Previously: Allen Institute for AI, Apple, Samaya AI. Research for #NLProc #IR
PhD student at Johns Hopkins University
Alumni from McGill University & MILA
Working on NLP Evaluation, Responsible AI, Human-AI interaction
she/her ๐จ๐ฆ
(she/her)
ยฏโ \โ _โ (โ ใโ )โ _โ /โ ยฏ
PhD student @jhuclsp | Prev @IndiaMSR
PhD student @JHU CLSP
hstehstehste.github.io