take me seriously
23.10.2025 16:00 β π 294 π 36 π¬ 1 π 3@betanalpha.bsky.social
Zealous modeler. Annoying statistician. Reluctant geometer. Support my writing at http://patreon.com/betanalpha. He/him.
take me seriously
23.10.2025 16:00 β π 294 π 36 π¬ 1 π 3An entire ensemble of experiments implemented in the same way would behave even more weirdly, and unless someone recognized the poor use of pseudo-random numbers people might end up chasing down red herrings.
23.10.2025 02:23 β π 1 π 0 π¬ 0 π 0In the other words when using parallel pseudo-random number generator sequences from different seeds there would be no way to diagnose the failure of the randomized assignments, at least not without running the experiments over and over again.
23.10.2025 02:23 β π 1 π 0 π¬ 1 π 0The issue is that for most pseudo-random number generators the sequences generated from two different seeds can have arbitrary correlations. In the randomization design example the assignments would look "random" superficially but not actually ensure the expected randomization outcomes.
23.10.2025 02:23 β π 1 π 0 π¬ 1 π 0No disagreement from me that applied science is a mess, but I maintain that the "use random seeds" heuristic doesn't actually solve anything. If anything it just obfuscates problems even further.
23.10.2025 02:23 β π 1 π 0 π¬ 1 π 0"Don't use environments as ordinary data structures, but also environments are the only base data structures implemented with a hash table and get pass-by-reference semantics". Yup, checks the box.
23.10.2025 02:11 β π 3 π 0 π¬ 2 π 0So while not refreshing the seed every time clearly doesn't implement the randomized design, refreshing the seed and drawing from the new pseudo-random number generator state doesn't either.
23.10.2025 00:38 β π 1 π 0 π¬ 0 π 0In order to accurately implement a randomized design one would need to pull assignments from a single pseudo-random number generator sequence (for almost any seed). Running multiple, independent pseudo-random number generators with different seeds doesn't generally guarantee the desired randomness.
23.10.2025 00:36 β π 1 π 0 π¬ 2 π 0As in you seeded a new pseudo-random number generator to generate the assignment for each individual? If so then I would argue this is an example of using pseudo-random numbers incorrectly.
23.10.2025 00:36 β π 0 π 0 π¬ 1 π 0I mostly limit threads like these to social media because I prefer to focus my chapters on what to do rather than what not to do.
23.10.2025 00:32 β π 0 π 0 π¬ 1 π 0I also posted this thread over on patreon dot com, patreon.com/betanalpha, but it requires a free membership to access. Otherwise most of this is in the Monte Carlo chapter to which I linked at the end of the thread.
23.10.2025 00:32 β π 1 π 0 π¬ 1 π 0Due to some very generous people a few sponsored registrations are available, but they usually go fast so don't hesitate to reach out and inquire. Eligibility details can be found at betanalpha.github.io/courses/.
22.10.2025 14:46 β π 7 π 2 π¬ 0 π 0Iβm seeing some misinformation about pseudo-random number generator best practices going around the internets. Letβs talk about why the pseudo-random number generator seed you use shouldnβt actually have any impact on your results and, consequently, you can choose whatever seed you damn well please.
22.10.2025 19:06 β π 35 π 12 π¬ 4 π 3silent hill with a shen yun billboard added
wow these silent hill games are getting more realistic
22.10.2025 15:48 β π 1600 π 465 π¬ 5 π 10If you want to read more then check out Section 2 of my Monte Carlo chapter, betanalpha.github.io/assets/case_.... For even more detail I really like the Mellissa OβNeill's writing, www.pcg-random.org.
22.10.2025 19:06 β π 5 π 0 π¬ 0 π 0That said it will always be more productive to understand the method you are using and how it can be engineered to ensure strong estimation performance in the first place.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Moreover it doesnβt introduce any harm provided you donβt try to do something foolish like average the results togetherβ¦
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Adding a little bit of robustness by running an analysis multiple times with different seeds as a check, just to see if the results are consistent, is a great way to identify potential estimator issues.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Incidentally the same goes for the range of values for a seed. Modern pseudo-random number generator state spaces are so unfathomably large that any two-digit integer is equally as uncharacteristic as any nine-digit integer.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0A common seed for every analysis is fine. A heuristic for changing the seed from analysis to analysis, say based on the current date or time, is fine.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0All of this is to say that any method for choosing a seed is equally adequate provided that seed is reported and the resulting pseudo-random number generator output is used properly.
22.10.2025 19:06 β π 3 π 1 π¬ 1 π 0Different pseudo-random number generator seeds resulting in different Markov chain exploration which results in different Markov chain Monte Carlo estimates is not a problem with the pseudo-random number generator seeds but rather with the Markov chain Monte Carlo algorithm itself!
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0With a multi-modal target distribution any finite Markov chain might explore only part of the target distribution, resulting in formally inaccurate estimates.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Markov chain Monte Carlo, however, is not always well-behaved. Many Markov chain Monte Carlo algorithms struggle with more complicated target distributions; multi-modality for instance is a particularly problematic feature.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0If Markov chain Monte Carlo is well-behaved then the Markov chains will converge and produce consistent estimates regardless of the precise sequence of pseudo-random numbers that were used.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0For a final example letβs consider Markov chain Monte Carlo. Most Markov chain Monte Carlo algorithms rely on pseudo-random number generators to fuel their exploration and, often, also initialize the starting values.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Sure an unscrupulous person can take advantage of cross validation fragility to hunt for a seed that yields better results, but again this is an estimator problem not a pseudo-random number generator seeding problem.
22.10.2025 19:06 β π 3 π 0 π¬ 1 π 1In this case some pseudo-random number seeds will result in better outputs than others, but in practice we wonβt know which seeds will result in better fluctuations than others and so the choice of seed still has no practical consequence.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0The problem is that these estimators are not always β dare I say not often β well-behaved. Fragile estimator performance is only magnified when not using enough splits, especially when the data set at hand is too small to allow for sufficiently many splits.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0Well this is really just a way to estimate a particular predictive expectation value. If the estimator is sufficiently well-behaved then any sequence of splits should result in constant estimates.
22.10.2025 19:06 β π 2 π 0 π¬ 1 π 0