Maybe it's just YOUR testosterone that's low
How the measurement tools have led us to falsely believe our T is low
Figure 1. Don’t believe everything you see on TV(vitter).
If you’re on X – or worse, you’re a man on X – you almost certainly have seen the claim that the global testosterone levels have declined. It’s everywhere, something people attribute to microplastics or obesity or seed oils, and much else, a claim I very recently casually contested as being likely actually the result of a change in the way we measure testosterone. I will once again reiterate my claim here: I don’t think the huge drop in testosterone is the result of anything other than a change in how we measure stuff. Below is my defense of this position focused primarily on the ways measurements have likely skewed our “testosterone crisis” that everyone on X is so worried about.
A brief note on how things are measured
If you’ve never worked in a lab, then it’s possible that you never thought about the mechanics of what happens when your blood sample is sent in for testing. After all, when has that ever mattered? Well, it turns out it matters quite a bit, and the standards in how we practice laboratory medicine have changed substantially over the years, owing primarily to upgrades in our methodology.
I’ll skip over the history and jump right into it. For a large period of time where much of our historical data was collected on serum testosterone levels we were using a technique called the radioimmunoassay (RIA). Adapted from protein testing, this rather ingenious method (which won the Rosalyn Yalow the 1977 Nobel Prize) requires three parts: 1) your thing you care about measuring taken from human serum (the “cold” antigen) 2) a radiolabeled version of that thing you care about measuring (the “hot” antigen) and 3) an antibody against the antigen you care about measuring. These three things are mixed together and the hot and cold versions of the antigen compete against each other for binding to the antibody, the end readout being a ratio of the bound:unbound sum of the measurable radiolabeled hot antigen. This technique borrows from how our immune system produces antibodies that recognize and bind features of infectious agents that are bad for us to neutralize in some capacity (this part is not important to our RIA). In this case, the feature that is bad for us – the antigen – is actually just the thing we want to measure, which in the case of this article is testosterone.
Figure 2. A brief overview of how the RIA is used to measure testosterone (source)
Simple! Where things go wrong ultimately comes down the reliability of our preparation of our parts. As we can now understand, RIA, which relies on an antibody tweaked to recognize our molecule of interest, depends heavily on the quality and specificity of the antibodies used. These antibodies aren't perfect - they can sometimes bind to molecules that are structurally similar to testosterone, like other steroid hormones. This "cross-reactivity" can lead to falsely elevated results – something that is not just limited to RIA but to any antibody-based quantification system. This has been shown to happen basically with any androgen, like DHEA, using this method, making the old results oftentimes inflated.
Second, though less critical, comes down to the fundamental premise that the competition between the label and unlabeled testosterone is equal or unbiased. The radioactive label itself can sometimes affect how the molecule binds to the antibody, creating subtle biases in the measurements. Plus, testosterone in blood doesn't travel alone - it's mostly bound to proteins. Different labs might handle these binding proteins differently during sample preparation, leading to inconsistent results between facilities.
This brings us to our third, and perhaps most important, contextual point to understand where the 1970s-90s RIA data goes wrong – the matter of standardization. The critical weakness of RIA lies in its dependence on lab-prepared components and relative measurements. Each lab would prepare their own radioactive testosterone, generate their own antibodies, and establish their own reference standards. Because the actual measurement is relative – comparing unknown samples to a lab-prepared standard – any errors in these components cascade into the final result. If a lab's standard testosterone preparation is off by 10%, all measurements from that lab will be off by 10%. With no rigorous quality control between facilities to ensure standards were equivalent, this created a patchwork of methodologies where accuracy was heavily dependent on each individual lab's technical expertise. In essence, historical testosterone measurements weren't absolute values, but rather estimations whose accuracy varied with each lab's procedures and technical skill.
The result? Without strict standardization between labs, and with the known issues of cross-reactivity in RIA testing, there's a strong possibility that historical testosterone measurements were artificially inflated. When antibodies cross-react with similar molecules, they can register "ghost" signals that get counted as testosterone even when they're not. This means that studies from the RIA era might be systematically overestimating testosterone levels. As labs transitioned to more specific and accurate testing methods like mass spectrometry, these "ghost" signals disappeared, potentially explaining some of the apparent "decline" in testosterone levels over time. This is actually replicated when we compare the results of the old RIA or CLIA tests to modern day chromatography testing. I’ll just leave us with this quote here from the study linked, which you can read more of if you like:
“There is significant variability in T measured with RIA, CLIA and MS. CLIA and RIA overestimated T levels in majority of patients leaving a concern of misdiagnosing truly castrate patients as being inadequately treated.”
All of this is to say that RIA was great for a time, but was not a true, ungrounded measurement of what was happening because it ultimately relied on the use of a standard that can sway measurement, and picked up things that were not testosterone in the tally. What we really want is something that gives an absolute quantification of molecules in a given volume. Good news is that we have that now – welcome to the era of LC/MS!
How we measure stuff today, and why it’s better
Some experts (cough my colleague Anand Muthusamy cough) have pointed out that we measure stuff in a different way now, which is true! But if you don’t understand how this method differs you wouldn’t be faulted for simply assuming it’s just a better version of the old way. Actually, the way things are done now is much, much better because we are measuring the absolute number of testosterone molecules in a human serum sample for the volume provided, as opposed to extrapolating from a reference range. You can now buy a machine, put your sample in, and simply get a result that is exact.
Liquid Chromatography-Tandem Mass Spectrometry (LC/MS) quickly became the way labs preferred to measure testosterone, and for good reason. Many studies have shown that these methods have lower limits of detection and wider linearity ranges compared to immunoassays, making them more reliable, especially at low testosterone levels. But why?
Figure 3. A little figure showing the components of how LC/MS works (but you probably need to read something about it if you really want to understand)
First, the "LC" part – liquid chromatography – separates molecules in your blood sample based on their physical properties. Think of it like a very sophisticated filter that sorts molecules by their size and how they interact with the filtering material. This separation step is crucial because it helps isolate testosterone from other similar molecules that might interfere with measurement.
Second, we have the "MS" part – mass spectrometry – where the magic really happens. Mass spectrometry is essentially a molecular scale and counting machine. It converts testosterone molecules into charged particles (ions) and then measures both their mass and quantity. The mass measurement is incredibly precise – it can distinguish testosterone from any other molecule based on its exact molecular weight and fragmentation pattern. This means that does NOT fall for the bait that immunoassays like RIA do, where similar looking things are potentially tallied because antibodies are, as we call it in the business, promiscuous.
When I say LC/MS gives us "absolute" numbers, here's what I mean: the machine is literally counting individual testosterone molecules that match this exact molecular fingerprint. There's no indirect measurement, no antibody binding, no relative comparison to a standard that might be off. If the machine counts 1000 testosterone molecules in your sample, that's exactly what was there. You then just multiply that by volume to adjust for your weight or whatever, and the end readout is exactly the number of circulating testosterone molecules in your body – no extrapolation needed. This is fundamentally different from RIA, where we were essentially making an educated guess based on how much radioactive testosterone got displaced from antibodies.
This direct counting ability, powered by precise molecular identification, is why LC/MS has become the gold standard. It's not just a better version of the old way – it's a completely different approach that gives us true, absolute measurements of testosterone levels. Again, this is recent technology! It didn’t exist in 1970, or 1990 – the supposed golden years of testosterone. Because these instruments are expensive, their rollout has taken time to acquire and train, leading to some years of mixed use before what has likely become the global adoption we see today (ie the “crater” decade from 2003-2011 from Figure 1).
Final thoughts on why the measurement matters
Plenty of people in my replies have pointed out other markers for, like, low sperm count or taint size (no comment), and that’s possible! I really just want to address the primary claim that we’ve seen a meteoric crash in our testosterone levels by up to 30% as seen in the Xeet (tweet?) I responded to, because I actually think the details in the technological progress we see around us also extends to the way we quantified stuff in the past. My argument is not that things are happening to virilize or feminize us (which I’m kind of also dubious about but again, not getting into it), it’s really that I just don’t think that’s the primary feature we’re observing when we compare the modern testosterone count to that of the old days. What I actually think is going on, and I think is proven by the mechanics of our tests, is that the numbers back then were just inflated, that’s all. Progress doesn’t just give us iPhones, it also gives us better measurement tools! Examining the mechanics shows this to be the likely causative agent in our panic of low T, and if this is true for testosterone, I’d love to see what folks uncover about other things that are supposedly way higher or lower than in the past.
I mean, who is to say that the global taint ruler is not actually just different than it was in the past? Maybe our ruler making just got better 🙂