My gut as a physician says AI will revolutionize primary care the most and the premise of AI having more specialized knowledge than a PCP holds water - I think the future of primary care is AI equipped midlevel providers - so I was very excited by the intro only to be let down. A lot of buzz about nothing sadly.
The only takeaway here is logging helps detect patterns, we already knew that.
For simple things, if an AI (combined with blood tests) can do what the doctor can do, then technically I don't need the doctor. I also don't need the friction that comes in working with a doctor, considering the AI is close to frictionless.
> ... "considering the AI is close to frictionless" ...
I feel like this bit is more of a problem than many folks realize. I know that people see "frictionless" as a desirable trait, but for many of the things folks are using AI for these days, you're gonna want "push-back" sometimes. Especially in cases of health care and computer programming, there are times when the user is straight-up wrong about what they're trying to do, say, or believe, and the last thing you wanna hear when your life or job or the lives or jobs of others are on the line and you're about to make a huge mistake is "You're absolutely right! Let me get right on doing that for you!"
You're absolutely right! What I do is have the LLM rank and score candidates for me given my situation. This helps me focus on what's more likely versus less likely. Blood tests are not free; they're in fact fairly expensive in aggregate, so this filtering has been essential.
That’s the goal for doctors too. It would be great to get simple things off the system.
But I think the more realistic intermediate step is a trained person cheaper than a doctor - nurse, PA, etc - aided by AI. The current generation of agentic AI doesn’t seem to be there yet and is too agreeable from RL.
“you’re probably fine sleep it off combined with: drink more water, eat healthier, exercise more, sleep better, consume less alcohol and quit smoking/vaping” +/- “we’ll check some labs to make sure” is the correct answer for probably 95%+ of encounters so it’s not hard for an automated system to handle most simple things, even without AI.
Title: This state is testing out AI doctors—and actual doctors aren’t happy about it (Wall Street Journal)
New York state residents get the worst of it with the state bending over backwards to take away the people's rights.
Also, let's not confuse independence with AI. Independence is where I can do some extra bloodwork and diagnostics to satisfy my well-being. I don't need an AI limiting what I investigate either.
1. Having concerns over unvetted AI is not the same thing as running an exclusive protectionist racket.
2. Doctors who depend on revenue from forced visits to renew prescriptions are grifters and not close to a majority. The profession obviously isn’t perfect but the vast majority of physicians I know would be happy to lighten their rosters.
With that said it can’t be a system that creates problems and dumps them back on physicians to fix, hence #1
Point #1 is nonsense because the ratio of problems eliminated to those created will be easily be 99:1. I estimate this ratio is closer to 50:50 for human doctors.
Ultimately, it should be about personal freedom. This is not a contagious disease we're dealing with here.
It’s called evidence based medicine for a reason. Medicine has long since moved away from making decisions based on whether a person thinks it will be better.
> Ultimately, it should be about personal freedom. This is not a contagious disease we're dealing with here.
The state regulates health and medical devices for a reason. Look at what’s happening with all of these prescription mill apps these days which are still theoretically overseen by a licensed healthcare professional - plenty of medical errors and harm in the name of increasing access (a good thing, when done well). We’ll see criminal investigations within a year.
> The brain is encased in bone, so you might get some penetration but it will be very limited.
Radiologist as well. Remember this is full wave inversion not pulsed wave B mode. You can get much more useful information from both high low frequency and capture transmitted waves.
There is promise with this and we use it for example with MRgFUS. With advanced computational models or patient specific CT/ZTE MR aberration correction it is theoretically very feasible to image the brain with ultrasound, whether that’s more useful than say portable low field strength MR is a different question altogether.
> This is cool, but ultrasound is not CT.
Not to be pedantic but since this is a tech forum I would clarify that FWI US is computed tomography by definition (at least in this and many applications). Gas degrades conventional CT too, it’s just worse with US as you have little to no forward propagation and of course innumerable interfaces in the lungs to reflect and scatter.
Yes, I was thinking about FUS as well! There are clearly ways to penetrate bone, but I have not seen it used for imaging, only for ablation. But I am not an expert there and it sounds like you have more knowledge in that area than I do.
What's the use of FWI for a moving target - which someone floating in a spa most definitely will be?
This is a fever dream, not a practical diagnostic technology.
I can imagine it being useful if you happened to own an imaging product and wanted to train a model on human body shapes and sizes.
But dunking people in a golden spa - why golden? why not pink or teal? - and waving a ring of US over them, at distances which make high resolution impractical, to tell them their medical fortunes is... eccentric.
I'm being scathing because there's no solid science behind this, and if there had been solid science MJ would have published it in a paper.
This is pure marketing. And if you're going to do that, you should at least include some dolphin petting.
Full wave inversion uses all of the information from the wave and more intense computational tomography to image structures that pulse wave B mode cannot, though gases are still a problem. Computationally, if you squint, it's similar to the work Midjourney does with AI image generation, as it progressively generates a structure that fits the data.
Ultrasonic waves can penetrate most structures in humans, including the brain. For example, with focused ultrasound (as they mentioned with MRgFUS) you can burn specific structures in the middle of the brain without any incision.
To use this for imaging, you need lots of transducers (MRgFUS typically uses 1024 for ablation, and Midjourney is proposing 358,000 for imaging) and massive advances in computational tomography capabilities. There will still likely be pockets of low confidence where there's a lot of air, like in the lungs. But with sufficient information on what's happening around those areas, you'd still have something that's medically useful.
Ultrasound researcher here: this work is almost certainly not full wave inversion. You're right that FWI can be done in this kind of setup. Notably, a lot of people are trying to use circular arrays of ultrasound sensors to do FWI, particularly in the brain or the breast. But, it's a challenging inverse problem to solve and as far as I know, it's still extremely slow - nowhere near realtime.
The data they show is almost certainly a normal beamforming algorithm like delay and sum, possibly with some simple speed of sound correction. The most similar paper I know of is here in Nature Biomedical Engineering from a team at Caltech. https://www.nature.com/articles/s41551-026-01660-4
This is ridiculously optimistic. The technology, USCT with full waveform inversion, is not new.
It’s already used in breast imaging (SoftVue) and hasn’t replace mammography. A body part ideally suited for ultrasound.
More compute many minimize some of the fundamental limits of sound waves (bone and gas) but I would be shocked if they have useful images of 90% of the body parts we image with CT or MRI and even beyond that I question how much it’s more useful than B-mode anyway.
Quite slow which means most things abdomen and chest will be motion degraded.
This may be useful in superficial areas but then why do whole body anyway. Might be some new niches and interesting research but hardly revolutionary in my opinion.
You’re assuming a diagnostic test can be designed for 100% accuracy and this is not possible as disease states are spectrums not discrete categories.
“Normal ranges” in lab values are just confidence intervals of population means which by definition that some normal people will have abnormal values and some patients with a disease will have normal values.
The same is true for imaging. For example we use size criteria a lot. There is nothing different about 4.1 cm adrenal nodules and 3.9 cm nodules to explain why the former gets surgery and the latter gets called benign other than pre-test probability and acceptable false positive and false negative rates, whether this is measured by a human or AI.
Eventually, diagnostic systems (whether AI or human+AI) will significantly outperform current human doctors.
If humans have different normal ranges, then the tests will be specific to the individual, based on their health history, DNA, tissue simulation in digital environment, etc. If adrenal nodules of similar diameter behave differently, then the tests will inspect more than just diameter.
The data to make the correct diagnosis is out there, we just don't have the tools or processing power to use it yet.
You’re loosely alluding to personalized medicine but envisioning is a very futuristic state we are very slowly moving towards. What you suggest is great but we are a few decades and several technological breakthroughs as well as new discoveries away from coming to what you are talking.
DNA is increasingly used in oncology, but is difficult to interpret elsewhere and in many tumors is not insightful.
> The data to make the correct diagnosis is out there, we just don't have the tools or processing power to use it yet.
Maybe, but we don’t know what or how to measure it.
> If adrenal nodules of similar diameter behave differently, then the tests will inspect more than just diameter.
Everything investigated so far such as: biopsies with histology, MR spectroscopy and measuring the diffusivity of water molecules has not been reliable in differentiating benign or malignant nodules so we still use size. These are nontrivial problems. There are technical limitations to our measuring tools.
I believe these are very difficult but not impossible problems. There are technical limitations to our measuring tools, but I am optimistic that future medical advancements (maybe far in the future) will provide ways of measuring and diagnosing that may seem like science fiction today.
I am not attempting to trivialize the work that medical professionals do, or fault doctors for being fallible. I am attempting to encourage the development of medical technology to cut down on what I perceive to be a high rate of misdiagnosis (10-15%).
> There's a term I dislike but is apt: medical misogyny. Basically it's, "systemic, conscious, or unconscious gender biases [which] affect how a patient is treated by the healthcare system."
This is a loaded UK-centric policy/humanities term and I would suggest using sex/gender disparities instead which does not imply animus and is therefore much more useful for productive discussion.
Implicit and systemic biases in medicine are very real and supported by ample data.
> Systemic in particular is that basically the vast amount of knowledge amassed in the medical sciences has come from studying men. Comparatively little for those not assigned male at birth.
At least for the US this hasn’t been the case in clinical research for the past 15 years or so which in aggregate leans a bit more female than male if anything. Some specific fields still have sex disparity in clinical research for a variety of reasons but that’s the minority these days.
> This is a loaded UK-centric policy/humanities term
Yes, the implication of animus is the chief reason for my dislike of the term. The main failing of most alternatives is they don't roll off the tongue as easily or succinctly.
> this hasn’t been the case in clinical research for the past 15 years or so which in aggregate leans a bit more female than male if anything.
Oh yes, I didn't mean to imply the situation isn't improving (and an overcorrection in research at this point in time is probably a good thing, IMO (if it is in fact happening, which I struggle to believe (but that's my issue))).
The body of knowledge in medical science is a lot older than 15 years though, so I would think it will take a lot of time and effort to equalise.
Thanks for your response. I found it constructive and informative to my own thinking.
It’s actually quite a lot worse than even doctors in training except for highly constrained experimental settings and a few very nice applications that are mostly too tedious/impractical for a human to do or are very basic detection tasks.
I am a radiologist and researcher predominately focused on AI.
I work with pathologists and radiology is way ahead of us with AI use in clinical setting (but still not very far). Only things that get serious use are lab-developed (ie not commercial) image analysis algorithms for very limited (tedious, error-prone and ultimately not that often used) biomarkers. Don't believe the hype.
You could also look at the market, one of the biggest players, Paige, was acquired for about 30% of the money they raised.
I don’t think so, not beyond the current trend in medicine which is going up anyway.
For some things, like 3D volume segmentation of structure or disease (e.g. CVA/stroke volume, cardiac muscle mass, iron quantification) the bottleneck is the time it takes so we currently use approximations like single longest dimension, circular regions of interest, etc. AI will dramatically increase accuracy allowing for more accurate treatment and easier large scale research with quantitative endpoints.
Other things people think of like detection of aneurysms, fracture, lung nodules are not “hard” but AI has already added and will continue to add the second-reader benefit which will reduce detection errors. For this category the clinical benefit is as of yet unclear and we know that increased detection does not necessarily translate into improved patient outcomes and can in fact make them worse from over-diagnosis which means investigation related harms and over-treatment.
We were already in a phase of “over detection” in much of radiology with advances in imaging technology so the incremental benefit of current AI remains to be seen and I personally think is going to be much more limited. I had a case recently where a 2 mm brain aneurysm was missed on 3 CT scans over 10 years but was picked up by AI so now is being followed annually. This is too small to treat considering the risks and a serious argument could be made that 10 years of stability is proof enough that this is almost certainly clinically irrelevant for this patient.
Far more interesting areas of AI in imaging are in acquisition of acceleration (i.e. the medical equivalent of upscaling) which can dramatically decrease costs and increase accessibility as well as analyzing imperceptible features.
It may not be a popular take here but in my opinion the future of radiology is like what we see in software engineering today - a skilled human equipped with AI will outperform humans without AI and AI without humans, the latter of which we are still several years away from prototyping due to various technical hurdles.
> in my opinion the future of radiology is like what we see in software engineering today - a skilled human equipped with AI will outperform humans without AI and AI without humans
I suspect this will be the case across the board. It's a useful tool, but it's just a tool. It's not a replacement.
A friend of mine, a dermatologist, told me that LLMs are quite performant for melanoma analysis. Based on their own statistics, LLMs are able to beat humans with ~10 years of experience in the field.
They will never beat the human instinct tho, but they can be great tools sometimes. Unfortunately, LLMs mostly produce garbage.
Whenever it comes to medical diagnosis I would caution anyone to be careful with what “beat humans” really means.
In real life pathology is a spectrum not a binary and physicians are not trained to be 100% accurate instead optimizing sensitivity and specificity considering pretest probability as well as the harms of overdiagnosis and under diagnosis for a given scenario.
For something like melanoma which is relatively easy to diagnose with a superficial, extremely low risk skin biopsy and where early staging dramatically improves outcomes you would want to design around overcalling (high sensitivity) rather than maximize accuracy given the significant harms with false negatives and minimal harms with false positives.
An AI may be more accurate at classifying melanoma/not melanoma but if it does not meaningfully improve on the clinical threshold of biopsy/no biopsy or result in less biopsies that accuracy is wasted and may even be detrimental.
Note: I am just using this as an example to illustrate the considerations.
The better question is are there any sources that AI is better than human readers? I haven’t heard anyone make this claim outside of single/few disease classification tasks and even those are mostly 2D.
Anecdotally, my practice has most FDA approved AI deployed as we are an evaluation site and very rarely is the AI result useful. Over the past few months we have been cancelling contracts as these cost quite a lot of money (in some cases eating >50% of the study interpretation cost) for little to no benefit and a LOT of noise.
I think you’re overstating the impact of interpretability here. Your earlier point that latent reasoning models can’t be trained very well and that discretization may be load bearing rather than a readability tax in addition to significant inference infra hurdles (e.g. batching, speculative decoding) have limited any serious attempts and reduced the theoretical advantage over CoT at least in the near term.
> I think you’re overstating the impact of interpretability here
Outside of RLAIF, interpretability is the strongest way to do alignment right now. alignment is important because otherwise LLMs are incentivized to learn power seeking, dangerous behaviours [1]. a more downto earth example of alignment being important is that agents are incentivized to do tasks in the shortest way possible, and this way might not be what the user wants (I explain this further in another comment in this thread)
You’re putting the cart before the horse - alignment is an unsolved challenge (there are proposed approaches and active research on this) but it is still not established (beyond theory) that latent reasoning is more capable than CoT on hard language reasoning, particularly at scale.
The only takeaway here is logging helps detect patterns, we already knew that.
reply