Friday, August 28, 2020
Medical Physics Games
Counterfactual learning systems
Separating causation and correlation in AI systems is a challenge because most machine learning systems look for trends, but not 'counterfactual' information, which is more like the way we, humans, and doctors think.
People, like doctors, make decision based on what they know to be true and untrue, and build causal reasoning into a diagnosis. Most #machinelearning systems don't build causality: they are built on associations / correlations. We don't care the sky is probably blue when we get a cold, but we do care your T-cell count is low when you get a cold (causation vs. correlation). Counterfactual data? "Lets get a chest x-ray / ultrasound / CBC ..." i.e., some data that rules out other possibilities to see how symptom relates to disorder (directly or indirectly). But what rules can you build for machine learning? (Un)surprisingly this paper shows it can be simple, (because thats how *our* brains probably work): disease should be consistent with diagnosis, rule out stuff that isn't possible, and keep it simple: 1 Dx fitting M symptoms is better than N Dx fitting M symptoms. They go on to define things called "expected disablement" and "expected sufficiency". The former is obvious, but the latter is like "sufficient cause", and state theorems, one of which is that disablement and sufficiency are sufficient conditions for the rules above. But real data is noisy and murks the variables and so there needs to be a way to account for noise. (insert mathy stuff here). Thats all fine, but the litmus test is "How does this compare to actual clinical decisions?" In short, a physician achieves higher accuracy in diagnosing a disorder for simpler problems, and the algorithm outperforms for more complex problems. Thats good for rare disease classification. That makes sense as the story of #machinelearning and #AI in medical diagnoses suggests utility in a role as a 'decision support tool', but not a fully autonomous one. The difference here is that the model behaves more like a clinician would. For you Bayesians ... when you first learned Bayes' Theorem I bet you pondered "Why can't we do counterfactual inference in medical diagnosis? ...policy making? ... court decisions?". This article is a nice progression of how we can use AI based on causation - not just correlation. Don't believe me? Read for yourself.Extracellular vesicle and particle biomarkers and AI
A very interesting article on
extracellular vesicle and particle biomarkers and how they might be used in
cancer detection.
https://www.sciencedirect.com/science/article/pii/S0092867420308746?via%3Dihub
There are gazillion authors from a bajillion institutions on this paper.
Collaboration!
The gold standard to confirm cancer and other aliments is a tissue biopsy,
where a small sample of tissue is extracted from the suspicious growth. But
extracting a tissue sample isn’t possible in many situations, especially when
there are other co-morbidities where the biopsy can introduce more problems
than it attempts to solve.
So ‘liquid’ biopsies is another
approach: stuff like drawing blood, lymphatic/bile, etc., which is not as
difficult. But that stuff isn’t where the tumor is… its stuff floating around
the body. Some of the gunk that floats around outside the cell are EVPs...or ‘extracellular
vesicles and particles’. Basically they’re goops of stuff that float outside
the cell, originating from ‘sorters of things’ in your cells. I (probably
mistakenly) think of them as recipe pages floating outside the bookstore that
sells recipe books. Except there are gazillion (actually billions of EVPs)
recipes, and a gazillion books: trying to figure out what page came from what
book would seem an impossible task, right? Well… this is where the story gets
interesting!
This team used machine learning
techniques to sort through all the EVPs based on sizes and other subcategories
(mice/human, cancers). They found that the relationship between +10K EVPs and
tumors in mice and humans were not the same (interesting since mouse models are
used in so much research). They then sifted through all these possible markers
to see if they could be used as a cancer detector.
How do you sort through literally 10s
of thousands of markers for trends? Reliably? #Machinelearning, of course. They
found the presence/absence of 13 common EVPs could be used to classify both
lung and pancreatic cancers. But are those little floaters actually associated
with tumors? In other words, is there a relationship between biopsy findings
and the floaters?
While their dataset was kinda small, they could verify the biopsy findings with the floaters to +90% sensitivity / specificity (sensitivity is how well you can detect something (like how likely you are to stop at a sign that looks like a stop sign), and specificity is how well you can rule all other possibilities out (like how well you ignore the sign that looks like a stop sign but really isn’t). They then attempted to ensure that what they saw wasn’t just stuff you’d seen normally... not a trivial task.
What does it all mean? Maybe *earlier* cancer detection? Increased precision cancer detection? Dunno… but it is super cool that floaters in the blood could be so precise in detecting disease. These EVPs may be echoes of the body saying ‘something ain’t right’. We didn’t have the tools to be able to appreciate this signal until we developed the technology to detect the echoes.
Super cool.
Meet RoboBEER
Meet RoboBEER, a robotic beer pourer.
As you know, the demand for high quality beers worldwide has exploded over the last few decades. What drives quality? Well one way to discern quality is to objectively characterize features within the beer.
What features you may ask? Some of them are visual, like the color and foam-ability, such as maximum volume of foam, total lifetime of foam, foam drainage, size of the bubbles in the foam. But not just any idiot can pour the beer, as a Guinness lover will tell you, since a good pour is crucial. Fortunately RoboBEER can pull the ‘perfect’ pint: RoboBEER pulls 80 mL (+/- 10 mL) while monitoring the liquid temperature, assessing the alcohol and CO2 levels, all through your kids Arduino control board and a Matlab interface (yeay Matlab!).
But what about more important features like taste? Surely no robot could do that right? No way. But… maybe you could predict things like mouthfeel from all the features obtained in by RoboBEER? You could capture descriptions of taste from experts through a questionnaire: 10 basic categories: bitter, sweet, sour, aroma in grains, aroma in hops, aroma in yeast, viscosity, astringency, carbonation mouthfeel, and flavor hops. Then, have them sample twenty-two beers. (What I would do to be a part of this study!)
Could you train a neural network to predict what the beer would taste like just based off the data from RoboBEER?
A ‘feedforward’ neural network was designed where, essentially, you take all the inputs from the RoboBEER (head size, color, etc), and the outputs from the tasters (bitterness, sweetness, mouthfeel) and see if a neural network can predict the taste based on those inputs. You do some fun math like principal component analysis to help with sorting all the data and patterns, pump them into the network for AI training and what do you get?
For the independent testing data, the AI system from RoboBEER data could predict what it a beer would taste like with an accuracy of 86%. What does this mean? Well… very likely, RoboBEER is a better judge of beer than you are. And it doesn’t even have to taste the beer.
Don’t believe me? Read for yourself.
https://onlinelibrary.wiley.com/doi/epdf/10.1111/1750-3841.14114