HN2new | past | comments | ask | show | jobs | submitlogin

There isn't an obvious unsupervised problem to train medical imaging with.

What's the medical imaging equivalent to "predict the next word"?



It's the same thing. Predict the next pixel, or the next token (same way you handle regular images), or infill missing tokens (MAE is particularly cool lately). Those induce the abstractions and understanding which get tapped into.


There is none. But if the multimodal model is exposed to enough medical knowledge, it may be able to interpret images without specific training


Labelling data is easier, I think. It will just take a while...


Predict next entry in medical chart?

Presumably all these images would be connected with what ended up happening with the patient months or years later


If you has this level of data, wouldn’t it be trivial to label the images?


It's incredibly hard to disambiguate and accurately label images using the reports (area of my research).

Reports are also not analogous to ground truth labels, and you don't always have histopathologic/clinical outcomes.

You also have drift in knowledge and patient trends, people are on immunotherapy now and we are seeing complications/patterns we didn't see 5 years ago. A renal cyst that would have been follow-up to exclude malignancy before 2018 is now definitively benign, so those reports are not directly usable.

You would have to non-trivially connect this to a knowledge base of some form to disambiguate, one that doesn't currently exist.

And then there's hallucination.

Currently if you could even extract actionable findings, accurately summarize reports and integrate this with workflow you could have a billion dollar company.

Nuance (now owned by Microsoft) can't even autofill my dictation template accurately using free-text to subject headings.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: