I do not think this is an outlier
In which Scott Alexander bitches about badly-run studies and then creates his own badly-run study I'm not a medical professional. I did acoustics. And I could tell you all the HIPAA-violating shit he tried to pull. This is fifteen hundred words of "I know better" and "so I was shocked to find out that the agency protecting my hospital from liability disagrees." HIPAA ain't about your ability to double-test suicidal patients to shit on someone else's tool. It's about protecting patients from abuse of their data. And if your patients are in an environment where you aren't comfortable giving them a pen, your patients are in an environment where their ability to consent can rightfully be questioned by a review board, who will question the validity of your study. Sure - it should be an easy study. It should cost less than a thousand dollars for a wheelbarrow tire when you put it on an airplane, too, but the FAA cares about such things because the stakes are high. And when HIPAA violations cost $10k to $50k per instance, the review board cares, too.
I've done several machine learning projects involving data with HIPAA requirements, and it was a pain in the ass because anything involving talking to a lawyer is a pain in the ass, but "I don't need anything personally identifiable for this project, so all the data I'm working with will be anonymized" was always enough after defining some terms. It's annoying, but not much more annoying than traffic lights and those do more good than harm too.
I believe that in this example, the anonymization was the sticking point. Alexander did not understand (or feigned to not understand) why the data needed to be anonymous and so he throws up sticking points that are almost entirely about making the data anonymous. Had the data been anonymized by an entity capable of separating the hospital from the liability, there probably would have been a lot less drama.
That is where experiments conducted by the doctors get more difficult than experiments conducted by the techies; I'm not collecting any data, I just have to show I'm not taking indecent liberties with the data on servers I have root on. But I think you're getting more directly at the same point I was trying to, that HIPAA shouldn't be that much of an obstacle so long as you're respecting the patients' privacy.Had the data been anonymized by an entity capable of separating the hospital from the liability, there probably would have been a lot less drama.
A big part of the problem is PROVING that you are taking adequate security measures and the needfulness of your study. The people okaying these sorts of things are typically not super savvy in computers or medical science(Even if they are MDs). They are savvy in bureaucracy and administration.HIPAA shouldn't be that much of an obstacle so long as you're respecting the patients' privacy.
I don't think I couldn't, but they were the unsexy "there's tedious work to be done and I can probably write a script that figures out how to do it faster than my users could do it themselves" kind of ML project and I don't have anything interesting to say about them.
Mmmm, yes and As someone who conducts clinical research, there are definitely better ways to administrate these things. I know this because my institute is going through a paradigm shift in research initiation, funding and administration, the third one in a decade, because the programs hemorrhage money and investigator time left right and center. For a simple enough reason as 'people like money' stuff is going to shift around until we can find a more efficient way to do poke and prod people for medical science. In the case of Mr. Alexander, it sounds like the research department at his institution was particularly ineffectual and bureaucratically twisted AND his lack of research into 'How do I legally conduct clinical research in a setting with mentally compromised individuals?' came back to bite him and his headcase PI in the butt, a nasty combo.
I'll bet they really didn't want to be in a position to hang their liability and malpractice insurance on the line to prove that a screening tool that explicitly states it isn't for diagnosis is a screening tool that shouldn't be used for diagnosis. I think I know the screening tool he's talking about. It's this, or something like it. It's been cited 1200 times. N of 198, 5 outpatient clinics (where "are you unstable enough that we shouldn't give you a pen" never comes up). Both of my parents took it, and the one that's bipolar to fuck got "bipolar to fuck" and the one that isn't didn't. Thing of it is, bipolar disorder is pretty squidgy and even once you have a diagnosis you're messing with meds forever. So it's not like a misdiagnosis with a screening tool is a catastrophe or anything anyway.
So, In my dreams, there is a revolution coming. A very quiet one. Computers keep getting smarter, and because of what I've seen Watson to be capable of, there will be a way to analyze Electronic Medical Record systems in an anonymous, bulk sort of way and extract meaningful research data from normal clinical operations of a hospital system. The problem here I think is law more than computing, and I suspect that there are going to be some radical changes in medical /data law in the next few years. There is a possibility that some of those changes will be beneficial.
Version 2 of EMR/EHR. I don't see it coming with what we have now. I'll bet you use a much more skookum EHR than my wife does. She uses one written for midwives. But then, she also does naturopathic medicine and that one requires a different set of fields so she uses two. And when a patient graduates from prenatal care to naturopathic care, the intake has to be done all over again. Both of them will allow you to dive through their data, but it's GIGO - if you call something "depression" with one patient and "sadness" with another, you get a report for "sadness" and "depression" each with an n of 1. This is something that came up with the discovery of the opioid epidemic - there was no agreed-upon standard of what you call an opioid overdose so they got reported under 20 different categories in Ohio alone. Once a researcher went through and screened everything that could be an opioid overdose and went back and looked at the autopsies and death certificates, they saw this: ...but they didn't see it until 2013. Metadata is a bitch. I have to plumb my own for Soundminer and every library uses their own tags. I ran into this when I was looking into stock: Getty and Corbis both use different tags. ICD10 standardizes the diagnosis for billing, insurance and statistical purposes but it doesn't standardize the history. I think you're right - there will be a push to make the terminology standardized so that it is more easily parse-able. But it's going to be the kind of herculean effort similar to the 12 years it took for homosexuality to cease to be a mental disorder between the DSM3 and DSM4.
Beyond that though. Not just V2.0 of the EMR systems, but V2.0 of how we think about medical data. Yes, your personal data is your personal data AND each individual represents data points that are relevant research/epidemiological data for their age/sex/smoking class/alcohol class/ etc ad infinitum. I think the bigger part of the revolution is going to be in the legal aspects of how that population level data is accessed/who is allowed to access it. With the example you listed about 'sadness' and 'depression' I think a sufficiently intelligent program like Watson would be able to parse all of that into meaningful 'buckets' like that researcher in Ohio, but at lightning speed and can report results in such a way that individual patients are protected. As part of one of my current protocols, when a patient gets admitted to the hospital, I have to generate a 'shadow chart' that details each days testing, how much IV fluid and what kind the patient gets, how much and what exactly they eat, med lists, etc. I'm basically making an anonymized copy of the main medical record for their admission. Watson could do the exact same thing en-masse if we gave it the permissions and guidance to do so. Then, with (God I hate using this word) standardized, anonymous description of clinical courses of diseases, we could then turn our super-doc program loose on those reports to extract statistically meaningful data both about disease and about the efficacy of contemporary treatments. I agree that I don't see it happening with what we have now. But I think that what we have now unsustainable. Research institutions are scrambling to find a model that really WORKS and they are trying to maintain the facade that they know what they are doing, which, from my observations, they don't. The high-level decision making positions are filled with geriatrics who at best don't understand the potential value that ML can bring, and at worst, are openly hostile to automation and technological advancement. There are old docs who maintain the mental and philosophical flexibility to be open to radical, systemic change, but it's not common.Version 2 of EMR/EHR.
the current system is sustained by government cash injections in the form of rewards for required compliance levels emrs are damn expensive as you presumably know. one of the myriad factors in cost disease, but one that gets forgotten for some reason. still, one of the most necessary
It's so much better than what we had, though, d00d. My wife requested her records from the hospital she went as a child. 240 pages of hand-written notes. No correlation in any of it. You see "unsustainable" because you didn't see the great leap forward.
I was a sick lil baby, now sick grown ass man. When 'The Great Leap Forward' happened, my 'chart' that was commonly trotted out was over 2000 pages, mostly front and back. I understand that where we are now is leaps and bounds ahead of what was, but it's still grossly lacking in what's possible. You don't organize digital files the way you organize paper files, and right now, broadly speaking, we have digital data being organized and squirreled away like paper. We are capable of much much more.It's so much better than what we had, though, d00d.
surprise surprise, not an outlier http://slatestarcodex.com/2017/08/31/highlights-from-the-comments-on-my-irb-nightmare/ i also found this classic post