Under AI Singapore (AISG)’s 100 Experiments Programme, my fellow apprentice and I were assigned to work with a regional kidney dialysis company to develop an AI model that predicts the hospitalisation of patients. This is a key component of our AI Apprenticeship Programme (AIAP) where we get to work on a real-world AI industry problem. Our model served as a decision support tool and has helped the kidney dialysis company’s medical team achieve 36% better precision (i.e. less false positive). It is currently deployed in their dialysis centres.
In this article, I will share the key challenges, processes and insight from developing my first medical AI model.
Ok… why are you interested in predicting the hospitalisation of dialysis patients?
Patients undergoing dialysis have higher morbidities and high risks of hospitalisation. By the time they are hospitalised, their medical conditions usually have become full-blown and their mortality risks would have increased. The ability to predict hospitalisation risk will allow early medical intervention.
Even though there is research done on key predictors on hospitalisation, the current process is fuzzy and is dependent on the experience of medical staff.
With the vast amount of data collected from each patient before, during, and after their dialysis, there is potential in using this data to train an AI model that predicts the hospitalisation of patients. The prediction from the model could be used as decision support for the medical team.
All you need is to feed those medical data into the model?
No. We need to pre-process the raw medical data to something useful for the model.
We put ourselves in the shoes of a medical professional and asked: how would a doctor assess the hospitalisation risk of patients? From this thought process, we learnt we could teach medical knowledge and supply patients’ medical histories to the model.
We could teach medical knowledge to our model by incorporating medical research done into our data.
For patients’ medical histories, we must find ways to aggregate patients’ medical parameters without losing excessive information. If a patient’s medical parameter is deteriorating, it is usually foretelling a grim outlook. This is what we want our model to know as well.
Cool… how do you teach medical knowledge to the model?
In the raw dataset given to us, most medical parameters readings are just numbers without any numerical significance.
There are guidelines on healthy ranges for most medical parameters. For example, an individual with hypertension has blood pressure above 140/90 mmHg. To give meaning to the raw blood pressure data, we converted it into categories of 0, 1, 2, or 3 for low, healthy, pre-hypertensive, and hypertensive blood pressure, respectively.
We did the same by converting other medical parameters into established categories.
And for incorporating the history of patients’ medical information?
We took different approaches based on the type of data.
For continuous variables, such as patients’ blood pressure, we used an exponential moving average (EMA) across 12 periods. This is a sliding window that takes 12 most recent readings and calculates the average value, with more recent data given higher weight.
Why 12? In dialysis, each patient will undergo 3 sessions weekly. A period of 12 means taking the average of the past 1-month dialysis data.
For discrete variables, such as past hospitalisation count, we created a ‘cumulative count’ column that records the number of times a patient’s frequency of hospitalisation. We increased this cumulative count by 1 each time the patient is hospitalised.
This is based on medical literature which discovered that past hospitalisation count is a strong predictor for future hospitalisation (an often-hospitalised patient means he is sicker and therefore has a higher chance of hospitalisation in future).
Any interesting finding from your project?
We tried using NLP (natural language processing) to extract information from patients’ discharge notes written by doctors. We thought providing this additional information to the model would improve its performance, but we are wrong — the model performance did not increase.
Our hypothesis is whatever information contained in patients’ discharge notes is already present in the patients’ medical parameters. For instance, for the doctor to indicate ‘high blood pressure’ in the discharge note, the doctor must have referred to the blood pressure reading of the patient.
Furthermore, adding NLP into the model slows down the model performance significantly due to the additional data processing required. We eventually decided to exclude patients’ discharge notes in our final AI model.
How do you know if your model is good enough?
We don’t. Everyone wants a perfect model, especially for medical professionals. This is rightly so. They are concerned about false negative and false positive results. These false results will negatively affect the patients’ medical outcome.
How did you then convince the medical team to implement your model?
We got creative. Instead of setting an arbitrary benchmark, we proposed a model validation exercise with the medical team. If our model can enable the medical team to make better predictions, then deploying our model will help the patients.
For one month, the medical team assessed patients and predicted which patient will be hospitalised. We did the same with our model. We then compared our predictions with the actual hospitalisation of patients.
After tallying the result, our AI model performed 36% better in precision. This means using our model as a decision support tool will help the kidney dialysis company’s medical team to make less false positive predictions.
Oh no! Did something bad happen?
No. On the contrary, something positive happened to the patients during the model validation period.
The hospitalisation rate of patients dropped significantly compared to the average hospitalisation rate before the model validation period. This is even though the medical team doesn’t know the prediction of our AI model.
After the model validation period, the average hospitalisation rate creeps back up to the original average hospitalisation rate. Therefore, it is unlikely that this event is due to randomness or other confounding factors.
How do you explain this decrease in hospitalisation rate?
This phenomenon is known as Social Facilitation, where individual performance improved when working with others.
Interestingly, it seems that just by knowing an AI model is competing with them on hospitalisation prediction, the overall performance of the medical team improves.
We suspected it could be due to the medical team taking a closer look at patients during this period. A little healthy competition never hurts.
Interesting… should I start telling my colleagues an AI model is running in the background (even if there isn’t) to improve their performance?
I will leave it to you to decide. A better alternative is to contact AISG and let us develop an AI model for your company.