Uncategorized

pandas – Calculating Readmission Metrics in Python


I need to compute some Hospital Readmission Variables using Python.

The variables needed for the variables are:

  • ID: Patient ID
  • Date Admission: Date of entrance to hospital
  • Date Discharge: Date of exit from hospital
  • Diagnosis: type of disease the person has
  • Hospital. It is an ID of the hospital.

I would need to compute the following metrics:

  • Simple Readmission:
    Compute variables for different periods 3, 7, 14 30 and 45 days after discharge from index admission.
    Rule:Readmission is possible the day after discharge from index admission – at the earliest. This means that if a patient is discharged and admitted on the same day, it is considered a transfer and not a readmission.

I was able to calculate this by using:

df = df.sort_values(['ID','Admission_Date'])

df[['Admission_Date','Discharge_Date']] = df[['Admission_Date','Discharge_Date']].apply(lambda x: pd.to_datetime(x))

df['Readmit3']=df.groupby('ID').apply(lambda x: (((x['Admission_Date'].shift(-1)-x['Discharge_Date']).dt.days.shift(1).le(3))).astype(int)).values
df['Readmit7']=df.groupby('ID').apply(lambda x: (((x['Admission_Date'].shift(-1)-x['Discharge_Date']).dt.days.shift(1).le(7))).astype(int)).values
df['Readmit14']=df.groupby('ID').apply(lambda x: (((x['Admission_Date'].shift(-1)-x['Discharge_Date']).dt.days.shift(1).le(14))).astype(int)).values
df['Readmit30']=df.groupby('ID').apply(lambda x: (((x['Admission_Date'].shift(-1)-x['Discharge_Date']).dt.days.shift(1).le(30))).astype(int)).values
df['Readmit45']=df.groupby('ID').apply(lambda x: (((x['Admission_Date'].shift(-1)-x['Discharge_Date']).dt.days.shift(1).le(45))).astype(int)).values

However, I am facing difficulties with:

  1. Hospital Readmission
    Variable indicating if readmission is to the same hospital or not (yes/no) within 3, 7, 14 30 and 45 days from discharge from the index admission.

  2. Variable indicating if readmission is to a list of specific hospitals. List Hospital: [3,34]. These hospitals are not in the snippet but just as toy example.

  3. Variable indicating if readmission is with the same diagnosis or not, within 3, 7, 14 30 and 45 days from discharge from the index admission (yes/no).

Here is the data snippet:


data = {'Date_Admission': ['19/04/20', '20/02/20', '06/04/20', '11/03/20', '11/04/20', '13/05/20', '10/01/20', '16/04/20', '08/02/20', '21/05/20', '06/04/20', '03/01/20', '15/05/20', '04/04/20', '13/01/20', '11/05/20', '19/02/20', '25/02/20', '14/05/20', '07/02/20', '14/03/20', '03/01/20', '14/02/20', '12/02/20', '09/05/20', '19/01/20', '07/04/20', '27/04/20', '14/05/20', '09/02/20', '23/03/20', '22/04/20', '14/02/20', '10/01/20', '05/03/20', '14/01/20', '04/04/20', '04/05/20', '22/05/20', '24/01/20', '11/02/20', '28/03/20', '03/05/20', '15/05/20', '02/01/20', '20/02/20', '13/01/20', '31/03/20', '16/04/20', '27/02/20', '10/02/20', '22/03/20', '15/05/20', '06/02/20', '05/04/20', '26/01/20', '28/05/20', '11/05/20', '29/04/20', '21/04/20', '13/01/20', '10/01/20', '27/05/20', '28/03/20', '27/01/20', '15/01/20', '16/03/20', '20/04/20', '10/03/20', '26/04/20', '28/01/20', '27/01/20', '26/04/20', '07/01/20', '28/04/20', '01/02/20', '18/02/20', '06/02/20', '18/03/20', '21/02/20', '01/04/20', '20/05/20', '03/02/20', '25/01/20', '23/03/20', '06/04/20', '13/05/20', '15/02/20', '20/02/20', '27/04/20', '02/03/20', '10/03/20', '19/05/20', '01/02/20', '26/05/20', '12/03/20', '17/02/20', '15/04/20'],
        'Date_Discharge': ['23/04/20', '25/02/20', '15/04/20', '04/04/20', '24/04/20', '15/05/20', '15/01/20', '25/04/20', '27/02/20', '24/05/20', '11/04/20', '05/01/20', '20/10/20', '08/04/20', '18/01/20', '11/05/20', '04/03/20', '03/03/20', '22/05/20', '29/02/20', '16/03/20', '07/01/20', '17/02/20', '13/03/20', '22/05/20', '22/01/20', '27/04/20', '19/05/20', '20/05/20', '15/02/20', '04/06/20', '30/04/20', '19/02/20', '16/01/20', '10/03/20', '20/01/20', '16/04/20', '18/05/20', '08/06/20', '29/01/20', '16/02/20', '01/04/20', '22/05/20', '23/05/20', '08/01/20', '20/02/20', '20/01/20', '10/04/20', '27/04/20', '85406', '13/02/20', '25/03/20', '28/05/20', '12/02/20', '20/04/20', '4848', '04/02/20', '19/06/20', '13/05/20', '581', '29/04/20', '03/05/20', '29532', '17/01/20', '01/02/20', '5849', '11/04/20', '42979', '22/01/20', '17/03/20', '4280', '11/03/20', '01/05/20', '40211', '06/02/20', '5400', '29/04/20', '29663', '06/05/20', '78039', '17/03/20', '51881', '24/05/20', '42781', '19/03/20', '10/04/20', '9962', '29/05/20', '18/02/20', '24/02/20', '29/04/20', '06/03/20', '17/03/20', '500', '30/05/20', '05/02/20', '27/05/20', '25/03/20', '22/02/20', '05/05/20'],
        'Diagnosis_Primary': [65421, 51881, 1889, 431, 431, 85400, 56081, 56211, 1912, 650, 1911, 5409, 51882, 650, 78609, 49301, 82321, 5119, 4111, 82020, 650, 30183, 41071, 9962, 28860, 650, 4848, 1398, 51881, 5111, 4848, 5789, 29690, 485, 5852, 419, 8244, 7994, 29020, 51881, 51881, 650, 43401, 4373, 80841, 5856, 1628, 1961, 1961, 85406, 4289, 40211, 82020, 46611, 4848, 81200, 1890, 591, 66981, 29532, 30502, 82001, 5849, 5168, 42979, 5609, 632, 4280, 60820, 5609, 40211, 5400, 650, 29663, 29642, 78039, 431, 51881, 42781, 51881, 650, 51881, 56089, 5118, 85220, 8832, 4848],
        'ID': [4, 16, 25, 42, 42, 50, 60, 64, 65, 67, 72, 77, 96, 101, 112, 116, 124, 146, 146, 154, 160, 161, 184, 185, 189, 192, 201, 201, 215, 234, 240, 248, 267, 286, 286, 292, 299, 309, 309, 318, 318, 340, 346, 354, 365, 367, 368, 368, 368, 385, 404, 420, 423, 431, 487, 492, 493, 519, 581, 598, 607, 620, 637, 646, 661, 664, 666, 672, 685, 723, 740, 744, 745, 751, 751, 753, 759, 760, 764, 774, 777, 779, 795, 807, 807, 817, 854, 862, 862, 865, 868, 868, 868, 873, 874, 877, 907],
        'Hospital_ID': [1] * 92}

df = pd.DataFrame(data)

I do not know what to try, still a beginner.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *