In this notebook, we explore the clinical data from Adaptive COVID-19 Treatment Trial (ACTT) to evaluate the clinical efficacy of remdesivir relative to the control arm in patients hospitalized with COVID-19 as assessed by the time to recovery.
This notebook showcases the ability to do exploratory analysis within the NIAID Accessing NIAID Clinical Trials Data Commons. The analysis is not intended to constitute advice nor is it to be used as a substitute for decision making from a professional. The data used in this notebook is controlled access and can be shared securely by granting the users access to this data in the NIAID Clinical Trials Data Commons. Please note that the data used in this notebook is CONTROLLED ACCESS DATA, which would need access to project ACTT in NIAID AccessClinicalData Commons for this notebook to run.
Study design and data description.
Demographic and clinical characteristics of the patients at baseline.
Kaplan–Meier estimates of cumulative recoveries.
Recovery rate ratios and hazard ratios calculated from the stratified Cox model.
ACTT-1 (Adaptive COVID-19 Treatment Trial) is an adaptive, randomized, double-blind, placebo-controlled trial to evaluate the safety and efficacy of remdesivir (200 mg loading dose on day 1, followed by 100 mg daily for up to 9 additional days) in hospitalized adults diagnosed with COVID-19. Subjects will be assessed daily while hospitalized. If the subjects are discharged from the hospital, they will have a study visit at Days 15, 22, and 29.
The primary outcome is time to recovery by Day 29. The primary analysis will include data from both severity groups using a stratified log-rank test. A key secondary outcome evaluates treatment-related improvements in the 8-point ordinal scale at Day 15. As little is known about the clinical course of COVID-19, an evaluation of the pooled (i.e., blinded to treatment assignment) proportion recovered will be used to gauge whether the targeted total number of subjects in the recovered categories of the ordinal scale will be achieved with the planned sample size. The analysis of the pilot data will be blinded, allowing for the pilot data to be included in subsequent analyses.
We’ll first import all the packages that we need for this notebook:
numpy
is the fundamental package for scientific computing in python.pandas
is what we’ll use to manipulate our data.matplotlib
is a plotting library.tableone
is a package for creating summary statistics for a patient population.lifelines
is an open-source survival analysis library.! pip install gen3 -U
! pip install tableone
! pip install --upgrade scipy
! pip install --upgrade lifelines
#! pip install patsy==0.5.2 --user
import zipfile
import pandas as pd
import numpy as np
import warnings
warnings.simplefilter("ignore")
%config InlineBackend.figure_format = 'svg'
%matplotlib inline
from tableone import TableOne, load_dataset
import matplotlib.pyplot as plt
import statistics
from lifelines import KaplanMeierFitter, CoxPHFitter
from lifelines.statistics import logrank_test
from lifelines.plotting import add_at_risk_counts
from patsy import dmatrices
from scipy import stats
import string
Requirement already satisfied: gen3 in /opt/conda/lib/python3.9/site-packages (4.6.3) Collecting gen3 Downloading gen3-4.10.1-py3-none-any.whl (109 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.0/109.0 kB 21.5 MB/s eta 0:00:00 Requirement already satisfied: httpx in /opt/conda/lib/python3.9/site-packages (from gen3) (0.15.5) Requirement already satisfied: indexclient>=1.6.2 in /opt/conda/lib/python3.9/site-packages (from gen3) (2.1.0) Requirement already satisfied: aiohttp in /opt/conda/lib/python3.9/site-packages (from gen3) (3.8.1) Requirement already satisfied: humanfriendly in /opt/conda/lib/python3.9/site-packages (from gen3) (10.0) Requirement already satisfied: dataclasses-json in /opt/conda/lib/python3.9/site-packages (from gen3) (0.5.6) Collecting drsclient<0.3.0,>=0.2.1 Downloading drsclient-0.2.1.tar.gz (7.1 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: python-dateutil in /opt/conda/lib/python3.9/site-packages (from gen3) (2.8.2) Requirement already satisfied: backoff in /opt/conda/lib/python3.9/site-packages (from gen3) (1.11.1) Requirement already satisfied: tqdm>=4.61.2 in /opt/conda/lib/python3.9/site-packages (from gen3) (4.64.0) Requirement already satisfied: jsonschema in /opt/conda/lib/python3.9/site-packages (from gen3) (4.6.0) Requirement already satisfied: requests in /opt/conda/lib/python3.9/site-packages (from gen3) (2.28.0) Requirement already satisfied: pypfb<1.0.0 in /opt/conda/lib/python3.9/site-packages (from gen3) (0.5.0) Requirement already satisfied: click in /opt/conda/lib/python3.9/site-packages (from gen3) (7.1.2) Requirement already satisfied: pandas<2.0.0,>=1.4.2 in /opt/conda/lib/python3.9/site-packages (from gen3) (1.4.2) Requirement already satisfied: aiofiles<0.9.0,>=0.8.0 in /opt/conda/lib/python3.9/site-packages (from gen3) (0.8.0) Requirement already satisfied: asyncio<4.0.0,>=3.4.3 in /opt/conda/lib/python3.9/site-packages (from drsclient<0.3.0,>=0.2.1->gen3) (3.4.3) Requirement already satisfied: certifi in /opt/conda/lib/python3.9/site-packages (from httpx->gen3) (2022.6.15) Requirement already satisfied: rfc3986[idna2008]<2,>=1.3 in /opt/conda/lib/python3.9/site-packages (from httpx->gen3) (1.5.0) Requirement already satisfied: sniffio in /opt/conda/lib/python3.9/site-packages (from httpx->gen3) (1.2.0) Requirement already satisfied: httpcore==0.11.* in /opt/conda/lib/python3.9/site-packages (from httpx->gen3) (0.11.1) Requirement already satisfied: h11<0.10,>=0.8 in /opt/conda/lib/python3.9/site-packages (from httpcore==0.11.*->httpx->gen3) (0.9.0) Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.9/site-packages (from pandas<2.0.0,>=1.4.2->gen3) (2022.1) Requirement already satisfied: numpy>=1.18.5 in /opt/conda/lib/python3.9/site-packages (from pandas<2.0.0,>=1.4.2->gen3) (1.22.3) Requirement already satisfied: gdcdictionary<2.0.0,>=1.2.0 in /opt/conda/lib/python3.9/site-packages (from pypfb<1.0.0->gen3) (1.2.0) Requirement already satisfied: dictionaryutils<=3.0.2 in /opt/conda/lib/python3.9/site-packages (from pypfb<1.0.0->gen3) (3.0.0) Requirement already satisfied: fastavro<2.0.0,>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from pypfb<1.0.0->gen3) (1.4.9) Requirement already satisfied: PyYAML<6.0.0,>=5.3.1 in /opt/conda/lib/python3.9/site-packages (from pypfb<1.0.0->gen3) (5.4.1) Requirement already satisfied: python-json-logger<0.2.0,>=0.1.11 in /opt/conda/lib/python3.9/site-packages (from pypfb<1.0.0->gen3) (0.1.11) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil->gen3) (1.16.0) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.9/site-packages (from requests->gen3) (3.3) Requirement already satisfied: charset-normalizer~=2.0.0 in /opt/conda/lib/python3.9/site-packages (from requests->gen3) (2.0.12) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.9/site-packages (from requests->gen3) (1.26.9) Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (21.4.0) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (1.7.2) Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (4.0.2) Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (1.2.0) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (6.0.2) Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.9/site-packages (from aiohttp->gen3) (1.3.0) Requirement already satisfied: marshmallow<4.0.0,>=3.3.0 in /opt/conda/lib/python3.9/site-packages (from dataclasses-json->gen3) (3.14.1) Requirement already satisfied: marshmallow-enum<2.0.0,>=1.5.1 in /opt/conda/lib/python3.9/site-packages (from dataclasses-json->gen3) (1.5.1) Requirement already satisfied: typing-inspect>=0.4.0 in /opt/conda/lib/python3.9/site-packages (from dataclasses-json->gen3) (0.7.1) Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /opt/conda/lib/python3.9/site-packages (from jsonschema->gen3) (0.18.1) Requirement already satisfied: cdislogging~=1.0 in /opt/conda/lib/python3.9/site-packages (from dictionaryutils<=3.0.2->pypfb<1.0.0->gen3) (1.1.1) Requirement already satisfied: mypy-extensions>=0.3.0 in /opt/conda/lib/python3.9/site-packages (from typing-inspect>=0.4.0->dataclasses-json->gen3) (0.4.3) Requirement already satisfied: typing-extensions>=3.7.4 in /opt/conda/lib/python3.9/site-packages (from typing-inspect>=0.4.0->dataclasses-json->gen3) (4.1.1) Building wheels for collected packages: drsclient Building wheel for drsclient (pyproject.toml) ... done Created wheel for drsclient: filename=drsclient-0.2.1-py3-none-any.whl size=7417 sha256=39f08cecc9ea43b7cad4684ca07d36a801b1e8c9bd15384f3ac3c902ee30f05a Stored in directory: /home/jovyan/.cache/pip/wheels/fa/14/cf/4242c44ed310a955ad62575b92f4b81c24852e388dca1d6385 Successfully built drsclient Installing collected packages: drsclient, gen3 Attempting uninstall: drsclient Found existing installation: drsclient 0.1.4 Uninstalling drsclient-0.1.4: Successfully uninstalled drsclient-0.1.4 Attempting uninstall: gen3 Found existing installation: gen3 4.6.3 Uninstalling gen3-4.6.3: Successfully uninstalled gen3-4.6.3 Successfully installed drsclient-0.2.1 gen3-4.10.1 Collecting tableone Downloading tableone-0.7.10-py2.py3-none-any.whl (32 kB) Requirement already satisfied: numpy>=1.12.1 in /opt/conda/lib/python3.9/site-packages (from tableone) (1.22.3) Requirement already satisfied: pandas>=0.22.0 in /opt/conda/lib/python3.9/site-packages (from tableone) (1.4.2) Collecting tabulate>=0.8.2 Downloading tabulate-0.8.10-py3-none-any.whl (29 kB) Requirement already satisfied: statsmodels>=0.8.0 in /opt/conda/lib/python3.9/site-packages (from tableone) (0.13.2) Requirement already satisfied: scipy>=0.18.1 in /opt/conda/lib/python3.9/site-packages (from tableone) (1.8.1) Requirement already satisfied: python-dateutil>=2.8.1 in /opt/conda/lib/python3.9/site-packages (from pandas>=0.22.0->tableone) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.9/site-packages (from pandas>=0.22.0->tableone) (2022.1) Requirement already satisfied: patsy>=0.5.2 in /opt/conda/lib/python3.9/site-packages (from statsmodels>=0.8.0->tableone) (0.5.2) Requirement already satisfied: packaging>=21.3 in /opt/conda/lib/python3.9/site-packages (from statsmodels>=0.8.0->tableone) (21.3) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.9/site-packages (from packaging>=21.3->statsmodels>=0.8.0->tableone) (3.0.9) Requirement already satisfied: six in /opt/conda/lib/python3.9/site-packages (from patsy>=0.5.2->statsmodels>=0.8.0->tableone) (1.16.0) Installing collected packages: tabulate, tableone Successfully installed tableone-0.7.10 tabulate-0.8.10 Requirement already satisfied: scipy in /opt/conda/lib/python3.9/site-packages (1.8.1) Requirement already satisfied: numpy<1.25.0,>=1.17.3 in /opt/conda/lib/python3.9/site-packages (from scipy) (1.22.3) Collecting lifelines Downloading lifelines-0.27.1-py3-none-any.whl (349 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 349.5/349.5 kB 45.2 MB/s eta 0:00:00 Collecting autograd>=1.3 Downloading autograd-1.4-py3-none-any.whl (48 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 kB 12.7 MB/s eta 0:00:00 Requirement already satisfied: numpy>=1.14.0 in /opt/conda/lib/python3.9/site-packages (from lifelines) (1.22.3) Requirement already satisfied: scipy>=1.2.0 in /opt/conda/lib/python3.9/site-packages (from lifelines) (1.8.1) Collecting autograd-gamma>=0.3 Downloading autograd-gamma-0.5.0.tar.gz (4.0 kB) Preparing metadata (setup.py) ... done Requirement already satisfied: matplotlib>=3.0 in /opt/conda/lib/python3.9/site-packages (from lifelines) (3.4.2) Collecting formulaic>=0.2.2 Downloading formulaic-0.3.4-py3-none-any.whl (68 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.2/68.2 kB 16.2 MB/s eta 0:00:00 Requirement already satisfied: pandas>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from lifelines) (1.4.2) Collecting future>=0.15.2 Downloading future-0.18.2.tar.gz (829 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 829.2/829.2 kB 59.9 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting astor>=0.8 Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB) Collecting interface-meta<2.0.0,>=1.2.0 Downloading interface_meta-1.3.0-py3-none-any.whl (14 kB) Collecting wrapt>=1.0 Downloading wrapt-1.14.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (77 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.8/77.8 kB 19.8 MB/s eta 0:00:00 Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.9/site-packages (from matplotlib>=3.0->lifelines) (3.0.9) Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.9/site-packages (from matplotlib>=3.0->lifelines) (9.1.1) Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.9/site-packages (from matplotlib>=3.0->lifelines) (0.11.0) Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.9/site-packages (from matplotlib>=3.0->lifelines) (2.8.2) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.9/site-packages (from matplotlib>=3.0->lifelines) (1.4.3) Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.9/site-packages (from pandas>=1.0.0->lifelines) (2022.1) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib>=3.0->lifelines) (1.16.0) Building wheels for collected packages: autograd-gamma, future Building wheel for autograd-gamma (setup.py) ... done Created wheel for autograd-gamma: filename=autograd_gamma-0.5.0-py3-none-any.whl size=4032 sha256=8c02f878022bb668777651cce1fe034363a6338117634b4756cbc73fcdf41038 Stored in directory: /home/jovyan/.cache/pip/wheels/a8/03/64/8557323821d25118c3a2dc1646996f7a962a8970d4b7d22473 Building wheel for future (setup.py) ... done Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491058 sha256=848abe5e6aabe3cb1bf1169f535def46d7a428dc9c6afbe0af737fba217b6fa5 Stored in directory: /home/jovyan/.cache/pip/wheels/2f/a0/d3/4030d9f80e6b3be787f19fc911b8e7aa462986a40ab1e4bb94 Successfully built autograd-gamma future Installing collected packages: wrapt, interface-meta, future, astor, autograd, formulaic, autograd-gamma, lifelines Successfully installed astor-0.8.1 autograd-1.4 autograd-gamma-0.5.0 formulaic-0.3.4 future-0.18.2 interface-meta-1.3.0 lifelines-0.27.1 wrapt-1.14.1
Please make sure to download the credentials.json
from https://accessclinicaldata.niaid.nih.gov/identity and upload to working directory of this notebook.
!gen3 --endpoint accessclinicaldata.niaid.nih.gov --auth credentials-4.json drs-pull object dg.NACD/e77ab887-7dfc-4c3b-879f-5e2c2dc0bc5f --no-unpack-packages
path_to_zip_file ="ACTT1_Datasets.zip"
with zipfile.ZipFile(path_to_zip_file, 'r') as zip_ref:
zip_ref.extractall()
Load the Dataset:
data = pd.read_csv("ACTT1_Datasets/ACTT_1_original/ACTT1.csv")
data.head()
USUBJID | TRTP | AGE | SEX | RACE | ETHNIC | BMI | REGION | STRATUM | ORDSCRG | ... | CANCERFL | IMMDFL | ASTHMAFL | COMORB1 | COMORB2 | ORDSCR15 | TTRECOV | RECCNSR | TTDEATH | DTHCNSR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | COV.00701 | Placebo | 74 | M | ASIAN | NOT HISPANIC OR LATINO | NaN | North America | Severe Disease | Baseline Clinical Status Score 5 | ... | N | N | N | No Comorbidities | No Comorbidities | 1 | 5 | 0 | 26 | 1 |
1 | COV.00702 | Remdesivir | 80 | M | WHITE | NOT HISPANIC OR LATINO | NaN | North America | Severe Disease | Baseline Clinical Status Score 5 | ... | N | N | N | Any Comorbidities | 1 Comorbidity | 2 | 6 | 0 | 28 | 1 |
2 | COV.00703 | Placebo | 36 | F | WHITE | NOT HISPANIC OR LATINO | 40.7 | North America | Severe Disease | Baseline Clinical Status Score 7 | ... | N | N | Y | Any Comorbidities | 2 or more Comorbidities | 7 | 25 | 1 | 26 | 1 |
3 | COV.00704 | Placebo | 75 | F | WHITE | NOT HISPANIC OR LATINO | 27.6 | North America | Severe Disease | Baseline Clinical Status Score 6 | ... | N | N | N | Any Comorbidities | 1 Comorbidity | 2 | 11 | 0 | 27 | 1 |
4 | COV.00705 | Placebo | 62 | F | WHITE | NOT HISPANIC OR LATINO | 32.5 | North America | Severe Disease | Baseline Clinical Status Score 6 | ... | N | N | N | Any Comorbidities | 2 or more Comorbidities | 5 | 26 | 0 | 28 | 1 |
5 rows × 31 columns
USUBJID
: Subject ID.TRTP
: Treatment group.AGE
: Age (<40; 40-64; 65 and older). SEX
: Sex (Female; Male).RACE
: Race (White; Black/African American; Asian; Other).ETHNIC
: Ethnic.BMI
: BMI.REGION
: Geographic region (North American sites; Asian sites; European sites).STRATUM
: Disease severity. "Severe disease" was defined as participants meeting one or more of the following criteria: requiring invasive or non-invasive mechanical ventilation, requiring supplemental oxygen, an SpO2 ≤ 94% on room air, or tachypnea (respiratory rate ≥ 24 breaths per minute). "Mild / moderate disease" was defined by a SpO2 > 94% and respiratory rate < 24 breaths per minute without supplemental oxygen requirement.ORDSCRG
: Baseline ordinal scale category (4; 5; 6; 7). BDURSYMP
: Duration of symptoms prior to enrollment.HYPFL
: Hypertension.CADFL
: Coronary artery disease. CHFFL
: Congestive heart failure. CRDFL
: Chronic respiratory disease. CORFL
: Chronic oxygen requirement. CLDFL
: Chronic liver disease CKDFL
: Chronic kidney disease. DIAB1FL
: Diabetes I.DIAB2FL
: Diabetes II.OBESIFL
: Obesity.CANCERFL
: Cancer. IMMDFL
: Immune deficiency. ASTHMAFL
: Asthma. COMORB1
: Comorbidity presence (None; Any).COMORB2
: Comorbidity number (None, One, Two or more).STRATUM
: Disease severity. "Severe disease" was defined as participants meeting one or more of the following criteria: requiring invasive or non-invasive mechanical ventilation, requiring supplemental oxygen, an SpO2 ≤ 94% on room air, or tachypnea (respiratory rate ≥ 24 breaths per minute). "Mild / moderate disease" was defined by a SpO2 > 94% and respiratory rate < 24 breaths per minute without supplemental oxygen requirement.ORDSCRG
: Baseline ordinal scale category (4; 5; 6; 7). BDURSYMP
: Duration of symptoms prior to enrollment.ORDSCR15
: The 8-point ordinal clinical status scale at Day 15.TTRECOV
: Time to recovery.RECCNSR
: Recovery censored.TTDEATH
: Time to death.DTHCNSR
: Death censored.Subject screening will begin with a brief discussion with study staff. Some will be excluded based on demographic data and medical history (i.e., pregnant, < 18 years of age, renal failure, etc.). In order to be eligible to participate in this study, a patient must meet all of the required criteria. Please find the additional study procedures details Here.
In the below table, summaries of age, sex, race, ethnicity, comorbidity and baseline clinical status score is presented by treatment groups.
columns = [
"AGE",
"SEX",
"RACE",
"ETHNIC",
"COMORB2",
"DIAB2FL",
"HYPFL",
"OBESIFL",
"ORDSCRG",
]
categorical = [
"SEX",
"RACE",
"ETHNIC",
"COMORB2",
"DIAB2FL",
"HYPFL",
"OBESIFL",
"ORDSCRG",
]
groupby = ["TRTP"]
labels = {
"AGE": "Age",
"SEX": "Sex",
"RACE": "Race",
"ETHNIC": "Ethnic group",
"COMORB2": "No. of coexisting conditions",
"DIAB2FL": "Type 2 diabetes",
"HYPFL": "Hypertension",
"OBESIFL": "Obesity",
"ORDSCRG": "Score on ordinal scale",
}
mytable = TableOne(
data,
columns=columns,
categorical=categorical,
groupby=groupby,
rename=labels,
pval=False,
)
mytable
Grouped by TRTP | |||||
---|---|---|---|---|---|
Missing | Overall | Placebo | Remdesivir | ||
n | 1062 | 521 | 541 | ||
Age, mean (SD) | 0 | 58.9 (15.0) | 59.2 (15.4) | 58.6 (14.6) | |
Sex, n (%) | F | 0 | 378 (35.6) | 189 (36.3) | 189 (34.9) |
M | 684 (64.4) | 332 (63.7) | 352 (65.1) | ||
Race, n (%) | AMERICAN INDIAN OR ALASKA NATIVE | 0 | 7 (0.7) | 3 (0.6) | 4 (0.7) |
ASIAN | 135 (12.7) | 56 (10.7) | 79 (14.6) | ||
BLACK OR AFRICAN AMERICAN | 226 (21.3) | 117 (22.5) | 109 (20.1) | ||
MULTIPLE | 3 (0.3) | 1 (0.2) | 2 (0.4) | ||
NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER | 4 (0.4) | 2 (0.4) | 2 (0.4) | ||
UNKNOWN | 121 (11.4) | 55 (10.6) | 66 (12.2) | ||
WHITE | 566 (53.3) | 287 (55.1) | 279 (51.6) | ||
Ethnic group, n (%) | HISPANIC OR LATINO | 0 | 250 (23.5) | 116 (22.3) | 134 (24.8) |
NOT HISPANIC OR LATINO | 755 (71.1) | 373 (71.6) | 382 (70.6) | ||
NOT REPORTED | 29 (2.7) | 14 (2.7) | 15 (2.8) | ||
UNKNOWN | 28 (2.6) | 18 (3.5) | 10 (1.8) | ||
No. of coexisting conditions, n (%) | 1 Comorbidity | 0 | 275 (25.9) | 137 (26.3) | 138 (25.5) |
2 or more Comorbidities | 579 (54.5) | 283 (54.3) | 296 (54.7) | ||
No Comorbidities | 194 (18.3) | 97 (18.6) | 97 (17.9) | ||
Unknown | 14 (1.3) | 4 (0.8) | 10 (1.8) | ||
Type 2 diabetes, n (%) | N | 11 | 729 (69.4) | 361 (69.6) | 368 (69.2) |
Y | 322 (30.6) | 158 (30.4) | 164 (30.8) | ||
Hypertension, n (%) | N | 11 | 518 (49.3) | 255 (49.1) | 263 (49.4) |
Y | 533 (50.7) | 264 (50.9) | 269 (50.6) | ||
Obesity, n (%) | N | 13 | 573 (54.6) | 284 (54.8) | 289 (54.4) |
Y | 476 (45.4) | 234 (45.2) | 242 (45.6) | ||
Score on ordinal scale, n (%) | Baseline Clinical Status Score 4 | 11 | 138 (13.1) | 63 (12.2) | 75 (14.1) |
Baseline Clinical Status Score 5 | 435 (41.4) | 203 (39.2) | 232 (43.5) | ||
Baseline Clinical Status Score 6 | 193 (18.4) | 98 (18.9) | 95 (17.8) | ||
Baseline Clinical Status Score 7 | 285 (27.1) | 154 (29.7) | 131 (24.6) |
The mean age of the patients was 58.9 years, and 64.4% were male.
On the basis of the evolving epidemiology of Covid-19 during the trial, 79.8% of patients were enrolled at sites in North America, 15.3% in Europe, and 4.9% in Asia.
Overall, 53.3% of the patients were White, 21.3% were Black, 12.7% were Asian, and 12.7% were designated as other or not reported; 250 (23.5%) were Hispanic or Latino.
Most patients had either one (25.9%) or two or more (54.5%) of the prespecified coexisting conditions at enrollment, most commonly hypertension (50.2%), obesity (44.8%), and type 2 diabetes mellitus (30.3%).
A total of 957 patients (90.1%) had severe disease at enrollment; 285 patients (26.8%) met category 7 criteria on the ordinal scale, 193 (18.2%) category 6, 435 (41.0%) category 5, and 138 (13.0%) category 4. Eleven patients (1.0%) had missing ordinal scale data at enrollment.
The Kaplan-Meier estimator is used to estimate the survival function. It measures the fraction of subjects who survived for a certain amount of survival time 𝑡 . Here, we apply the standard Kaplan–Meier method for estimating the time to recovery for both Remdesivir group and Placebo group in the overall population.
ix = data["TRTP"] == "Placebo"
ax = plt.subplot(111)
kmf_exp = KaplanMeierFitter()
ax = kmf_exp.fit(
data.loc[~ix]["TTRECOV"],
(1 - data.loc[~ix]["RECCNSR"]),
label="Remdesivir",
timeline=range(0, 29),
).plot_cumulative_density(ax=ax)
kmf_control = KaplanMeierFitter()
ax = kmf_control.fit(
data.loc[ix]["TTRECOV"],
(1 - data.loc[ix]["RECCNSR"]),
label="Placebo",
timeline=range(0, 29),
).plot_cumulative_density(ax=ax)
ax.set_title("Overall Kaplan-Meier Estimates of Cumulative Recoveries")
ax.set_xlabel("Days", fontsize=10)
ax.set_ylabel("Proportion Recovered", fontsize=10)
ax.xaxis.set_ticks(np.arange(0, 29, 2))
ax.set_ylim(0, 1)
from lifelines.plotting import add_at_risk_counts
#add_at_risk_counts(kmf_exp, kmf_control, ax=ax, rows_to_show=['At risk'])
add_at_risk_counts(kmf_exp, kmf_control, ax=ax)
<AxesSubplot:title={'center':'Overall Kaplan-Meier Estimates of Cumulative Recoveries'}, xlabel='Days', ylabel='Proportion Recovered'>
Panel A shows the estimates (and 95% confidence bands) in the population with baseline ordinal scale = 4; Panel B in those with baseline ordinal scale = 5; Panel C in those with baseline ordinal scale = 6; Panel D in those with baseline ordinal scale = 7.
group1 = data[data["TRTP"] == "Remdesivir"]
group2 = data[data["TRTP"] == "Placebo"]
T = group1["TTRECOV"]
E = group1["RECCNSR"]
T1 = group2["TTRECOV"]
E1 = group2["RECCNSR"]
clinical_status_list = [x for x in data["ORDSCRG"].unique().tolist() if str(x) != "nan"]
clinical_status_list.sort()
fig, axes = plt.subplots(2, 2, figsize=(9, 7), constrained_layout=True)
fig.suptitle(
"Kaplan–Meier Estimates of Cumulative Recoveries in Patients with Various Baseline Scores",
fontsize=12,
)
axes = axes.reshape(4,)
for i, clinical_status in enumerate(clinical_status_list):
ix = data["ORDSCRG"] == clinical_status
kmf_control = KaplanMeierFitter()
kmf_exp = KaplanMeierFitter()
ax = kmf_exp.fit(
T[ix], (1 - E[ix]), label="Remdesivir", timeline=range(0, 29)
).plot_cumulative_density(ax=axes[i])
ax = kmf_control.fit(
T1[ix], (1 - E1[ix]), label="Placebo", timeline=range(0, 29)
).plot_cumulative_density(ax=axes[i])
ax.set_title(clinical_status, fontsize=10)
ax.set_xlabel("Days", fontsize=9)
ax.set_ylabel("Proportion Recovered", fontsize=9)
ax.xaxis.set_ticks(np.arange(0, 29, 2))
ax.set_ylim(0, 1)
ax.text(
-0.1,
1.1,
string.ascii_uppercase[i],
transform=ax.transAxes,
size=10,
weight="bold",
)
Patients in the remdesivir group had a shorter time to recovery than patients in the placebo group.
Cox Proportional Hazards Model is a semi-parametric model in the sense that the baseline hazard function does not have to be specified i.e it can vary, allowing a different parameter to be used for each unique survival time. But, it assumes that the rate ratio remains proportional throughout the studied period. This results in increased flexibility of the model. A fully-parametric proportional hazards model also assumes that the baseline hazard function can be parameterized according to a particular model for the distribution of the survival times.
Here we calculate the hazard ratios using the stratified Cox model. Run the following cell to fit the Cox Proportional Hazards model using the lifelines
package.
Data Wrangling:
di_race = {
"WHITE": "White",
"BLACK OR AFRICAN AMERICAN": "Black",
"ASIAN": "Asian",
"UNKNOWN": "Other",
"AMERICAN INDIAN OR ALASKA NATIVE": "Other",
"MULTIPLE": "Other",
"NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER": "Other",
}
di_ethnic = {
"HISPANIC OR LATINO": "Hispanic or Latino",
"NOT HISPANIC OR LATINO": "Not Hispanic or Latino",
"UNKNOWN": "Unknown",
"NOT REPORTED": "Unknown",
}
di_sex = {"M": "Male", "F": "Female"}
data1 = data.replace({"RACE": di_race, "ETHNIC": di_ethnic, "SEX": di_sex})
age_category = pd.cut(
data1.AGE, bins=[18, 40, 65, 95], labels=["18 to <40 yr", "40 to <65 yr", "≥65 yr"]
)
data1["AGE"] = age_category
symp_category = pd.cut(
data1.BDURSYMP, bins=[0, 10, 46], labels=["≤10 days", ">10 days"]
)
symp_category = symp_category.cat.add_categories(["Unknown"])
data1["BDURSYMP"] = symp_category
data1.loc[data1["BDURSYMP"].isnull() == True, "BDURSYMP"] = "Unknown"
model_expr = "TTRECOV ~ STRATUM + TTRECOV + C(REGION) + C(RACE) + C(ETHNIC) + C(AGE) + C(SEX) + C(BDURSYMP) + C(ORDSCRG) + RECCNSR"
y, X = dmatrices(model_expr, data1, return_type="dataframe")
X = X[X.columns.drop(list(X.filter(regex="Unknown")))]
X.head()
Intercept | STRATUM[T.Severe Disease] | C(REGION)[T.Europe] | C(REGION)[T.North America] | C(RACE)[T.Black] | C(RACE)[T.Other] | C(RACE)[T.White] | C(ETHNIC)[T.Not Hispanic or Latino] | C(AGE)[T.40 to <65 yr] | C(AGE)[T.≥65 yr] | C(SEX)[T.Male] | C(BDURSYMP)[T.>10 days] | C(ORDSCRG)[T.Baseline Clinical Status Score 5] | C(ORDSCRG)[T.Baseline Clinical Status Score 6] | C(ORDSCRG)[T.Baseline Clinical Status Score 7] | TTRECOV | RECCNSR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 5.0 | 0.0 |
1 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 6.0 | 0.0 |
2 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 25.0 | 1.0 |
3 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 11.0 | 0.0 |
4 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 26.0 | 0.0 |
Create Model and fit the data to train the model:
warnings.simplefilter("ignore")
cph = CoxPHFitter()
cph.fit(
df=X,
duration_col="TTRECOV",
event_col="RECCNSR",
strata=["STRATUM[T.Severe Disease]"],
)
# Plot the HR
cph.plot(hazard_ratios=True)
<AxesSubplot:xlabel='HR (95% CI)'>
# Have a look at the significance of the features
cph.print_summary()
model | lifelines.CoxPHFitter |
---|---|
duration col | 'TTRECOV' |
event col | 'RECCNSR' |
strata | [STRATUM[T.Severe Disease]] |
baseline estimation | breslow |
number of observations | 1051 |
number of events observed | 300 |
partial log-likelihood | -1456.67 |
time fit was run | 2022-06-27 18:27:32 UTC |
coef | exp(coef) | se(coef) | coef lower 95% | coef upper 95% | exp(coef) lower 95% | exp(coef) upper 95% | cmp to | z | p | -log2(p) | |
---|---|---|---|---|---|---|---|---|---|---|---|
C(REGION)[T.Europe] | -1.08 | 0.34 | 0.45 | -1.95 | -0.20 | 0.14 | 0.82 | 0.00 | -2.40 | 0.02 | 5.94 |
C(REGION)[T.North America] | -1.01 | 0.36 | 0.41 | -1.83 | -0.20 | 0.16 | 0.82 | 0.00 | -2.44 | 0.01 | 6.10 |
C(RACE)[T.Black] | 0.24 | 1.27 | 0.26 | -0.27 | 0.76 | 0.76 | 2.13 | 0.00 | 0.92 | 0.36 | 1.48 |
C(RACE)[T.Other] | 0.13 | 1.14 | 0.31 | -0.48 | 0.74 | 0.62 | 2.10 | 0.00 | 0.42 | 0.67 | 0.57 |
C(RACE)[T.White] | 0.10 | 1.10 | 0.25 | -0.39 | 0.58 | 0.68 | 1.79 | 0.00 | 0.39 | 0.70 | 0.52 |
C(ETHNIC)[T.Not Hispanic or Latino] | -0.10 | 0.90 | 0.17 | -0.42 | 0.22 | 0.65 | 1.25 | 0.00 | -0.61 | 0.54 | 0.88 |
C(AGE)[T.40 to <65 yr] | -0.69 | 0.50 | 0.21 | -1.09 | -0.29 | 0.34 | 0.75 | 0.00 | -3.36 | <0.005 | 10.31 |
C(AGE)[T.≥65 yr] | -0.66 | 0.52 | 0.20 | -1.06 | -0.26 | 0.35 | 0.77 | 0.00 | -3.22 | <0.005 | 9.61 |
C(SEX)[T.Male] | 0.10 | 1.10 | 0.13 | -0.16 | 0.35 | 0.86 | 1.42 | 0.00 | 0.75 | 0.45 | 1.15 |
C(BDURSYMP)[T.>10 days] | 0.06 | 1.06 | 0.13 | -0.19 | 0.31 | 0.83 | 1.37 | 0.00 | 0.49 | 0.63 | 0.68 |
C(ORDSCRG)[T.Baseline Clinical Status Score 5] | -0.10 | 0.91 | 0.60 | -1.27 | 1.07 | 0.28 | 2.92 | 0.00 | -0.17 | 0.87 | 0.20 |
C(ORDSCRG)[T.Baseline Clinical Status Score 6] | -0.35 | 0.71 | 0.60 | -1.52 | 0.82 | 0.22 | 2.28 | 0.00 | -0.58 | 0.56 | 0.84 |
C(ORDSCRG)[T.Baseline Clinical Status Score 7] | -0.13 | 0.88 | 0.59 | -1.29 | 1.04 | 0.27 | 2.82 | 0.00 | -0.21 | 0.83 | 0.27 |
Concordance | 0.56 |
---|---|
Partial AIC | 2939.33 |
log-likelihood ratio test | 19.88 on 13 df |
-log2(p) of ll-ratio test | 3.35 |
This notebook replicates the analysis in a recently published randomized comparative trial evaluating remdesivir treatment for COVID-19. In the workspace of NIAID Clinical Trials Data Commons, researchers can apply the commonly used survival analysis techniques, such as the Kaplan–Meier method, Cox model, hazard ratio method, etc. Clinical investigators are encouraged to consider applying these methods for quantifying treatment effects in future studies of COVID-19 using NIAID Clinical Trials Data Commons.