Skip to main content

SoBigData Articles

Emerging risk factors for recurrent and persistent differentiated thyroid cancer

Introduction

Proper risk stratification of patients with differentiated thyroid cancer (DTC) is essential to avoid both unnecessary diagnostic procedures in low-risk patients and clinical inertia in cases that may require more aggressive treatment (1). One of the most widely used tools for prognostic stratification of patients with DTC is included in the American Thyroid Association (ATA) guidelines, which has been validated in several retrospective single-center studies (2) and developed on the basis of data from a literature review that included studies on different populations, settings and timeframes. In the prospective cohort study described below, we analyzed data from over 4000 cases of DTC managed in 40 different healthcare facilities in Italy. Our objectives were (i) to develop a comprehensive, data-driven predictive model capable of capturing the characteristics available at the time of initial treatment, (ii) to compare performance with the current ATA risk score and (iii) to determine the relative weight of various potential predictors.

Data

The data used originate from the web-based database of the Italian Thyroid Cancer Observatory (ITCO), opened in 2013 at the Thyroid Cancer Center of the Sapienza University of Rome and including 49 other thyroid cancer centers in Italy. The data collected on more than 10000 patients contain demographic and biometric information, circumstances of diagnosis, tumor pathology, surgical and radioactive iodine treatments, and results of periodic follow-up examinations. Cases with histological diagnosis of DTC have responses to initial treatment classified as excellent, biochemically incomplete, structurally incomplete or indeterminate based on data collected during the clinical evaluation at the 1-year follow-up visit.

Model

A Decision Tree model was chosen for its simplicity, which allows us to investigate in more detail the impact of different variables on prediction, critical in the medical field, and to apply these results directly to clinical practice (3). With the task of assigning a risk index to each patient, we performed an extensive 3-fold Cross Validation GirdSearch to optimize the model parameters. Two different models were adopted: the first employed all available variables, while the second excluded variables derived from radioiodine treatment, as this is prescribed by physicians, so independence between the treatment variables and tumor recurrence cannot be assumed.

Results

The final model outperforms the ATA risk stratification system: the sensitivity of the high-risk classification for structural incomplete response increases from 37% to 49%. The negative predictive value for the low-risk patients also increases, although only by about 3%. If radioiodine-derived data are used as input, the sensitivity of the high-risk classification for structural disease increases to 54.8%.

Feature Importance

The table shows the importance of the individual features according to the decision tree. Feature importance for the decision tree is measured as the reduction in impurity at a node weighted by the probability of reaching that node (i.e. number of samples reaching that node divided by total number of samples). 

 

No radioiodine-derived data

All data

Histological extra-thyroid extension

17.1

5.6

Histological tumor size (mm)

12.4

6.7

M status

10.4

1.2

Age at diagnosis

9.7

7.4

Number of metastatic L. nodes

8

6.1

BMI

7.7

3.8

Surgical margins

3.5

2.5

Circumstances of thyroid nodule diagnosis

2.7

1.2

Histology subtypes

2.6

1.2

Pre-surgical cytology

2.5

2.8

Family history of thyroid nodules

2.4

1.8

Surgical approach

2.3

1.9

Sex

2.1

<1

N status

2.1

1.3

Histology tumoral foci

2

<1

Family history of thyroid cancers

1.7

3.1

Number of removed lymph nodes

1.6

<1

Presence of (any) somatic mutation

1.3

<1

Histology vascular invasion

1.1

2.3

Circumstances of thyroid cancer diagnosis (pre- vs. post-surgical)

1.1

0.9

Neck dissection

0.9

2.5

RAI therapy, patient preparation

/

1.5

RAI therapy, anti-Tg antibodies before treatment

/

2.3

Decision to perform RAI therapy

/

5.6

Stimulated Tg (ng/mL) before RAI therapy

/

33.4

Table: Feature importance according to the Decision Tree algorithm. Only columns with importance plus or equal to 1% are shown. Highlighted rows include features not considered in the current ATA risk stratification system. The tree with access to data derived from radioiodine treatment assigns much importance to these variables (over 40% of the total) and uses less variables than the other. It is important to consider that many of the data that lose importance (e.g. tumor size, extrathyroidal extension, presence of distant metastases) are probably highly correlated with the decision to perform radioiodine treatment.

New risk stratification

The data derived from the decision tree were used to revise the ATA-proposed risk stratification: the presence of distant metastases and lymph node metastases generates the first discriminations, and more specific groups are then derived by the number on lymph node metastases, tumor size, completeness of surgical resection, Hürthle cell histotype, body mass index, and age.

Although the role of overweight and obesity is disputed (4), our data show that body composition plays a role in predicting initial treatment response. BMI acts as a risk modifier, refining patient clusters only after other discriminating factors are applied. It should also be noted that both age and BMI take on opposite meanings in the different patient clusters of this cohort, being either a protective or risk characteristic. This suggests that their interaction with other variables is complex and that they may act as surrogate markers of other characteristics.

 

References
  • Lamartina L, Grani G, Durante C, Filetti S 2018 Recent advances in managing differentiated thyroid cancer. F1000Res 7:86.
  • Pitoia F, Jerkovich F, Urciuoli C, Schmidt A, Abelleira E, Bueno F, Cross G, Tuttle RM 2015 Implementing the Modified 2009 American Thyroid Association Risk Stratification System in Thyroid Cancer Patients with Low and Intermediate Risk of Recurrence. Thyroid 25:1235-1242.
  • Podgorelec V, Kokol P, Stiglic B, Rozman I 2002 Decision trees: an overview and their use in medicine. J Med Syst 26:445-463.
  • Matrone A, Ceccarini G, Beghini M, Ferrari F, Gambale C, D'Aqui M, Piaggi P, Torregrossa L, Molinaro E, Basolo F, Vitti P, Santini F, Elisei R 2020 Potential Impact of BMI on the Aggressiveness of Presentation and Clinical Outcome of Differentiated Thyroid Cancer. J Clin Endocrinol Metab 105.