Risk factors for metastatic cutaneous squamous cell carcinoma: Refinement and replication based on 2 nationwide nested case-control studies

,


INTRODUCTION
Cutaneous squamous cell carcinoma (cSCC) is one of the most common cancers worldwide with metastatic potential. 1,2The high incidence of primary cSCC makes it challenging to correctly identify the small percentage (2%-5%) of patients who are at high risk of metastasis and would benefit from intense surveillance and/or adjuvant treatment strategies.][5][6] However, these studies were mainly based on single-center retrospective cohorts with relatively small numbers of metastatic cSCC, resulting in insufficient power to draw firm conclusions.Brantsch et al 5 concluded that large independent validation studies (involving [1500 patients) are needed to reliably assess risk factors for metastasis.][9][10] We aimed to analyze important patient-and tumor-based risk factors for metastasis using the largest data set, to our knowledge, of metastatic cSCC so far and thereafter to replicate our results in a geographically separate patient population.

Study design
We conducted 2 nested case-control studies using data from England and the Netherlands.Cases and controls were 1:1 matched on minimum follow-up time.Follow-up time for cases ended on the date of metastasis and, for controls, on the date of death or end of follow-up, whichever occurred first.Metastases from potential sources other than skin or unknown origin were excluded.

Patient populations
Main analyses (England).Data from all patients with a histopathologically confirmed primary cSCC with diagnosis and excision between January 1, 2013, and December 31, 2015, were included from the National Disease Registration Service, England. 11Patient inclusion and metastasis selection procedures have been described previously. 11,12hortly, patients with metastatic cSCC (cases) were identified using an extensive algorithm, and all identified reports were reviewed by Z.C.V. with the second opinion from B.R. in ambiguous cases.Thereafter, controls were randomly selected from patients with a cSCC in 2013 with no metastasis occurrence until the end of follow-up (ie, December 31, 2015).Replication (the Netherlands).The cases and controls were selected from a Dutch nationwide cohort of patients with a histopathologically confirmed first primary cSCC in 2007/ 2008, as registered by the Netherlands Cancer Registry, 13 which has been linked to the nationwide network and registry of histopathology and cytopathology (PALGA) for the retrieval of subsequent and metastatic cSCCs up to December 31, 2018. 14The selection of metastatic cSCCs has been described before. 10Shortly, metastases were identified using an algorithm based on the pathology reports.Thereafter, all selected reports from potential cases were reviewed manually, and nonmetastatic controls were selected from the remaining patients.

Risk factors
In the English data set, patient characteristics were derived from the patient administration systems.To assess for immunosuppression, registry data and hospital episode statistics were analyzed for diagnosis/operation codes associated with solid organ transplantations or hematologic malignancies before the date of primary tumor diagnosis or within 183 days.In the Dutch data set, data regarding age, sex, and hematologic malignancies were derived from the Netherlands Cancer Registry and data on solid organ transplantations through linkage with the Netherlands Organ Transplant Registry. 15The number of previous cSCCs were retrieved from pathology reports and counted manually for each patient until the occurrence of the case or corresponding control.
All tumor characteristics were extracted from pathology (A.L.M.) reports for the English data set and included: tumor location (face, scalp and neck, trunk and limbs), macroscopic diameter as measured CAPSULE SUMMARY d Risk factors for cutaneous squamous cell carcinoma metastasis have previously been investigated in small data sets with relatively few metastases.
d Using 2 large nationwide data sets, diameter, thickness, poor differentiation, invasion in and beyond subcutaneous fat, perineural/lymphovascular invasion, male sex, and facial localization were determined to be the important risk factors for metastasis.by a pathologist in millimeters, thickness in millimeters, differentiation grade (good/moderate vs poor/ undifferentiated), morphology (acantholytic/desmoplastic/spindle vs none), perineural/lymphovascular invasion (yes/no), and a variable on the extent of tissue involvement (dermis/subcutaneous fat/ beyond subcutaneous fat [ie, in muscle/cartilage/ bone]).If the invasion depth was not stated in the pathology report, the tumor was assumed as not invading beyond subcutaneous fat.If a tumor was described as a ''minimally invasive cSCC,'' it was assumed to be less than Clark level 5 and well differentiated.The employed method for measuring thickness was assumed to follow the Royal College of Pathologists guidance. 16n the Netherlands, formalin-fixed paraffinembedded specimens of the excisions of all cases and controls were retrieved from the pathology archives.A new histopathologic slide was scored by a dermatopathologist (A.L.M.) who was blinded for the outcome of the aforementioned risk factors.Tumor diameter was the only variable extracted from the pathology reports and comprised the macroscopic diameter as measured by the pathologist.Tumor thickness was measured according to Breslow criteria from the granular layer of the skin to the deepest point of the tumor.Differentiation grade was scored following the adjusted Broder classification system 17 (\25% undifferentiated cells: well; 25%-75% undifferentiated cells: moderate; [75% undifferentiated cells: poor).In our analyses, we dichotomized differentiation grade into good/ moderate vs poor differentiation.

Statistical analyses
Conditional logistic regression analyses with backward stepwise selection identified the set of statistically significant metastasis risk factors in the English data set.A 2-sided statistical significance level of P = .10was used in the backward stepwise selection to reduce optimism and selection bias.Variance inflation factors were calculated, with no evidence for multicollinearity.Missing values for covariates were imputed 20 times using multivariate imputation by chained equations.The imputation model included all covariates, the outcome, and, for the English data set, ethnicity and deprivation as auxiliary variables.For the continuous variables age, the number of previous cSCCs, diameter, and thickness, restricted cubic splines with 3 knots were used to evaluate a possible nonlinear relationship with the metastasis outcome.To facilitate interpretation, nonlinear variables were categorized into clinically relevant categories on the basis of 2 criteria: (1) increase of at least 2 odds ratio (OR) points per category per variable and (2) as little as possible overlap between the confidence intervals (CIs) of the categories within a variable.For comparison purposes, we categorized diameter and thickness following the American Joint Committee on Cancer eighth edition (AJCC8) 18 criteria and, for diameter, using the Brigham and Women's Hospital (BWH) 19 staging system.Since the BWH classification does not specify any criteria for thickness, this was included as a continuous variable.The discriminative ability of the final set of risk factors was assessed by Harrell's concordance index (c-index) in the main analyses.The final risk factors were replicated in the Dutch data set to investigate whether similar ORs would be found.The c-indices of the models could not be compared because no absolute risk model was available.Our model fit was compared with that of the AJCC8 and BWH using Nagelkerke's pseudo R 2 measure, which explains the improvement in model likelihood over a null model and can be used to compare different models using the same data set. 20This measure ranges from 0 to 1, and the higher the value, the better the model predicts the outcome.
Ethical approval and informed consent were not required for analyzing data from the National Disease Registration Service following section 251 of the National Health Service act 2006. 21pproval was obtained from the scientific committees of the Netherlands Cancer Registry, PALGA, Dutch Transplant Foundation, and a waiver of informed consent was granted by the Erasmus Medical Center (MEC-2020-0147).Statistical analyses were performed using SPSS 25.0 statistical software (SPSS Inc) and R statistical software version 3.4.1 with the clogit package (R Core Team, 2017).

RESULTS
In total, 887 metastatic cases and 887 nonmetastatic controls (n = 1774) were included from the English data set for the main analyses, and 217 cases and 217 controls (n = 434) were included from the Dutch data set for the replication analyses (Table I

Categorizations of diameter and thickness
Table III (''English OR'') shows the categorizations of the continuous variables diameter and thickness into clinically relevant categories as defined a priori, adjusted for all other covariates from the final model.For diameter, the reference category consisted of tumors measuring \15 mm, with 15 mm to 30 mm and $30 mm producing increasing ORs with distinct CIs: 2.29 (95% CI, 1.52-3.47)and 6.82 (95% CI, 3.58-13.00),respectively.For thickness, the reference category included all cSCCs with a thickness of \3 mm, followed by the categories 3.0 mm to 8.0 mm and $8.0 mm.Although the ORs per category showed an increasing trend, the CIs were slightly overlapping: 3.21 (95% CI, 1.98-5.22)and 5.59 (95% CI, 2.75-11.36),respectively.Categorizing the diameter and thickness variables did not change the c-index or the pseudo R 2 of our final model.For comparison, ORs for diameter and thickness with cutoff values from the AJCC8 and BWH classifications were also calculated (Supplementary Table III, available via Mendeley at https://doi.org/10.17632/6z4tpsmdwt.1).

Replication
In the Dutch data set, similar effect estimates were observed for all metastasis risk factors (Table II, ''Dutch OR'').The effect plots for the spline functions of diameter and thickness are shown in Supplementary Figs 2 and 3 (available via Mendeley at https://doi.org/10.17632/6z4tpsmdwt.1).However, replication of the categorized diameter and thickness variables failed to meet our predefined criteria in the multivariable model (Table III, ''Dutch OR''): the diameter categories showed overlapping CIs and the thickness categories failed to produce an increasing trend, with almost equal ORs of 1.33 (95% CI, 0.69-2.57)and 1.47 (95% CI, 0.55-3.98),respectively.Nevertheless, univariable analyses showed increasing ORs with increasing diameter and thickness values, and the distribution of both variables was distinct between cases and controls, comparable to the pattern observed in the English data set (Supplementary Table IV and Supplementary Fig 4,  available via Mendeley at https://doi.org/10.17632/6z4tpsmdwt.1).The pseudo R 2 measure of the replicated model was 0.52 (95% CI, 0.44-0.74)for the model with splines and 0.48 (95% CI, 0.40-0.71)for the model with the categorized diameter and thickness variables compared with 0.25 (95% CI, 0.14-0.42)for AJCC8 and 0.36 (95% CI, 0.23-0.53)for BWH.

DISCUSSION
We analyzed the most common risk factors for metastatic cSCC using 2 large nationwide data sets.We confirmed the previously found significant associations for diameter, thickness, poor differentiation, deep invasion, and perineural/lymphovascular invasion.Clinical parameters such as sex and body site were also significant risk factors, whereas immune status did not remain in the model.Replication of our risk factors produced similar effect estimates, supporting our findings.
5][6]22 Sex is not included in current staging systems but was an important risk factor in our study, which was previously also seen for melanoma. 23This could be due to biological sex differences, delayed presentation, greater UVexposure secondary to less protection from hair coverage, or outdoor occupations/hobbies.Differentiation grade is included in the BWH but has been omitted from the AJCC8 related to reproducibility issues. 24We obtained good model discrimination by dichotomizing differentiation grade and believe that by removing the middle category, reproducibility may increase.6]10 This may be due to immunosuppressed patients having worse tumor characteristics rather than immunosuppression itself underlying the elevated risk.In our data set, we confirmed that immunosuppressed patients were significantly more likely to have tumors invading in/beyond subcutaneous fat and have perineural/lymphovascular invasion (data not shown).Another explanation might be an underestimation of immunosuppressed patients in the English data set due to the use of diagnostic codes for immune suppression.However, in the Dutch data set, there was nationwide coverage of organ transplantation and hematologic malignancies and yet immunosuppression still remained insignificant.Lastly, a lack of statistical power could explain this result as only a small proportion of the patients (England: 10%; the Netherlands: 14%) were immunosuppressed.
Diameter and thickness have been analyzed with a robust methodologic approach using splines.To apply the results easily in clinical practice, variables were categorized thereafter, leaving the c-index of 0.96 unchanged.Although categorization of these variables failed to meet our predefined criteria in the Dutch data set, the ORs showed a more gradual increase per category, there was less overlap between the CIs, and our model fit was better than the cut-offs from the AJCC8 and BWH.The risk estimates for diameter categories were comparable in both data sets; however, the risk estimates for thickness categories were lower in the Dutch data set than in the English data set.This could be due to the smaller sample size and correlation with other variables, as the ORs were increasing for increasing thickness categories in univariable analyses.Also, the English data set comprised larger and thicker tumors among cases and controls than the Dutch data set.This could be due to differences in health care systems and the number of outliers and may also be a reason for the observed differences in the Dutch data set.

Strengths and limitations
Important strengths are the magnitude of our data set and the availability of a second geographically separate data set for replication, which is essential to determine the reproducibility and generalizability. 25urthermore, the Dutch data set contained very few missing values as all histopathologic slides were reassessed by a dermatopathologist.Limitations included the use of routine pathology reports and the assumption of reporting according to the Royal College of Pathologists standards in the English data set without a possibility for a reassessment of histologic slides.An underestimation of thickness could have occurred in both data sets if the tumor was incompletely excised at the bottom.In the English data set, data on immune status were incomplete, and number of previous cSCCs were assessed during 2013-2015, with incomplete access to earlier years.Moreover, perineural/ lymphovascular invasion and several body sites were grouped together owing to otherwise too small sample sizes, which hindered analyzing the effect of each variable separately.Also, no data on nerve diameter were available for perineural invasion.Lastly, our model is not suitable to provide absolute metastasis risks owing to its nested case-control design.The current challenge remains in translating the relative risks of this population-based model into an individual prediction model that provides absolute risks.

CONCLUSION
Using 2 large nationwide data sets with a total of 1104 metastatic cSCCs, we identified patient-and tumor-based risk factors with a c-index of 0.96 in the development data set.Comparison of our final set of risk factors with the AJCC8 and BWH showed higher pseudo R 2 measures in both data sets.Following tumor diameter and thickness, poor differentiation proved an important risk factor for metastasis, despite being omitted from the AJCC8, thereby emphasizing the importance of reviewing and refining current staging systems.This work uses data that have been provided by patients and collected by the National Health Service as part of their care and support.The data are collated, maintained, and quality-assured by the National Disease Registration Service, which is part of Public Health England.

Fig 1 .Fig 2 .
Fig 1.Effect plot of the spline function for diameter with the metastasis Odds ratio.

Table I .
Descriptive characteristics of the English (n = 1774) and Dutch (n = 434) data sets, stratified by metastasis outcome ; Supplementary Fig 1, available via Mendeley at

Table II .
Final model with significantly remaining risk factors for metastatic cutaneous squamous cell carcinoma in the English data set, replicated in the Dutch data set OR, Odds ratio.

Table III .
Categorizations for the spline functions of diameter and thickness in the English data set, replicated in the Dutch data set OR, Odds ratio.*Adjusted for all covariates in a multivariable model.