Agreement among Colposcopists on the Identification of Three Digital Images More Frequently Seen in Glandular Cervical Precursor Neoplasias

Background : Global strategies to eliminate cervical cancer will probably be followed by a drop in prevalence of precursor cervical neoplasias, leading to the need of improving colposcopic diagnostic performance that may negatively be affected. The aim of this study was to assess agreement among five colposcopists regarding the presence of three isolated colposcopic images, and different degrees of colposcopic findings. Methods : In this retrospective study, two original colposcopists examined colposcopic images of patients treated between 2005 and 2018, classified them following the International Federation for Cervical Pathology and Colposcopy terminology, and evaluated them for the presence of obstructed dilated grouped glands, aceto-white villi with invaginated borders fused or not, and atypical vessels in cylindrical epithelium area. Posteriorly, three independent colposcopists also classified those colposcopic findings. The degree of agreement between the findings of the three independent, and the two original colposcopists was assessed using the Kappa ( κ ) coefficient. Results : Among the 822 included patients, 67.4% had a diagnosis of cervical intraepithelial neoplasia (CIN) grades 2 or 3, 6.8% of adenocarcinoma in situ , and 11.8% of CIN 1. The agreement for each image ranged from κ 0.14 to 0.37 ( p < 0.001). The highest agreements occurred for aceto-white villi with invaginated borders ( κ 0.15–0.37), major ( κ 0.29–0.46), and minor ( κ 0.14– 0.36) colposcopic findings ( p ≤ 0.001). Conclusions : The agreement among the three independent, and the two original colposcopists was statistically significant, ranging from weak to regular for the identification of three isolated colposcopic images, and from weak to moderate for the identification of major and minor colposcopic findings.


Introduction
In Brazil, after excluding non-melanoma skin tumors, cervical cancer is the most frequent in the North (22.47/100,000), and the second in the Northeast (17.62/100,000) and Midwest Regions (15.92/100,000) [1].In the United States, carcinomas, i.e., tumors of epithelial origin, account for about 98% of cervical cancers, among which, squamous-type carcinomas represent 64.4% of the cases, while different subtypes of glandular carcinomas correspond to 28.9% of the cases [2].
The World Health Organization has proposed two main strategies to enable its ambitious project to reduce the incidence of cervical cancer to age-adjusted annual rates of less than 4/100,000 women by the end of the twenty-first century.The first is the broad immunization of girls under 15 against human papillomavirus (HPV), and the second is the implementation of high-sensitivity screening based on the detection of HPV DNA [3].
Adenocarcinoma in situ (AIS) has been recognized as the precursor to invasive cervical adenocarcinoma [4], and its detection by cervical cytology has traditionally been less than ideal [5].The introduction of high-sensitivity screening should allow earlier diagnoses not only of squamous cervical precursor neoplasias, but also, and especially of those of glandular origin [5].
Women considered at risk of cervical precursor neoplasias during screening, including those who have been followed for already treated cervical intraepithelial neoplasias (CIN), should undergo colposcopy, either with or without targeted biopsy [6].At this stage of the diagnostic investigation, this exam is considered the gold standard [6], despite its intrinsic subjectivity [7], and restricted efficiency to identify glandular cervical precursor neoplasias [8,9].It is necessary to take into consideration that CIN treated women continue at a higher risk of cervical malignancies than the general population [10].
Thus, strategies to improve the diagnostic performance of colposcopy should be undertaken so that smaller and more subtle neoplasias, more likely to be detected in earlier screened patients [11], are effectively identified.New studies on specific patterns of colposcopic images are, therefore, desirable, because publications of this nature have already contributed to increase the specificity of colposcopy.The colposcopic signs named inner border and ridge sign were described in 2009 [12,13], and incorporated into the colposcopic nomenclature of the International Federation of Cervical Pathology and Colposcopy (IFCPC) in 2011 [14], precisely because of their high specificity for detecting CIN grades 2 and 3.
The agreement among colposcopists on the detection of the images described by the IFCPC terminology has been evaluated in a few classic studies and ranged from weak to substantial [15][16][17][18][19][20][21][22][23][24], again showing the need to expand the research in this field.However, this worldwide accepted parameter does not include colposcopic images more frequently found in glandular cervical precursor lesions or AIS.Therefore, to the best of our knowledge, no previous studies have been conducted to evaluate the agreement on the detection of images related to AIS.
Considering the need to improve the colposcopic diagnostic performance, the present study aimed to assess the agreement between three independent colposcopists, previously trained using a manual with digital images (Supplementary Material), and the consensual finding of two original colposcopists.Agreement on the colposcopic findings grading and detection of three specific colposcopic images, namely obstructed dilated grouped glands, acetowhite villi with invaginated borders fused or not, and atypical vessels in cylindrical epithelium area [25], from now on called here grouped glands, aceto-white villi, and atypical vessels, were evaluated.

Materials and Methods
This retrospective cross-sectional study, approved by the Research Ethics Committee of the Hospital das Clínicas, Universidade Federal de Goiás (CAAE: 03421418.8.0000.5078),was conducted in a private colposcopy service, and followed the principles of the Declaration of Helsinki [26].A written and signed consent was waived since only digital images and medical records were reviewed without identifying the patients included in this research.
Five experienced colposcopists reviewed filed digital images (640 × 456 pixels or 720 × 480 pixels) of patients who underwent colposcopy between 2005 and 2018.They classified the colposcopic findings into normal, minor findings, major findings, or suspicious for invasion, according to the terminology proposed by IFCPC [14].Furthermore, they sought to identify within the transformation zone (TZ) with major colposcopic findings, the three aforementioned colposcopic images grouped glands, aceto-white villi, and atypical vessels (Fig. 1) [25].
Files of all patients examined between 2005 and 2018, and diagnosed with cervical intraepithelial neoplasias grades 1, 2, or 3, or AIS after an excisional procedure, were included.To this initial list of files, randomly selected digital image files obtained from patients with both normal and abnormal initial colposcopy, but without CIN, were added.Images with no visible SCJ and/or insufficient quality for reading were excluded.
The two original colposcopists created a manual with digital images of 61 patients not included in this study, used for training the three independent colposcopists.The former had access to all the data collected and jointly and consensually identified the cases presenting the three colposcopic images of interest.The three independent colposcopists, experts from other services, received a spreadsheet containing information of all cases, except the degrees of colposcopic findings, and histopathological diagnosis.Subsequently, they reviewed the filed digital images of the cases included in the study, recorded the presence of each of the three aforementioned images, the degree of colposcopic finding, and the quality of the images.
The cytological abnormalities were classified following the Bethesda Cytological Classification, updated in 2014 [27], whereas the colposcopic findings were categorized according to the terminology proposed by the IFCPC [14].Histopathological examinations of the biopsy fragments and excisional specimens of the TZ were performed by a single examiner and classified according to the World Health Organization International Histological Classification of Tumors [28] and Richart's classification for cervical intraepithelial neoplasias [29].
The Statistical Package for Social sciences (SPSS) for Windows 21.0 (IBM Brasil, São Paulo, SP, Brasil), SPSS, was used for descriptive and frequency distribution anal- ysis of collected clinical data, as well as for calculating the Kappa (κ) coefficient and p values.Agreement in the recognition of each of the three colposcopic images and the four degrees of colposcopic findings [14], by each of the three independent colposcopists, in relation to the consensual findings of the two original colposcopists (gold standard) was evaluated applying Kappa statistics.Values of less than or equal to zero indicate no agreement; between 0.00 and 0.20 are considered weak; between 0.21 and 0.40, regular; between 0.41 and 0.60, moderate; between 0.61 and 0.80, substantial; and between 0.81 and 0.99, almost perfect.

Discussion
In the present study, the degree of agreement between three independent and two original colposcopists on the detection of three colposcopic images, namely grouped glands, aceto-white villi, and atypical vessels, was statis- tically significant when evaluated isolatedly (p < 0.001).Among the three images, the best agreement was obtained for the detection of aceto-white villi, reaching values considered regular.However, the detection of grouped glands and atypical vessels reached agreement levels considered weak or regular.Regarding the colposcopic findings classified as major or minor, according to the terminology of the IFCPC [14], the degrees of agreement found were considered regular and moderate, except for colposcopist 2, who achieved weak agreement for the category of minor findings.
The degree of agreement between the different colposcopists on the identification of the images evaluated in this study showed great variability.However, this is inherent to methods based on the interpretation of images, similarly to the interpretation of cytological and histopathological images [30].Despite these limitations, this study achieved levels of agreement between the independent and the original colposcopists that significantly (p < 0.001) exceeded those that would occur by chance, even reaching values considered regular.In addition, the differences between the independent colposcopists were not statistically significant, as the confidence intervals of their results overlapped with each other.Digital colposcopic image files are undoubtedly useful for documenting, training, and assessing expert proficiency.Nonetheless, another important consideration, in addition to the subjectivity of image interpretation, is that the analysis of filed digital images is not as accurate as the assessment of colposcopic examination in real-time.In the former, it is not possible to change the focus or the magnification level.Furthermore, the mobilization of the cervix, the removal of blood or mucus, the longitudinal assessment of the aceto-whitening reaction, and the use of a green filter are also impossible [31].Thus, the identification of these images in real-time colposcopy, rather than in filed digital images, would most probably result in better degrees of agreement among colposcopists.
The study that most closely resembles the present one [12] was the basis for the inclusion of the colposcopic signs inner border and ridge sign, according to the IFCPC terminology, as findings that indicate the presence of major alterations [14].The degree of agreement among three different colposcopists for the detection of the ridge sign in digital colposcopic images of 592 patients ranged from regular to moderate (κ 0.29-0.49).In this investigation, the degree of agreement among three colposcopists for the assessment of aceto-white villi was similar, and ranged from weak to regular (κ 0.15; 95% CI: 0.06-0.25;κ 0.37; 95% CI: 0.28-0.45;p < 0.001).Another publication also analyzed the inner border sign [12] that was subsequently introduced into the current colposcopic terminology of the IFCPC [14] due to its high specificity (97%), although the agreement among different colposcopists at that time had not been evaluated.
Three colposcopic images indicative of minor alterations, six indicative of major alterations, and two patterns corresponding to suspicious for invasion (atypical vessels and additional signs) were described according to the IFCPC terminology [14].Nevertheless, the increase in the number of images to be detected, or categories and subcategories in which the findings should be included, may result in a decrease in the agreement indices [32].This occurred in a study carried out in the United Kingdom, which obtained an agreement of κ 0.17 when categorizing the colposcopic impression into eight degrees using a non-standardized classification [23].In the present study, the evaluation of three isolated colposcopic images suggestive of major findings, resulted in degrees of agreement among colposcopists ranging from weak to regular, as follows: atypical vessels by colposcopist 3, κ 0.14 (95% CI: 0.06-0.21;p < 0.001) and aceto-white villi by colposcopist 1, κ 0.37 (95% CI: 0.28-0.45;p < 0.001).
Similarly, several studies have reported lower degrees of agreement considering the identification of isolated images [12,15,18,24].In a Canadian study, the authors found an agreement of κ 0.13 to κ 0.41, and κ 0.21 to κ 0.47, in relation to the characteristics of the neoplasia border and its color, respectively.However, for the categorical finding of abnormal TZ, the agreement obtained was higher (κ 0.34-0.36)[24].In another study, an agreement of κ 0.23-0.28 was obtained for the Reid index [33], while the degree of agreement for the categorical finding of colposcopic impression was higher (κ 0.36; 95% CI: 0.33-0.39)[18].
In another publication of our team, the assessment of the diagnostic performance of the three aforementioned images showed that the area under the receiver operating characteristic curve, found when at least one of them was present, was 0.82 (95% CI: 0.77-0.88)for the diagnosis of AIS, and 0.60 (95% CI: 0.56-0.63)for the diagnosis of CIN 2 and 3 [25].This performance, especially relevant for the identification of glandular precursor cervical neoplasias, could probably be similar in other colposcopy services, provided that the specialists received specific additional training to identify them.
The categories suspicious for invasion and normal colposcopic findings showed a low prevalence of positive findings in this study, 0.24% (2 cases) and 1.46% (12 cases), respectively.This low prevalence must have contributed to the lack of agreement in the former, as well as to the weak agreement in the latter, due to the prevalence effect [34].This effect appears anytime the proportion of positive results is substantially different from 50%, and implies a variation in the Kappa coefficient inversely proportional to this difference [34].
The agreement between colposcopists on the major and minor colposcopic finding categories ranged from weak to moderate in this study, comparable to the findings of other publications [18,24,33].Furthermore, the weak to moderate degrees of agreement obtained, concerning major and minor colposcopic findings, were also similar to those found for the detection of aceto-white villi images, as shown by the overlap of their 95% CI, except for colposcopist 3 regarding major colposcopic findings.An apparent difference in the level of agreement with the findings of the original colposcopists, achieved by each one of the three independent colposcopists, was discarded due to the overlap of the 95% CI found among their κ results for each one of the studied images and categories.
This study showed levels of agreement between distinct colposcopists that significantly exceeded those expected only by chance.Even though these outcomes suggest that new examiners can be trained to recognize the three images here evaluated, it still remains necessary to take multiple biopsies of any abnormal colposcopic finding to maximize the sensitivity of this method, especially considering the already widely known great variability on the final findings of exams based on interpretation of images [7].

Conclusions
Three independent colposcopists, trained using a digital colposcopic imaging manual with cases different than those included in this study, reached statistically significant agreement in relation to the findings of the two original colposcopists.The degree of agreement that they obtained ranged from weak to regular for the identification of three isolated colposcopic images, namely grouped glands, aceto-white villi, and atypical vessels.Additionally, the agreement found for the detection of major and minor colposcopic findings varied between weak and moderate.The higher agreement in the identification of major and minor colposcopic findings compared to the lower agreement for the three images, which in a previous study showed higher performance for the identification of glandular intraepithelial neoplasias, suggests that new studies in the field are still necessary, especially to clarify whether the improvement in the detection of these images could lead to better rates of colposcopic diagnosis of AIS and, consequently, reduce the incidence of invasive cervical adenocarcinoma.

Fig. 1 .
Fig. 1.Three colposcopic images investigated in filed digital images.(A) Obstructed dilated grouped glands.(B) Aceto-white villi with invaginated borders, fused or not.(C) Atypical vessels in cylindrical epithelium area[25].Note that the third image of group A has been intentionally repeated as the fifth image of group C, to emphasize the diversity of aspects that are commonly seen each colposcopic examination.