The full impacts of digital mammography and computer-aided diagnostic (CAD/QIA) systems on the performance of diagnostic mammography are yet to be realized. Lesion composition as described by its 3 compositional thicknesses of protein, lipid, and water (3CB) was recently discovered to be a strong descriptor of abnormal breast lesions.
The first five years of this project (March 1, 2013 to June 1, 2017) have been completed. The currently funded grant develops and evaluates novel mammographic biomarkers and lipid/protein/water signatures to determine if composition alone (3CB) or combined with validated QIA/radiomics methods (q3CB), can improve the diagnostic accuracy of breast imaging, and reduce the number of unnecessary biopsies.
The long-term goal of this project is to determine if biological diagnostic measures of mammographic lesions can be used to improve current CADe algorithms in estimating the probability of breast cancer. Our objective was to quantify lipid-protein-water signatures of mammographically suspicious breast lesions to better predict malignant findings. Our central hypothesis was that novel lipid-protein-water image biomarkers can be combined with existing QIA/radiomics methods to improve the sensitivity and specificity of cancer diagnosis and reduce the number of unnecessary biopsies. Our prior specific aims are listed below with the progress to date.
To investigate the sensitivity and specificity of localized 3CB (3-compartment breast) lipid-protein-water signatures to distinguish breast cancer from benign lesions on prospectively acquired diagnostic mammograms of women recommended to undergo biopsy.
It is important to note that because of the 17% budget reduction, we reduced our recruitment goal from 600 to 498 FFDM patients. To date, we have recruited 425 (215 UCSF/210 Moffitt) women and anticipate reaching our recruitment goal by the end of Autumn 2017. Subtype finding confirmed by biopsy includes 61 invasive ductal carcinoma (IDC), 40 DCIS, 66 fibroadenomas, and 324 benign findings. Our target is 130 IDC by summer 2018. We modified recruitment several times this past year to include BI-RADS 5 women with previous contralateral cancer. At our current rate, we anticipate 70 IDC and 50 DCIS for a total of 120 malignant findings by the end of year 5. We also created a q3CB phantom shown in Figure 1 that can be used with any FFDM and DBT to create calibrated 3CB images.
Figure 1. Calibration phantom developed in yr 1-5 aim 1 consisting of 51 combinations of water, lipid (wax), and protein (Delrin) capable of calibrating any FFDM or DBT for 3CB.
One of our early q3CB analyses (35) included 45 lesions, in which we found that individual 3CB features were able to distinguish between lesion types — invasive cancer was distinguished from DCIS by the skewness of the lipid values in the lesion (with an area under the ROC curve AUC=0.71), from fibroadenomas by the texture (standard deviation) of the water distribution of the lesion relative to that of the background (AUC=0.75), and from other benign lesions by the amount of water in the lesion periphery (AUC=0.71). Performance of a combined 3CB signature distinguishing invasive cancer compared to all other findings had an AUC=0.72 in a leave-one-case-out cross-validation (1). Figure 2 shows examples of the differences between IDC and benign lesions. In distinguishing between cancer (invasive+DCIS) and benign findings, we found AUC = 0.72 (standard error 0.07), 0.78 (0.06), and 0.65 (0.09), for mass, calcification, and asymmetry/architectural distortion cases respectively.
Compare the sensitivity and specificity of 3CB to an established CAD/QIA method and conventional morphological BI-RADS descriptors to distinguish between cancers and benign lesions.
Using the same interim analysis (1), we identified the strongest individual QIA feature to distinguish a) invasive cancer: spiculation (with invasive cancers being more spiculated), b) DCIS: circularity (with DCIS appearing less circular), c) fibroadenoma: radial gradient index (relating to shape and margin, with fibroadenomas having a more circular appearance with more distinct margins), and d) other benign findings: texture (with benign findings appearing less heterogeneous). A merged QIA/radiomics signature yielded an AUC of 0.81 (leave-one-case-out analysis) in the distinction between lesions that should and should not undergo biopsy (invasive and DCIS versus fibroadenomas and other benign findings). We also found that there was little correlation between the 3CB compositional information and the QIA features suggesting that they provide complementary information.
This results in methods that merge the 3CB and the QIA features into our new q3cb signatures (for which our renewal incorporates).
We also developed methods to assess and classify microcalcifications through QIA for FFDM images. Two distinct approaches were developed and used: 1) a ‘conventional’ QIA approach (2) and 2) a deep learning approach (under review). For the former, we developed a ‘conventional’ QIA method based on identification, segmentation, and feature extraction of mammographic microcalcifications, partially using unsupervised methods to reduce database bias. For the latter, we used transfer learning to extract image features through a pre-trained deep convolutional neural net (i.e., a deep learning method). In the ‘conventional’ QIA method, image-based phenotypes (features) were extracted and investigated for distinguishing our 4 lesion subtypes (invasive cancers, in situ cancers, fibroadenomas, and other benign lesions).
In this calcification analysis of only participants, there were 7 invasive cancers, 14 in situ cancers, 13 fibroadenomas, and 48 other benign type lesions. We found distinct phenotypes by calcification size, inter-calcification distance and X-ray attenuation-based areas under the ROC curve ranging from 0.69 (0.05) to 0.92 (0.09) depending on the classification task at hand. The deep learning-based feature extraction method was developed to be used in combination with a ‘conventional’ classifier (in this case linear discriminant analysis) to classify for malignancy. The primary objective was to investigate the potential reduction of the number of unnecessary benign biopsies at 100% sensitivity. For a dataset of 99 biopsy-proven lesions, we found that the deep learning-based method outperformed study radiologists. The deep learning based method could have avoided 21 biopsies of the 80 benign lesions in this study versus only 8 avoidable biopsies based on the assessment of the study radiologists (p<.001). These encouraging preliminary results provide yet another approach of identifying malignant lesions through calcifications beyond q3CB composition and lesion mass morphology features.
Moreover, we also investigated the use of deep learning for q3CB (in combination with transfer learning) for feature extraction in the classification for malignancy when applied to FFDM images (3), 3CB images, or both (4). As before, the deep learning method (deep convolutional neural net) was pretrained on a large set of non-medical images and not re-trained or fine-tuned in any way for our application. It was used only to extract image features while classification was performed by linear discriminant analysis. We extracted image features from images of 195 women who participated in our 3CB study; in 58 women, the lesion manifested as a mass (13 malignant vs. 45 benign), in 87, as microcalcifications (19 vs. 68), and in 56, as focal asymmetry or architectural distortion (11 vs. 45). Six patients had both a mass and calcifications. For each mammogram and corresponding q3CB images (water, lipid, and protein images), a 128×128 region of interest containing the lesion was selected by an expert radiologist and used directly as input to the deep convolutional neural net for feature extraction.
We hypothesize that 3CB measures of suspicious mammographic lesions and established CAD/QIA measures are independent measures that are different predictors of breast cancer and benign lesions and that the combination of measures from the two methods in combination with clinical risk factors and qualitative BI-RADS lesion descriptors in a single model will maximize the sensitivity and specificity of cancer detection.
We found that the combination of 3CB and QIA (i.e., q3CB) signatures in the classification of lesions that should and should not undergo biopsy (invasive and DCIS versus fibroadenomas and other benign findings) yielded an AUC of 0.86 (versus 0.71 for 3CB alone and 0.81 for QIA alone, p<0.05.) (Figure 3a) In a mass only subset analysis, we found the AUC increased from 0.83 (QIA alone) to 0.89 (QIA+3CB, p=.162). For cases with microcalcification and asymmetry/architectural distortion, AUC increased from 0.84 to 0.91 (p=0.116) and from 0.61 to 0.87 (p=0.006), respectively.
We that the combination of 3CB and QIA/radiomics improves the specificity of malignant lesions prior to biopsy. Our results indicate great potential for the application of deep learning methods in the diagnosis of breast cancer and additional knowledge of the biologic tissue composition appeared to improve performance, especially for lesions mammographically manifesting as asymmetries or architectural distortions. These strong results justify the continuation of this project and extending the project with the proposed aims and methods.
The focus of future studies will be on extending the technique to 3D Tomosynthesis imaging as well as reader studies to validate that the addition of 3CB knowledge at the time of diagnostic mammography influences the decision making of the radiologist and reduces the rate of unnecessary biopsies.
In The News
RADIOLOGICAL SOCIETY OF NORTH AMERICA