Breast cancer is one of the deadliest cancers causing about massive number of patients to die annually all over the world according to the WHO. It is a kind of cancer that develops when the tissues of the breast grow rapidly and unboundly. This fatality rate can be prevented if the cancer is detected before it gets malignant. Using automation for early-age detection of breast cancer, Artificial Intelligence and Machine Learning technologies can be implemented for the best outcome. In this study, we are using the Breast Cancer Image Classification dataset collected from the Kaggle depository, which comprises 9248 Breast Ultrasound Images and is classified into three categories: Benign, Malignant, and Normal which refers to non-cancerous, cancerous, and normal images.This research introduces three pretrained model featuring custom classifiers that includes ResNet50, MobileNet, and VGG16, along with a custom CNN model utilizing the ReLU activation function.The models ResNet50, MobileNet, VGG16, and a custom CNN recorded accuracies of 98.41%, 97.91%, 98.19%, and 92.94% on the dataset, correspondingly, with ResNet50 achieving the highest accuracy of 98.41%.This model, with its deep and
Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex context of breast imaging modalities. Considering the rapid improvement in deep learning technology and the increasing severity of breast cancer, it is critical to summarize past progress and identify future challenges to be addressed. This paper provides an extensive review of deep learning-based breast cancer imaging research, covering studies on mammogram, ultrasound, magnetic resonance imaging, and digital pathology images over the past decade. The major deep learning methods and applications on imaging-based screening, diagnosis, treatment response prediction, and prognosis are elaborated and discussed. Drawn from the findings of this survey, we present a comprehensive discussion of the challenges and potential avenues for future research in deep learning-based breast cancer imaging.
Although the descriptive epidemiology of primary breast cancer is well characterized in the US, breast cancer recurrence rates have not been measured in an unselected population. The number of breast cancer survivors at risk for recurrence is growing each year, so recurrence surveillance is a pressing need. We used missing data methods to impute breast cancer recurrence and estimate the risk of recurrence in the Cancer Recurrence Information and Surveillance Program (CRISP) cohort in the Georgia Cancer Registry. The imputation model was based on an internal validation substudy and indicators recorded in the registry (e.g., pathology reports, imaging claims), prognostic variables (e.g., stage at diagnosis), and characteristics associated with missing data (e.g., insurance coverage). We pooled hazard ratios (HR) and 95% Confidence Intervals (CI) across 1000 imputed datasets, adjusted for age, stage, grade, subtype, race and ethnicity, marital status, and urban/rural county at diagnosis. There were 1,606 patients with a validated outcome (75% with breast cancer recurrence) and we imputed the outcome for the remaining 23,439 patients. We estimated an overall 7.2% incidence of recurrenc
Breast cancer is a major global health issue that affects millions of women worldwide. Classification of breast cancer as early and accurately as possible is crucial for effective treatment and enhanced patient outcomes. Deep transfer learning has emerged as a promising technique for improving breast cancer classification by utilizing pre-trained models and transferring knowledge across related tasks. In this study, we examine the use of a VGG, Vision Transformers (ViT) and Resnet to classify images for Invasive Ductal Carcinoma (IDC) cancer and make a comparative analysis of the algorithms. The result shows a great advantage of Resnet-34 with an accuracy of $90.40\%$ in classifying cancer images. However, the pretrained VGG-16 demonstrates a higher F1-score because there is less parameters to update. We believe that the field of breast cancer diagnosis stands to benefit greatly from the use of deep transfer learning. Transfer learning may assist to increase the accuracy and accessibility of breast cancer screening by allowing deep learning models to be trained with little data.
Breast cancer ranks as the most prevalent form of cancer diagnosed in women, and diagnosis faces several challenges, a change in the size, shape and appearance of breasts, dense breast tissue, lumps or thickening in the breast especially if in only one breast, lumps and nodules in the breast. The major challenge that faces deep learning diagnosis of breast cancer was its shape, size and position non-uniformity especially malignant cancer. This work proposed a deep learning system that increased the accuracy of classification of breast cancer types from ultrasound images. It reaches 99.29% accuracy, exceeding other previous work. First, image processing was applied to in enhance the quality of input images. Second, the image segmentation was performed using U-Net architecture. Third, many features are extracted using Mobilenet. Finally, the accuracy of proposed system was evaluated.
Cancer is one of the most dangerous diseases to humans, and yet no permanent cure has been developed for it. Breast cancer is one of the most common cancer types. According to the National Breast Cancer foundation, in 2020 alone, more than 276,000 new cases of invasive breast cancer and more than 48,000 non-invasive cases were diagnosed in the US. To put these figures in perspective, 64% of these cases are diagnosed early in the disease's cycle, giving patients a 99% chance of survival. Artificial intelligence and machine learning have been used effectively in detection and treatment of several dangerous diseases, helping in early diagnosis and treatment, and thus increasing the patient's chance of survival. Deep learning has been designed to analyze the most important features affecting detection and treatment of serious diseases. For example, breast cancer can be detected using genes or histopathological imaging. Analysis at the genetic level is very expensive, so histopathological imaging is the most common approach used to detect breast cancer. In this research work, we systematically reviewed previous work done on detection and treatment of breast cancer using genetic sequenci
Accurate extraction of breast cancer patients' phenotypes is important for clinical decision support and clinical research. Current models do not take full advantage of cancer domain-specific corpus, whether pre-training Bidirectional Encoder Representations from Transformer model on cancer-specific corpus could improve the performances of extracting breast cancer phenotypes from texts data remains to be explored. The objective of this study is to develop and evaluate the CancerBERT model for extracting breast cancer phenotypes from clinical texts in electronic health records. This data used in the study included 21,291 breast cancer patients diagnosed from 2010 to 2020, patients' clinical notes and pathology reports were collected from the University of Minnesota Clinical Data Repository (UMN). Results: About 3 million clinical notes and pathology reports in electronic health records for 21,291 breast cancer patients were collected to train the CancerBERT model. 200 pathology reports and 50 clinical notes of breast cancer patients that contain 9,685 sentences and 221,356 tokens were manually annotated by two annotators. 20% of the annotated data was used as a test set. Our CancerB
Mammographic breast density is a well-established risk factor for breast cancer. Recently there has been interest in breast MRI as an adjunct to mammography, as this modality provides an orthogonal and highly quantitative assessment of breast tissue. However, its 3D nature poses analytic challenges related to delineating and aggregating complex structures across slices. Here, we applied an in-house machine-learning algorithm to assess breast density on normal breasts in three MRI datasets. Breast density was consistent across different datasets (0.104 - 0.114). Analysis across different age groups also demonstrated strong consistency across datasets and confirmed a trend of decreasing density with age as reported in previous studies. MR breast density was correlated with mammographic breast density, although some notable differences suggest that certain breast density components are captured only on MRI. Future work will determine how to integrate MR breast density with current tools to improve future breast cancer risk prediction.
The ODELIA Breast MRI Challenge 2025 addresses a critical issue in breast cancer screening: improving early detection through more efficient and accurate interpretation of breast MRI scans. Even though methods for general-purpose whole-body lesion segmentation as well as multi-time-point analysis exist, breast cancer detection remains highly challenging, largely due to the limited availability of high-quality segmentation labels. Therefore, developing robust classification-based approaches is crucial for the future of early breast cancer detection, particularly in applications such as large-scale screening. In this write-up, we provide a comprehensive overview of our approach to the challenge. We begin by detailing the underlying concept and foundational assumptions that guided our work. We then describe the iterative development process, highlighting the key stages of experimentation, evaluation, and refinement that shaped the evolution of our solution. Finally, we present the reasoning and evidence that informed the design choices behind our final submission, with a focus on performance, robustness, and clinical relevance. We release our full implementation publicly at https://gi
Breast cancer screening plays a pivotal role in early detection and subsequent effective management of the disease, impacting patient outcomes and survival rates. This study aims to assess breast cancer screening rates nationwide in the United States and investigate the impact of social determinants of health on these screening rates. Data on mammography screening at the census tract level for 2018 and 2020 were collected from the Behavioral Risk Factor Surveillance System. We developed a large dataset of social determinants of health, comprising 13 variables for 72337 census tracts. Spatial analysis employing Getis-Ord Gi statistics was used to identify clusters of high and low breast cancer screening rates. To evaluate the influence of these social determinants, we implemented a random forest model, with the aim of comparing its performance to linear regression and support vector machine models. The models were evaluated using R2 and root mean squared error metrics. Shapley Additive Explanations values were subsequently used to assess the significance of variables and direction of their influence. Geospatial analysis revealed elevated screening rates in the eastern and northern U
Traditional breast cancer image classification methods require manual extraction of features from medical images, which not only require professional medical knowledge, but also have problems such as time-consuming and labor-intensive and difficulty in extracting high-quality features. Therefore, the paper proposes a computer-based feature fusion Convolutional neural network breast cancer image classification and detection method. The paper pre-trains two convolutional neural networks with different structures, and then uses the convolutional neural network to automatically extract the characteristics of features, fuse the features extracted from the two structures, and finally use the classifier classifies the fused features. The experimental results show that the accuracy of this method in the classification of breast cancer image data sets is 89%, and the classification accuracy of breast cancer images is significantly improved compared with traditional methods.
Cancer is the main cause of mortality at the developed countries and is the second cause of mortality at the developing countries and breast cancer is the most prevalent malignancy and the first cause of mortality among women of the world. At the US, 28% of new cancer cases and 15% of mortality caused by that in 2010 was caused by breast cancer. Moreover, breast cancer is the most common cancer among Iranian women, which includes 24.4% of overall malignancies and causes 3.3% of mortalities per 1.000 people. The disease is most spread in Tehran and including 25.5% of all cancers. According to the mentioned, the study has tried to assess the effect of nanotechnology in form of a review to diagnose and treat breast cancer. The method applied in this study is descriptive analytical method and library method has been used for data collection purpose. In the data collection method, 45 articles relevant to structure of types of nanoparticles and their uses in diagnosis, imaging and medicine delivery systems and breast cancer treatment have been used. Although there are still some challenges and limitations to use nanoparticles in medication, it is hope that nanoparticles can make wonderfu
The biggest Breast cancer is increasingly a major factor in female fatalities, overtaking heart disease. While genetic factors are important in the growth of breast cancer, new research indicates that environmental factors also play a substantial role in its occurrence and progression. The literature on the various environmental factors that may affect breast cancer risk, incidence, and outcomes is thoroughly reviewed in this study report. The study starts by looking at how lifestyle decisions, such as eating habits, exercise routines, and alcohol consumption, may affect hormonal imbalances and inflammation, two important factors driving the development of breast cancer. Additionally, it explores the part played by environmental contaminants such pesticides, endocrine-disrupting chemicals (EDCs), and industrial emissions, all of which have been linked to a higher risk of developing breast cancer due to their interference with hormone signaling and DNA damage. Algorithms for machine learning are used to express predictions. Logistic Regression, Random Forest, KNN Algorithm, SVC and extra tree classifier. Metrics including the confusion matrix correlation coefficient, F1-score, Preci
Breast cancer is the most common cancer in women both in developed and developing countries. More than half of all cancer mobile application concern breast cancer. Gamification is widely used in mobile software applications created for health-related services. Current prevalence of gamification in breast cancer apps is unknown and detection must be manually performed. The purpose of this study is to describe and produce a tool allowing automatic detection of apps which contain gamification elements and thus empowering researchers to study gamification using large data samples. Predictive logistic regression model was designed on data extracted from breast cancer apps' title and description text available in app stores. Model was validated comparing estimated and benchmark values, observed by gamification specialists. Study's outcome can be applied as a screening tool to efficiently identify gamification presence in breast cancer apps for further research.
With the advances in artificial intelligence (AI), data-driven algorithms are becoming increasingly popular in the medical domain. However, due to the nonlinear and complex behavior of many of these algorithms, decision-making by such algorithms is not trustworthy for clinicians and is considered a black-box process. Hence, the scientific community has introduced explainable artificial intelligence (XAI) to remedy the problem. This systematic scoping review investigates the application of XAI in breast cancer detection and risk prediction. We conducted a comprehensive search on Scopus, IEEE Explore, PubMed, and Google Scholar (first 50 citations) using a systematic search strategy. The search spanned from January 2017 to July 2023, focusing on peer-reviewed studies implementing XAI methods in breast cancer datasets. Thirty studies met our inclusion criteria and were included in the analysis. The results revealed that SHapley Additive exPlanations (SHAP) is the top model-agnostic XAI technique in breast cancer research in terms of usage, explaining the model prediction results, diagnosis and classification of biomarkers, and prognosis and survival analysis. Additionally, the SHAP mo
Canine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.
Among American women, the rate of breast cancer is only second to lung cancer. An estimated 12.4% women will develop breast cancer over the course of their lifetime. The widespread use of social media across the socio-economic spectrum offers unparalleled ways to facilitate information sharing, in particular as it pertains to health. Social media is also used by many healthcare stakeholders, ranging from government agencies to healthcare industry, to disseminate health information and to engage patients. The purpose of this study is to investigate people's perceptions and attitudes relate to breast cancer, especially those that are related to physical activities, on Twitter. To achieve this, we first identified and collected tweets related to breast cancer; and then used topic modeling and sentiment analysis techniques to understanding discussion themes and quantify Twitter users' perceptions and emotions w.r.t breast cancer to answer 5 research questions.
Interpretability is critical in high-stakes domains such as medical imaging, where understanding model decisions is essential for clinical adoption. In this work, we introduce Sparse Autoencoder (SAE)-based interpretability to breast imaging by analyzing {Mammo-CLIP}, a vision--language foundation model pretrained on large-scale mammogram image--report pairs. We train a patch-level \texttt{Mammo-SAE} on Mammo-CLIP to identify and probe latent features associated with clinically relevant breast concepts such as \textit{mass} and \textit{suspicious calcification}. Our findings reveal that top activated class level latent neurons in the SAE latent space often tend to align with ground truth regions, and also uncover several confounding factors influencing the model's decision-making process. Additionally, we analyze which latent neurons the model relies on during downstream finetuning for improving the breast concept prediction. This study highlights the promise of interpretable SAE latent representations in providing deeper insight into the internal workings of foundation models at every layer for breast imaging. The code will be released at https://krishnakanthnakka.github.io/MammoS
Cardiotoxicity induced by the breast cancer treatments (i.e., chemotherapy, targeted therapy and radiation therapy) is a significant problem for breast cancer patients. The cardiotoxicity risk for breast cancer patients receiving different treatments remains unclear. We developed and evaluated risk predictive models for cardiotoxicity in breast cancer patients using EHR data. The AUC scores to predict the CHF, CAD, CM and MI are 0.846, 0.857, 0.858 and 0.804 respectively. After adjusting for baseline differences in cardiovascular health, patients who received chemotherapy or targeted therapy appeared to have higher risk of cardiotoxicity than patients who received radiation therapy. Due to differences in baseline cardiac health across the different breast cancer treatment groups, caution is recommended in interpreting the cardiotoxic effect of these treatments.
Deep learning models have achieved promising results in breast cancer classification, yet their 'black-box' nature raises interpretability concerns. This research addresses the crucial need to gain insights into the decision-making process of convolutional neural networks (CNNs) for mammogram classification, specifically focusing on the underlying reasons for the CNN's predictions of breast cancer. For CNNs trained on the Mammographic Image Analysis Society (MIAS) dataset, we compared the post-hoc interpretability techniques LIME, Grad-CAM, and Kernel SHAP in terms of explanatory depth and computational efficiency. The results of this analysis indicate that Grad-CAM, in particular, provides comprehensive insights into the behavior of the CNN, revealing distinctive patterns in normal, benign, and malignant breast tissue. We discuss the implications of the current findings for the use of machine learning models and interpretation techniques in clinical practice.