Coronary heart disease (CAD) is the leading cause of death worldwide, and coronary angiography (CAG) serves as the gold standard for its assessment. Valvular heart diseases, such as severe aortic stenosis (AS) and severe mitral regurgitation (MR), frequently coexist with CAD yet are often underdiagnosed. Opportunistic screening for these conditions at the time of CAG could influence therapeutic strategies and improve prognosis. This study developed and validated a foundation model for the automated screening of severe AS and severe MR from CAG videos. The study presents CAGFound, a video-based foundation model that was self-supervised pre-trained on CAG sequences from seven medical centers and subsequently adapted to two downstream tasks: screening for severe AS and severe MR. Two internal and external validation datasets were retrospectively enrolled from the First Medical Center and the Sixth Medical Center of Chinese PLA General Hospital, respectively. A total of 117,383 unlabeled CAG sequences were used to build CAGFound. For the detection of severe AS, CAGFound achieved an area under the receiver operating characteristic curve (AUROC) of 0.932 (sensitivity 0.767, specificity 0.921) on the internal test dataset and maintained robust performance on the external validation dataset, with an AUROC of 0.879 (sensitivity 0.800, specificity 0.955). For the detection of severe MR, the model demonstrated an AUROC of 0.933 (sensitivity 0.738, specificity 0.938) on the internal dataset and an AUROC of 0.896 (sensitivity 0.754, specificity 0.855) on the external cohort. The performance of CAGFound was also compared with other video-based foundation models, VideoMAEv2 and Video Swin. CAGFound achieved the highest AUROC and demonstrated the best calibration performance (Brier score 0.122, R2 0.478) compared with VideoMAEv2 (Brier score 0.159, R2 0.306) and Video Swin (Brier score 0.162, R2 0.306). CAGFound enables accurate, automated screening for severe AS and severe MR during CAG. It has the potential to increase detection rates, facilitate timely clinical referral, and improve prognosis without requiring additional contrast administration or procedures.
Die-forging structural parts are widely used in the main load-bearing components of aircrafts because of their excellent mechanical properties and fatigue resistance. However, the forming and heat treatment processes of die-forging structural parts are complex, leading to high levels of internal stress and a complex distribution of residual stress fields (RSFs), which affect the deformation, fatigue life, and failure of structural parts throughout their lifecycles. Hence, the global RSF can provide the basis for process control. The existing RSF inference method based on deformation force data can utilize monitoring data to infer the global RSF of a regular part. However, owing to the irregular geometry of die-forging structural parts and the complexity of the RSF, it is challenging to solve ill-conditioned problems during the inference process, which makes it difficult to obtain the RSF accurately. This paper presents a global RSF inference method for the die-forging structural parts based on the fusion of monitoring data and distribution prior. Prior knowledge was derived from the RSF distribution trends obtained through finite element analysis. This enables the low-dimensional characterization of the RSF, reducing the number of parameters required to solve the equations. The effectiveness of this method was validated in both simulation and actual environments.
The vision transformer (ViT) architecture, with its attention mechanism based on multi-head attention layers, has been widely adopted in various computer-aided diagnosis tasks due to its effectiveness in processing medical image information. ViTs are notably recognized for their complex architecture, which requires high-performance GPUs or CPUs for efficient model training and deployment in real-world medical diagnostic devices. This renders them more intricate than convolutional neural networks (CNNs). This difficulty is also challenging in the context of histopathology image analysis, where the images are both limited and complex. In response to these challenges, this study proposes a TokenMixer hybrid-architecture that combines the strengths of CNNs and ViTs. This hybrid architecture aims to enhance feature extraction and classification accuracy with shorter training time and fewer parameters by minimizing the number of input patches employed during training, while incorporating tokenization of input patches using convolutional layers and encoder transformer layers to process patches across all network layers for fast and accurate breast cancer tumor subtype classification. The TokenMixer mechanism is inspired by the ConvMixer and TokenLearner models. First, the ConvMixer model dynamically generates spatial attention maps using convolutional layers, enabling the extraction of patches from input images to minimize the number of input patches used in training. Second, the TokenLearner model extracts relevant regions from the selected input patches, tokenizes them to improve feature extraction, and trains all tokenized patches in an encoder transformer network. We evaluated the TokenMixer model on the BreakHis public dataset, comparing it with ViT-based and other state-of-the-art methods. Our approach achieved impressive results for both binary and multi-classification of breast cancer subtypes across various magnification levels (40×, 100×, 200×, 400×). The model demonstrated accuracies of 97.02% for binary classification and 93.29% for multi-classification, with decision times of 391.71 and 1173.56 s, respectively. These results highlight the potential of our hybrid deep ViT-CNN architecture for advancing tumor classification in histopathological images. The source code is accessible: https://github.com/abimouloud/TokenMixer .
Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements; however, existing prediction models face limitations, primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy. To address these gaps, this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling. For facial feature extraction, this study developed a lightweight face-region convolutional neural network (FRegNet) specialized in detecting key facial components, such as eyes and lips in clinical patients that incorporates a residual backbone (Rstem) to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion; comparative experiments reveal that FRegNet outperforms state-of-the-art target detection algorithms, achieving average precision (AP) of 0.922, average recall of 0.933, mean average precision (mAP) of 0.987, and precision of 0.98-significantly surpassing other mask region-based convolutional neural networks (RCNN) variants, such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957. Based on the extracted facial features and clinical physiological indicators, this study proposes an enhanced temporal encoding-decoding (ETED) model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance, with comparative results demonstrating that the ETED variant incorporating facial features (ETEncoding-Decoding-Face) outperforms traditional models, achieving an accuracy of 0.916, precision of 0.850, recall of 0.895, F1 of 0.884, and area under the curve (AUC) of 0.947-outperforming gradient boosting with an accuracy of 0.922, but AUC of 0.669, and other classifiers in comprehensive metrics. The results confirm that the multimodal dataset (facial features + physiological indicators) significantly enhances the prediction accuracy of the seven-day survival conditions of patients. Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival, while temperature, Glasgow Coma Scale, and fibrinogen are negatively correlated.
Active surveillance (AS) is the primary strategy for managing patients with low or favorable-intermediate risk prostate cancer (PCa). Identifying patients who may benefit from AS relies on unpleasant prostate biopsies, which entail the risk of bleeding and infection. In the current study, we aimed to develop a radiomics model based on prostate magnetic resonance images to identify AS candidates non-invasively. A total of 956 PCa patients with complete biopsy reports from six hospitals were included in the current multicenter retrospective study. The National Comprehensive Cancer Network (NCCN) guidelines were used as reference standards to determine the AS candidacy. To discriminate between AS and non-AS candidates, five radiomics models (i.e., eXtreme Gradient Boosting (XGBoost) AS classifier (XGB-AS), logistic regression (LR) AS classifier, random forest (RF) AS classifier, adaptive boosting (AdaBoost) AS classifier, and decision tree (DT) AS classifier) were developed and externally validated using a three-fold cross-center validation based on five classifiers: XGBoost, LR, RF, AdaBoost, and DT. Area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE) were calculated to evaluate the performance of these models. XGB-AS exhibited an average of AUC of 0.803, ACC of 0.693, SEN of 0.668, and SPE of 0.841, showing a better comprehensive performance than those of the other included radiomic models. Additionally, the XGB-AS model also presented a promising performance for identifying AS candidates from the intermediate-risk cases and the ambiguous cases with diagnostic discordance between the NCCN guidelines and the Prostate Imaging-Reporting and Data System assessment. These results suggest that the XGB-AS model has the potential to help identify patients who are suitable for AS and allow non-invasive monitoring of patients on AS, thereby reducing the number of annual biopsies and the associated risks of bleeding and infection.
Hypertensive retinopathy (HR) occurs when the choroidal vessels, which form the photosensitive layer at the back of the eye, are injured owing to high blood pressure. Artificial intelligence (AI) in retinal image analysis (RIA) for HR diagnosis involves the use of advanced computational algorithms and machine learning (ML) strategies to recognize and evaluate signs of HR in retinal images automatically. This review aims to advance the field of HR diagnosis by investigating the latest ML and deep learning techniques, and highlighting their efficacy and capability for early diagnosis and intervention. By analyzing recent advancements and emerging trends, this study seeks to inspire further innovation in automated RIA. In this context, AI shows significant potential for enhancing the accuracy, effectiveness, and consistency of HR diagnoses. This will eventually lead to better clinical results by enabling earlier intervention and precise management of the condition. Overall, the integration of AI into RIA represents a considerable step forward in the early identification and treatment of HR, offering substantial benefits to both healthcare providers and patients.
Early allograft dysfunction (EAD) significantly affects liver transplantation prognosis. This study evaluated the effectiveness of artificial intelligence (AI)-assisted methods in accurately diagnosing EAD and identifying its causes. The primary metric for assessing the accuracy was the area under the receiver operating characteristic curve (AUC). Accuracy, sensitivity, and specificity were calculated and analyzed to compare the performance of the AI models with each other and with radiologists. EAD classification followed the criteria established by Olthoff et al. A total of 582 liver transplant patients who underwent transplantation between December 2012 and June 2021 were selected. Among these, 117 patients (mean age 33.5 ± 26.5 years, 80 men) were evaluated. The ultrasound parameters, images, and clinical information of patients were extracted from the database to train the AI model. The AUC for the ultrasound-spectrogram fusion network constructed from four ultrasound images and medical data was 0.968 (95%CI: 0.940, 0.991), outperforming radiologists by 30% for all metrics. AI assistance significantly improved diagnostic accuracy, sensitivity, and specificity (P < 0.050) for both experienced and less-experienced physicians. EAD lacks efficient diagnosis and causation analysis methods. The integration of AI and ultrasound enhances diagnostic accuracy and causation analysis. By modeling only images and data related to blood flow, the AI model effectively analyzed patients with EAD caused by abnormal blood supply. Our model can assist radiologists in reducing judgment discrepancies, potentially benefitting patients with EAD in underdeveloped regions. Furthermore, it enables targeted treatment for those with abnormal blood supply.
Fluorescence endoscopy technology utilizes a light source of a specific wavelength to excite the fluorescence signals of biological tissues. This capability is extremely valuable for the early detection and precise diagnosis of pathological changes. Identifying a suitable experimental approach and metric for objectively and quantitatively assessing the imaging quality of fluorescence endoscopy is imperative to enhance the image evaluation criteria of fluorescence imaging technology. In this study, we propose a new set of standards for fluorescence endoscopy technology to evaluate the optical performance and image quality of fluorescence imaging objectively and quantitatively. This comprehensive set of standards encompasses fluorescence test models and imaging quality assessment protocols to ensure that the performance of fluorescence endoscopy systems meets the required standards. In addition, it aims to enhance the accuracy and uniformity of the results by standardizing testing procedures. The formulation of pivotal metrics and testing methodologies is anticipated to facilitate direct quantitative comparisons of the performance of fluorescence endoscopy devices. This advancement is expected to foster the harmonization of clinical and preclinical evaluations using fluorescence endoscopy imaging systems, thereby improving diagnostic precision and efficiency.
This study presents a novel visualization approach to explainable artificial intelligence for graph-based visual question answering (VQA) systems. The method focuses on identifying false answer predictions by the model and offers users the opportunity to directly correct mistakes in the input space, thus facilitating dataset curation. The decision-making process of the model is demonstrated by highlighting certain internal states of a graph neural network (GNN). The proposed system is built on top of a GraphVQA framework that implements various GNN-based models for VQA trained on the GQA dataset. The authors evaluated their tool through the demonstration of identified use cases, quantitative measures, and a user study conducted with experts from machine learning, visualization, and natural language processing domains. The authors' findings highlight the prominence of their implemented features in supporting the users with incorrect prediction identification and identifying the underlying issues. Additionally, their approach is easily extendable to similar models aiming at graph-based question answering.
Breast cancer, which is the most commonly diagnosed cancers among women, is a notable health issues globally. Breast cancer is a result of abnormal cells in the breast tissue growing out of control. Histopathology, which refers to the detection and learning of tissue diseases, has appeared as a solution for breast cancer treatment as it plays a vital role in its diagnosis and classification. Thus, considerable research on histopathology in medical and computer science has been conducted to develop an effective method for breast cancer treatment. In this study, a vision Transformer (ViT) was employed to classify tumors into two classes, benign and malignant, in the Breast Cancer Histopathological Database (BreakHis). To enhance the model performance, we introduced the novel multi-head locality large kernel self-attention during fine-tuning, achieving an accuracy of 95.94% at 100× magnification, thereby improving the accuracy by 3.34% compared to a standard ViT (which uses multi-head self-attention). In addition, the application of principal component analysis for dimensionality reduction led to an accuracy improvement of 3.34%, highlighting its role in mitigating overfitting and reducing the computational complexity. In the final phase, SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations, and Gradient-weighted Class Activation Mapping were used for the interpretability and explainability of machine-learning models, aiding in understanding the feature importance and local explanations, and visualizing the model attention. In another experiment, ensemble learning with VGGIN further boosted the performance to 97.13% accuracy. Our approach exhibited a 0.98% to 17.13% improvement in accuracy compared with state-of-the-art methods, establishing a new benchmark for breast cancer histopathological image classification.
Epilepsy is a chronic neurological disorder characterized by recurrent seizures that can lead to death. Seizure treatment usually involves antiepileptic drugs and sometimes surgery, but patients with drug-resistant epilepsy often remain effectively untreated owing to the lack of targeted therapies. The development of a reliable technique for detecting and predicting epileptic seizures could significantly impact clinical treatment protocols and the care of patients with epilepsy. Over the years, researchers have developed various computational techniques using scalp electroencephalography (EEG), intracranial EEG, and other neuroimaging modalities, evolving from traditional signal processing methods (e.g., wavelet transforms and template matching) to advanced machine learning (ML, e.g., support vector machines and random forests) and deep learning (DL) algorithms (e.g., convolutional neural networks, recurrent neural networks, transformers, graph neural networks, and hybrid architectures). This review provides a detailed examination of epileptic seizure detection and prediction, covering the key aspects of signal processing, ML algorithms, and DL techniques applied to brainwave signals. We systematically categorized the techniques, analyzed key research trends, and identified critical challenges (e.g., data scarcity, model generalizability, and real-time processing). By highlighting the gaps in the literature, this review serves as a valuable resource for researchers and offers insights into future directions for improving the accuracy, interpretability, and clinical applicability of EEG-based seizure detection systems.
Federated learning (FL) has shown great potential in addressing data privacy issues in medical image analysis. However, varying data distributions across different sites can create challenges in aggregating client models and achieving good global model performance. In this study, we propose a novel personalized contrastive representation FL framework, named PCRFed, which leverages contrastive representation learning to address the non-independent and identically distributed (non-IID) challenge and dynamically adjusts the distance between local clients and the global model to improve each client's performance without incurring additional communication costs. The proposed weighted model-contrastive loss provides additional regularization for local models, optimizing their respective distributions while effectively utilizing information from all clients to mitigate performance challenges caused by insufficient local data. The PCRFed approach was evaluated on two non-IID medical image segmentation datasets, and the results show that it outperforms several state-of-the-art FL frameworks, achieving higher single-client performance while ensuring privacy preservation and minimal communication costs. Our PCRFed framework can be adapted to various encoder-decoder segmentation network architectures and holds significant potential for advancing the use of FL in real-world medical applications. Based on a multi-center dataset, our framework demonstrates superior overall performance and higher single-client performance, achieving a 2.63% increase in the average Dice score for prostate segmentation.
This review examines the current applications, benefits, challenges, and future potential of artificial intelligence (AI) and immersive aviation technologies. AI has been applied across various domains, including flight operations, air traffic control, maintenance, and ground handling. AI enhances aviation safety by enabling pilot assistance systems, mitigating human error, streamlining safety management systems, and aiding in accident analysis. Lightweight AI models are crucial for mobile applications in aviation, particularly for resource-constrained environments such as drones. Hardware considerations involve trade-offs between energy-efficient field-programmable gate arrays and power-consuming graphics processing units. Battery and thermal management are critical for mobile device applications. Although AI integration has numerous benefits, including enhanced safety, improved efficiency, and reduced environmental impact, it also presents challenges. Addressing algorithmic bias, ensuring cybersecurity, and managing the relationship between human operators and AI systems are crucial. The future of aviation will likely involve even more sophisticated AI algorithms, advanced hardware, and increased integration of AI with augmented reality and virtual reality, creating new possibilities for training and operations, and ultimately leading to a safer, more efficient, and more sustainable aviation industry.
Speech is a highly coordinated process that requires precise control over vocal tract morphology/motion to produce intelligible sounds while simultaneously generating unique exhaled flow patterns. The schlieren imaging technique visualizes airflows with subtle density variations. It is hypothesized that speech flows captured by schlieren, when analyzed using a hybrid of convolutional neural network (CNN) and long short-term memory (LSTM) network, can recognize alphabet pronunciations, thus facilitating automatic speech recognition and speech disorder therapy. This study evaluates the feasibility of using a CNN-based video classification network to differentiate speech flows corresponding to the first four alphabets: /A/, /B/, /C/, and /D/. A schlieren optical system was developed, and the speech flows of alphabet pronunciations were recorded for two participants at an acquisition rate of 60 frames per second. A total of 640 video clips, each lasting 1 s, were utilized to train and test a hybrid CNN-LSTM network. Acoustic analyses of the recorded sounds were conducted to understand the phonetic differences among the four alphabets. The hybrid CNN-LSTM network was trained separately on four datasets of varying sizes (i.e., 20, 30, 40, 50 videos per alphabet), all achieving over 95% accuracy in classifying videos of the same participant. However, the network's performance declined when tested on speech flows from a different participant, with accuracy dropping to around 44%, indicating significant inter-participant variability in alphabet pronunciation. Retraining the network with videos from both participants improved accuracy to 93% on the second participant. Analysis of misclassified videos indicated that factors such as low video quality and disproportional head size affected accuracy. These results highlight the potential of CNN-assisted speech recognition and speech therapy using articulation flows, although challenges remain in expanding the alphabet set and participant cohort.
To conduct a computational investigation to explore the influence of clinical reference uncertainty on magnetic resonance imaging (MRI) radiomics feature selection, modelling, and performance. This study used two sets of publicly available prostate cancer MRI = radiomics data (Dataset 1: n = 260; Dataset 2: n = 100) with Gleason score clinical references. Each dataset was divided into training and holdout testing datasets at a ratio of 7:3 and analysed independently. The clinical references of the training set were permuted at different levels (increments of 5%) and repeated 20 times. Four feature selection algorithms and two classifiers were used to construct the models. Cross-validation was employed for training, while a separate hold-out testing set was used for evaluation. The Jaccard similarity coefficient was used to evaluate feature selection, while the area under the curve (AUC) and accuracy were used to assess model performance. An analysis of variance test with Bonferroni correction was conducted to compare the metrics of each model. The consistency of the feature selection performance decreased substantially with the clinical reference permutation. AUCs of the trained models with permutation particularly after 20% were significantly lower (Dataset 1 (with ≥ 20% permutation): 0.67, and Dataset 2 (≥ 20% permutation): 0.74), compared to the AUC of models without permutation (Dataset 1: 0.94, Dataset 2: 0.97). The performances of the models were also associated with larger uncertainties and an increasing number of permuted clinical references. Clinical reference uncertainty can substantially influence MRI radiomic feature selection and modelling. The high accuracy of clinical references should be helpful in building reliable and robust radiomic models. Careful interpretation of the model performance is necessary, particularly for high-dimensional data.
Cataract is the leading ocular disease of blindness and visual impairment globally. Deep neural networks (DNNs) have achieved promising cataracts recognition performance based on anterior segment optical coherence tomography (AS-OCT) images; however, they have poor explanations, limiting their clinical applications. In contrast, visual features extracted from original AS-OCT images and their transform forms (e.g., AS-OCT-based histograms) have good explanations but have not been fully exploited. Motivated by these observations, an explainable machine learning framework to recognize cataracts severity levels automatically using AS-OCT images was proposed, consisting of three stages: visual feature extraction, feature importance explanation and selection, and recognition. First, the intensity histogram and intensity-based statistical methods are applied to extract visual features from original AS-OCT images and AS-OCT-based histograms. Subsequently, the SHapley Additive exPlanations and Pearson correlation coefficient methods are applied to analyze the feature importance and select significant visual features. Finally, an ensemble multi-class ridge regression method is applied to recognize the cataracts severity levels based on the selected visual features. Experiments on a clinical AS-OCT-NC dataset demonstrate that the proposed framework not only achieves competitive performance through comparisons with DNNs, but also has a good explanation ability, meeting the requirements of clinical diagnostic practice.
In recent years, the application of artificial intelligence (AI) in medical image analysis has drawn increasing attention in clinical studies of gynecologic tumors. This study presents the development and prospects of AI applications to assist in the treatment of gynecological oncology. The Web of Science database was screened for articles published until August 2023. "artificial intelligence," "deep learning," "machine learning," "radiomics," "radiotherapy," "chemoradiotherapy," "neoadjuvant therapy," "immunotherapy," "gynecological malignancy," "cervical carcinoma," "cervical cancer," "ovarian cancer," "endometrial cancer," "vulvar cancer," "Vaginal cancer" were used as keywords. Research articles related to AI-assisted treatment of gynecological cancers were included. A total of 317 articles were retrieved based on the search strategy, and 133 were selected by applying the inclusion and exclusion criteria, including 114 on cervical cancer, 10 on endometrial cancer, and 9 on ovarian cancer. Among the included studies, 44 (33%) focused on prognosis prediction, 24 (18%) on treatment response prediction, 13 (10%) on adverse event prediction, five (4%) on dose distribution prediction, and 47 (35%) on target volume delineation. Target volume delineation and dose prediction were performed using deep Learning methods. For the prediction of treatment response, prognosis, and adverse events, 57 studies (70%) used conventional radiomics methods, 13 (16%) used deep Learning methods, 8 (10%) used spatial-related unconventional radiomics methods, and 3 (4%) used temporal-related unconventional radiomics methods. In cervical and endometrial cancers, target prediction mostly included treatment response, overall survival, recurrence, toxicity undergoing radiotherapy, lymph node metastasis, and dose distribution. For ovarian cancer, the target prediction included platinum sensitivity and postoperative complications. The majority of the studies were single-center, retrospective, and small-scale; 101 studies (76%) had single-center data, 125 studies (94%) were retrospective, and 127 studies (95%) included Less than 500 cases. The application of AI in assisting treatment in gynecological oncology remains limited. Although the results of AI in predicting the response, prognosis, adverse events, and dose distribution in gynecological oncology are superior, it is evident that there is no validation of substantial data from multiple centers for these tasks.
Web-based libraries, such as D3.js, ECharts.js, and G6.js, are widely used to generate node-link graph visualizations. These libraries allow users to call application programming interfaces (APIs) without identifying the details of the encapsulated techniques such as graph layout algorithms and graph rendering methods. Efficiency requirements, such as visualizing a graph with 3k nodes and 4k edges within 1 min at a frame rate of 30 fps, are crucial for selecting a proper library because libraries generally present different characteristics owing to the diversity of encapsulated techniques. However, existing studies have mainly focused on verifying the advantages of a new layout algorithm or rendering method from a theoretical viewpoint independent of specific web-based libraries. Their conclusions are difficult for end users to understand and utilize. Therefore, a trial-and-error selection process is required. This study addresses this gap by conducting an empirical experiment to evaluate the performance of web-based libraries. The experiment involves popular libraries and hundreds of graph datasets covering node scales from 100 to 200k and edge-to-node ratios from 1 to 10 (including complete graphs). The experimental results are the time costs and frame rates recorded using the libraries to visualize the datasets. The authors analyze the performance characteristics of each library in depth based on the results and organize the results and findings into application-oriented guidelines. Additionally, they present three usage cases to illustrate how the guidelines can be applied in practice. These guidelines offer user-friendly and reliable recommendations, aiding users in quickly selecting the desired web-based libraries based on their specific efficiency requirements for node-link graph visualizations.
This study presents an energy consumption (EC) forecasting method for laser melting manufacturing of metal artifacts based on fusionable transfer learning (FTL). To predict the EC of manufacturing products, particularly from scale-down to scale-up, a general paradigm was first developed by categorizing the overall process into three main sub-steps. The operating electrical power was further formulated as a combinatorial function, based on which an operator learning network was adopted to fit the nonlinear relations between the fabricating arguments and EC. Parallel-arranged networks were constructed to investigate the impacts of fabrication variables and devices on power. Considering the interconnections among these factors, the outputs of the neural networks were blended and fused to jointly predict the electrical power. Most innovatively, large artifacts can be decomposed into time-dependent laser-scanning trajectories, which can be further transformed into fusionable information via neural networks, inspired by large language model. Accordingly, transfer learning can deal with either scale-down or scale-up forecasting, namely, FTL with scalability within artifact structures. The effectiveness of the proposed FTL was verified through physical fabrication experiments via laser powder bed fusion. The relative error of the average and overall EC predictions based on FTL was maintained below 0.83%. The melting fusion quality was examined using metallographic diagrams. The proposed FTL framework can forecast the EC of scaled structures, which is particularly helpful in price estimation and quotation of large metal products towards carbon peaking and carbon neutrality.
Photoacoustic imaging (PAI), a modality that combines the high contrast of optical imaging with the deep penetration of ultrasound, is rapidly transitioning from preclinical research to clinical practice. However, its widespread clinical adoption faces challenges such as the inherent trade-off between penetration depth and spatial resolution, along with the demand for faster imaging speeds. This review comprehensively examines the fundamental principles of PAI, focusing on three primary implementations: photoacoustic computed tomography, photoacoustic microscopy, and photoacoustic endoscopy. It critically analyzes their respective advantages and limitations to provide insights into practical applications. The discussion then extends to recent advancements in image reconstruction and artifact suppression, where both conventional and deep learning (DL)-based approaches have been highlighted for their role in enhancing image quality and streamlining workflows. Furthermore, this work explores progress in quantitative PAI, particularly its ability to precisely measure hemoglobin concentration, oxygen saturation, and other physiological biomarkers. Finally, this review outlines emerging trends and future directions, underscoring the transformative potential of DL in shaping the clinical evolution of PAI.