Medaka (Oryzias latipes) is a small freshwater teleost widely used as a vertebrate model organism. Existing medaka reference genomes, however, contain many gaps and unresolved repetitive regions, hindering precise genome annotation and comparative analyses. Here we present one complete and two near-complete genome assemblies for three inbred medaka strains derived from geographically distant populations. These assemblies provide a comprehensive view of highly repetitive sequences and chromosome-scale genome architecture in medaka. The fully resolved centromeres reveal an intriguing sequence organization characterized by short, distinct SF1+3 satellite arrays flanked by larger homogenized repeats. These short arrays are putatively hypomethylated and conserved across all acrocentric chromosomes, suggesting a functional role in centromere stability. The reconstructed 121 copies of the giant mobile element Teratorn retain complete genes of both a transposon and a herpesvirus, highlighting its unique persistence and impact on host genomes. Moreover, our assemblies reveal extensive structural divergence of medaka Y Chromosomes, yet identify a small (~24 kb) conserved region encompassing Dmy that may suffice for male determination. Collectively, these (near-)complete medaka genomes provide a powerful resource for exploring the biology of uncharacterized repetitive regions and the molecular basis of phenotypic diversity in vertebrates.
Causal therapy has achieved success in the treatment of epithelial tumors, which account for more than 80% of all cancers. Although frequently claimed as breakthroughs, cancer therapy has achieved limited increases in survival of only weeks to several months, and cancer incidence continues to increase while metastasis rates, which are primarily responsible for cancer mortality, remain constant. This reflects an incomplete understanding of carcinogenesis and metastatic progression. Over a 25-year timeframe, the series "Epistemology of the Origin of Cancer" has examined the biological basis of carcinogenesis and metastasis. Part I addressed carcinogenesis, Part II identified the first cancer cell, and Part III described the development of local pre-metastatic niches and traveling cancer satellites. In this fourth part, the conditions required for distant metastasis are discussed, including the sequential development of metastatic niches (MN-1, MN-2, and MN-3), transendothelial migration, dormancy, immune modulation, extracellular matrix remodeling, and metastatic niche maturation. The review proposes that metastatic progression depends on the formation of metastatic cancer satellites consisting of metastatic cancer cells, metastasis-associated fibroblasts (MAFs), stromal components, chemokine coatings, platelets, and neutrophil extracellular traps (NETs), each contributing to immune evasion and dissemination. This sequential distant metastatic niche model provides a biological framework explaining clinical observations including metastatic dormancy, relapse after surgery or anticancer therapy, tumor heterogeneity, and the limited long-term success of current therapeutic approaches.
Monitoring agricultural lands is crucial for achieving food security. Earth Observation (EO) has recently become an essential tool to reach this goal owing to advances in spatial and temporal resolutions as well as radiometric accuracy of the current satellite sensors. Yet, the size of the datasets and the need to preprocess them limit their dissemination to some scientific communities and in direction to stakeholders. The dataset presented here is composed of pluriannual time series of variables acquired by high spatial (10-30 m) and medium to high temporal resolutions (a few days combining images from different swaths) from radar (Sentinel-1) and multispectral (Landsat-8 and Sentinel-2) EO satellites along with commonly used vegetation indices and biophyscial variables averaged over more than 1,400 agricultural fields. Data were made available in the framework of a project aimed at developing the application of numeric tools on EO data for agroecology purposes. This dataset was made publicly available owing to its easyness to use and interest for agronomists, environmentalists as well as economics and politics stakeholders.
Amid an ongoing warming climate, long-term increases in surface ozone (O3) concentrations have been reported across East Asia, including South Korea, despite sustained efforts to reduce precursor emissions. In this study, we use an extreme heat episode as a physically plausible analog of near-future warming conditions to examine how climate-driven temperature increases may offset the air quality benefits of emission reductions in two major metropolitan areas in South Korea: Seoul and Busan. Specifically, we selected the unprecedented heatwave during the summer of 2018, which recorded the highest seasonal mean temperature over the past decade and represents meteorological conditions analogous to those expected under continued climate warming. Heatwave-specific biogenic volatile organic compound (BVOC) emissions were contrasted with those during non-heatwave conditions using surface observations, satellite-based formaldehyde (HCHO) retrievals from TROPOMI, and WRF-Chem simulations. The results show that temperature was positively correlated with both HCHO and O3 in Seoul (R = 0.59 and 0.69, respectively) and Busan (R = 0.60 and 0.57, respectively), consistent with enhanced temperature-driven BVOC emissions and subsequent oxidation reflected by HCHO. These findings indicate that rising temperatures intensify photochemical reactions and alter O3 production sensitivity in major megacities through the amplification of BVOC emissions, particularly under VOC-limited ozone formation regimes. Our results suggest that continued climate warming, represented here by historically observed extreme heat conditions, may impose a climate change penalty on urban ozone, implying that substantially greater emission reductions will be required to attain equivalent air quality benefits in major East Asian cities under future warming.
K-mer-based analysis of genomic data is ubiquitous, but the presence of repetitive k-mers continues to pose problems for the accuracy of many methods. For example, the Mash tool (Ondov et al 2016) can accurately estimate the substitution rate between two low-repetitive sequences from their k-mer sketches; however, it is inaccurate on repetitive sequences such as the centromere of a human chromosome. Follow-up work by Blanca et al. (2021) has attempted to model how mutations affect k-mer sets based on strong assumptions that the sequence is non-repetitive and that mutations do not create spurious k-mer matches. However, the theoretical foundations for extending an estimator like Mash to work in the presence of repeat sequences have been lacking. In this work, we relax the non-repetitive assumption and propose a novel estimator for the mutation rate. We derive theoretical bounds on our estimator's bias. Our experiments show that it remains accurate for repetitive genomic sequences, such as the alpha satellite higher order repeats in centromeres. We demonstrate our estimator's robustness across diverse datasets and various ranges of the substitution rate and k-mer size. Finally, we show how sketching can be used to avoid dealing with large k-mer sets while retaining accuracy. Our software is available at https://github.com/medvedevgroup/Repeat-Aware_Substitution_Rate_Estimator.
This study investigated sex differences in speed, sub-technique selection and cycle characteristics during a sprint cross-country skiing time-trial qualification in the classical style. Thirty elite- to world-class cross-country skiers (15 women, 15 men; mean International Ski Federation [FIS] sprint points ∼58 in both groups) performed an FIS-regulated on-snow classical sprint competition. The initial 1.3-km time-trial qualification (prologue) was analyzed using a combined global navigation satellite system and inertial measurement unit to determine speed, sub-technique selection, and cycle characteristics. The men were ∼14% faster than the women (mean speed: 8.26 vs. 7.23 m·s-1, P < 0.001). The sex difference in speed was greatest in the uphill sections (19%-20% difference) and smallest in the downhill sections (6%-9%), with comparable differences in flat terrain (∼13%-14%). The men spent larger portion of the time trial using double poling than the women (∼61% vs. ∼53% of total time, P < 0.001), whereas women used diagonal stride more than men (∼23% vs. ∼16% of total time, P < 0.001). Cycle analysis revealed that the men had 9%-11% longer cycle lengths than the women in both diagonal stride and double poling (both P < 0.05), while cycle rates were generally similar between sexes. Elite male cross-country skiers were ∼14% faster than performance-matched female skiers during a sprint time-trial qualification with equal distance. The most notable difference appeared on uphill terrain, where men relied more heavily on double poling and achieved longer cycle lengths across all sub-techniques. Consequently, female and male skiers face distinct sport-specific demands when competing over equal distances.
Long-term trajectories of the Normalized Difference Vegetation Index (NDVI) provide insights into ecosystem changes associated with vegetation productivity and land degradation. However, their interpretation may depend on land-use history. Here, we analyzed NDVI trajectories from Moderate Resolution Imaging Spectroradiometer (MODIS) data (2000-2022) across two contrasting regions in the Brazilian Cerrado: an established agricultural region and an expanding frontier. We evaluated whether NDVI trends differ between regions as a function of land-use dynamics and environmental/climatic conditions. NDVI trajectories were derived using Trends.Earth and correlated with MODIS Gross Primary Production (GPP). Common and region-specific trend drivers were analyzed using binary logistic regression and Wald-type Z tests. Results revealed a remote-sensing paradox. Despite pronounced differences between regions, the primary common driver of NDVI decline was the native savanna conversion to croplands and pastures. Paradoxically, the established agricultural region, where most clearing predated satellite observations, was the most stable, showing the smallest pixel proportion of NDVI decrease (7%) and the highest proportion of increase (52%). In contrast, the agricultural frontier, where native vegetation dominates and clearing largely coincided with satellite observations, showed greater NDVI decline (16%) and smaller increases (37%). Positive NDVI-GPP relationships were weaker in the established region. Region-specific drivers included precipitation, soil sand content, and fire frequency. Longer agricultural use duration contributed to NDVI increases, likely reflecting management practices that mask underlying degradation. Our findings indicate that NDVI-based assessments of land degradation in the Cerrado must consider early land-use history and agricultural management relative to the period of satellite observations.
Landfill fires pose severe environmental and public health risks, particularly in developing regions where waste accumulation is poorly managed. This study develops a multi-sensor machine-learning model for monitoring open landfill burning, using the 2023 Sarimukti Landfill fire in West Bandung, Indonesia, as a case study. UAV imagery and Sentinel-2 data were integrated into map fire-affected zones using Random Forest (RF) and Gradient Tree Boosting (GTB) algorithms. The model achieved high classification performance, with overall accuracies of 80.55% and 81.31% for two UAV datasets and high cross-validated AUC values that were comparable across classifiers and across the UAV-only and integrated inputs. A fire-mitigation priority model was then developed by combining the Burned Severity Index (BSI) and a Digital Surface Model (DSM) to evaluate the influence of elevation on fire persistence. The current study is limited to a single landfill site; future work should extend the model to multiple landfill locations and incorporate additional variables, such as thermal and meteorological data, to improve generalizability. These findings highlight the potential of UAV-satellite integration and machine learning for rapid fire monitoring and risk assessment in waste-management systems.
This study aimed at evaluating the role of contrast-enhanced mammography (CEM) for detection of multifocal/multicentric disease as compared to MRI in patients with breast carcinoma (BC). The study commenced after obtaining the institute's ethics approval. All included patients underwent CEM and MRI within an interval of 2 weeks. Images were interpreted as per the respective BIRADS lexicon. In addition to the baseline assessment of the index mass and its size, each modality was specifically evaluated for the presence of any additional findings like satellite lesions or non-mass enhancement, which were categorized as multifocal or multicentric disease depending on the location of additional lesions. Histopathology was considered the gold standard to ascertain the malignant nature of additional findings, wherever possible. Majority of the patients in our study had early T stage with clinical T2 (48.6%) followed by T1 stage (25.7%). Additional lesions could be identified in 34 (97.1%) patients on both CEM and MRI. Multifocality of the disease was seen in 23 patients (67.6%) with a median size of 1.2 cm and multicentricity in 11(32.3%) patients with a median size of 1.45 cm on CEM. Multifocality of the disease was seen in 21(61.7%) with a median size of 1.4 cm and multicentricity in 13 patients with a median size of 2 cm on MRI. CEM offers good correlation with MRI for determining the disease extent and multifocal/multicentric disease estimation. It is a new emerging modality and preliminary results showed comparable performance of CEM with MRI. Both the modalities have their inherent advantages and limitations; and longitudinal studies with large sample sizes are needed for further validation and clinical application.
Artificial light at night (ALAN) is a growing environmental pressure linked to socioeconomic development. This study examines ALAN trends from 2012 to 2024 across 165 countries using harmonized VIIRS satellite data. Globally, ALAN increased at an annual rate of 3.2%. Highest radiance levels occur in developed regions (Europe, North America, and East Asia), while remote areas remain dark. A few countries, including France and Venezuela, show declines. The study extends the VIIRS time series and applies a calibrated Kaya-identity framework integrating the human development index (HDI) and Gini coefficient. It introduces radiance density (RD) as a normalized metric linked to per-capita development for cross-national comparison. ALAN shows associations with macroeconomic indicators. Absolute ALAN correlates with total GDP (log), energy consumption, and CO2 emissions ( R 2 = 0.73 $R^2=0.73$ -0.81). RD and light pollution density better reflect development quality, correlating with HDI ( R 2 = 0.65 $R^2=0.65$ ), GDP per capita (log) ( R 2 = 0.63 $R^2=0.63$ ), and life expectancy ( R 2 = 0.62 $R^2=0.62$ ). Predictive models achieve high explanatory power ( R 2 = 0.87 $R^2=0.87$ -0.89), with population and GDP per capita as the strongest determinants.
Medicinal plants remain a vital source of new therapeutic agents, yet many species remain underexplored. Echinops niveus Wall. ex Royle, a member of the Asteraceae family, has not been comprehensively evaluated for multifunctional bioactivity. This study aimed to assess the phytochemical composition, antioxidant, antimicrobial, anti-leishmanial, and cytotoxic properties of E. niveus extracts.The aerial parts of E. niveus were extracted with six different solvents: aqueous, methanolic, ethanolic, chloroform, ethyl acetate and n-hexane. They were characterized by phytochemical profiling and FT-IR spectroscopy. The antioxidant activity was measured by DPPH free radical scavenging and reducing power assays. Antimicrobial activity was assessed against Gram-positive bacteria (Bacillus subtilis, Staphylococcus aureus), Gram-negative bacteria (Escherichia coli, Pseudomonas aeruginosa, Klebsiella pneumoniae) and some fungal strains. The anti-leishmanial activity was assessed by the MTT assay and the cytotoxic activity was explored in the brine shrimp lethality assay and prostate cancer cell lines (PC3). The n-hexane extract showed the strongest DPPH scavenging activity (IC₅₀ = 104.76 ± 1.2 µg/mL). The chloroform extract demonstrated the highest reducing power (73.14 ± 1.47 mg AAE/g) and total antioxidant capacity (63.49 ± 1.46 mg AAE/g). The aqueous extract exhibited the best antibacterial potential against tested strains. Anti-leishmanial activity exceeded 50% inhibition for all extracts, with the aqueous extract showing 64% inhibition. The n-hexane extract was most cytotoxic against brine shrimp (LD₅₀ = 56.15 µg/mL) and PC3 cells.Echinops niveus possesses significant multifunctional pharmacological potential, supporting further investigation into its bioactive compounds and mechanisms as a candidate for drug discovery.
Satellite cells (SCs) are recognized as the resident stem cells of adult skeletal muscles, essential for post-natal skeletal muscle growth and regeneration following focal myotrauma. In healthy muscle, SCs are quiescent, but in response to damage or growth signals, they become activated to proliferate and differentiate to form new myofibers. A small population self-renews to replenish the basal pool for future demands. The ability of SCs to precisely balance quiescence, self-renewal, and myogenic commitment/differentiation is essential for ensuring long-term muscle homeostasis and tissue maintenance. Their state and functionality are strictly regulated by different intrinsic and extrinsic cues, the latter deriving from the microenvironment in which SCs reside, known as the niche. The niche is a dynamic compartment where extracellular matrix components, soluble factors, mechanical stimuli, and multiple interacting cell populations modulate the morphological, molecular, and electrophysiological properties of SCs. This review provides an updated overview of the morpho-functional features of SCs and of non-myogenic stromal interstitial cells, highlighting their reciprocal crosstalk within the regenerative niche. Such stromal cells play a dual role, acting as "good" or "bad" cells: while functioning as nursing cells for SCs during muscle repair/regeneration via juxtracrine and paracrine interactions, their excessive accumulation and adoption of a fibrotic/fat phenotype may lead to aberrant tissue repair, compromising muscle function. A deeper understanding of SC biology and of collaborative spatiotemporal cell interactions in the healthy, damaged, and regenerative niche is essential to identify potential novel targets and to better address interventions for maintaining, restoring, or enhancing muscle regeneration capacity and mitigating the deleterious effects of extended, severe, or pathological muscle damage.
Duchenne muscular dystrophy (DMD) is a devastating disease manifested in skeletal muscle by repetitious myonecrosis and regeneration. Because the regenerative process is closely linked to the cumulative severity of muscle damage, which is variably distributed within and between muscle groups, accurately quantifying muscle regeneration has remained a significant challenge. Myofibers are delineated by immunostaining for laminin, and subsequent image analysis employed to generate a masked outline precisely within each myofiber boundary. Morphometric parameters including minimal Feret's diameter, cross-sectional area, and circularity were measured for each myofiber. In addition, the number of Pax7-expressing satellite cells were quantified. To evaluate regenerative activity, newly formed myofibers were identified by immunostaining for expression of embryonic myosin heavy chain (eMHC). Necrotic myofibers were enumerated by immunofluorescent detection of immunoglobulin G (IgG) infiltration. The Regenerative Index (RI) was calculated as the number of regenerating (eMHC+) myofibers divided by the number of necrotic (IgG+) myofibers. Determination of RI was performed on muscle biopsies obtained from 10 boys with DMD and 3 age-matched non-DMD controls. A trend toward an increasing minimal Feret's diameter, cross-sectional area and circularity was observed with increasing age in DMD boys, with circularity showing the strongest trend. Furthermore, compared to DMD boys 7- to 8-years old, the boys 9- to 11-years old had increased myofiber circularity. Pax7-expressing cells per myofiber were elevated in DMD boys compared to control boys of similar ages, without any observation of age-related changes. The Regenerative Index in DMD boys exhibited a decline between 7 and 11 years of age, with an inverse correlation between RI and age. The use of eMHC and IgG immunostaining to calculate RI appears to provide a way to assess regeneration across biopsies that differ in histopathologic severity. Using this approach, RI showed a negative correlation with age in DMD boys aged 7 to 11 years which requires further investigation.
5G wireless networks have paved the way for intelligent, reliable communication networks with extremely high data rates, ultra-low latency, and highly reliable connectivity. With 6G wireless networks expected to offer up to 1 Tbps data rates with near-zero latency, wireless communication networks are expected to get smarter than ever. As a result, ultra-fast intelligent wireless networks are expected to power the hyper-connected intelligent world of tomorrow. Towards this direction, this paper provides a comprehensive overview of how machine learning (ML) techniques can be employed in Wireless Sensor Networks (WSNs) and Internet of Things (IoT) networks over futuristic 6G wireless networks. Machine learning would enable 6G-enabled IoT/WSNs to operate autonomously, detect anomalies, optimize energy use, and respond to real-time data they sense and collect. Enabling technologies such as edge Artificial Intelligence (AI), satellite-assisted 6G, Intelligent Reflecting Surfaces (IRS), and terahertz communications are discussed. Furthermore, a novel architecture employing federated and distributed learning for IoT communication is presented to demonstrate low-latency, energy-efficient, and secure communication for distributed ML tasks. Results indicate the superiority of the proposed architecture over contemporary 5G-based architectures in terms of network intelligence, latency, and reliability. Finally, the paper discusses major challenges and future directions for realizing the promise of 6G for ML-powered IoT and WSNs.
Monitoring the population dynamics of industrial methanotrophic bacterial consortia is critical for optimization of single-cell protein (SCP) production from natural gas. Traditional manual microscopy is labor-intensive, subjective, and limited in throughput. AI-based computer vision offers a promising alternative for automated, quantitative analysis of cell morphotypes in mixed cultures. A convolutional neural network (YOLO11x-seg) with a P2 activation layer for small-object detection and Focaler-MPDIoU loss function was trained on 250 phase-contrast micrographs of an industrial methanotrophic consortium based on Methylococcus capsulatus KN2, cultivated continuously at dilution rates of 0.15-0.25 h-1. The dataset comprised nine cellular morphotypes across 50,410 annotated objects (training: 200 images; validation: 50 images), with synthetic data augmentation applied to reduce class imbalance in the tetracocci class. The trained model was then applied to a pure M. capsulatus KN2 culture during substrate-unlimited batch growth (μmax = 0.223 h-1, td = 3.11 h) across three biological replicates and four time points (3, 5, 7, 9 h), analyzing 277 micrographs and 15,124 objects. On the validation set, the model achieved mAP@0.5:0.95 = 0.52, with class-weighted Precision = 0.87, Recall = 0.85, and F1 = 0.82. Per-morphotype F1 scores were: monococci 0.75, diplococci0.89, tetracocci 0.65. In the industrial consortium, producer cells (M. capsulatus KN2) constituted 88.5% of the population (monococci 33.7%, diplococci 61.9%, tetracocci 4.4%), while satellite bacteria comprised 11.5%. During batch cultivation of the pure producer strain, the diplococci fraction increased from 61.0% to 68.6% over 7 h, negatively correlating with a decline in monococci from 35.4% to 28.8% (Pearson r = -0.996, p = 0.004). Tetracocci showed no statistically significant correlation with either morphotype and are considered a stochastic subpopulation of diplococci. Cell cycle analysis revealed elongation of the M-phase from 71.6% to 78.4% of td. Monococci cell radius, intracellular volume, and periplasmic surface area all increased over 9 h, while the surface-to-volume ratio declined. The observed M-phase elongation is consistent with incipient substrate limitation (CH4 or O2) in gas-tight batch flasks, detectable through morphotype ratio shifts before standard process parameters register any change. The approach enables a proof-of-concenpt for real-time culture quality monitoring, early prediction of growth limitations, and optimization of SCP production in industrial bioreactors.
Stratospheric water vapour (SWV) is a key greenhouse gas that influences both global climate and stratospheric ozone chemistry1-4. Its abundance is strongly modulated by natural climate variability1,5-8. Volcanic eruptions have long been expected to humidify the stratosphere via tropopause warming9,10, but observational confirmation has been lacking. Here we provide observational evidence that moderate volcanic eruptions and extreme wildfires since 2005 have systematically increased SWV. Both contribute through aerosol-induced tropopause warming; however, extreme wildfires reveal an additional self-lofting pathway that transports water vapour into the stratosphere. Complementary analysis of satellite observations and climate model simulations reveals an SWV enhancement of about 0.1 ppmv at 83 hPa, accumulating 76-203 million tons of water vapour during 2005-2021. This contribution explains 36 ± 7% of the observed SWV trend over this period, comparable to that from the global surface temperature increase. SWV changes induced by the surface temperature trend, moderate volcanic eruptions and extreme wildfire events have together effectively offset the sudden 10% SWV decrease observed around 2000. Episodic aerosol perturbations from moderate volcanic eruptions and extreme wildfires therefore emerge as a previously overlooked driver of SWV variability. Future projections of stratospheric composition, radiative forcing and ozone recovery should account for these aerosol-mediated processes, especially as extreme fires intensify in a warming world.
Energetic electrons in Earth's inner radiation belt pose significant hazards to spacecraft systems, with the strongest radiation in low-Earth orbit (LEO) mostly confined to the South Atlantic Anomaly (SAA) region. Once considered stable, the inner belt is now understood to exhibit significant variability. Using data from the low-Earth-orbit Macau Science Satellite-1 mission, we report transient distortions of the SAA radiation environments, observationally characterized by enhanced fluxes of energetic electrons outside the traditional SAA radiation region, appearing either attached to or detached from its boundary. We show that these distortions can be explained by large-scale electric-field perturbations that adiabatically alter the electron mirror heights, which can be further modulated by ultra-low-frequency waves. Test-particle simulations successfully reproduce the observational features and provide crucial constraints on properties of the associated electric fields. These findings reveal a distinct manifestation of inner-belt variability, extending the electron radiation risks beyond the expected boundaries of the SAA radiation environments.
Melt ponding on Arctic sea ice is a key indicator of the transition from a predominantly perennial to a seasonal sea-ice cover, yet quantitative data on pond depth remain limited. Here, we present the first analysis of melt-pond depth using Ice, Cloud, and land Elevation Satellite-2 (ICESat-2)'s Advanced Topographic Lidar Altimeter System (ATLAS). The Density-Dimension Algorithm for bifurcating sea-ice reflectors (DDA-bifurcate-seaice) automatically detects multiple surface returns in ICESat-2 photon data and estimates corresponding surface heights, enabling melt-pond-depth retrievals under varied noise conditions. Airborne lidar and imagery collected during the NASA ICESat-2 Project Arctic Summer Sea Ice Campaign (July 2022) provide near-coincident observations used to evaluate and optimize the algorithm's melt-pond detection. Evaluation of the melt-pond-depth quantile using Chiroptera data shows that the uniform value used in the ATL07 release 7 data product is near-optimal. We demonstrate DDA-bifurcate-seaice's capability to detect a wide range of melt feature morphologies, including smooth or rough bottoms, ridge-adjacent ponds, partial drainage and seawater intrusion. To further improve depth determination, we propose a depth-quantile function that reduces bias and mean-squared error by a factor of 2.75 and 2.2, respectively. This work improves melt-pond-depth estimation using the DDA-seaice-bifurcate, supporting Arctic- and Antarctic-wide mapping in the ICESat-2/ATLAS experimental sea-ice melt-pond data product on ATL07 (release 7).
Product demand and climate variability are progressively increasing the need for real-time, scalable crop monitoring to support varietal selection and in-season input optimisation. However, producers still have limited information on the temporal and spatial variability of cotton health and performance beyond point-scale field surveying. In addition, given cotton's high phenotypic plasticity, near real-time derived metrics are essential to improve input efficiency and strengthen long-term sustainability of the cotton industry in Australia. Therefore, we proposed a functional integrated predictive sensing framework to estimate and predict cotton canopy morphological (i.e., height) and productivity traits (i.e., dry matter and lint yield) across large plots (12m × 6m). Scalability was validated by applying the proposed framework to estimate and map cotton yield across commercial fields. To do this, we explored the accuracy of high-resolution multispectral imagery from two platforms (unmanned aerial vehicle (UAV) and PlanetScope (PS)) collected across two 144-plot trials for two cotton seasons. These were designed with a large range in nitrogen rates (N), shading, and two growth-regulator doses, thus, creating variable environments. Sensing metrics were obtained from UAV imagery (1.3-1.6 cm pixel size) acquired once in 2022/23 and eight times in 2023/24, while PS composites (3 m pixel size) provided near-daily coverage in both seasons. Time-series gaps were imputed using Savitzky-Golay smoothing in thermal time (GDD), enabling extraction of growth dynamic metrics (GDMs) as single-date (SD; e.g., peak canopy) and multi-date (MD; e.g., daily average growth rate) metrics. After reducing collinearity and dimensionality, random forest (RF), support vector regression (SVR), and gaussian process regression (GPR) were trained and interpreted with SHAP, for feature contribution. UAV single-date models (SD_ UAV) achieved strong accuracy for height (R2 = 0.77), biomass (R2 = 0.73), and yield (R2 = 0.81). Incorporating UAV time-series metrics (MD_UAV) improved the performance R2 = 0.87, 0.86, and 0.85 for height, biomass and yield, respectively. Application of the derived models using high resolution satellite data (MD_PS) for different farming systems showed highly significant accuracy (R2 = 0.67) to predict cotton yield at aggregated field scale. As such, enabling the detailed spatial prediction of cotton yield within a field. It is anticipated that the proposed functional sensing framework will improve the estimation of key cotton production traits, supporting field- and within-field decision-making, ultimately contributing to more resilient and sustainable cotton production in Australia.
Early recurrence within 24 months post-resection remains a primary driver of poor prognosis in hepatocellular carcinoma (HCC). In the absence of standardized adjuvant guidelines, robust postoperative risk stratification is critical. We evaluated explainable machine learning (ML) architectures to optimize risk modeling using readily accessible parameters. This retrospective, multicenter study analyzed 1,681 HCC patients undergoing curative-intent hepatectomy at Chang Gung institutions (2007-2020) as the training cohort. External validation was conducted using an independent cohort (n = 251) from Mackay Memorial Hospital. Four algorithms-random survival forest, Cox-nnet, LASSO, and extreme gradient boosting (XGBoost)-were trained using 5-fold cross-validation. Missing data were handled via k-nearest neighbors imputation. Discriminative capacity was assessed using the concordance index (C-index), and feature significance was decoded through SHAP values. The XGBoost framework yielded optimal discrimination, achieving a high training C-index of 0.98. During independent external validation, the C-index attenuated to a robust 0.72, reflecting expected adjustments for baseline institutional heterogeneities. Multivariable Cox and SHAP analyses consistently identified five pivotal predictors: sex, preoperative treatment, tumor size, satellite lesions, and vascular invasion. The derived nomogram enabled effective patient risk-tiering (p < 0.0001), although absolute recurrence probabilities were systematically overestimated in the external validation cohort. While the XGBoost model exhibits expected calibration shifts across disparate cohorts, it provides robust, cross-center discriminative generalizability for categorical risk stratification. Rather than serving as an absolute probability estimator, this explainable model functions as a reliable clinical tool to selectively identify high-risk candidates for intensive imaging surveillance. Geographically and ethnically diverse prospective validation remains required prior to broader clinical deployment.