Context: Cybersecurity vendors often publish cyber threat intelligence (CTI) reports, referring to the written artifacts on technical and forensic analysis of the techniques used by the malware in APT attacks. Objective: The goal of this research is to inform cybersecurity practitioners about how adversaries form cyberattacks through an analysis of adversarial techniques documented in cyberthreat intelligence reports. Dataset: We use 594 adversarial techniques cataloged in MITRE ATT\&CK. We systematically construct a set of 667 CTI reports that MITRE ATT\&CK used as citations in the descriptions of the cataloged adversarial techniques. Methodology: We analyze the frequency and trend of adversarial techniques, followed by a qualitative analysis of the implementation of techniques. Next, we perform association rule mining to identify pairs of techniques recurring in APT attacks. We then perform qualitative analysis to identify the underlying relations among the techniques in the recurring pairs. Findings: The set of 667 CTI reports documents 10,370 techniques in total, and we identify 19 prevalent techniques accounting for 37.3\% of documented techniques. We also identify 425
Peer review by experts is central to the evaluation of grant proposals, but little is known about how gender and disciplinary differences shape the content and tone of grant peer review reports. We analyzed 39,280 review reports submitted to the Swiss National Science Foundation between 2016 and 2023, covering 11,385 proposals for project funding across 21 disciplines from the Social Sciences and Humanities (SSH), Life Sciences (LS), and Mathematics, Informatics, Natural Sciences, and Technology (MINT). Using supervised machine learning, we classified over 1.3 million sentences by evaluation criteria and sentiment. Reviews in SSH were significantly longer and more critical, with less focus on the applicant's track record, while those in MINT were more concise and positive, with a higher focus on the track record, as compared to those in LS. Compared to male reviewers, female reviewers write longer reviews that more closely align with the evaluation criteria and express more positive sentiments. Female applicants tend to receive reviews with slightly more positive sentiment than male applicants. Gender and disciplinary culture influence how grant proposals are reviewed - shaping the
Supernovae (SNe) and kilonovae (KNe) are the most violent explosions in cosmos, signalling the destruction of a massive star (core-collapse SN), a white dwarf (thermonuclear SN) and a neutron star (KN), respectively. The ejected debris in these explosions is believed to be the main cosmic source of most elements in the periodic table. However, decoding the spectra of these transients is a challenging task requiring sophisticated spectral synthesis modelling. Here, the techniques for such modelling is reviewed, with particular focus on the computational aspects. We build from a historical review of how methodologies evolved from modelling of stellar winds, to supernovae, to kilonovae, studying various approximations in use for the central physical processes. Similarities and differences in the numeric schemes employed by current codes are discussed, and the path towards improved models is laid out.
Large-scale cyberattacks, referred to as campaigns, are documented across multiple CTI reports from diverse sources, with some providing a high-level overview of attack techniques and others providing technical details. Extracting attack techniques from reports is essential for organizations to identify the controls required to protect against attacks. Manually extracting techniques at scale is impractical. Existing automated methods focus on single reports, leaving many attack techniques and their controls undetected, resulting in a fragmented view of campaign behavior. The goal of this study is to aid security researchers in extracting attack techniques and controls from a campaign by replicating and comparing the performance of the state-of-the-art ATT&CK technique extraction methods in a multi-report campaign setting compared to prior single-report evaluations. We conduct an empirical study of 29 methods to extract attack techniques, spanning named entity recognition (NER), encoder-based classification, and decoder-based LLM approaches. Our study analyzes 90 CTI reports across three major attack campaigns: SolarWinds, XZ Utils, and Log4j, using both quantitative performance
This document aims at reviewing the different types of clustering algorithms and substructures detection techniques in order to study the spatial and kinematic clustering of stars and detect the gas components in molecular clouds. It is the deliverable: Report on Optimal Substructure Techniques for Stellar, Gas and Combined Samples, for the EU H2020 (COMPET-5-2015 - Space) project (A Gaia and Herschel Study of the Density Distribution and Evolution of Young Massive Star Clusters), Grant Agreement Number: 687528, with abbreviated code name StarFormMapper (SFM) project. The document is organized in the following sections: 1. General Introduction 2. Clustering of Discrete Distributions 3. Clustering of Continuous Distributions 4. Clustering in Astrophysics 5. StarFormMapper 6. Summary and Conclusions
Multiferroics represent a class of advanced materials for promising applications and stand at the forefront of modern science for the special feature possessing both charge polar and magnetic order. Previous studies indicate that the family of RECrO3 (RE = rare earth) compounds is likely another rare candidate system holding both ferroelectricity and magnetism. However, many issues remain unsolved, casting hot disputes about whether RECrO3 is multiferroic or not. For example, an incompatibility exists between reported structural models and observed ferroelectric behaviors, and it is not easy to determine the spin canting degree. To address these questions, one key step is to grow single crystals because they can provide more reliable information than other forms of matter do. In this review, the parent and doped ferroelectric YCrO3 compounds are comprehensively reviewed based on scientific and patent literatures from 1954 to 2022. The materials syntheses with different methods, including poly-, nano-, and single-crystalline samples and thin films, are summarized. The structural, magnetic, ferroelectric and dielectric, optical, and chemical-pressure (on Y and Cr sites by doping) dep
Optimizing the quality of software is a function of the degree of reviews made during the early life of a software development process. Reviews detect errors and potential errors early in the software development process. The errors detected during the early life cycle of software are least expensive to correct. Efficient involvement in software inspections and technical reviews, help developers improve their own skills, thereby mitigating the occurrence of errors in the later stage of software development process. The ideas gathered on this paper point that a properly implemented program of technical and management reviews drastically reduces the time as well as the cost required for testing, debugging, and reworking, and dramatically improves the quality of the resulting product. This paper, Optimization of Software Quality using management and technical Review Techniques, provides its readers with the opportunity to learn about and experience using this indispensable software quality tools.
This paper reviews the work done on black hole interior volume, entropy, and evaporation. An insight into the basics for understanding the interior volume is presented. A general analogy to investigate the interior volume of a black hole, the associated quantum mode's entropy, and the evolution relation between the interior and exterior entropy is explained. Using this analogy, we predicted the future of information stored in a BH, its radiation, and evaporation. The results are noted in tables (\ref{tab:1}) and (\ref{tab:2}). To apply this analogy in BH space-time, we investigated the interior volume, entropy, and evaluation relation for different types of BHs. Finally, we also investigated the nature of BH radiation and the probability of particle emission during the evaporation process.
(This paper was submitted as an invited paper to IEEE Reviews in Biomedical Engineering on April 6, 2020.) The pandemic of coronavirus disease 2019 (COVID-19) is spreading all over the world. Medical imaging such as X-ray and computed tomography (CT) plays an essential role in the global fight against COVID-19, whereas the recently emerging artificial intelligence (AI) technologies further strengthen the power of the imaging tools and help medical specialists. We hereby review the rapid responses in the community of medical imaging (empowered by AI) toward COVID-19. For example, AI-empowered image acquisition can significantly help automate the scanning procedure and also reshape the workflow with minimal contact to patients, providing the best protection to the imaging technicians. Also, AI can improve work efficiency by accurate delination of infections in X-ray and CT images, facilitating subsequent quantification. Moreover, the computer-aided platforms help radiologists make clinical decisions, i.e., for disease diagnosis, tracking, and prognosis. In this review paper, we thus cover the entire pipeline of medical imaging and analysis techniques involved with COVID-19, including
Wind energy has emerged as a highly promising source of renewable energy in recent times. However, wind turbines regularly suffer from operational inconsistencies, leading to significant costs and challenges in operations and maintenance (O&M). Condition-based monitoring (CBM) and performance assessment/analysis of turbines are vital aspects for ensuring efficient O&M planning and cost minimisation. Data-driven decision making techniques have witnessed rapid evolution in the wind industry for such O&M tasks during the last decade, from applying signal processing methods in early 2010 to artificial intelligence (AI) techniques, especially deep learning in 2020. In this article, we utilise statistical computing to present a scientometric review of the conceptual and thematic evolution of AI in the wind energy sector, providing evidence-based insights into present strengths and limitations of data-driven decision making in the wind industry. We provide a perspective into the future and on current key challenges in data availability and quality, lack of transparency in black box-natured AI models, and prevailing issues in deploying models for real-time decision support, alo
Governments' net zero emission target aims at increasing the share of renewable energy sources as well as influencing the behaviours of consumers to support the cost-effective balancing of energy supply and demand. These will be achieved by the advanced information and control infrastructures of smart grids which allow the interoperability among various stakeholders. Under this circumstance, increasing number of consumers produce, store, and consume energy, giving them a new role of prosumers. The integration of prosumers and accommodation of incurred bidirectional flows of energy and information rely on two key factors: flexible structures of energy markets and intelligent operations of power systems. The blockchain and artificial intelligence (AI) are innovative technologies to fulfil these two factors, by which the blockchain provides decentralised trading platforms for energy markets and the AI supports the optimal operational control of power systems. This paper attempts to address how to incorporate the blockchain and AI in the smart grids for facilitating prosumers to participate in energy markets. To achieve this objective, first, this paper reviews how policy designs price
The portrayal of crowd accidents by the media can influence public understanding and emotional response, shaping societal perceptions and potentially impacting safety measures and preparedness strategies. This paper critically examines the portrayal of crowd accidents in news coverage by analyzing the texts of 372 media reports of crowd accidents spanning 26 diverse news sources from 1900 to 2019. We investigate how media representations of crowd accidents vary across time and geographical origins. Our methodology combines lexical analysis to unveil prevailing terminologies and sentiment analysis to discern the emotional tenor of the reports. The findings reveal the prevalence of the term "stampede" over "panic" in media descriptions of crowd accidents. Notably, divergent patterns are observable when comparing Western versus South Asian media (notably India and Pakistan), unveiling a cross-cultural dimension. Moreover, the analysis detects a gradual transition from "crowd stampede" to "crowd crush" in media and Wikipedia narratives in recent years, suggesting evolving lexical sensitivities. Sentiment analysis uncovers a consistent association with fear-related language, indicative
In the contemporary digital landscape, online reviews have become an indispensable tool for promoting products and services across various businesses. Marketers, advertisers, and online businesses have found incentives to create deceptive positive reviews for their products and negative reviews for their competitors' offerings. As a result, the writing of deceptive reviews has become an unavoidable practice for businesses seeking to promote themselves or undermine their rivals. Detecting such deceptive reviews has become an intense and ongoing area of research. This research paper proposes a machine learning model to identify deceptive reviews, with a particular focus on restaurants. This study delves into the performance of numerous experiments conducted on a dataset of restaurant reviews known as the Deceptive Opinion Spam Corpus. To accomplish this, an n-gram model and max features are developed to effectively identify deceptive content, particularly focusing on fake reviews. A benchmark study is undertaken to explore the performance of two different feature extraction techniques, which are then coupled with five distinct machine learning classification algorithms. The experimen
In this study, we investigate how supporting serendipitous discovery and analysis of online product reviews can encourage readers to explore reviews more comprehensively prior to making purchase decisions. We propose two interventions -- Exploration Metrics that can help readers understand and track their exploration patterns through visual indicators and a Bias Mitigation Model that intends to maximize knowledge discovery by suggesting sentiment and semantically diverse reviews. We designed, developed, and evaluated a text analytics system called Serendyze, where we integrated these interventions. We asked 100 crowd workers to use Serendyze to make purchase decisions based on product reviews. Our evaluation suggests that exploration metrics enabled readers to efficiently cover more reviews in a balanced way, and suggestions from the bias mitigation model influenced readers to make confident data-driven decisions. We discuss the role of user agency and trust in text-level analysis systems and their applicability in domains beyond review exploration.
Increasing demands on medical imaging departments are taking a toll on the radiologist's ability to deliver timely and accurate reports. Recent technological advances in artificial intelligence have demonstrated great potential for automatic radiology report generation (ARRG), sparking an explosion of research. This survey paper conducts a methodological review of contemporary ARRG approaches by way of (i) assessing datasets based on characteristics, such as availability, size, and adoption rate, (ii) examining deep learning training methods, such as contrastive learning and reinforcement learning, (iii) exploring state-of-the-art model architectures, including variations of CNN and transformer models, (iv) outlining techniques integrating clinical knowledge through multimodal inputs and knowledge graphs, and (v) scrutinising current model evaluation techniques, including commonly applied NLP metrics and qualitative clinical reviews. Furthermore, the quantitative results of the reviewed models are analysed, where the top performing models are examined to seek further insights. Finally, potential new directions are highlighted, with the adoption of additional datasets from other rad
Ultralight dark matter refers to the lightest potential dark matter candidates. We will focus on the mass range that has been studied using astrophysical and cosmological observations, corresponding to a mass $10^{-24} \, \mathrm{eV} \lesssim m \lesssim 10^{-18} \, \mathrm{eV}$. We will discuss the motivations for this mass range. The most studied model in this range corresponds to a minimally coupled, single, classical, spin-0 field comprising all dark matter. However, the work exploring extensions of this model (for example, higher spin, self-coupled, multiple field, and mixed models) will be one of the focuses of this review. The phenomenology associated with ultralight dark matter is rich and includes linear effects on the primordial power spectrum, core structures forming at the center of halos, nonlinear effects resulting in heating of stellar distributions, and non-relativistic effects relating to pulsar signals and black hole superradiance, to name a few. This set of effects has been studied using an equally extensive set of numerical tools. We will summarize the most common ones and discuss their applications and limitations. Ultralight dark matter also has a wide variety
This paper investigates sentiment classification of Steam game reviews using an attention-based Bidirectional Long Short-Term Memory (BiLSTM) model. Using a dataset of 50,000 reviews sampled from a larger Steam review corpus, the authors compare a traditional machine learning baseline based on TF-IDF and PyCaret AutoML with a deep learning approach implemented in PyTorch. The proposed BiLSTM+Attention model is trained with class-weighted cross-entropy to address class imbalance and achieves 83% accuracy and 85% weighted F1-score on the test set, with 90% recall for negative reviews. The paper also presents attention visualizations to show interpretability by highlighting sentiment-bearing words. The study concludes that the BiLSTM+Attention model is effective for analyzing user sentiment in Steam reviews and useful for helping developers understand player feedback.
Purpose: We investigated the utilization of privacy-preserving, locally-deployed, open-source Large Language Models (LLMs) to extract diagnostic information from free-text cardiovascular magnetic resonance (CMR) reports. Materials and Methods: We evaluated nine open-source LLMs on their ability to identify diagnoses and classify patients into various cardiac diagnostic categories based on descriptive findings in 109 clinical CMR reports. Performance was quantified using standard classification metrics including accuracy, precision, recall, and F1 score. We also employed confusion matrices to examine patterns of misclassification across models. Results: Most open-source LLMs demonstrated exceptional performance in classifying reports into different diagnostic categories. Google's Gemma2 model achieved the highest average F1 score of 0.98, followed by Qwen2.5:32B and DeepseekR1-32B with F1 scores of 0.96 and 0.95, respectively. All other evaluated models attained average scores above 0.93, with Mistral and DeepseekR1-7B being the only exceptions. The top four LLMs outperformed our board-certified cardiologist (F1 score of 0.94) across all evaluation metrics in analyzing CMR reports.
This paper describes a rapid feasibility study of using GPT-4, a large language model (LLM), to (semi)automate data extraction in systematic reviews. Despite the recent surge of interest in LLMs there is still a lack of understanding of how to design LLM-based automation tools and how to robustly evaluate their performance. During the 2023 Evidence Synthesis Hackathon we conducted two feasibility studies. Firstly, to automatically extract study characteristics from human clinical, animal, and social science domain studies. We used two studies from each category for prompt-development; and ten for evaluation. Secondly, we used the LLM to predict Participants, Interventions, Controls and Outcomes (PICOs) labelled within 100 abstracts in the EBM-NLP dataset. Overall, results indicated an accuracy of around 80%, with some variability between domains (82% for human clinical, 80% for animal, and 72% for studies of human social sciences). Causal inference methods and study design were the data extraction items with the most errors. In the PICO study, participants and intervention/control showed high accuracy (>80%), outcomes were more challenging. Evaluation was done manually; scoring
When people experience harassment online, from individual threats or invective to coordinated campaigns of harassment, they have the option to report the harassers and content to the platform where the harassment has occurred. Platforms then evaluate harassment reports against terms of use and other policies to decide whether to remove content or take action against the alleged harasser--or not. On Twitter, harassing accounts can be deleted entirely, suspended (with content made unavailable pending appeal or specific changes), or sent a warning. Some platforms, including Twitter and YouTube, grant authorized reporters or trusted flaggers special privileges to identify and report inappropriate content on behalf of others. In November 2014, Twitter granted Women, Action, and the Media (WAM!) this authorized reporter status. In three weeks, WAM! reviewers assessed 811 incoming reports of harassment and escalated 161 reports to Twitter, ultimately seeing Twitter carry out 70 account suspensions, 18 warnings, and one deleted account. This document presents findings from this three-week project; it draws on both quantitative and qualitative methods. Findings focus on the people reporting