Hypernetworks, or hypernets for short, are neural networks that generate weights for another neural network, known as the target network. They have emerged as a powerful deep learning technique that allows for greater flexibility, adaptability, dynamism, faster training, information sharing, and model compression. Hypernets have shown promising results in a variety of deep learning problems, including continual learning, causal inference, transfer learning, weight pruning, uncertainty quantification, zero-shot learning, natural language processing, and reinforcement learning. Despite their success across different problem settings, there is currently no comprehensive review available to inform researchers about the latest developments and to assist in utilizing hypernets. To fill this gap, we review the progress in hypernets. We present an illustrative example of training deep neural networks using hypernets and propose categorizing hypernets based on five design criteria: inputs, outputs, variability of inputs and outputs, and the architecture of hypernets. We also review applications of hypernets across different deep learning problem settings, followed by a discussion of general
We propose an aspect-guided, multi-level perturbation framework to evaluate the robustness of Large Language Models (LLMs) in automated peer review. Our framework explores perturbations in three key components of the peer review process-papers, reviews, and rebuttals-across several quality aspects, including contribution, soundness, presentation, tone, and completeness. By applying targeted perturbations and examining their effects on both LLM-as-Reviewer and LLM-as-Meta-Reviewer, we investigate how aspect-based manipulations, such as omitting methodological details from papers or altering reviewer conclusions, can introduce significant biases in the review process. We identify several potential vulnerabilities: review conclusions that recommend a strong reject may significantly influence meta-reviews, negative or misleading reviews may be wrongly interpreted as thorough, and incomplete or hostile rebuttals can unexpectedly lead to higher acceptance rates. Statistical tests show that these biases persist under various Chain-of-Thought prompting strategies, highlighting the lack of robust critical evaluation in current LLMs. Our framework offers a practical methodology for diagnosin
Influencer marketing has become a crucial feature of digital marketing strategies. Despite its rapid growth and algorithmic relevance, the field of computational studies in influencer marketing remains fragmented, especially with limited systematic reviews covering the computational methodologies employed. This makes overarching scientific measurements in the influencer economy very scarce, to the detriment of interested stakeholders outside of platforms themselves, such as regulators, but also researchers from other fields. This paper aims to provide an overview of the state of the art of computational studies in influencer marketing by conducting a systematic literature review (SLR) based on the PRISMA model. The paper analyses 69 studies to identify key research themes, methodologies, and future directions in this research field. The review identifies four major research themes: Influencer identification and characterisation, Advertising strategies and engagement, Sponsored content analysis and discovery, and Fairness. Methodologically, the studies are categorised into machine learning-based techniques (e.g., classification, clustering) and non-machine-learning-based techniques
This paper reviews the current progress in applying machine learning (ML) tools to solve NP-hard combinatorial optimization problems, with a focus on routing problems such as the traveling salesman problem (TSP) and the vehicle routing problem (VRP). Due to the inherent complexity of these problems, exact algorithms often require excessive computational time to find optimal solutions, while heuristics can only provide approximate solutions without guaranteeing optimality. With the recent success of machine learning models, there is a growing trend in proposing and implementing diverse ML techniques to enhance the resolution of these challenging routing problems. We propose a taxonomy categorizing ML-based routing methods into construction-based and improvement-based approaches, highlighting their applicability to various problem characteristics. This review aims to integrate traditional OR methods with state-of-the-art ML techniques, providing a structured framework to guide future research and address emerging VRP variants.
Chronic pain is a widespread and debilitating condition that affects millions of individuals worldwide. Recent research has unveiled a connection between chronic pain and epigenetic processes, shedding light on the complex interplay between genetics, environment, and pain perception. This review synthesizes findings from several studies exploring the relationship between epigenetic modifications, pain intensity, disability, and various other factors in individuals with chronic pain. The studies encompass a range of chronic pain conditions, including chronic low back pain, knee osteoarthritis pain, and pain in the context of underlying health conditions. The review highlights key findings and implications from each study and discusses their collective contributions to our understanding of the epigenetic mechanisms underpinning chronic pain.
The Internet of Medical Things (IoMT) has transformed the healthcare industry by connecting medical devices in monitoring treatment outcomes of patients. This increased connectivity has resulted to significant security vulnerabilities in the case of malware and Distributed Denial of Service (DDoS) attacks. This literature review examines the vulnerabilities of IoMT devices, focusing on critical threats and exploring mitigation strategies. We conducted a comprehensive search across leading databases such as ACM Digital Library, IEEE Xplore, and Elsevier to analyze peer-reviewed studies published within the last five years (from 2019 to 2024). The review shows that inadequate encryption protocols, weak authentication methods, and irregular firmware updates are the main causes of risks associated with IoMT devices. We have identified emerging solutions like machine learning algorithms, blockchain technology, and edge computing as promising approaches to enhance IoMT security. This review emphasizes the pressing need to develop lightweight security measures and standardized protocols to protect patient data and ensure the integrity of healthcare services.
Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields and have shown significant potential in the academic peer-review process. However, existing applications are primarily limited to static review generation based on submitted papers, which fail to capture the dynamic and iterative nature of real-world peer reviews. In this paper, we reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers. We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources, including the top-tier conference and prestigious journal. This dataset is meticulously designed to facilitate the applications of LLMs for multi-turn dialogues, effectively simulating the complete peer-review process. Furthermore, we propose a series of metrics to evaluate the performance of LLMs for each role under this reformulated peer-review setting, ensuring fair and comprehensive evaluations. We believe this work provides a promising perspective on enhancing the LLM-driven peer-review process by incorporating dynamic, role-based interactio
An increasing number of scientific publications are created in open and transparent peer review models: a submission is published first, and then reviewers are invited, or a submission is reviewed in a closed environment but then these reviews are published with the final article, or combinations of these. Reasons for open peer review include giving better credit to reviewers and enabling readers to better appraise the quality of a publication. In most cases, the full, unstructured text of an open review is published next to the full, unstructured text of the article reviewed. This approach prevents human readers from getting a quick impression of the quality of parts of an article, and it does not easily support secondary exploitation, e.g., for scientometrics on reviews. While document formats have been proposed for publishing structured articles including reviews, integrated tool support for entire open peer review workflows resulting in such documents is still scarce. We present AR-Annotator, the Automatic Article and Review Annotator which employs a semantic information model of an article and its reviews, using semantic markup and unique identifiers for all entities of intere
Modern Code Review (MCR) is a standard practice in software engineering, yet it demands substantial time and resource investments. Recent research has increasingly explored automating core review tasks using machine learning (ML) and deep learning (DL). As a result, there is substantial variability in task definitions, datasets, and evaluation procedures. This study provides the first comprehensive analysis of MCR automation research, aiming to characterize the field's evolution, formalize learning tasks, highlight methodological challenges, and offer actionable recommendations to guide future research. Focusing on the primary code review tasks, we systematically surveyed 691 publications and identified 24 relevant studies published between May 2015 and April 2024. Each study was analyzed in terms of tasks, models, metrics, baselines, results, validity concerns, and artifact availability. In particular, our analysis reveals significant potential for standardization, including 48 task metric combinations, 22 of which were unique to their original paper, and limited dataset reuse. We highlight challenges and derive concrete recommendations for examples such as the temporal bias threa
Peer assessment has established itself as a critical pedagogical tool in academic settings, offering students timely, high-quality feedback to enhance learning outcomes. However, the efficacy of this approach depends on two factors: (1) the strategic allocation of reviewers and (2) the number of reviews per artifact. This paper presents a systematic literature review of 87 studies (2010--2024) to investigate how reviewer-assignment strategies and the number of reviews per submission impact the accuracy, fairness, and educational value of peer assessment. We identified four common reviewer-assignment strategies: random assignment, competency-based assignment, social-network-based assignment, and bidding. Drawing from both quantitative data and qualitative insights, we explored the trade-offs involved in each approach. Random assignment, while widely used, often results in inconsistent grading and fairness concerns. Competency-based strategies can address these issues. Meanwhile, social and bidding-based methods have the potential to improve fairness and timeliness -- existing empirical evidence is limited. In terms of review count, assigning three reviews per submission emerges as t
Context is an important factor in computer vision as it offers valuable information to clarify and analyze visual data. Utilizing the contextual information inherent in an image or a video can improve the precision and effectiveness of object detectors. For example, where recognizing an isolated object might be challenging, context information can improve comprehension of the scene. This study explores the impact of various context-based approaches to object detection. Initially, we investigate the role of context in object detection and survey it from several perspectives. We then review and discuss the most recent context-based object detection approaches and compare them. Finally, we conclude by addressing research questions and identifying gaps for further studies. More than 265 publications are included in this survey, covering different aspects of context in different categories of object detection, including general object detection, video object detection, small object detection, camouflaged object detection, zero-shot, one-shot, and few-shot object detection. This literature review presents a comprehensive overview of the latest advancements in context-based object detecti
This study examines whether there is any evidence of bias in two areas of common critique of open, non-anonymous peer review - and used in the post-publication, peer review system operated by the open-access scholarly publishing platform F1000Research. First, is there evidence of bias where a reviewer based in a specific country assesses the work of an author also based in the same country? Second, are reviewers influenced by being able to see the comments and know the origins of previous reviewer? Methods: Scrutinising the open peer review comments published on F1000Research, we assess the extent of two frequently cited potential influences on reviewers that may be the result of the transparency offered by a fully attributable, open peer review publishing model: the national affiliations of authors and reviewers, and the ability of reviewers to view previously-published reviewer reports before submitting their own. The effects of these potential influences were investigated for all first versions of articles published by 8 July 2019 to F1000Research. In 16 out of the 20 countries with the most articles, there was a tendency for reviewers based in the same country to give a more po
LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers are using LLM-assistance, but also that authors use LLMs to revise their papers before submitting. In this work, we perform empirical experiments on papers from the 2025 ACL Rolling Review (ARR) to evaluate LLM reviews from both the author and the reviewer perspective. First, we identify a limited alignment of LLM reviews with human ones. In the best-case scenario, the alignment is reasonable. However, we also find that LLM-human alignment varies substantially across prompts and models. Finally, we investigate the scenario in which the author uses an iterative draft-revise workflow to improve the submission according to the LLM review. We find that this "gaming" of LLM reviews can be effective in specific scenarios, leading to a statistically significant increase of overall scores for up to 35\% of papers. We publish our code: https://github.com/uhh-hcds/reviewarcade.
Large language models (LLMs) and vision-augmented LLMs (VLMs) have significantly advanced medical informatics, diagnostics, and decision support. However, these models exhibit systematic biases, particularly age bias, compromising their reliability and equity. This is evident in their poorer performance on pediatric-focused text and visual question-answering tasks. This bias reflects a broader imbalance in medical research, where pediatric studies receive less funding and representation despite the significant disease burden in children. To address these issues, a new comprehensive multi-modal pediatric question-answering benchmark, PediatricsMQA, has been introduced. It consists of 3,417 text-based multiple-choice questions (MCQs) covering 131 pediatric topics across seven developmental stages (prenatal to adolescent) and 2,067 vision-based MCQs using 634 pediatric images from 67 imaging modalities and 256 anatomical regions. The dataset was developed using a hybrid manual-automatic pipeline, incorporating peer-reviewed pediatric literature, validated question banks, existing benchmarks, and existing QA resources. Evaluating state-of-the-art open models, we find dramatic performan
While there has been significant progress in object detection using conventional image processing and machine learning algorithms, exploring small and dim target detection in the IR domain is a relatively new area of study. The majority of small and dim target detection methods are derived from conventional object detection algorithms, albeit with some alterations. The task of detecting small and dim targets in IR imagery is complex. This is because these targets often need distinct features, the background is cluttered with unclear details, and the IR signatures of the scene can change over time due to fluctuations in thermodynamics. The primary objective of this review is to highlight the progress made in this field. This is the first review in the field of small and dim target detection in infrared imagery, encompassing various methodologies ranging from conventional image processing to cutting-edge deep learning-based approaches. The authors have also introduced a taxonomy of such approaches. There are two main types of approaches: methodologies using several frames for detection, and single-frame-based detection techniques. Single frame-based detection techniques encompass a d
Mobility-on-demand (MOD) services have the potential to significantly improve the adaptiveness and recovery of urban systems, in the wake of disruptive events. But there lacks a comprehensive review on using MOD services for such purposes in addition to serving regular travel demand. This paper presents a review that suggests a noticeable increase within recent years on this topic across four main areas: resilient MOD services, novel usage of MOD services for improving infrastructure and community resilience, empirical impact evaluation, and enabling and augmenting technologies. The review shows that MOD services have been utilized to support anomaly detection, essential supply delivery, evacuation and rescue, on-site medical care, power grid stabilization, transit service substitution during downtime, and infrastructure and equipment repair. Such a versatility suggests a comprehensive assessment framework and modeling methodologies for evaluating system design alternatives that simultaneously serve different purposes. The review also reveals that integrating suitable technologies, business models, and long-term planning efforts offers significant synergistic benefits.
Identifying key pathological features in brain MRIs is crucial for the long-term survival of glioma patients. However, manual segmentation is time-consuming, requiring expert intervention and is susceptible to human error. Therefore, significant research has been devoted to developing machine learning methods that can accurately segment tumors in 3D multimodal brain MRI scans. Despite their progress, state-of-the-art models are often limited by the data they are trained on, raising concerns about their reliability when applied to diverse populations that may introduce distribution shifts. Such shifts can stem from lower quality MRI technology (e.g., in sub-Saharan Africa) or variations in patient demographics (e.g., children). The BraTS-2024 challenge provides a platform to address these issues. This study presents our methodology for segmenting tumors in the BraTS-2024 SSA and Pediatric Tumors tasks using MedNeXt, comprehensive model ensembling, and thorough postprocessing. Our approach demonstrated strong performance on the unseen validation set, achieving an average Dice Similarity Coefficient (DSC) of 0.896 on the BraTS-2024 SSA dataset and an average DSC of 0.830 on the BraTS
The escalating volume of academic research, coupled with a shortage of qualified reviewers, necessitates innovative approaches to peer review. In this work, we propose: 1. ReviewEval, a comprehensive evaluation framework for AI-generated reviews that measures alignment with human assessments, verifies factual accuracy, assesses analytical depth, identifies degree of constructiveness and adherence to reviewer guidelines; and 2. ReviewAgent, an LLM-based review generation agent featuring a novel alignment mechanism to tailor feedback to target conferences and journals, along with a self-refinement loop that iteratively optimizes its intermediate outputs and an external improvement loop using ReviewEval to improve upon the final reviews. ReviewAgent improves actionable insights by 6.78% and 47.62% over existing AI baselines and expert reviews respectively. Further, it boosts analytical depth by 3.97% and 12.73%, enhances adherence to guidelines by 10.11% and 47.26% respectively. This paper establishes essential metrics for AIbased peer review and substantially enhances the reliability and impact of AI-generated reviews in academic research.
Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context constraints. Existing SAST-based checks are noisy and often miss immature, context-dependent vulnerabilities, while standalone Large Language Models (LLMs) are constrained by context windows and lack explicit tool use. Agentic AI, which combine LLMs with autonomous decision-making, tool invocation, and code navigation, offer a promising alternative, but their effectiveness for pre-commit secure code review is not yet well understood. In this work, we introduce AgenticSCR, an agentic AI for secure code review for detecting immature vulnerabilities during the pre-commit stage, augmented by security-focused semantic memories. Using our own curated benchmark of immature vulnerabilities, tailored to the pre-commit secure code review, we empirically evaluate how accurate is our AgenticSCR for localizing, detecting, and explaining immature vulnerabilities. Our results show that AgenticSCR achieves at least 153% relatively higher percentage of correct code review comments than the static LLM-based baseline, and also substantially surpasses SAST tools. M
Science is justly admired as a cumulative process ("standing on the shoulders of giants"), yet scientific knowledge is typically built on a patchwork of research contributions without much coordination. This lack of efficiency has specifically been addressed in clinical research by recommendations against avoidable research waste and for living systematic reviews and prospective meta-analysis. We propose to further those recommendations with ALL-IN meta-analysis: Anytime Live and Leading INterim meta-analysis. ALL-IN provides meta-analysis based on e-values and anytime-valid confidence intervals that can be updated at any time - reanalyzing after each new observation while retaining type-I error and coverage guarantees, live - no need to prespecify the looks, and leading - in the decisions on whether individual studies should be initiated, stopped or expanded, the meta-analysis can be the leading source of information without losing validity to accumulation bias. The analysis design requires no information about the trial sample sizes or the number of trials eventually included. So ALL-IN meta-analysis can be applied retrospectively as well as prospectively, to evaluate the evidenc