The partnership between the United States and Singapore is founded in no small part on the shared recognition of the value that technology has for national security. Over the last 55 years, Singapore has become an established purchaser of U.S. defense technology, but the past 20 years have also seen the U.S.-Singapore relationship mature into an increasingly collaborative one, tackling newer fields like cybersecurity and biosecurity. However, current geopolitical tensions present a challenge for Singapore, which strives to retain its strategic autonomy by maintaining positive relations with all parties. Paradoxically, the rise of non-traditional security threats may pave the way for greater bilateral cooperation by allowing Singapore to position itself as a hub for cooperation on regional security issues in Southeast Asia at large. In such spirit, this paper recommends that the United States and Singapore do the following: 1) in defense technology, co-develop niche capabilities in C4ISR and unmanned systems with peacetime applications; 2) in cybersecurity, improve their domestic resilience against sophisticated nation-state actors while also building regional capacity to counter cy
Multicultural Singapore hosts overlapping language publics (English, Chinese, and Malay) that discuss the same out-groups in parallel, a natural setting to ask whether online hate shares a structure across languages and whether what a community $\textit{produces}$ is what it $\textit{amplifies}$. From a Singapore-centric 2025 Facebook, Reddit, and YouTube corpus (31.0M items; 1.76M comments mentioning eleven identity groups), we benchmark eight open large language models as hate annotators against a human-adjudicated gold set, adopt the best (Phi-4: accuracy 0.95, Cohen's $κ$=0.91, recall 1.00 on an independent manual check), and replicate every finding under a second model. The results converge on one thesis, $\textit{layered cultural contingency}$: cross-lingual divergence falls monotonically as one moves from what a community hates to how and why it hates. Which out-groups are targeted is culturally specific (language $\times$ target $V$=0.25), but the threat frames and the binding moral grammar of hate (sanctity and loyalty, $55-75\%$, not fairness) are far more shared across languages, with divergence dropping to $V$=0.08 for moral foundations and 0.07 for emotion. Hate is con
Dengue, a mosquito-borne disease, continues to pose a persistent public health challenge in urban areas, particularly in tropical regions such as Singapore. Effective and affordable control requires anticipating where transmission risks are likely to emerge so that interventions can be deployed proactively rather than reactively. This study introduces a novel framework that uncovers and exploits latent transmission links between urban regions, mined directly from publicly available dengue case data. Instead of treating cases as isolated reports, we model how hotspot formation in one area is influenced by epidemic dynamics in neighboring regions. While mosquito movement is highly localized, long-distance transmission is often driven by human mobility, and in our case study, the learned network aligns closely with commuting flows, providing an interpretable explanation for citywide spread. These hidden links are optimized through gradient descent and used not only to forecast hotspot status but also to verify the consistency of spreading patterns, by examining the stability of the inferred network across consecutive weeks. Case studies on Singapore during 2013-2018 and 2020 show that
Legal citation in common-law systems depends not only on factual similarity, but also on the legal principle for which a precedent is invoked. However, existing benchmarks for legal citation retrieval use case facts, citation context, or full judgments as inputs, where the governing legal principle is often missing or only implicitly expressed and entangled with broader context. As a result, models may retrieve precedents that are factually similar yet doctrinally irrelevant. This limitation is particularly consequential in Singapore, where the legal system has evolved independently: only domestic precedents are binding, while foreign authorities serve merely as persuasive references. Thus, we propose a new retrieval paradigm that ranks cited cases based on queries integrating case facts and explicit legal principles, inspired by real-world legal reasoning workflows. To support this paradigm, we introduce SG-LegalCite, a dataset of 100,890 case-principle pairs extracted from 8,523 Singapore Supreme Court judgments spanning from 2000 to 2025. Experiments across 11 baselines demonstrate the effectiveness of our principle-augmented retrieval paradigm, showing that explicit legal princ
A resilient society is one capable of withstanding and thereafter recovering quickly from large shocks. Brought to the fore by the COVID-19 pandemic of 2020--2022, this social resilience is nevertheless difficult to quantify. In this paper, we measured how quickly the Singapore society recovered from the pandemic, by first modeling it as a dynamic social network governed by three processes: (1) random link addition between strangers; (2) social link addition between individuals with a friend in common; and (3) random link deletion . To calibrate this model, we carried out a survey of a representative sample of $N = 2,057$ residents and non-residents in Singapore between Jul and Sep 2022 to measure the numbers of random and social contacts gained over a fixed duration, as well as the number of contacts lost over the same duration, using phone contacts as proxy for social contacts. Lockdown simulations using the model that fits the survey results best suggest that Singapore would recover from such a disruption after 1--2 months.
While ex-ante screening and static price caps are global standards for mitigating price volatility, Singapore's electricity market employs a unique dual-defense mechanism integrating vesting contracts (VC) with a temporary price cap (TPC). Using high-frequency data from 2021 to 2024, this paper evaluates this mechanism and yields three primary findings. First, a structural trade-off exists within the VC framework: while VC quantity (VCQ) suppresses average prices, it paradoxically exacerbates instability via liquidity squeezes. Conversely, VC price (VCP) functions as a tail-risk anchor, dominating at extreme quantiles where VCQ efficacy wanes. Second, a structural break around the 2023 reform reveals a fundamental re-mapping of price dynamics; the previously positive pass-through from offer ratios to clearing prices was largely neutralized post-reform. Furthermore, diagnostics near the TPC threshold show no systematic evidence of strategic bid shading, confirming the TPC's operational integrity. Third, the dual-defense mechanism exhibits a critical synergy that resolves the volatility trade-off. The TPC reverses the volatility penalty of high VCQ, shifting the elasticity of conditi
We present Polyglot-Lion, a family of compact multilingual automatic speech recognition (ASR) models tailored for the linguistic landscape of Singapore, covering English, Mandarin, Tamil, and Malay. Our models are obtained by fine-tuning Qwen3-ASR-0.6B and Qwen3-ASR-1.7B exclusively on publicly available speech corpora, using a balanced sampling strategy that equalizes the number of training utterances per language and deliberately omits language-tag conditioning so that the model learns to identify languages implicitly from audio. On 12 benchmarks spanning the four target languages, Polyglot-Lion-1.7B achieves an average error rate of 14.85, competitive with MERaLiON-2-10B-ASR (14.32) - a model 6x larger - while incurring a training cost of \$81 on a single RTX PRO 6000 GPU compared to \$18,862 for the 128-GPU baseline. Inference throughput is approximately 20x faster than MERaLiON at 0.10 s/sample versus 2.02 s/sample. These results demonstrate that linguistically balanced fine-tuning of moderate-scale pretrained models can yield deployment-ready multilingual ASR at a fraction of the cost of larger specialist systems.
This technical report describes the MERaLiON-SpeechEncoder, a foundation model designed to support a wide range of downstream speech applications. Developed as part of Singapore's National Multimodal Large Language Model Programme, the MERaLiON-SpeechEncoder is tailored to address the speech processing needs in Singapore and the surrounding Southeast Asian region. The model currently supports mainly English, including the variety spoken in Singapore. We are actively expanding our datasets to gradually cover other languages in subsequent releases. The MERaLiON-SpeechEncoder was pre-trained from scratch on 200,000 hours of unlabelled speech data using a self-supervised learning approach based on masked language modelling. We describe our training procedure and hyperparameter tuning experiments in detail below. Our evaluation demonstrates improvements to spontaneous and Singapore speech benchmarks for speech recognition, while remaining competitive to other state-of-the-art speech encoders across ten other speech tasks. We commit to releasing our model, supporting broader research endeavours, both in Singapore and beyond.
The increasing number of vehicles highlights the need for efficient parking space management. Predicting real-time Parking Availability (PA) can help mitigate traffic congestion and the corresponding social problems, which is a pressing issue in densely populated cities like Singapore. In this study, we aim to collectively predict future PA across Singapore with complex factors from various domains. The contributions in this paper are listed as follows: (1) A New Dataset: We introduce the \texttt{SINPA} dataset, containing a year's worth of PA data from 1,687 parking lots in Singapore, enriched with various spatial and temporal factors. (2) A Data-Driven Approach: We present DeepPA, a novel deep-learning framework, to collectively and efficiently predict future PA across thousands of parking lots. (3) Extensive Experiments and Deployment: DeepPA demonstrates a 9.2% reduction in prediction error for up to 3-hour forecasts compared to existing advanced models. Furthermore, we implement DeepPA in a practical web-based platform to provide real-time PA predictions to aid drivers and inform urban planning for the governors in Singapore. We release the dataset and source code at https://g
Traditional online content moderation systems struggle to classify modern multimodal means of communication, such as memes, a highly nuanced and information-dense medium. This task is especially hard in a culturally diverse society like Singapore, where low-resource languages are used and extensive knowledge on local context is needed to interpret online content. We curate a large collection of 112K memes labeled by GPT-4V for fine-tuning a VLM to classify offensive memes in Singapore context. We show the effectiveness of fine-tuned VLMs on our dataset, and propose a pipeline containing OCR, translation and a 7-billion parameter-class VLM. Our solutions reach 80.62% accuracy and 0.8192 AUROC on a held-out test set, and can greatly aid human in moderating online contents. The dataset, code, and model weights have been open-sourced at https://github.com/aliencaocao/vlm-for-memes-aisg.
The proliferation of discussion about fatherhood in Singapore attests to its significance, indicating the need for an exploration of how fatherhood is framed, aiding policy-making around fatherhood in Singapore. Sound and holistic policy around fatherhood in Singapore may reduce stigma and apprehension around being a parent, critical to improving the nations flagging birth rate. We analyzed 15,705 articles and 56,221 posts to study how fatherhood is framed in Singapore across a range of online platforms (news outlets, parenting forums, Twitter). We used NLP techniques to understand these differences. While fatherhood was framed in a range of ways on the Singaporean online environment, it did not seem that fathers were framed as central to the Singaporean family unit. A strength of our work is how the different techniques we have applied validate each other.
Peer support plays a vital role in expanding access to mental health care by providing empathetic, community-based support outside formal clinical systems. As digital platforms increasingly mediate such support, the design and impact of these technologies remain under-examined, particularly in Asian contexts. This paper presents findings from an interview study with 20 peer supporters in Singapore, who operate across diverse online, offline, and hybrid environments. Through a thematic analysis, we unpack how participants start, conduct, and sustain peer support, highlighting their motivations, emotional labour, and the sociocultural dimensions shaping their practices. Building on this grounded understanding, we surface design directions for culturally responsive digital tools that scaffold rather than supplant relational care. Drawing insights from qualitative accounts, we offer a situated perspective on how AI might responsibly augment peer support. This research contributes to human-centred computing by articulating the lived realities of peer supporters and proposing design implications for trustworthy and context-sensitive AI in mental health.
Timely assessment of current conditions is essential especially for small, open economies such as Singapore, where external shocks transmit rapidly to domestic activity. We develop a real-time nowcasting framework for quarterly GDP growth using a high-dimensional panel of approximately 70 indicators, encompassing economic and financial indicators over 1990Q1-2023Q2. The analysis covers penalized regressions, dimensionality-reduction methods, ensemble learning algorithms, and neural architectures, benchmarked against a Random Walk, an AR(3), and a Dynamic Factor Model. The pipeline preserves temporal ordering through an expanding-window walk-forward design with Bayesian hyperparameter optimization, and uses moving block-bootstrap procedures both to construct prediction intervals and to obtain confidence bands for feature-importance measures. It adopts model-specific and XAI-based explainability tools. A Model Confidence Set procedure identifies statistically superior learners, which are then combined through simple, weighted, and exponentially weighted schemes; the resulting time-varying weights provide an interpretable representation of model contributions. Predictive ability is as
To address the limitations of current hate speech detection models, we introduce \textsf{SGHateCheck}, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore's main languages, and refining these with native annotators. \textsf{SGHateCheck} reveals critical flaws in state-of-the-art models, highlighting their inadequacy in sensitive content moderation. This work aims to foster the development of more effective hate speech detection tools for diverse linguistic environments, particularly for Singapore and Southeast Asia contexts.
Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to ensure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential -- it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. The "2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety" aimed to support research in this space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. This resulting report builds on the International AI Safety Report chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this report organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control).
The study assesses Hungary's National AI Strategy and its implementation through the analysis of strategic documents, publicly available financial records, and expert interviews with the Hungarian AI Coalition President and Chief Strategic Advisor to the Government Commissioner for AI. 22 goals from Hungary's strategy were evaluated through conceptual, governance, temporal, and financial dimensions before being benchmarked against Singapore's National AI Strategies (NAIS 1.0 and NAIS 2.0). Key findings include an estimated total of EUR 4.65 billion in AI-related public investment in Hungary. Openly available financial data was found for only half of the evaluated goals, and just three projects made up 98\% of all documented funding. The research also reveals Hungary's implementation challenges, including fragmented execution following ministerial reorganizations and the absence of designated biennial reviews since 2020. Furthermore, the paper provides targeted recommendations for Hungary's forthcoming AI strategy, drawing on Singapore's framework as a reference point. These include adapting to the era of large language models, restructuring the existing triple helix network to fost
Singapore has been striving to improve the provision of healthcare services to her people. In this course, the government has taken note of the deficiency in regulating and supervising people's nutrient intake, which is identified as a contributing factor to the development of chronic diseases. Consequently, this issue has garnered significant attention. In this paper, we share our experience in addressing this issue and attaining medical-grade nutrient intake information to benefit Singaporeans in different aspects. To this end, we develop the FoodSG platform to incubate diverse healthcare-oriented applications as a service in Singapore, taking into account their shared requirements. We further identify the profound meaning of localized food datasets and systematically clean and curate a localized Singaporean food dataset FoodSG-233. To overcome the hurdle in recognition performance brought by Singaporean multifarious food dishes, we propose to integrate supervised contrastive learning into our food recognition model FoodSG-SCL for the intrinsic capability to mine hard positive/negative samples and therefore boost the accuracy. Through a comprehensive evaluation, we present perfor
The advancement of Large Language Models (LLMs) has transformed natural language processing; however, their safety mechanisms remain under-explored in low-resource, multilingual settings. Here, we aim to bridge this gap. In particular, we introduce \textsf{SGToxicGuard}, a novel dataset and evaluation framework for benchmarking LLM safety in Singapore's diverse linguistic context, including Singlish, Chinese, Malay, and Tamil. SGToxicGuard adopts a red-teaming approach to systematically probe LLM vulnerabilities in three real-world scenarios: \textit{conversation}, \textit{question-answering}, and \textit{content composition}. We conduct extensive experiments with state-of-the-art multilingual LLMs, and the results uncover critical gaps in their safety guardrails. By offering actionable insights into cultural sensitivity and toxicity mitigation, we lay the foundation for safer and more inclusive AI systems in linguistically diverse environments.\footnote{Link to the dataset: https://github.com/Social-AI-Studio/SGToxicGuard.} \textcolor{red}{Disclaimer: This paper contains sensitive content that may be disturbing to some readers.}
We have integrated Easy JavaScript Simulation (EJSS) Data Analytics into the national Learning Management System for Singapore schools, known as the Singapore Student Learning Space (SLS). EJSS Data Analytics enhances the teaching and learning experience for educators and students by enabling educators to monitor and evaluate students interactions with interactive computer simulations. The data analytics and visualisation capabilities are delivered using the Moodle platform and version 1.3 of the specifications for Learning Tools Interoperability (LTI). In this paper, we showcase the potential for EJSS Data Analytics to identify students learning difficulties and misconceptions. Four examples of EJSS Data Analytics applications are provided to illustrate insights on aspects that include understanding a students sequential actions leading to specific task outcomes, the frequency of task attempts by each student, and the ratio of students achieving correct versus incorrect task completions. We identify five key considerations for designing the EJSS teacher dashboard. These considerations relate to Student Thought Process, Student Behaviour, Student Engagement, Student Choice, and Tea
As brain computer interfaces (BCIs) transition from experimental medical systems to consumer and military adjacent technologies, they introduce a novel security domain in which the human nervous system becomes a networked and contestable substrate. Existing frameworks for cybersecurity, biomedical safety, and data protection were not designed to address adversarial threats to neural signal integrity, creating a governance gap characterized by systemic misclassification. This paper argues that cognition is becoming strategic infrastructure and is situated between the market driven diffusion of neurotechnology in the United States and the state integrated fusion of AI and brain science in China. Using Singapore as a critical stress test and applying institutional classification analysis and regulatory mandate mapping, this paper identifies a structural paradox. A state with high regulatory capacity in both cyber and biomedical domains remains vulnerable at their intersection due to a failure to classify the human mind as infrastructure. This paper introduces the concept of cognitive sovereignty defined as the strategic capacity to protect neural processes from external modulation and