In this paper, we introduce SaudiBERT, a monodialect Arabic language model pretrained exclusively on Saudi dialectal text. To demonstrate the model's effectiveness, we compared SaudiBERT with six different multidialect Arabic language models across 11 evaluation datasets, which are divided into two groups: sentiment analysis and text classification. SaudiBERT achieved average F1-scores of 86.15\% and 87.86\% in these groups respectively, significantly outperforming all other comparative models. Additionally, we present two novel Saudi dialectal corpora: the Saudi Tweets Mega Corpus (STMC), which contains over 141 million tweets in Saudi dialect, and the Saudi Forums Corpus (SFC), which includes 15.2 GB of text collected from five Saudi online forums. Both corpora are used in pretraining the proposed model, and they are the largest Saudi dialectal corpora ever reported in the literature. The results confirm the effectiveness of SaudiBERT in understanding and analyzing Arabic text expressed in Saudi dialect, achieving state-of-the-art results in most tasks and surpassing other language models included in the study. SaudiBERT model is publicly available on \url{https://huggingface.co/
Large language models (LLMs) for Arabic are still dominated by Modern Standard Arabic (MSA), with limited support for Saudi dialects such as Najdi and Hijazi. This underrepresentation hinders their ability to capture authentic dialectal variation. Using a privately curated Saudi Dialect Instruction dataset (Hijazi and Najdi; 5,466 synthetic instruction-response pairs; 50/50 split), we LoRA-tune ALLaM-7B-Instruct-preview, the first foundation model developed in Saudi Arabia, for Saudi dialect generation. We investigate two variants: (i) Dialect-Token training, which prepends an explicit dialect tag to the instruction, and (ii) No-Token training, which omits the tag at formatting time. Evaluation on a held-out test set combines an external dialect classifier with text fidelity metrics (chrF++ and BERTScore) and diversity measures. The Dialect-Token model achieves the best control, raising the Saudi rate from 47.97% to 84.21% and reducing MSA leakage from 32.63% to 6.21%; fidelity also improves (chrF++ +3.53, BERTScore +0.059). Both LoRA variants outperform strong generic instruction models (Falcon-7B-Instruct, Llama-3.1-8B-Instruct, Qwen-2.5-7B-Instruct, AceGPT-v2-8B-Chat, JAIS-13B-C
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing; however, they often struggle to accurately capture and reflect cultural nuances. This research addresses this challenge by focusing on Saudi Arabia, a country characterized by diverse dialects and rich cultural traditions. We introduce SaudiCulture, a novel benchmark designed to evaluate the cultural competence of LLMs within the distinct geographical and cultural contexts of Saudi Arabia. SaudiCulture is a comprehensive dataset of questions covering five major geographical regions, such as West, East, South, North, and Center, along with general questions applicable across all regions. The dataset encompasses a broad spectrum of cultural domains, including food, clothing, entertainment, celebrations, and crafts. To ensure a rigorous evaluation, SaudiCulture includes questions of varying complexity, such as open-ended, single-choice, and multiple-choice formats, with some requiring multiple correct answers. Additionally, the dataset distinguishes between common cultural knowledge and specialized regional aspects. We conduct extensive evaluations on five LLMs, such as GPT-4, Llama
Large-scale restoration in drylands is widely promoted to address land degradation and biodiversity loss, yet many efforts rely on long-term irrigation, limiting sustainability in water-scarce regions. A key challenge is identifying locations where native vegetation can persist without intensive management while minimizing costly field campaigns. A scalable pre-screening framework is presented that integrates climate and remote sensing data to enable cost-efficient site selection in arid environments using Saudi Arabia as a case study. A Climate Suitability Score (CSS), derived from machine learning models trained on expert-curated reference sites, captures complex climatic dependencies on vegetation persistence. Using multi-year ERA5-Land data for Saudi Arabia, national-scale prediction maps are generated and combined with vegetation indices to identify areas where climate is favorable, but vegetation remains underdeveloped. Multi-criteria screening reduces candidates to thirteen priority locations. Climatically analogous intact ecosystems provide benchmarks for restoration targets and indicate that an average 2.5 fold increase in vegetation coverage is a realistic target for rest
Saudi Arabias rapid economic growth and social evolution under Vision 2030 present a unique opportunity to track emerging trends in real time. Uncovering trends in real time can open up new avenues for business and investment opportunities. This paper explores how AI and social media analytics can uncover and monitor these trends across sectors like sustainability, construction, food beverages industry, tourism, technology, and entertainment. This paper focus on use of AI-driven methodology to identify sustainability trends across Saudi Arabia. We processed millions of social media posts, news, blogs in order to understand sustainability trends in the region. The paper presents an AI approach that can help economists, businesses, government to understand sustainability trends and make better decisions around them. This approach offers both sector-specific and cross-sector insights, giving decision-makers a reliable, up to date snapshot of Saudi Arabias market shifts. Beyond Saudi Arabia, this framework also shows potential for adapting to other regions. Overall, our findings highlight how by using AI-methodologies, give decision makers a reliable method to understand how initiative
Generative Artificial Intelligence (GenAI) is rapidly becoming embedded in Saudi Arabia's digital transformation under Vision 2030, yet public awareness, adoption, and concerns surrounding these tools remain underexplored. This study provides an early snapshot of GenAI engagement among Saudi nationals. Using a nationwide survey of 330 participants across regions, age groups, and employment sectors, we examine seven dimensions of GenAI use: awareness and understanding, adoption patterns, perceived impacts, training needs, risks and barriers, data-sharing behaviors, and future expectations. Findings show that 93% of respondents actively use GenAI primarily for text-based tasks, while more advanced uses such as programming or multimodal generation are less common. Despite the prevalence of use, overall awareness and conceptual understanding remain uneven, with many reporting limited technical knowledge. Participants recognize GenAI's benefits for productivity, work quality, and understanding complex information, yet caution that sustained reliance may undermine critical thinking and key professional skills. Trust in AI-generated outputs remains cautious, with widespread concerns about
Background and Context: Artificial intelligence (AI) tools have been reshaping computing and computer science education. Trust in AI is a determining factor in the adoption of these tools. Recent studies have shown different trust factors across gender and first-generation status among students. However, these studies have focused mainly on Western, Educated, Industrialized, Rich, and Democratic (WEIRD) populations, and their generalizability to other populations with different languages and cultures is unclear. Objective: This study aims to evaluate trust in AI among Middle Eastern computer science students and the factors that can impact it. Method. We replicate a recent study of trust in four universities in three Middle Eastern, Arabic-speaking countries: Saudi Arabia, Kuwait, and Jordan. We analyze trust among students across different factors such as gender and first-generation status. Findings: Our results suggest that language fluency can predict trust in AI. Moreover, unlike the results from the US population where female students tended to trust AI more than their male peers, female students in Saudi Arabia indicated lower trust compared to their male counterparts, and we
As large language models (LLMs) become increasingly central to Arabic NLP applications, evaluating their understanding of regional dialects and cultural nuances is essential, particularly in linguistically diverse settings like Saudi Arabia. This paper introduces Absher, a comprehensive benchmark specifically designed to assess LLMs performance across major Saudi dialects. \texttt{Absher} comprises over 18,000 multiple-choice questions spanning six distinct categories: Meaning, True/False, Fill-in-the-Blank, Contextual Usage, Cultural Interpretation, and Location Recognition. These questions are derived from a curated dataset of dialectal words, phrases, and proverbs sourced from various regions of Saudi Arabia. We evaluate several state-of-the-art LLMs, including multilingual and Arabic-specific models. We also provide detailed insights into their capabilities and limitations. Our results reveal notable performance gaps, particularly in tasks requiring cultural inference or contextual understanding. Our findings highlight the urgent need for dialect-aware training and culturally aligned evaluation methodologies to improve LLMs performance in real-world Arabic applications.
Facial Recognition Technology (FRT) is a pioneering field of mass surveillance that sparks privacy concerns and is considered a growing threat in the modern world. FRT has been widely adopted in the Kingdom of Saudi Arabia to improve public services and surveillance. Accordingly, the following study aims to understand the privacy and security concerns, trust, and acceptance of FRT in Saudi Arabia. Validated Privacy Concerns (IUIPC-8), Security Attitudes (SA-6), and Security Behavior (SeBIS) scales are used along with replicate studies from Pew Research Center trust questions and government trust questions. In addition, we examine potential differences between Saudis and Americans. To gain insights into these concerns, we conducted an online survey involving 53 Saudi Arabia citizens who are residing in the USA. We have collected data in the US instead of Saudi Arabia to avoid the regulatory challenges of the Saudi Data & Artificial Intelligence Authority (SDAIA). Responses from closed-ended questions revealed that Saudis score much lower than Americans when it comes to security attitudes, whereas they score lower when it comes to privacy concerns. We found no significant differe
This study investigates the extent to which contemporary Text-to-Image artificial intelligence (AI) models perpetuate gender stereotypes and cultural inaccuracies when generating depictions of professionals in Saudi Arabia. We analyzed 1,006 images produced by ImageFX, DALL-E V3, and Grok for 56 diverse Saudi professions using neutral prompts. Two trained Saudi annotators evaluated each image on five dimensions: perceived gender, clothing and appearance, background and setting, activities and interactions, and age. A third senior researcher adjudicated whenever the two primary raters disagreed, yielding 10,100 individual judgements. The results reveal a strong gender imbalance, with ImageFX outputs being 85\% male, Grok 86.6\% male, and DALL-E V3 96\% male, indicating that DALL-E V3 exhibited the strongest overall gender stereotyping. This imbalance was most evident in leadership and technical roles. Moreover, cultural inaccuracies in clothing, settings, and depicted activities were frequently observed across all three models. Counter-stereotypical images often arise from cultural misinterpretations rather than genuinely progressive portrayals. We conclude that current models mirro
Autonomous vehicles (AVs) are emerging as a transformative innovation in transportation, offering potential benefits in safety, sustainability, and efficiency. Saudi Arabian adoption of AVs aligns with Vision 2030, emphasizing smart mobility through initiatives such as the Riyadh Autonomous Metro and self-driving cars. This study explores Saudi citizens perceptions of AVs before and after exposure to these technologies and examines whether demographic factors age, gender, education level, and driving habits affect acceptance. Using quantitative methods, the findings provide insights into the broader influences shaping AV adoption, highlighting the importance of trust, perceived safety, and convenience. These results can inform policymakers and industry stakeholders on strategies to facilitate successful integration of AVs into Saudi Arabian transportation ecosystem.
Sign language (SL) is an essential communication form for hearing-impaired and deaf people, enabling engagement within the broader society. Despite its significance, limited public awareness of SL often leads to inequitable access to educational and professional opportunities, thereby contributing to social exclusion, particularly in Saudi Arabia, where over 84,000 individuals depend on Saudi Sign Language (SSL) as their primary form of communication. Although certain technological approaches have helped to improve communication for individuals with hearing impairments, there continues to be an urgent requirement for more precise and dependable translation techniques, especially for Arabic sign language variants like SSL. Most state-of-the-art solutions have primarily focused on non-Arabic sign languages, resulting in a considerable absence of resources dedicated to Arabic sign language, specifically SSL. The complexity of the Arabic language and the prevalence of isolated sign language datasets that concentrate on individual words instead of continuous speech contribute to this issue. To address this gap, our research represents an important step in developing SSL resources. To ad
Not all nations on earth have previously been surveyed accurately enough to know for certain which peak is the national highpoint, the highest peak in the country. Knowledge of these peaks is important for understanding the physical geography of these countries in terms of natural resource availability, watershed management, and tourism potential. For this study, ground surveys were conducted between 2018-2025 with modern professional surveying equipment, including differential GPS units and Abney levels, to accurately determine the national highpoints in five African and Asian countries where uncertainty existed. New national highpoints were determined for Saudi Arabia (Jabal Ferwa), Uzbekistan (Alpomish), Gambia (Sare Firasu Hill), Guinea-Bissau (Mt Ronde), and Togo (Mt Atilakoutse). Elevations were measured with sub-meter vertical accuracy for candidate peaks in Saudi Arabia, Gambia, Guinea-Bissau, and Togo. Relative elevations were measured between contender peaks in Uzbekistan with sufficient accuracy to determine the highpoint.
As digital government platforms become central to public service delivery, understanding citizen assessment is crucial for enhancing usability, trust, and inclusivity. This study investigates citizen satisfaction with the e-government services in Saudi Arabia through a quality-in-use framework based on ISO/IEC 25010 and ISO/IEC 25022 standards, interpreted through the lens of the Unified Theory of Acceptance and Use of Technology (UTAUT). A structured questionnaire was administered to 500 citizens, yielding 276 valid responses. Satisfaction was evaluated across four dimensions: overall satisfaction, feature satisfaction, trust, and emotional engagement (pleasure). The findings demonstrate consistently high levels of satisfaction regarding usability and trust, aligning with Saudi Arabia's top-tier global ranking in e-government development. However, the results also highlight persistent challenges related to service clarity and system responsiveness. Emotional engagement was limited, indicating that users perceive these services primarily as functional tools rather than as engaging digital experiences. The study offers valuable insights for policymakers and contributes to the theore
Using an integrated framework rooted in the TOE model enhanced with AI, this study looks at ways to improve industrial performance and environmental sustainability in fragile and rapidly transforming contexts such as those found in Yemen and Saudi Arabia. Data for the research are field-based and were obtained from a total of 600 SMEs operating in both countries. Based on the questionnaires' responses by 294 managers, results from the partial least squares structural equation modeling (PLS-SEM) have indicated significant positive effects of AI-TOE on environmental performance (beta = 0.487) and manufacturing performance (beta = 0.759). Results indicate that AI acts as a transformative force, though its impact differs based on the maturity of infrastructure and organizational readiness. The Saudi SMEs gain from their institutional support and advanced technologies, while those in Yemen are dependent on the low-cost adoption of AI and organizational flexibility to accept structural challenges. PLS-SEM analysis of the study showed that integrating AI into the TOE dimensions accelerates operational efficiency in order to support environmental performance. Industrial performance was fou
The present study provides the first-ever report on the language shift from Tibetan to Arabic among descendants of Tibetan families who migrated from the Tibet region to Saudi Arabia around 70 years ago. The aim of this study was to determine whether three age groups had adopted different practices in terms of maintaining Tibetan or shifting to Hijazi Arabic. To this end, 96 male and female members of the Tibetan community responded to a questionnaire in which they were asked about their code choice in different domains (home, neighbourhood, friends and relatives, expressing emotion, and performing religious rituals). The data revealed significant intergenerational differences between members of the community in terms of the extent of the shift to Arabic, with Tibetan rarely used by younger members and older members making only slightly more use of it. The difference between the three age groups was significant, at a p-value of .001.
Saudi Arabia faced a swift economic growth and societal transformation under Vision 2030. This offers a unique opportunity to track emerging trends in the region, which will ultimately pave the way for new business and investment possibilities. This paper explores how AI and social media analytics can identify and track trends across sectors such as construction, food and beverage, tourism, technology, and entertainment thereby helping the businesses make informed decisions. By leveraging a tailored AI-driven methodology, we analyzed millions of social media posts each month, classifying discussions and calculating scores to track the trends. The approach not only uncovered the emerging trends but also shows diminishing trends. Our methodology is able to predict the emergence and growth of trends by utilizing social media data. This approach has potential for adaptation in other regions. Ultimately, our findings highlight how ongoing, AI-powered trend analysis can enable more effective, data-informed business and development strategies in an increasingly dynamic environment.
The integration of Internet of Things (IoT) technologies in agriculture holds promise for transforming farming practices, particularly in the Kingdom of Saudi Arabia (KSA). This study explores the adoption of smart farming practices among KSA farmers. Due to the geographical location and nature of KSA, it faces significant challenges in agriculture. The objective of this research is to discuss how IoT will enhance agriculture in KSA and identify its current usage by conducting a study on Saudi farmers with varying ages, regions, and years of experience. The results indicate that 90% of the farmers encounter challenges in farming, and all of them express interest in adopting smart farming to address these issues. While 60% of farmers are currently utilizing IoT technologies, they encounter challenges in implementing smart farming practices. Thus, smart farming presents solutions to prevalent challenges including adverse weather, water scarcity, and labor shortages, though barriers include cost and educational challenges.
This paper introduces the Saudi Privacy Policy Dataset, a diverse compilation of Arabic privacy policies from various sectors in Saudi Arabia, annotated according to the 10 principles of the Personal Data Protection Law (PDPL); the PDPL was established to be compatible with General Data Protection Regulation (GDPR); one of the most comprehensive data regulations worldwide. Data were collected from multiple sources, including the Saudi Central Bank, the Saudi Arabia National United Platform, the Council of Health Insurance, and general websites using Google and Wikipedia. The final dataset includes 1,000 websites belonging to 7 sectors, 4,638 lines of text, 775,370 tokens, and a corpus size of 8,353 KB. The annotated dataset offers significant reuse potential for assessing privacy policy compliance, benchmarking privacy practices across industries, and developing automated tools for monitoring adherence to data protection regulations. By providing a comprehensive and annotated dataset of privacy policies, this paper aims to facilitate further research and development in the areas of privacy policy analysis, natural language processing, and machine learning applications related to pr
This paper presents the preliminary findings of a study researching the diffusion and the adoption of online retailing in Saudi Arabia. It reports new research that identifies and explores the key issues that positively and negatively influence the decision of Saudi customers to buy from online retailers in Saudi Arabia. Although Saudi Arabia has the largest and fastest growth of ICT marketplaces in the Arab region, e-commerce activities are not progressing at the same speed. While the overall research project involves exploratory research using mixed methods, the focus of this paper is on a quantitative analysis of responses obtained from a survey of Saudi customers, with the design of the questionnaire instrument being based on the findings of a qualitative analysis reported in a previous paper. The main findings of the current analysis include a list of key factors that affect Saudi customers' purchase from Saudi online retailers, and quantitative indications of the relative strengths of the various relationships.