We develop a high-precision classifier to measure artificial intelligence (AI) patents by fine-tuning PatentSBERTa on manually labeled data from the USPTO's AI Patent Dataset. Our classifier substantially improves the existing USPTO approach, achieving 97.0% precision, 91.3% recall, and a 94.0% F1 score, and it generalizes well to Chinese patents based on citation and lexical validation. Applying it to granted U.S. patents (1976-2023) and Chinese patents (2010-2023), we document rapid growth in AI patenting in both countries and broad convergence in AI patenting intensity and subfield composition, even as China surpasses the United States in recent annual patent counts. The organization of AI innovation nevertheless differs sharply: U.S. AI patenting is concentrated among large private incumbents and established hubs, whereas Chinese AI patenting is more geographically diffuse and institutionally diverse, with larger roles for universities and state-owned enterprises. For listed firms, AI patents command a robust market-value premium in both countries. Cross-border citations show continued technological interdependence rather than decoupling, with Chinese AI inventors relying more
Scientific research is a key input into technological innovation, yet not all scientific knowledge is equally mobilized in patents. This paper examines how different scientific publishing models shape both the selection of scientific publications cited in patents and their cognitive alignment with patented technologies. Using large-scale data on non-patent references linking patents to scientific publications, combined with metadata from OpenAlex, we compare the Open Access (OA) structure of patent-cited science to that of the scientific literature. We then assess cognitive alignment using semantic similarity between patent abstracts and the abstracts of cited publications, distinguishing between citations appearing in the front section of patents and those embedded in the body of patent texts. We find that patent citations disproportionately draw on publications disseminated through highly visible and institutionally established publishing channels, particularly hybrid and bronze OA models, indicating strong selection effects. However, this dominance in citation counts does not translate into stronger cognitive alignment with patented technologies. On the contrary, publications in
In an age of fast-paced technological change, patents have evolved into not only legal mechanisms of intellectual property, but also structured storage containers of knowledge full of metadata, categories, and formal innovation. This chapter proposes to reframe patents in the context of information science, by focusing on patents as knowledge artifacts, and by seeing patents as fundamentally tied to the global movement of scientific and technological knowledge. With a focus on three areas, the inventions of AIs, biotech patents, and international competition with patents, this work considers how new technologies are challenging traditional notions of inventorship, access, and moral accountability.The chapter provides a critical analysis of AI's implications for patent authorship and prior art searches, ownership issues arising from proprietary claims in biotechnology to ethical dilemmas, and the problem of using patents for strategic advantage in a global context of innovation competition. In this analysis, the chapter identified the importance of organizing information, creating metadata standards about originality, implementing retrieval systems to access previous works, and ethi
Pharmaceutical patents play an important role by protecting the innovation from copies but also drive researchers to innovate, create new products, and promote disruptive innovations focusing on collective health. The study of patent management usually refers to an exhaustive manual search. This happens, because patent documents are complex with a lot of details regarding the claims and methodology/results explanation of the invention. To mitigate the manual search, we proposed PATopics, a framework specially designed to extract relevant information for Pharmaceutical patents. PATopics is composed of four building blocks that extract textual information from the patents, build relevant topics that are capable of summarizing the patents, correlate these topics with useful patent characteristics and then, summarize the information in a friendly web interface to final users. The general contributions of PATopics are its ability to centralize patents and to manage patents into groups based on their similarities. We extensively analyzed the framework using 4,832 pharmaceutical patents concerning 809 molecules patented by 478 companies. In our analysis, we evaluate the use of the framewo
The rapid growth of scientific techniques and knowledge is reflected in the exponential increase in new patents filed annually. While these patents drive innovation, they also present significant burden for researchers and engineers, especially newcomers. To avoid the tedious work of navigating a vast and complex landscape to identify trends and breakthroughs, researchers urgently need efficient tools to summarize, evaluate, and contextualize patents, revealing their innovative contributions and underlying scientific principles.To address this need, we present EvoPat, a multi-LLM-based patent agent designed to assist users in analyzing patents through Retrieval-Augmented Generation (RAG) and advanced search strategies. EvoPat leverages multiple Large Language Models (LLMs), each performing specialized roles such as planning, identifying innovations, and conducting comparative evaluations. The system integrates data from local databases, including patents, literature, product catalogous, and company repositories, and online searches to provide up-to-date insights. The ability to collect information not included in original database automatically is also implemented. Through extensiv
This article examines the complex trade-offs inherent in the patent system, exploring whether patents truly incentivize innovation or inadvertently hinder progress. It traces the historical evolution of patent rights from their origins in Renaissance Venice to the modern framework enshrined in constitutional and international law. By balancing the exclusivity granted to inventors with the need for public access to knowledge, the article highlights how patents stimulate R&D investments while sometimes limiting follow-on innovation due to prolonged monopolies. It discusses the economic rationales for patent protection, such as increasing private returns to encourage invention, against criticisms that patents restrict the free flow of ideas essential for collective progress. Ultimately, the review argues that although patents are designed to reward creativity and secure economic benefits, they remain a contentious instrument whose benefits and limitations must be carefully weighed in policy debates.
The purpose of this paper is to use patent level characteristics to estimate the survival of resident patents (filed at the Indian Patent Office (IPO) and assigned to firms in India). This study uses the renewal information of firm-level patents applied during 1st January 1995 and 31st December 2005, which were eventually granted. The data provided by IPO consists of 2025 resident patents assigned to 266 firms (foreign subsidiary firms and domestic firms). The survival analysis is carried out via Kaplan-Meier estimation and Cox proportional hazard regression. The outcomes of this study suggest that the survival length of patents significantly depends on their technological scope and inventor size. Moreover, the patents of the firms taking tax credit benefits exhibit lower survival rate as compared to patents of remaining firms. The study also finds that the patents filed by foreign firms with DSIR affiliation are getting more benefit from the R&D tax incentive policy.
This study uses patent renewal information to estimate private value of patents by technology and ownership status. Patent value refers to the economic reward that the inventor extracts from the patent by making, using or selling an invention. Thus, we measure the value of patent right (private value of patent) from the patentee perspective. Our empirical analysis comprises of 555 patents with application year during 1999 to 2002. The term of these patents either ended in 2018 or lapsed due to non-payment of renewal fee. We model renewal decision of patentee as ordered probit where patent renewal fee increases with the age of patent. Variables such as patent family size, technological scope, number of inventors and grant lag are used as explanatory variables in the corresponding regression. Hence, this paper combines the patentee renewal decision along with patents characteristics and renewal cost schedule to estimate the initial rent distribution. We find that a large number of patents expire at an early stage leaving few patents with high value corroborating the results of studies using European, American and Chinese data. As expected, certain technology class patents enjoy high
Already since the 1950s TRIZ shows that patents and the technical contradictions they solve are an important source of inspiration for the development of innovative products. However, TRIZ is a heuristic based on a historic patent analysis and does not make use of the ever-increasing number of latest technological solutions in current patents. Because of the huge number of patents, their length, and, last but not least, their complexity there is a need for modern patent retrieval and patent analysis to go beyond keyword-oriented methods. Recent advances in patent retrieval and analysis mainly focus on dense vectors based on neural AI Transformer language models like Google BERT. They are, for example, used for dense retrieval, question answering or summarization and key concept extraction. A research focus within the methods for patent summarization and key concept extraction are generic inventive concepts respectively TRIZ concepts like problems, solutions, advantage of invention, parameters, and contradictions. Succeeding rule-based approaches, finetuned BERT-like language models for sentence-wise classification represent the state-of-the-art of inventive concept extraction. Whil
This chapter explores the role of patent protection in algorithmic surveillance and whether ordre public exceptions from patentability should apply to such patents, due to their potential to enable human rights violations. It concludes that in most cases, it is undesirable to exclude algorithmic surveillance patents from patentability, as the patent system is ill-equipped to evaluate the impacts of the exploitation of such technologies. Furthermore, the disclosure of such patents has positive externalities from the societal perspective by opening the black box of surveillance for public scrutiny.
Due to the swift growth of patent applications each year, information and multimedia retrieval approaches that facilitate patent exploration and retrieval are of utmost importance. Different types of visualizations (e.g., graphs, technical drawings) and perspectives (e.g., side view, perspective) are used to visualize details of innovations in patents. The classification of these images enables a more efficient search and allows for further analysis. So far, datasets for image type classification miss some important visualization types for patents. Furthermore, related work does not make use of recent deep learning approaches including transformers. In this paper, we adopt state-of-the-art deep learning methods for the classification of visualization types and perspectives in patent images. We extend the CLEF-IP dataset for image type classification in patents to ten classes and provide manual ground truth annotations. In addition, we derive a set of hierarchical classes from a dataset that provides weakly-labeled data for image perspectives. Experimental results have demonstrated the feasibility of the proposed approaches. Source code, models, and dataset will be made publicly ava
One of the most challenging problems in technological forecasting is to identify as early as possible those technologies that have the potential to lead to radical changes in our society. In this paper, we use the US patent citation network (1926-2010) to test our ability to early identify a list of historically significant patents through citation network analysis. We show that in order to effectively uncover these patents shortly after they are issued, we need to go beyond raw citation counts and take into account both the citation network topology and temporal information. In particular, an age-normalized measure of patent centrality, called rescaled PageRank, allows us to identify the significant patents earlier than citation count and PageRank score. In addition, we find that while high-impact patents tend to rely on other high-impact patents in a similar way as scientific papers, the patents' citation dynamics is significantly slower than that of papers, which makes the early identification of significant patents more challenging than that of significant papers.
Patent analysis has recently been recognized as a powerful technique for large companies worldwide to lend them insight into the age of competition among various industries. This technique is considered a shortcut for developing countries since it can significantly accelerate their technology development. Therefore, as an inevitable process, patent analysis can be utilized to monitor rival companies and diverse industries. This research employed a graph representation learning approach to create, analyze, and find similarities in the patent data registered in the Iranian Official Gazette. The patent records were scrapped and wrangled through the Iranian Official Gazette portal. Afterward, the key entities were extracted from the scrapped patents dataset to create the Iranian patents graph from scratch based on novel natural language processing and entity resolution techniques. Finally, thanks to the utilization of novel graph algorithms and text mining methods, we identified new areas of industry and research from Iranian patent data, which can be used extensively to prevent duplicate patents, familiarity with similar and connected inventions, Awareness of legal entities supporting
One essential component in the construction of patent landscapes in biomedical research and development (R&D) is identifying the most seminal patents. Hitherto, the identification of seminal patents required subject matter experts within biomedical areas. In this brief communication, we report an analytical method and tool, Patent Citation Spectroscopy (PCS), for rapidly identifying landmark patents in user-specified areas of biomedical innovation. PCS mines the cited references within large sets of patents and provides an estimate of the most historically impactful prior work. The efficacy of PCS is shown in two case studies of biomedical innovation with clinical relevance: (1) RNA interference and (2) cholesterol. PCS mined and analyzed 4,065 cited references related to patents on RNA interference and correctly identified the foundational patent of this technology, as independently reported by subject matter experts on RNAi intellectual property. Secondly, PCS was applied to a broad set of patents dealing with cholesterol - a case study chosen to reflect a more general, as opposed to expert, patent search query. PCS mined through 11,326 cited references and identified the sem
The global patent application count has steadily increased, achieving eight consecutive years of growth.The global patent industry has shown a general trend of expansion. This is attributed to the increasing innovation activities, particularly in the fields of technology, healthcare, and biotechnology. Some emerging market countries, such as China and India, have experienced significant growth in the patent domain, becoming important participants in global patent activities.
A long-standing discussion is to what extent patents can be used to monitor trends in innovation activity. This study quantifies the amount and quality of information about actual innovation contained in the patent system, based on 4,460 Swedish innovations (1970-2015) that have been matched to international patents. The results show that most innovations were not patented and that among those that were, 43.9% of all innovations, only a fraction can be identified with patent quality data. The best-performing models identify 17% of all information about innovations, equivalent to an information loss of at least 83%. Econometric tests also show that the fraction of innovations responding to strengthened patent laws during the period were on average 8% percent. The overlap between the patent and innovation systems is hence more modest than often assumed. This accentuates the need to, alongside patents, develop versatile approaches in order to induce and monitor various aspects of innovation.
While patents and standards have been identified as essential driving components of innovation and market growth, the inclusion of a patent in a standard poses many difficulties. These difficulties arise from the contradicting natures of patents and standards, which makes their combination really challenging, but, also, from the opposing business and market strategies of different patent owners involved in the standardisation process. However, a varying set of policies has been adopted to address the issues occurring from the unavoidable inclusion of patents in standards concerning certain industry sectors with a constant high degree of innovation, such as telecommunications. As these policies have not always proven adequate enough, constant efforts are being made to improve and expand them. The intriguing and complicated relationship between patents and standards is finally examined through a review of the use cases of well-known standards of the telecommunications sector which include a growing set of essential patents.
The total number of patents produced by a country (or the number of patents produced per capita) is often used as an indicator for innovation. Here we present evidence that the distribution of patents amongst applicants within many OECD countries is well-described by power laws with exponents that vary between 1.66 (Japan) and 2.37 (Poland). Using simulations based on simple preferential attachment-type rules that generate power laws, we find we can explain some of the variation in exponents between countries, with countries that have larger numbers of patents per applicant generally exhibiting smaller exponents in both the simulated and actual data. Similarly we find that the exponents for most countries are inversely correlated with other indicators of innovation, such as R&D intensity or the ubiquity of export baskets. This suggests that in more advanced economies, which tend to have smaller values of the exponent, a greater proportion of the total number of patents are filed by large companies than in less advanced countries.
Patent similarity analysis plays a crucial role in evaluating the risk of patent infringement. Nonetheless, this analysis is predominantly conducted manually by legal experts, often resulting in a time-consuming process. Recent advances in natural language processing technology offer a promising avenue for automating this process. However, methods for measuring similarity between patents still rely on experts manually classifying patents. Due to the recent development of artificial intelligence technology, a lot of research is being conducted focusing on the semantic similarity of patents using natural language processing technology. However, it is difficult to accurately analyze patent data, which are legal documents representing complex technologies, using existing natural language processing technologies. To address these limitations, we propose a hybrid methodology that takes into account bibliographic similarity, measures the similarity between patents by considering the semantic similarity of patents, the technical similarity between patents, and the bibliographic information of patents. Using natural language processing techniques, we measure semantic similarity based on pat
As the capabilities of Large Language Models (LLMs) continue to advance, the field of patent processing has garnered increased attention within the natural language processing community. However, the majority of research has been concentrated on classification tasks, such as patent categorization and examination, or on short text generation tasks like patent summarization and patent quizzes. In this paper, we introduce a novel and practical task known as Draft2Patent, along with its corresponding D2P benchmark, which challenges LLMs to generate full-length patents averaging 17K tokens based on initial drafts. Patents present a significant challenge to LLMs due to their specialized nature, standardized terminology, and extensive length. We propose a multi-agent framework called AutoPatent which leverages the LLM-based planner agent, writer agents, and examiner agent with PGTree and RRAG to generate lengthy, intricate, and high-quality complete patent documents. The experimental results demonstrate that our AutoPatent framework significantly enhances the ability to generate comprehensive patents across various LLMs. Furthermore, we have discovered that patents generated solely with t