Grid search and manual search are the most widely used strategies for hyper-parameter optimization. This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid. Empirical evidence comes from a comparison with a large previous study that used grid search and manual search to configure neural networks and deep belief networks. Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as good or better within a small fraction of the computation time. Granting random search the same computational budget, random search finds better models by effectively searching a larger, less promising configuration space. Compared with deep belief networks configured by a thoughtful combination of manual search and grid search, purely random search over the same 32-dimensional configuration space found statistically equal performance on four of seven data sets, and superior performance on one of seven. A Gaussian process analysis of the function from hyper-parameters to validation set performance reveals that for most data sets only a few of the hyper-parameters really matter, but that different hyper-parameters are important on different data sets. This phenomenon makes
OBJECTIVE: To develop an evidence-based guideline for Peer Review of Electronic Search Strategies (PRESS) for systematic reviews (SRs), health technology assessments, and other evidence syntheses. STUDY DESIGN AND SETTING: An SR, Web-based survey of experts, and consensus development forum were undertaken to identify checklists that evaluated or validated electronic literature search strategies and to determine which of their elements related to search quality or errors. RESULTS: Systematic review: No new search elements were identified for addition to the existing (2008-2010) PRESS 2015 Evidence-Based Checklist, and there was no evidence refuting any of its elements. Results suggested that structured PRESS could identify search errors and improve the selection of search terms. Web-based survey of experts: Most respondents felt that peer review should be undertaken after the MEDLINE search had been prepared but before it had been translated to other databases. Consensus development forum: Of the seven original PRESS elements, six were retained: translation of the research question; Boolean and proximity operators; subject headings; text word search; spelling, syntax and line numbers; and limits and filters. The seventh (skilled translation of the search strategy to additional databases) was removed, as there was consensus that this should be left to the discretion of searchers. An updated PRESS 2015 Guideline Statement was developed, which includes the following four documents: PRESS 2015 Evidence-Based Checklist, PRESS 2015 Recommendations for Librarian Practice, PRESS 2015 Implementation Strategies, and PRESS 2015 Guideline Assessment Form. CONCLUSION: The PRESS 2015 Guideline Statement should help to guide and improve the peer review of electronic literature search strategies.
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
This is the second half of a two part series devoted to the tabu search metastrategy for optimization problems. Part I introduced the fundamental ideas of tabu search as an approach for guiding other heuristics to overcome the limitations of local optimality, both in a deterministic and a probabilistic framework. Part I also reported successful applications from a wide range of settings, in which tabu search frequently made it possible to obtain higher quality solutions than previously obtained with competing strategies, generally with less computational effort. Part II, in this issue, examines refinements and more advanced aspects of tabu search. Following a brief review of notation, Part II introduces new dynamic strategies for managing tabu lists, allowing fuller exploitation of underlying evaluation functions. In turn, the elements of staged search and structured move sets are characterized, which bear on the issue of finiteness. Three ways of applying tabu search to the solution of integer programming problems are then described, providing connections also to certain nonlinear programming applications. Finally, the paper concludes with a brief survey of new applications of tabu search that have occurred since the developments reported in Part I. Together with additional comparisons with other methods on a wide body of problems, these include results of parallel processing implementations and the use of tabu search in settings ranging from telecommunications to neural networks. INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.
This paper presents the fundamental principles underlying tabu search as a strategy for combinatorial optimization problems. Tabu search has achieved impressive practical successes in applications ranging from scheduling and computer channel balancing to cluster analysis and space planning, and more recently has demonstrated its value in treating classical problems such as the traveling salesman and graph coloring problems. Nevertheless, the approach is still in its infancy, and a good deal remains to be discovered about its most effective forms of implementation and about the range of problems for which it is best suited. This paper undertakes to present the major ideas and findings to date, and to indicate challenges for future research. Part I of this study indicates the basic principles, ranging from the short-term memory process at the core of the search to the intermediate and long term memory processes for intensifying and diversifying the search. Included are illustrative data structures for implementing the tabu conditions (and associated aspiration criteria) that underlie these processes. Part I concludes with a discussion of probabilistic tabu search and a summary of computational experience for a variety of applications. Part II of this study (to appear in a subsequent issue) examines more advanced considerations, applying the basic ideas to special settings and outlining a dynamic move structure to insure finiteness. Part II also describes tabu search methods for solving mixed integer programming problems and gives a brief summary of additional practical experience, including the use of tabu search to guide other types of processes, such as those of neural networks. INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.
It is desirable to store data on data storage servers such as mail servers and file servers in encrypted form to reduce security and privacy risks. But this usually implies that one has to sacrifice functionality for security. For example, if a client wishes to retrieve only documents containing certain words, it was not previously known how to let the data storage server perform the search and answer the query, without loss of data confidentiality. We describe our cryptographic schemes for the problem of searching on encrypted data and provide proofs of security for the resulting crypto systems. Our techniques have a number of crucial advantages. They are provably secure: they provide provable secrecy for encryption, in the sense that the untrusted server cannot learn anything about the plaintext when only given the ciphertext; they provide query isolation for searches, meaning that the untrusted server cannot learn anything more about the plaintext than the search result; they provide controlled searching, so that the untrusted server cannot search for an arbitrary word without the user's authorization; they also support hidden queries, so that the user may ask the untrusted server to search for a secret word without revealing the word to the server. The algorithms presented are simple, fast (for a document of length n, the encryption and search algorithms only need O(n) stream cipher and block cipher operations), and introduce almost no space and communication overhead, and hence are practical to use today.
From the Publisher: This book explores the meta-heuristics approach called tabu search, which is dramatically changing our ability to solve a hostof problems that stretch over the realms of resource planning,telecommunications, VLSI design, financial analysis, scheduling, spaceplanning, energy distribution, molecular engineering, logistics,pattern classification, flexible manufacturing, waste management,mineral exploration, biomedical analysis, environmental conservationand scores of other problems. The major ideas of tabu search arepresented with examples that show their relevance to multipleapplications. Numerous illustrations and diagrams are used to clarifyprinciples that deserve emphasis, and that have not always been wellunderstood or applied. The book's goal is to provide ''hands-on' knowledge and insight alike, rather than to focus exclusively eitheron computational recipes or on abstract themes. This book is designedto be useful and accessible to researchers and practitioners inmanagement science, industrial engineering, economics, and computerscience. It can appropriately be used as a textbook in a masterscourse or in a doctoral seminar. Because of its emphasis on presentingideas through illustrations and diagrams, and on identifyingassociated practical applications, it can also be used as asupplementary text in upper division undergraduate courses. Finally, there are many more applications of tabu search than canpossibly be covered in a single book, and new ones are emerging everyday. The book's goal is to provide a grounding in the essential ideasof tabu search that will allow readers to create successfulapplications of their own. Along with the essentialideas,understanding of advanced issues is provided, enabling researchers togo beyond today's developments and create the methods of tomorrow.
We present a statistical model to estimate the accuracy of peptide assignments to tandem mass (MS/MS) spectra made by database search applications such as SEQUEST. Employing the expectation maximization algorithm, the analysis learns to distinguish correct from incorrect database search results, computing probabilities that peptide assignments to spectra are correct based upon database search scores and the number of tryptic termini of peptides. Using SEQUEST search results for spectra generated from a sample of known protein components, we demonstrate that the computed probabilities are accurate and have high power to discriminate between correctly and incorrectly assigned peptides. This analysis makes it possible to filter large volumes of MS/MS database search results with predictable false identification error rates and can serve as a common standard by which the results of different research groups are compared.
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \\curse of dimensionality. " That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points
This paper develops the multidimensional binary search tree (or k -d tree, where k is the dimensionality of the search space) as a data structure for storage of information to be retrieved by associative searches. The k -d tree is defined and examples are given. It is shown to be quite efficient in its storage requirements. A significant advantage of this structure is that a single data structure can handle many types of queries very efficiently. Various utility algorithms are developed; their proven average running times in an n record file are: insertion, O (log n ); deletion of the root, O ( n ( k -1)/ k ); deletion of a random node, O (log n ); and optimization (guarantees logarithmic performance of searches), O ( n log n ). Search algorithms are given for partial match queries with t keys specified [proven maximum running time of O ( n ( k - t )/ k )] and for nearest neighbor queries [empirically observed average running time of O (log n ).] These performances far surpass the best currently known algorithms for these tasks. An algorithm is presented to handle any general intersection query. The main focus of this paper is theoretical. It is felt, however, that k -d trees could be quite useful in many applications, and examples of potential uses are given.
A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.
article Free Access`` Direct Search'' Solution of Numerical and Statistical Problems Authors: Robert Hooke Wesinghouse Research Laboratories, Pittsburgh, Pennsylvania Wesinghouse Research Laboratories, Pittsburgh, PennsylvaniaView Profile , T. A. Jeeves Wesinghouse Research Laboratories, Pittsburgh, Pennsylvania Wesinghouse Research Laboratories, Pittsburgh, PennsylvaniaView Profile Authors Info & Claims Journal of the ACMVolume 8Issue 2April 1961 pp 212–229https://doi.org/10.1145/321062.321069Published:01 April 1961Publication History 2,930citation8,393DownloadsMetricsTotal Citations2,930Total Downloads8,393Last 12 Months769Last 6 weeks95 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteeReaderPDF
MOTIVATION: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. RESULTS: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. AVAILABILITY: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.
A new theory of search and visual attention is presented. Results support neither a distinction between serial and parallel search nor between search for features and conjunctions. For all search materials, instead, difficulty increases with increased similarity of targets to nontargets and decreasedsimilarity between nontargets, producing a continuum of search efficiency. A parallel stage of perceptual grouping and description is followed by competitive interaction between inputs, guiding selective access to awareness and action. An input gains weight to the extent that it matches an internal description of that information needed in current behavior (hence the effect of targetnontarget similarity). Perceptual grouping encourages input weights to change together (allowing spreading suppression of similar nontargets). The theory accounts for harmful effects of nontargets resembling any possible target, the importance of local nontarget grouping, and many other findings.
An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases. Because of the algorithm's efficiency on many microcomputers, sensitive protein database searches may now become a routine procedure for molecular biologists. The method efficiently identifies regions of similar sequence and then scores the aligned identical and differing residues in those regions by means of an amino acid replacability matrix. This matrix increases sensitivity by giving high scores to those amino acid replacements which occur frequently in evolution. The algorithm has been implemented in a computer program designed to search protein databases very rapidly. For example, comparison of a 200-amino-acid sequence to the 500,000 residues in the National Biomedical Research Foundation library would take less than 2 minutes on a minicomputer, and less than 10 minutes on a microcomputer (IBM PC).
When Beacon Press first published Man's Search for Meaning in 1959, Carl Rogers called it of the outstanding contributions to psychological thought in the last fifty years. In the thirty-three years since then, this book - at once a memoir, a self-help book, and a psychology manual - has become a classic that has sold more than three million copies in English language editions. Man's Search for Meaning tells the chilling and inspirational story of eminent psychiatrist Viktor Frankl, who was imprisoned at Auschwitz and other concentration camps for three years during the Second World War. Immersed in great suffering and loss, Frankl began to wonder why some of his fellow prisoners were able not only to survive the horrifying conditions, but to grow in the process. Frankl's conclusion - that the most basic human motivation is the will to meaning - became the basis of his groundbreaking psychological theory, logotherapy. As Nietzsche put it, He who has a why to live for can bear almost any how. In Man's Search for Meaning, Frankl outlines the principles of logotherapy, and offers ways to help each one of us focus on finding the purpose in our lives. This new edition of Man's Search for Meaning includes a new preface by the author, in which he explains his decision to remain in his native Austria during the Nazi invasion, a choice which eventually led to his imprisonment. It also includes an updated bibliography of books, articles, records, films, videotapes, and audio tapes about logotherapy.
David Goldberg's Genetic Algorithms in Search, Optimization and Machine Learning is by far the bestselling introduction to genetic algorithms. Goldberg is one of the preeminent researchers in the field--he has published over 100 research articles on genetic algorithms and is a student of John Holland, the father of genetic algorithms--and his deep understanding of the material shines through. The book contains a complete listing of a simple genetic algorithm in Pascal, which C programmers can easily understand. The book covers all of the important topics in the field, including crossover, mutation, classifier systems, and fitness scaling, giving a novice with a computer science background enough information to implement a genetic algorithm and describe genetic algorithms to a friend.
When Beacon Press first published Man's Search for Meaning in 1959, Carl Rogers called it of the outstanding contributions to psychological thought in the last fifty years. In the thirty-three years since then, this book - at once a memoir, a self-help book, and a psychology manual - has become a classic that has sold more than three million copies in English language editions. Man's Search for Meaning tells the chilling and inspirational story of eminent psychiatrist Viktor Frankl, who was imprisoned at Auschwitz and other concentration camps for three years during the Second World War. Immersed in great suffering and loss, Frankl began to wonder why some of his fellow prisoners were able not only to survive the horrifying conditions, but to grow in the process. Frankl's conclusion - that the most basic human motivation is the will to meaning - became the basis of his groundbreaking psychological theory, logotherapy. As Nietzsche put it, He who has a why to live for can bear almost any how. In Man's Search for Meaning, Frankl outlines the principles of logotherapy, and offers ways to help each one of us focus on finding the purpose in our lives. This new edition of Man's Search for Meaning includes a new preface by the author, in which he explains his decision to remain in his native Austria during the Nazi invasion, a choice which eventually led to his imprisonment. It also includes an updated bibliography of books, articles, records, films, videotapes, and audio tapes about logotherapy.
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the "multiple segment Viterbi" (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call "sparse rescaling". These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches.