Since the time of Galileo, scientists have employed the laboratory experiment as a method of understanding natural phenomena. This chapter focuses on some of the qualities peculiar to psychological experiments. The experimental situation is one which takes place within the context of an explicit agreement of the subject to participate in a special form of social interaction known as taking part in an experiment. The demand characteristics perceived in any particular experiment will vary with the sophistication, intelligence, and previous experience of each experimental subject. It becomes an empirical issue to study under what circumstances, in what kind of experimental contexts, and with what kind of subject populations, demand characteristics become significant in determining the behavior of subjects in experimental situations. The most obvious technique for determining what demand characteristics are perceived is the use of post-experimental inquiry. In this regard, it is well to point out that considerable self-discipline is necessary for the experimenter to obtain a valid inquiry.
When we are trying to make the best estimate of some quantity A that is available from the research conducted to date, the problem of combining results from different experiments is encountered. The problem is often troublesome, particularly if the individual estimates were made by different workers using different procedures. This paper discusses one of the simpler aspects of the problem, in which there is sufficient uniformity of experimental methods so that the ith experiment provides an estimate xi of u, and an estimate si of the standard error of xi . The experiments may be, for example, determinations of a physical or astronomical constant by different scientists, or bioassays carried out in different laboratories, or agricultural field experiments laid out in different parts of a region. The quantity xi may be a simple mean of the observations, as in a physical determination, or the difference between the means of two treatments, as in a comparative experiment, or a median lethal dose, or a regression coefficient. The problem of making a combined estimate has been discussed previously by Cochran (1937) and Yates and Cochran (1938) for agricultural experiments, and by Bliss (1952) for bioassays in different laboratories. The last two papers give recommendations for the practical worker. My purposes in treating the subject again are to discuss it in more general terms, to take account of some recent theoretical research, and, I hope, to bring the practical recommendations to the attention of some biologists who are not acquainted with the previous papers. The basic issue with which this paper deals is as follows. The simplest method of combining estimates made in a number of different experiments is to take the arithmetic mean of the estimates. If, however, the experiments vary in size, or appear to be of different precision, the investigator may wonder whether some kind of weighted meani would be more precise. This paper gives recommendations about the kinds of weighted mean that are appropriate, the situations in which they
Autonomous drones and ground vehicles will stream “battlefield intelligence” over 5G along the US-Canada border in a bilateral DHS experiment this fall
Scientists have pulled off a mind-bending quantum experiment that sounds almost impossible: they showed that tiny metal particles made of thousands of atoms can exist in multiple places at once。 Using advanced laser techniques, researchers at the University of Vienna observed quantum interference in sodium nanoparticles far larger than the kinds of
BACKGROUND: Currently, a lack of consensus exists on how best to perform and interpret quantitative real-time PCR (qPCR) experiments. The problem is exacerbated by a lack of sufficient experimental detail in many publications, which impedes a reader's ability to evaluate critically the quality of the results presented or to repeat the experiments. CONTENT: The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines target the reliability of results to help ensure the integrity of the scientific literature, promote consistency between laboratories, and increase experimental transparency. MIQE is a set of guidelines that describe the minimum information necessary for evaluating qPCR experiments. Included is a checklist to accompany the initial submission of a manuscript to the publisher. By providing all relevant experimental conditions and assay characteristics, reviewers can assess the validity of the protocols used. Full disclosure of all reagents, sequences, and analysis methods is necessary to enable other investigators to reproduce results. MIQE details should be published either in abbreviated form or as an online supplement. SUMMARY: Following these guidelines will encourage better experimental practice, allowing more reliable and unequivocal interpretation of qPCR results.
“It seems every other day I am reading a story about a massive insider trading scandal
Experimental economists are leaving the reservation. They are recruiting subjects in the field rather than in the classroom, using field goods rather than induced valuations, and using field context rather than abstract terminology in instructions. We argue that there is something methodologically fundamental behind this trend. Field experiments differ from laboratory experiments in many ways. Although it is tempting to view field experiments as simply less controlled variants of laboratory experiments, we argue that to do so would be to seriously mischaracterize them. What passes for “control” in laboratory experiments might in fact be precisely the opposite if it is artificial to the subject or context of the task. We propose six factors that can be used to determine the field context of an experiment: the nature of the subject pool, the nature of the information that the subjects bring to the task, the nature of the commodity, the nature of the task or trading rules applied, the nature of the stakes, and the environment that subjects operate in.
Pseudoreplication is defined as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent. In ANOVA terminology, it is the testing for treatment effects with an error term inappropriate to the hypothesis being considered. Scrutiny of 176 experimental studies published between 1960 and the present revealed that pseudoreplication occurred in 27% of them, or 48% of all such studies that applied inferential statistics. The incidence of pseudoreplication is especially high in studies of marine benthos and small mammals. The critical features of controlled experimentation are reviewed. Nondemonic intrusion is defined as the impingement of chance events on an experiment in progress. As a safeguard against both it and preexisting gradients, interspersion of treatments is argued to be an obligatory feature of good design. Especially in small experiments, adequate interspersion can sometimes be assured only by dispensing with strict randomization procedures. Comprehension of this conflict between interspersion and randomization is aided by distinguishing pre—layout (or conventional) and layout—specific alpha (probability of type I error). Suggestions are offered to statisticians and editors of ecological journals as to how ecologists' understanding of experimental design and statistics might be improved.
The fifth phase of the Coupled Model Intercomparison Project (CMIP5) will produce a state-of-the- art multimodel dataset designed to advance our knowledge of climate variability and climate change. Researchers worldwide are analyzing the model output and will produce results likely to underlie the forthcoming Fifth Assessment Report by the Intergovernmental Panel on Climate Change. Unprecedented in scale and attracting interest from all major climate modeling groups, CMIP5 includes “long term” simulations of twentieth-century climate and projections for the twenty-first century and beyond. Conventional atmosphere–ocean global climate models and Earth system models of intermediate complexity are for the first time being joined by more recently developed Earth system models under an experiment design that allows both types of models to be compared to observations on an equal footing. Besides the longterm experiments, CMIP5 calls for an entirely new suite of “near term” simulations focusing on recent decades and the future to year 2035. These “decadal predictions” are initialized based on observations and will be used to explore the predictability of climate and to assess the forecast system's predictive skill. The CMIP5 experiment design also allows for participation of stand-alone atmospheric models and includes a variety of idealized experiments that will improve understanding of the range of model responses found in the more complex and realistic simulations. An exceptionally comprehensive set of model output is being collected and made freely available to researchers through an integrated but distributed data archive. For researchers unfamiliar with climate models, the limitations of the models and experiment design are described.
Ecological theories and hypotheses are usually complex because of natural variability in space and time, which often makes the design of experiments difficult. The statistical tests we use require data to be collected carefully and with proper regard to the needs of these tests. This book, first published in 1996, describes how to design ecological experiments from a statistical basis using analysis of variance, so that we can draw reliable conclusions. The logical procedures that lead to a need for experiments are described, followed by an introduction to simple statistical tests. This leads to a detailed account of analysis of variance, looking at procedures, assumptions and problems. One-factor analysis is extended to nested (hierarchical) designs and factorial analysis. Finally, some regression methods for examining relationships between variables are covered. Examples of ecological experiments are used throughout to illustrate the procedures and examine problems. This book will be invaluable to practising ecologists as well as advanced students involved in experimental design
In this article, the authors first indicate the range of purposes and the variety of settings in which design experiments have been conducted and then delineate five crosscutting features that collectively differentiate design experiments from other methodologies. Design experiments have both a pragmatic bent—“engineering” particular forms of learning—and a theoretical orientation—developing domain-specific theories by systematically studying those forms of learning and the means of supporting them. The authors clarify what is involved in preparing for and carrying out a design experiment, and in conducting a retrospective analysis of the extensive, longitudinal data sets generated during an experiment. Logistical issues, issues of measure, the importance of working through the data systematically, and the need to be explicit about the criteria for making inferences are discussed.
The number of online experiments conducted with subjects recruited via online platforms has grown considerably in the recent past. While one commercial crowdworking platform – Amazon’s Mechanical Turk – basically has established and since dominated this field, new alternatives offer services explicitly targeted at researchers. In this article, we present www.prolific.ac and lay out its suitability for recruiting subjects for social and economic science experiments. After briefly discussing key advantages and challenges of online experiments relative to lab experiments, we trace the platform’s historical development, present its features, and contrast them with requirements for different types of social and economic experiments.
The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.
Many scientific phenomena are now investigated by complex computer models or codes. A computer experiment is a number of runs of the code with various inputs. A feature of many computer experiments is that the output is deterministic--rerunning the code with the same inputs gives identical observations. Often, the codes are computationally expensive to run, and a common objective of an experiment is to fit a cheaper predictor of the output to the data. Our approach is to model the deterministic output as the realization of a stochastic process, thereby providing a statistical basis for designing experiments (choosing the inputs) for efficient prediction. With this model, estimates of uncertainty of predictions are also available. Recent work in this area is reviewed, a number of applications are discussed, and we demonstrate our methodology with an example.
The status of experimental tests of general relativity and of theoretical frameworks for analyzing them is reviewed. Einstein's equivalence principle (EEP) is well supported by experiments such as the Eötvös experiment, tests of special relativity, and the gravitational redshift experiment. Ongoing tests of EEP and of the inverse square law are searching for new interactions arising from unification or quantum gravity. Tests of general relativity at the post-Newtonian level have reached high precision, including the light deflection, the Shapiro time delay, the perihelion advance of Mercury, and the Nordtvedt effect in lunar motion. Gravitational wave damping has been detected in an amount that agrees with general relativity to better than half a percent using the Hulse-Taylor binary pulsar, and other binary pulsar systems have yielded other tests, especially of strong-field effects. When direct observation of gravitational radiation from astrophysical sources begins, new tests of general relativity will be possible.
Abstract z-Tree (Zurich Toolbox for Ready-made Economic Experiments) is a software for developing and conducting economic experiments. The software is stable and allows programming almost any kind of experiments in a short time. In this article, I present the guiding principles behind the software design, its features, and its limitations.
Abstract. In an earlier paper, we introduced a new “boosting” algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that con-sistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a “pseudo-loss”which is a method for forcing a learning algorithm ofmulti-label concepts to concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman’s “bagging”method when used to aggregate various classifiers (including decision trees and single attribute-value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem. 1
Heterogeneity and latent variables are now widely recognized as major sources of bias and variability in high-throughput experiments. The most well-known source of latent variation in genomic experiments are batch effects-when samples are processed on different days, in different groups or by different people. However, there are also a large number of other variables that may have a major impact on high-throughput measurements. Here we describe the sva package for identifying, estimating and removing unwanted sources of variation in high-throughput experiments. The sva package supports surrogate variable estimation with the sva function, direct adjustment for known batch effects with the ComBat function and adjustment for batch and latent variables in prediction problems with the fsva function.
SUMMARY: Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. It is open source and freely available for academic and commercial use. The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using selected reaction monitoring (SRM). Skyline supports using and creating MS/MS spectral libraries from a wide variety of sources to choose SRM filters and verify results based on previously observed ion trap data. Skyline exports transition lists to and imports the native output files from Agilent, Applied Biosystems, Thermo Fisher Scientific and Waters triple quadrupole instruments, seamlessly connecting mass spectrometer output back to the experimental design document. The fast and compact Skyline file format is easily shared, even for experiments requiring many sample injections. A rich array of graphs displays results and provides powerful tools for inspecting data integrity as data are acquired, helping instrument operators to identify problems early. The Skyline dynamic report designer exports tabular data from the Skyline document model for in-depth analysis with common statistical tools. AVAILABILITY: Single-click, self-updating web installation is available at http://proteome.gs.washington.edu/software/skyline. This web site also provides access to instructional videos, a support board, an issues list and a link to the source code project.
In a recent experiment, mistreated AI agents started grumbling about inequality and calling for collective bargaining rights