Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has given rise to a diversity of approaches, methodology, and\npractice. In this paper, we explore the landscape of transfer learning\ntechniques for NLP by introducing a unified framework that converts all\ntext-based language problems into a text-to-text format. Our systematic study\ncompares pre-training objectives, architectures, unlabeled data sets, transfer\napproaches, and other factors on dozens of language understanding tasks. By\ncombining the insights from our exploration with scale and our new ``Colossal\nClean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks\ncovering summarization, question answering, text classification, and more. To\nfacilitate future work on transfer learning for NLP, we release our data set,\npre-trained models, and code.\n
Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target-domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning research studies, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey article reviews more than 40 representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over 20 representative transfer learning models are used for experiments. The models are performed on three different data sets, that is, Amazon Reviews, Reuters-21578, and Office-31, and the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. We release our code and pre-trained model weights at https://github.com/OpenAI/CLIP.
Transfer of training is of paramount concern for training researchers and practitioners. Despite research efforts, there is a growing concern over the “transfer problem.” The purpose of this paper is to provide a critique of the existing transfer research and to suggest directions for future research investigations. The conditions of transfer include both the generalization of learned material to the job and the maintenance of trained skills over a period of time on the job. The existing research examining the effects of training design, trainee, and work‐environment factors on conditions of transfer is reviewed and critiqued. Research gaps identified from the review include the need to (1) test various operationalizations of training design and work‐environment factors that have been posited as having an impact on transfer and (2) develop a framework for conducting research on the effects of trainee characteristics on transfer. Needed advancements in the conceptualization and operationalization of the criterion of transfer are also discussed.
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.
This research considers how different features of informal networks affect knowledge transfer. As a complement to previous research that has emphasized the dyadic tie strength component of informal networks, we focus on how network structure influences the knowledge transfer process. We propose that social cohesion around a relationship affects the willingness and motivation of individuals to invest time, energy, and effort in sharing knowledge with others. We further argue that the network range, ties to different knowledge pools, increases a person's ability to convey complex ideas to heterogeneous audiences. We also examine explanations for knowledge transfer based on absorptive capacity, which emphasizes the role of common knowledge, and relational embeddedness, which stresses the importance of tie strength. We investigate the network effect on knowledge transfer using data from a contract R&D firm. The results indicate that both social cohesion and network range ease knowledge transfer, over and above the effect for the strength of the tie between two people. We discuss the implications of these findings for research on effective knowledge transfer, social capital, and information diffusion.
Machine learning and data mining techniques have been used in numerous real-world applications. An assumption of traditional machine learning methodologies is the training data and testing data are taken from the same domain, such that the input feature space and data distribution characteristics are the same. However, in some real-world machine learning scenarios, this assumption does not hold. There are cases where training data is expensive or difficult to collect. Therefore, there is a need to create high-performance learners trained with more easily obtained data from different domains. This methodology is referred to as transfer learning. This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied to transfer learning. Lastly, there is information listed on software downloads for various transfer learning solutions and a discussion of possible future research work. The transfer learning solutions surveyed are independent of data size and can be applied to big data environments.
A comprehensive discussion of heat transfer by thermal radiation is presented, including the radiative behavior of materials, radiation between surfaces, and gas radiation. Among the topics considered are property prediction by electromagnetic theory, the observed properties of solid materials, radiation in the presence of other modes of energy transfer, the equations of transfer for an absorbing-emitting gas, and radiative transfer in scattering and absorbing media. Also considered are radiation exchange between black isothermal surfaces, radiation exchange in enclosures composed of diffuse gray surfaces and in enclosures having some specularly reflecting surfaces, and radiation exchange between nondiffuse nongray surfaces. The use of the Monte Carlo technique in solving radiant-exchange problems and problems of radiative transfer through absorbing-emitting media is explained.
An experimental system was built to investigate convective heat transfer and flow features of the nanofluid in a tube. Both the convective heat transfer coefficient and friction factor of the sample nanofluids for the turbulent flow are measured, respectively. The effects of such factors as the volume fraction of suspended nanoparticles and the Reynolds number on the heat transfer and flow features are discussed in detail. A new type of convective heat transfer correlation is proposed to correlate experimental data of heat transfer for nanofluids.
A method has been devised for the electrophoretic transfer of proteins from polyacrylamide gels to nitrocellulose sheets. The method results in quantitative transfer of ribosomal proteins from gels containing urea. For sodium dodecyl sulfate gels, the original band pattern was obtained with no loss of resolution, but the transfer was not quantitative. The method allows detection of proteins by autoradiography and is simpler than conventional procedures. The immobilized proteins were detectable by immunological procedures. All additional binding capacity on the nitrocellulose was blocked with excess protein; then a specific antibody was bound and, finally, a second antibody directed against the first antibody. The second antibody was either radioactively labeled or conjugated to fluorescein or to peroxidase. The specific protein was then detected by either autoradiography, under UV light, or by the peroxidase reaction product, respectively. In the latter case, as little as 100 pg of protein was clearly detectable. It is anticipated that the procedure will be applicable to analysis of a wide variety of proteins with specific reactions or ligands.
Abstract The ability to transfer best practices internally is critical to a firm's ability to build competitive advantage through the appropriation of rents from scarce internal knowledge. Just as a firm's distinctive competencies might be difficult for other firms to imitate, its best practices could be difficult to imitate internally. Yet, little systematic attention has been paid to such internal stickiness. The author analyzes internal stickiness of knowledge transfer and tests the resulting model using canonical correlation analysis of a data set consisting of 271 observations of 122 best‐practice transfers in eight companies. Contrary to conventional wisdom that blames primarily motivational factors, the study findings show the major barriers to internal knowledge transfer to be knowledge‐related factors such as the recipient's lack of absorptive capacity, causal ambiguity, and an arduous relationship between the source and the recipient.
Evidence for photoinduced electron transfer from the excited state of a conducting polymer onto buckminsterfullerene, C(60), is reported. After photo-excitation of the conjugated polymer with light of energy greater than the pi-pi* gap, an electron transfer to the C(60) molecule is initiated. Photoinduced optical absorption studies demonstrate a different excitation spectrum for the composite as compared to the separate components, consistent with photo-excited charge transfer. A photoinduced electron spin resonance signal exhibits signatures of both the conducting polymer cation and the C(60) anion. Because the photoluminescence in the conducting polymer is quenched by interaction with C(60), the data imply that charge transfer from the excited state occurs on a picosecond time scale. The charge-separated state in composite films is metastable at low temperatures.
Abstract Fluorescence quenching rate constants, k q , ranging from 10 6 to 2 × 10 10 M −1 sec −1 , of more than 60 typical electron donor‐acceptor systems have been measured in de‐oxygenated acetonitrile and are shown to be correlated with the free enthalpy change, Δ G 23 , involved in the actual electron transfer process magnified image in the encounter complex and varying between + 5 and −60 kcal/mole. The correlation which is based on the mechanism of adiabatic outer‐sphere electron transfer requires Δ G ≠ 23 , the activation free enthalpy of this process to be a monotonous function of Δ G 23 and allows the calculation of rate constants of electron transfer quenching from spectroscopic and electrochemical data. A detailed study of some systems where the calculated quenching constants differ from the experimental ones by several orders of magnitude revealed that the quenching mechanism operative in these cases was hydrogen‐atom rather than electron transfer. The conditions under which these different mechanisms apply and their consequences are discussed.
Abstract This paper examines interfirm knowledge transfers within strategic alliances. Using a new measure of changes in alliance partners' technological capabilities, based on the citation patterns of their patent portfolios, we analyze changes in the extent to which partner firms' technological resources ‘overlap’ as a result of alliance participation. This measure allows us to test hypotheses from the literature on interfirm knowledge transfer in alliances, with interesting results: we find support for some elements of this ‘received wisdom’—equity arrangements promote greater knowledge transfer, and ‘absorptive capacity’ helps explain the extent of technological capability transfer, at least in some alliances. But the results also suggest limits to the ‘capabilities acquisition’ view of strategic alliances. Consistent with the argument that alliance activity can promote increased specialization, we find that the capabilities of partner firms become more divergent in a substantial subset of alliances.
1 The Mass-Transfer Operations Diffusion and Mass Transfer2 Moleuclar Diffusion in Fluids3 Mass-Transfer Coefficients4 Diffusion in Solids5 Interphase Mass Transfer6 Gas-Liquid Operations7 Equipment for Gas-Liquid Operations8 Humidification Operations, Gas Absorption9 Gas Absorption10 Distillation11 Liquid-Liquid Operations12 Liquid Extraction13 Solid-Fluid Operations14 Absorption and Ion Exchange15 Drying16 Leaching
A simple and rapid method for transferring RNA from agarose gels to nitrocellulose paper for blot hybridization has been developed. Poly(A)+ and ribosomal RNAs transfer efficiently to nitrocellulose paper in high salt (3 M NaCl/0.3 M trisodium citrate) after denaturation with glyoxal and 50% (vol/vol) dimethyl sulfoxide. RNA also binds to nitrocellulose after treatment with methylmercuric hydroxide. The method is sensitive: about 50 pg of specific mRNA per band is readily detectable after hybridization with high specific activity probes (10(8) cpm/microgram). The RNA is stably bound to the nitrocellulose paper by this procedure, allowing removal of the hybridized probes and rehybridization of the RNA blots without loss of sensitivity. The use of nitrocellulose paper for the analysis of RNA by blot hybridization has several advantages over the use of activated paper (diazobenzyloxymethyl-paper). The method is simple, inexpensive, reproducible, and sensitive. In addition, denaturation of DNA with glyoxal and dimethyl sulfoxide promotes transfer and retention of small DNAs (100 nucleotides and larger) to nitrocellulose paper. A related method is also described for dotting RNA and DNA directly onto nitrocellulose paper treated with a high concentration of salt; under these conditions denatured DNA of less than 200 nucleotides is retained and hybridizes efficiently.
This paper combines the concept of weak ties from social network research and the notion of complex knowledge to explain the role of weak ties in sharing knowledge across organization subunits in a multiunit organization. I use a network study of 120 new-product development projects undertaken by 41 divisions in a large electronics company to examine the task of developing new products in the least amount of time. Findings show that weak interunit ties help a project team search for useful knowledge in other subunits but impede the transfer of complex knowledge, which tends to require a strong tie between the two parties to a transfer. Having weak interunit ties speeds up projects when knowledge is not complex but slows them down when the knowledge to be transferred is highly complex. I discuss the implications of these findings for research on social networks and product innovation.
Using self-resonant coils in a strongly coupled regime, we experimentally demonstrated efficient nonradiative power transfer over distances up to 8 times the radius of the coils. We were able to transfer 60 watts with approximately 40% efficiency over distances in excess of 2 meters. We present a quantitative model describing the power transfer, which matches the experimental results to within 5%. We discuss the practical applicability of this system and suggest directions for further study.
Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer. However, their framework requires a slow iterative optimization process, which limits its practical application. Fast approximations with feed-forward neural networks have been proposed to speed up neural style transfer. Unfortunately, the speed improvement comes at a cost: the network is usually tied to a fixed set of styles and cannot adapt to arbitrary new styles. In this paper, we present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. At the heart of our method is a novel adaptive instance normalization (AdaIN) layer that aligns the mean and variance of the content features with those of the style features. Our method achieves speed comparable to the fastest existing approach, without the restriction to a pre-defined set of styles. In addition, our approach allows flexible user controls such as content-style trade-off, style interpolation, color & spatial controls, all using a single feed-forward neural network.