Prosody modification involves changing the pitch and duration of speech without affecting the message and naturalness. This paper proposes a method for prosody (pitch and duration) modification using the instants of significant excitation of the vocal tract system during the production of speech. The instants of significant excitation correspond to the instants of glottal closure (epochs) in the case of voiced speech, and to some random excitations like onset of burst in the case of nonvoiced speech. Instants of significant excitation are computed from the linear prediction (LP) residual of speech signals by using the property of average group-delay of minimum phase signals. The modification of pitch and duration is achieved by manipulating the LP residual with the help of the knowledge of the instants of significant excitation. The modified residual is used to excite the time-varying filter, whose parameters are derived from the original speech signal. Perceptual quality of the synthesized speech is good and is without any significant distortion. The proposed method is evaluated using waveforms, spectrograms, and listening tests. The performance of the method is compared with linear prediction pitch synchronous overlap and add (LP-PSOLA) method, which is another method for prosody manipulation based on the modification of the LP residual. The original and the synthesized speech signals obtained by the proposed method and by the LP-PSOLA method are available for listening at http://speech.cs.iitm.ernet.in/Main/result/prosody.html.
This paper presents a new approach for solving optimal control problems for switched systems. We focus on problems in which a prespecified sequence of active subsystems is given. For such problems, we need to seek both the optimal switching instants and the optimal continuous inputs. In order to search for the optimal switching instants, the derivatives of the optimal cost with respect to the switching instants need to be known. The most important contribution of the paper is a method which first transcribes an optimal control problem into an equivalent problem parameterized by the switching instants and then obtains the values of the derivatives based on the solution of a two point boundary value differential algebraic equation formed by the state, costate, stationarity equations, the boundary and continuity conditions, along with their differentiations. This method is applied to general switched linear quadratic problems and an efficient method based on the solution of an initial value ordinary differential equation is developed. An extension of the method is also applied to problems with internally forced switching. Examples are shown to illustrate the results in the paper.
We present the Dynamic Programming Projected Phase-Slope Algorithm (DYPSA) for automatic estimation of glottal closure instants (GCIs) in voiced speech. Accurate estimation of GCIs is an important tool that can be applied to a wide range of speech processing tasks including speech analysis, synthesis and coding. DYPSA is automatic and operates using the speech signal alone without the need for an EGG signal. The algorithm employs the phase-slope function and a novel phase-slope projection technique for estimating GCI candidates from the speech signal. The most likely candidates are then selected using a dynamic programming technique to minimize a cost function that we define. We review and evaluate three existing methods of GCI estimation and compare the new DYPSA algorithm to them. Results are presented for the APLAWD and SAM databases for which 95.7% and 93.1% of GCIs are correctly identified
暂无摘要(点击查看原文获取完整内容)
Most R novices will start with Appendix A [A sample session], page 80.This should give some familiarity with the style of R sessions and more importantly some instant feedback on what actually happens.Many users will come to R mainly for its graphical facilities.
Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing quality, thus significantly reducing the number of floating point and memory access operations: a small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through stochastic gradient descent. The multiresolution structure allows the network to disambiguate hash collisions, making for a simple architecture that is trivial to parallelize on modern GPUs. We leverage this parallelism by implementing the whole system using fully-fused CUDA kernels with a focus on minimizing wasted bandwidth and compute operations. We achieve a combined speedup of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds, and rendering in tens of milliseconds at a resolution of 1920×1080.
暂无摘要(点击查看原文获取完整内容)
Users have adopted a wide range of digital technologies into their communication repertoire. It remains unclear why they adopt multiple forms of communication instead of substituting one medium for another. It also raises the question: What type of need does each of these media fulfill? In the present article, the authors conduct comparative work that examines the gratifications obtained from Facebook with those from instant messaging. This comparison between media allows one to draw conclusions about how different social media fulfill user needs. Data were collected from undergraduate students through a multimethod study based on 77 surveys and 21 interviews. A factor analysis of gratifications obtained from Facebook revealed six key dimensions: pastime, affection, fashion, share problems, sociability, and social information. Comparative analysis showed that Facebook is about having fun and knowing about the social activities occurring in one’s social network, whereas instant messaging is geared more toward relationship maintenance and development. The authors discuss differences in the two technologies and outline a framework based on uses and gratifications theory as to why young people integrate numerous media into their communication habits.
暂无摘要(点击查看原文获取完整内容)
We present an approach to easily remove the effects of haze from images. It is based on the fact that usually airlight scattered by atmospheric particles is partially polarized. Polarization filtering alone cannot remove the haze effects, except in restricted situations. Our method, however, works under a wide range of atmospheric and viewing conditions. We analyze the image formation process, taking into account polarization effects of atmospheric scattering. We then invert the process to enable the removal of haze from images. The method can be used with as few as two images taken through a polarizer at different orientations. This method works instantly, without relying on changes of weather conditions. We present experimental results of complete dehazing in far from ideal conditions for polarization filtering. We obtain a great improvement of scene contrast and correction of color. As a by product, the method also yields a range (depth) map of the scene, and information about properties of the atmospheric particles.
The debugging cycle is the most common methodology for finding and correcting errors in sequential programs. Cyclic debugging is effective because sequential programs are usually deterministic. Debugging parallel programs is considerably more difficult because successive executions of the same program often do not produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay. During program execution we save the relative order of significant events as they occur, not the data associated with such events. As a result, our approach requires less time and space to save the information needed for program replay than other methods. Our technique is not dependent on any particular form of interprocess communication. It provides for replay of an entire program, rather than individual processes in isolation. No centralized bottlenecks are introduced and there is no need for synchronized clocks or a globally consistent logical time. We describe a prototype implementation of Instant Replay on the BBN Butterfly™ Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs.
暂无摘要(点击查看原文获取完整内容)
暂无摘要(点击查看原文获取完整内容)
An international association advancing the multidisciplinary study of informing systems. Founded in 1998, the Informing Science Institute (ISI) is a global community of academics shaping the future of informing science.
We present a study of anonymized data capturing a month of high-level communication activities within the whole of the Microsoft Messenger instant-messaging system. We examine characteristics and patterns that emerge from the collective dynamics of large numbers of people, rather than the actions and characteristics of individuals. The dataset contains summary properties of 30 billion conversations among 240 million people. From the data, we construct a communication graph with 180 million nodes and 1.3 billion undirected edges, creating the largest social network constructed and analyzed to date. We report on multiple aspects of the dataset and synthesized graph. We find that the graph is well-connected and robust to node removal. We investigate on a planetary-scale the oft-cited report that people are separated by "six degrees of separation" and find that the average path length among Messenger users is 6.6. We find that people tend to communicate more with each other when they have similar age, language, and location, and that cross-gender conversations are both more frequent and of longer duration than conversations with the same gender.
暂无摘要(点击查看原文获取完整内容)
We present a fundamental procedure for instant rendering from the radiance equation. Operating directly on the textured scene description, the very efficient and simple algorithm produces photorealistic images without any finite element kernel or solution discretization of the underlying integral equation. Rendering rates of a few seconds are obtained by exploiting graphics hardware, the deterministic technique of the quasi-random walk for the solution of the global illumination problem, and the new method of jittered low discrepancy sampling.
暂无摘要(点击查看原文获取完整内容)