共找到 20 条结果
Humans often experience not just a single basic emotion at a time, but rather a blend of several emotions with varying salience. Despite the importance of such blended emotions, most video-based emotion recognition approaches are designed to recognize single emotions only. The few approaches that have attempted to recognize blended emotions typically cannot assess the relative salience of the emotions within a blend. This limitation largely stems from the lack of datasets containing a substantial number of blended emotion samples annotated with relative salience. To address this shortcoming, we introduce BLEMORE, a novel dataset for multimodal (video, audio) blended emotion recognition that includes information on the relative salience of each emotion within a blend. BLEMORE comprises over 3,000 clips from 58 actors, performing 6 basic emotions and 10 distinct blends, where each blend has 3 different salience configurations (50/50, 70/30, and 30/70). Using this dataset, we conduct extensive evaluations of state-of-the-art video classification approaches on two blended emotion prediction tasks: (1) predicting the presence of emotions in a given sample, and (2) predicting the relativ
Blending visual and textual concepts into a new visual concept is a unique and powerful trait of human beings that can fuel creativity. However, in practice, cross-modal conceptual blending for humans is prone to cognitive biases, like design fixation, which leads to local minima in the design space. In this paper, we propose a T2I diffusion adapter "IT-Blender" that can automate the blending process to enhance human creativity. Prior works related to cross-modal conceptual blending are limited in encoding a real image without loss of details or in disentangling the image and text inputs. To address these gaps, IT-Blender leverages pretrained diffusion models (SD and FLUX) to blend the latent representations of a clean reference image with those of the noisy generated image. Combined with our novel blended attention, IT-Blender encodes the real reference image without loss of details and blends the visual concept with the object specified by the text in a disentangled way. Our experiment results show that IT-Blender outperforms the baselines by a large margin in blending visual and textual concepts, shedding light on the new application of image generative models to augment human c
Visual blends combine elements from two distinct visual concepts into a single, integrated image, with the goal of conveying ideas through imaginative and often thought-provoking visuals. Communicating abstract concepts through visual blends poses a series of conceptual and technical challenges. To address these challenges, we introduce Creative Blends, an AI-assisted design system that leverages metaphors to visually symbolize abstract concepts by blending disparate objects. Our method harnesses commonsense knowledge bases and large language models to align designers' conceptual intent with expressive concrete objects. Additionally, we employ generative text-to-image techniques to blend visual elements through their overlapping attributes. A user study (N=24) demonstrated that our approach reduces participants' cognitive load, fosters creativity, and enhances the metaphorical richness of visual blend ideation. We explore the potential of our method to expand visual blends to include multiple object blending and discuss the insights gained from designing with generative AI.
Current text-conditioned diffusion editors handle single object replacement well but struggle when a new object and a new style must be introduced simultaneously. We present Twin-Prompt Attention Blend (TP-Blend), a lightweight training-free framework that receives two separate textual prompts, one specifying a blend object and the other defining a target style, and injects both into a single denoising trajectory. TP-Blend is driven by two complementary attention processors. Cross-Attention Object Fusion (CAOF) first averages head-wise attention to locate spatial tokens that respond strongly to either prompt, then solves an entropy-regularised optimal transport problem that reassigns complete multi-head feature vectors to those positions. CAOF updates feature vectors at the full combined dimensionality of all heads (e.g., 640 dimensions in SD-XL), preserving rich cross-head correlations while keeping memory low. Self-Attention Style Fusion (SASF) injects style at every self-attention layer through Detail-Sensitive Instance Normalization. A lightweight one-dimensional Gaussian filter separates low- and high-frequency components; only the high-frequency residual is blended back, impr
In this paper, we develop a blended dynamics framework for open quantum networks with diffusive couplings. The network consists of qubits interconnected through Hamiltonian couplings, environmental dissipation, and consensus-like diffusive interactions. Such networks commonly arise in spontaneous emission processes and non-Hermitian quantum computing, and their evolution follows a Lindblad master equation. Blended dynamics theory is well established in the classical setting as a tool for analyzing emergent behaviors in heterogeneous networks with diffusive couplings. Its key insight is to blend the local dynamics rather than the trajectories of individual nodes. Perturbation analysis then shows that, under sufficiently strong coupling, all node trajectories tend to stay close to those of the blended system over time. We first show that this theory extends naturally to the reduced-state dynamics of quantum networks, revealing classical-like clustering phenomena in which qubits converge to a shared equilibrium or a common trajectory determined by the quantum blended reduced-state dynamics. We then extend the analysis to qubit coherent states using quantum Laplacians and induced graph
Recent deepfake detection methods demonstrate improved cross-dataset generalization, yet the underlying mechanisms remain underexplored. We introduce the Alpha Blending Hypothesis, positing that state-of-the-art frame-based detectors primarily function as alpha blending searchers; rather than learning semantic anomalies or specific generative neural fingerprints, they localize low-level compositing artifacts introduced during the integration of manipulated faces into target frames. We experimentally validate the hypothesis, demonstrating that deepfake detectors exhibit high sensitivity to the so-called self-blended images (SBI) and non-generative manipulations. We propose the method BlenD that leverages a large-scale, diverse dataset of real-only facial images augmented with SBI. This approach achieves the best average cross-dataset generalization on 15 compositional deepfake datasets released between 2019 and 2025 without utilizing explicitly generated deepfakes during training. Furthermore, we show that predictions from explicit blending searchers and models resilient to blending shortcuts are highly complementary, yielding a state-of-the-art AUROC of 94.0% in an ensemble configu
Diffusion models have dramatically advanced text-to-image generation in recent years, translating abstract concepts into high-fidelity images with remarkable ease. In this work, we examine whether they can also blend distinct concepts, ranging from concrete objects to intangible ideas, into coherent new visual entities under a zero-shot framework. Specifically, concept blending merges the key attributes of multiple concepts (expressed as textual prompts) into a single, novel image that captures the essence of each concept. We investigate four blending methods, each exploiting different aspects of the diffusion pipeline (e.g., prompt scheduling, embedding interpolation, or layer-wise conditioning). Through systematic experimentation across diverse concept categories, such as merging concrete concepts, synthesizing compound words, transferring artistic styles, and blending architectural landmarks, we show that modern diffusion models indeed exhibit creative blending capabilities without further training or fine-tuning. Our extensive user study, involving 100 participants, reveals that no single approach dominates in all scenarios: each blending technique excels under certain conditio
Training a generative model on a single human skeletal motion sequence without being bound to a specific kinematic tree has drawn significant attention from the animation community. Unlike text-to-motion generation, single-shot models allow animators to controllably generate variations of existing motion patterns without requiring additional data or extensive retraining. However, existing single-shot methods do not explicitly offer a controllable framework for blending two or more motions within a single generative pass. In this paper, we present the first single-shot motion blending framework that enables seamless blending by temporally conditioning the generation process. Our method introduces a skeleton-aware normalization mechanism to guide the transition between motions, allowing smooth, data-driven control over when and how motions blend. We perform extensive quantitative and qualitative evaluations across various animation styles and different kinematic skeletons, demonstrating that our approach produces plausible, smooth, and controllable motion blends in a unified and efficient manner.
Understanding charge separation dynamics in organic semiconductor blends is crucial for optimizing the performance of organic photovoltaic solar cells. In this study, we explored the optoelectronic properties and charge separation dynamics of a PCE10:FOIC blend, by combining steady-state and time-resolved spectroscopies with high-level DFT calculations. Femtosecond transient absorption spectroscopy revealed a significant reduction of the exciton-exciton annihilation recombination rate in the acceptor when incorporated into the blend, compared to its pristine form. This reduction was attributed to a decrease in exciton density within the acceptor, driven by an efficient hole-separation process that was characterized by following the temporal evolution of the transient signals associated with the excited states of the donor when the acceptor was selectively excited within the blend. The analysis of these dynamics enabled the estimation of the hole separation time constant from the acceptor to the donor, yielding a time constant of (1.3 +- 0.3) ps. Additionally, this study allowed the quantification of exciton diffusion and revealed a charge separation efficiency of approximately 60%,
Polymers are an effective test-bed for studying topological constraints in condensed matter due to a wide array of synthetically-available chain topologies. When linear and ring polymers are blended together, emergent rheological properties are observed as the blend can be more viscous than either of the individual components. This emergent behavior arises since ring-linear blends can form long-lived topological constraints as the linear polymers thread the ring polymers. Here, we demonstrate how the Gauss linking integral can be used to efficiently evaluate the relaxation of topological constraints in ring-linear polymer blends. For majority-linear blends, the relaxation rate of topological constraints depends primarily on reptation of the linear polymers, resulting in the diffusive time $τ_{d,R}$ for rings of length $N_R$ blended with linear chains of length $N_l$ to scale as $τ_{d,R}\sim N_R^2N_L^{3.4}$.
Computer simulation studies on the miscibility behavior and single chain properties in binary polymer blends are reviewed. We consider blends of various architectures in order to identify important architectural parameters on a coarse grained level and study their qualitative consequences for the miscibility behavior. The phase diagram, the relation between the exchange chemical potential and the composition, and the intermolecular paircorrelation functions for symmetric blends of linear chains, blends of cyclic polymers, blends with an asymmetry in cohesive energies, blends with different chain lengths, blends with distinct monomer shapes, and blends with a stiffness disparity between the components are discussed. We investiagte the temperature and composition dependence of the single chain conformations in symmetric and asymmetric blends and compare our findings to scaling arguments and detailed SCF calculations. Two aspects of the single chain dynamics in blends are discussed: the dynamics of short non--entangled chains in a binary blend and irreversible reactions of a small fraction of reactive polymers at a strongly segregated interface. Pertinent off-lattice simulations and a
We present an image blending pipeline, \textit{IBURD}, that creates realistic synthetic images to assist in the training of deep detectors for use on underwater autonomous vehicles (AUVs) for marine debris detection tasks. Specifically, IBURD generates both images of underwater debris and their pixel-level annotations, using source images of debris objects, their annotations, and target background images of marine environments. With Poisson editing and style transfer techniques, IBURD is even able to robustly blend transparent objects into arbitrary backgrounds and automatically adjust the style of blended images using the blurriness metric of target background images. These generated images of marine debris in actual underwater backgrounds address the data scarcity and data variety problems faced by deep-learned vision algorithms in challenging underwater conditions, and can enable the use of AUVs for environmental cleanup missions. Both quantitative and robotic evaluations of IBURD demonstrate the efficacy of the proposed approach for robotic detection of marine debris.
In this paper we discuss a new method to blend fractal attractors using the code map for the IFS formed by the Hutchinson--Barnsley operators of a finite family of hyperbolic IFSs. We introduce a parameter called blending coefficient to measure the similarity between the blended set and each one of the original attractors. We also introduce a discrete approximation algorithm and prove a rigorous error estimation used to approximate these new objects. Several simulation results are provided illustrating our techniques.
Advances in deepfake research have led to the creation of almost perfect manipulations undetectable by human eyes and some deepfakes detection tools. Recently, several techniques have been proposed to differentiate deepfakes from realistic images and videos. This paper introduces a Frequency Enhanced Self-Blended Images (FSBI) approach for deepfakes detection. This proposed approach utilizes Discrete Wavelet Transforms (DWT) to extract discriminative features from the self-blended images (SBI) to be used for training a convolutional network architecture model. The SBIs blend the image with itself by introducing several forgery artifacts in a copy of the image before blending it. This prevents the classifier from overfitting specific artifacts by learning more generic representations. These blended images are then fed into the frequency features extractor to detect artifacts that can not be detected easily in the time domain. The proposed approach has been evaluated on FF++ and Celeb-DF datasets and the obtained results outperformed the state-of-the-art techniques with the cross-dataset evaluation protocol.
For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. To that end, we explore the task of concept blending through diffusion models. Diffusion models are based on a connection between a latent representation of textual prompts and a latent space that enables image reconstruction and generation. This task allows us to try different text-based combination strategies, and evaluate easily through a visual analysis. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend.
Editing a local region or a specific object in a 3D scene represented by a NeRF or consistently blending a new realistic object into the scene is challenging, mainly due to the implicit nature of the scene representation. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multi-view consistent results with much flex
Many IoT use cases demand both secure storage and secure communication. Resource-constrained devices cannot afford having one set of crypto protocols for storage and another for communication. Lightweight application layer security standards are being developed for IoT communication. Extending these protocols for secure storage can significantly reduce communication latency and local processing. We present BLEND, combining secure storage and communication by storing IoT data as pre-computed encrypted network packets. Unlike local methods, BLEND not only eliminates separate crypto for secure storage needs, but also eliminates a need for real-time crypto operations, reducing the communication latency significantly. Our evaluation shows that compared with a local solution, BLEND reduces send latency from 630 microseconds to 110 microseconds per packet. BLEND enables PKI based key management while being sufficiently lightweight for IoT. BLEND doesn't need modifications to communication standards used when extended for secure storage, and can therefore preserve underlying protocols' security guarantees.
Image composition is an important operation to create visual content. Among image composition tasks, image blending aims to seamlessly blend an object from a source image onto a target image with lightly mask adjustment. A popular approach is Poisson image blending, which enforces the gradient domain smoothness in the composite image. However, this approach only considers the boundary pixels of target image, and thus can not adapt to texture of target image. In addition, the colors of the target image often seep through the original source object too much causing a significant loss of content of the source object. We propose a Poisson blending loss that achieves the same purpose of Poisson image blending. In addition, we jointly optimize the proposed Poisson blending loss as well as the style and content loss computed from a deep network, and reconstruct the blending region by iteratively updating the pixels using the L-BFGS solver. In the blending image, we not only smooth out gradient domain of the blending boundary but also add consistent texture into the blending region. User studies show that our method outperforms strong baselines as well as state-of-the-art approaches when p
The MIR blend fluxes correlation between HCN and water can be explained as a consequence of dust evolution, namely, changes in the dust MIR opacity. Other disk properties, such as the disk inner radius and the disk flaring angle, can only partially cover the dynamic range of the HCN and water blend observations. At the same time, the dynamic range of the MIR SED slopes is better reproduced by the disk structure (e.g. inner radius, flaring) than by the dust evolution. Our model series do not reproduce the observed trend between continuum flux at 850 μm and the MIR HCN/H2O blend ratio. However, our models show that this continuum flux is not a unique indicator of disk mass and it should therefore be used jointly with complementary observational data for optimal results. The presence of an anti-correlation between MIR H2O blend fluxes and the MIR SED is consistent with a scenario where dust evolves in disks, producing lower opacity and stronger features in the Spitzer spectral regime, while the gas eventually becomes depleted at a later stage, leaving behind an inner cavity in the disk.
Organic solar cells (OSCs) based on ADA-type (acceptor-donor-acceptor) non-fullerene acceptors (NFAs) exhibit improved power conversion efficiency (PCE) compared to the conventional fullerene-based analogues. The optoelectronic properties of OSC active layer blends are correlated to their underlying structural dynamics and therefore influence the device performance. Using synergistically different neutron spectroscopy techniques, we studied the dynamics of binary and ternary blends made of the NFAs O-IDTBR and O-IDFBR and the regioregular donor polymer P3HT. Deuteration was considered for a contrast variation purpose. In addition to shedding light on the miscibilty and alloying characters of the blends, a main outcome of this work is the evidenced similar dynamical response of the blend components. This finding is in contrast with our previous neutron spectroscopy and molecular dynamics studies of the fullerene-based blend P3HT:PCBM, where we highlighted distinct behaviors of P3HT and PCBM in terms of the vitrification/frustration of P3HT and the plasticization of PCBM by P3HT upon blending. Alike P3HT vitrification is not presently observed. The absence or the weak vitrification e