The swift advancement of AI-generated content (AIGC) has empowered users to create photorealistic images and engage in meaningful dialogues with foundation models. Despite these advancements, AIGC services face challenges, including concept bleeding, hallucinations, and unsafe content generation.
The paper presents a novel method called Source Prompt Disentangled Inversion (SPDInv) to enhance image editability using diffusion models. Traditional approaches often struggle because the inverted latent noise code is closely tied to the source prompt, hindering effective editing with target prompts.
The paper introduces FreePIH, a novel method for painterly image harmonization using a pre-trained diffusion model without additional training. Unlike traditional methods that require fine-tuning or auxiliary networks, FreePIH leverages the denoising process as a plug-in module to transfer the style between the foreground and background images.
The ability to decentralize knowledge graphs (KG) is important to exploit the full potential of the Semantic Web and realize the Web 3.0 vision. However, decentralization also renders KGs more prone to attacks with adverse effects on data integrity and query verifiability.
When generating multi-entity scenes, stable diffusion and its derivative models frequently encounter issues of entity overlap or fusion, primarily due to cross-attention leakage. To mitigate these challenges, we propose performing differentiation and binarization on cross-attention maps to accurately locate entities within non-overlapping areas.
PROTORE works by incorporating CLIP’s language-contrastive knowledge to identify the prototype of negative concepts, extract the negative features from outputs using the prototype as a prompt, and further refine the attention maps by retrieving negative features.