The unprecedented impact of foundation model technology, represented by ChatGPT, is driving a revolutionary paradigm shift in AI, bringing new opportunities and challenges to many industries. However, the high training, inference and maintenance costs of foundation model technologies limit their widespread adoption.
AI for Science (AI4Science) is an emerging field that explores the intersection of artificial intelligence (AI) and scientific research. It leverages the power of AI techniques and algorithms to analyze vast amounts of scientific data, accelerate discovery, and enhance our understanding of complex scientific phenomena.
In pursuit of building open, intelligent, and efficient AI large models, we aim to address the challenges posed by diverse data and resources distributed across edge devices, which can significantly impact the performance and scalability of large models.
The swift advancement of AI-generated content (AIGC) has empowered users to create photorealistic images and engage in meaningful dialogues with foundation models. Despite these advancements, AIGC services face challenges, including concept bleeding, hallucinations, and unsafe content generation.
Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data, and federated domain generalization further considers the test dataset (target domain) is absent from the decentralized training data (source domains).
Fine-tuning the learnable prompt for a pre-trained vision-language model (VLM), such as CLIP, has demonstrated exceptional efficiency in adapting to a broad range of downstream tasks. Existing prompt tuning methods for VLMs do not distinguish spurious features introduced by biased training data from invariant features, and employ a uniform alignment process when adapting to unseen target domains.
The paper introduces CRA5, a project that uses the Variational Autoencoder Transformer (VAEformer) to compress the ERA5 climate dataset from 226TB to just 0.7TB, achieving a compression ratio of over 300 times.
The paper presents the WEATHER-5K dataset, a comprehensive global weather station dataset designed to advance time-series weather forecasting benchmarks. WEATHER-5K includes data from 5,672 weather stations worldwide, covering a 10-year period with hourly intervals.
The paper presents a novel method called Source Prompt Disentangled Inversion (SPDInv) to enhance image editability using diffusion models. Traditional approaches often struggle because the inverted latent noise code is closely tied to the source prompt, hindering effective editing with target prompts.
The paper introduces FreePIH, a novel method for painterly image harmonization using a pre-trained diffusion model without additional training. Unlike traditional methods that require fine-tuning or auxiliary networks, FreePIH leverages the denoising process as a plug-in module to transfer the style between the foreground and background images.