Download PDFOpen PDF in browserMetaDB: Metadata-Guided Diffusion Bridge Model for High-Fidelity Medical Image SynthesisEasyChair Preprint 160199 pages•Date: May 5, 2026AbstractMedical image synthesis is pivotal in modern clinical workflows, addressing the issue of missing imaging modalities. While diffusion-based models have shown promise, existing approaches often neglect the rich clinical metadata, leading to synthesized images that lack semantic fidelity and fail to maintain strict consistency with the target modality. To address these challenges, we propose a metadata-guided diffusion bridge model, termed MetaDB, a novel framework that leverages textual clinical priors to steer the source-to-target translation process. Our method introduces two key innovations to ensure high-fidelity synthesis. First, we design a text-guided adaptive normalization layer, which dynamically modulates the feature statistics of the diffusion backbone using encoded clinical metadata. This mechanism explicitly aligns the synthesized features with the target modality's attributes, ensuring semantic consistency throughout the generation process. Second, to prevent semantic degradation during the iterative denoising steps, we propose a semantics reconstruction network. This auxiliary module imposes a constraint that forces the network to preserve deep semantic representations, further reinforcing the semantic consistency between the generated output and the target description. Extensive experiments on multiple medical imaging datasets demonstrate that our approach achieves state-of-the-art performance in terms of quantitative metrics and visual quality, generating images that are both anatomically accurate and semantically faithful to clinical protocols. Keyphrases: Diffusion Bridge Model, Medical Image Synthesis, Metadata Guidance
|

