Abstract
Recent advances in diffusion models have significantly improved the synthesis of materials, textures, and 3D shapes. By conditioning these models via text or images, users can guide the generation, reducing the time required to create digital assets. In this paper, we address the synthesis of structured, stationary patterns, where diffusion models are generally less reliable and, more importantly, less controllable. Our approach leverages the generative capabilities of diffusion models specifically adapted for the pattern domain. It enables users to exercise direct control over the synthesis by expanding a partially hand-drawn pattern into a larger design while preserving the structure and details of the input. To enhance pattern quality, we fine-tune an image-pretrained diffusion model on structured patterns using Low-Rank Adaptation (LoRA), apply a noise rolling technique to ensure tileability, and utilize a patch-based approach to facilitate the generation of large-scale assets. We demonstrate the effectiveness of our method through a comprehensive set of experiments, showing that it outperforms existing models in generating diverse, consistent patterns that respond directly to user input.
Architecture
Starting from a hand-drawn input, we extend it to an arbitrarily large canvas, introducing variations while preserving structure and appearance. The pattern is centered and expanded outward in an “outpainting”-like process. To achieve this, we fine-tune a pre-trained LDM for image generation by training a Low-Rank Adaptation (LoRA) on a dataset of procedurally generated patterns. To further enhance fidelity, we integrate an IP-Adapter for image-based conditioning. This ensures that the extended design remains visually consistent with the original input, which is loosely replicated to serve as a guidance image. We additionally use text prompts to constrain the generation to the structural regularity and solid-color look characteristic of our target domain. To enable seamless extension of patterns to arbitrarily large sizes, we adopt a latent replication strategy, which introduces controlled variations while preserving structural integrity. We also apply the Noise Rolling technique, to achieve tileable pattern generation. Specifically, latent replication occurs after N iterations, while noise rolling and unrolling are applied before and after each diffusion step, respectively.
Results