downloadsolz.blogg.se - Nvidia control panel preset profiles in the 3d managed settings

Tested using nVidia RTX3060 and CUDA 11.7 Cross-attention This is an example test using specific hardware and configuration, your mileage may vary Memory & Performance Impact of Optimizers and Flags

Go in nvidia control panel, 3d parameters, and change power profile to "maximum performance".xFormers still needs to enabled via COMMANDLINE_ARGS. Effects not closely studied.įor Nvidia and AMD cards normally forced to run with -no-half, should improve generation speed.Īs of version 1.3.0, Cross attention optimization can be selected under settings. Only makes sense together with -medvram or -lowvramĬhanges torch memory type for stable diffusion to channels last. Not a command line option, but an optimization implicitly enabled by using -medvram or -lowvram.ĭisables the optimization above. Prevents batching of positive and negative prompts during sampling, which essentially lets you run at 0.5 batch size, saving a lot of memory. Lowers performance, but only by a bit - except if live previews are enabled.Īn even more thorough optimization of the above, splitting unet into many modules, and only one module is kept in VRAM. Makes the Stable Diffusion model consume less VRAM by splitting it into three parts - cond (for transforming text into numerical representation), first_stage (for converting a picture into latent space and back), and unet (for actual denoising of latent space) and making it so that only one is in VRAM at all times, sending others to CPU RAM. Uses an older version of the optimization above that is not as memory hungry (it will use less VRAM, but will be more limiting in the maximum size of pictures you can make). On macOS, this will also allow for generation of larger images. Recommended if getting poor performance or failed generations with a hardware/software configuration that xFormers doesn't work for. Sub-quadratic attention, a memory efficient Cross Attention layer optimization that can significantly reduce required memory, sometimes at a slight performance cost. On by default for torch.cuda, which includes both NVidia and AMD cards. Do not report bugs you get running this.Ĭross attention layer optimization significantly reducing memory use for almost no cost (some report improved performance with it). (non-deterministic)Įnables xFormers regardless of whether the program thinks you can run it or not. Great improvement to memory consumption and speed. (deterministic, slightly slower than -opt-sdp-attention and uses more VRAM) May results in faster speeds than using xFormers on some systems but requires more VRAM. A number of optimization can be enabled by commandline arguments: commandline argument