Hmm… I considered the possibility of hardware damage, but if reinstalling ComfyUI would definitely break it, doesn’t that mean the ComfyUI environment is corrupted while the files themselves might still be intact?
Short answer:
Your LoRAs are almost certainly not “randomly rotting”. What is happening is usually one of:
- The files are getting slightly corrupted when you copy them to / from the external drive.
- ComfyUI changes (after reinstall/update) make the same LoRA + model combo start producing NaNs, even though the file itself is fine.
- Paths and folders get mixed up after reinstall so ComfyUI is not actually using the file you think it is.
You can fix this with:
- One safe, permanent “models” folder on your PC.
- Backups as ZIP/7z archives that you test.
- A quick hash check (file fingerprint) to prove if a backup is really identical.
- Treating ComfyUI as disposable, not your model files.
Below is a detailed, beginner-safe breakdown.
1. Background: what is actually stored where
1.1 Hugging Face side
Hugging Face libraries use two layers: (Hugging Face Forums)
-
Cache
- Hidden “scratch” area (like
~/.cache/huggingface/hub or equivalent on Windows).
- Managed automatically.
hf_hub_download() explicitly says the paths it returns point into this cache and must not be edited, or the cache may break. (Hugging Face Forums)
-
Your own model folders
- Normal folders like
D:\hf-models\sdxl-base or D:\AI\models\lora-name.
- You choose these.
- These are what you should back up and restore.
The Hugging Face “Local Model Backups” thread says exactly this: treat the cache as internal scratch space and only back up clean model folders you control. (Hugging Face Forums)
1.2 ComfyUI side
ComfyUI also uses a folder tree of model files: checkpoints, LoRAs, VAEs, etc. By default they sit under ComfyUI\models\.... The official docs say you can move your real models outside of ComfyUI and point ComfyUI at them via an extra_model_paths.yaml file. (ComfyUI)
Key idea:
Your LoRAs and checkpoints should live in one central folder that you own.
ComfyUI should just reference that folder.
Once you do that, reinstalling ComfyUI does not touch your model files at all.
2. Possible causes of your “corruption”
2.1 Partial or bad copies to / from the external drive
What happens
LoRA files (.safetensors) are big binary files. If:
- The USB cable is flaky.
- The external drive has errors.
- You unplug the drive too early.
- You copy while something is still downloading.
then the copy can be truncated (shorter than it should be) or slightly damaged.
The safetensors library then throws errors like:
“The safetensors file is incomplete. Check the file size and make sure you have copied/downloaded it correctly.”
This exact message appears in a ComfyUI GitHub issue where a model copied into ComfyUI/models/controlnet/... was incomplete; safetensors raised MetadataIncompleteBuffer. (GitHub)
Sometimes the file is “valid enough” to load, but the weights are garbage, and later the UNet math blows up and you see NaNs.
Why this matches your case
- Fresh download works.
- Copy to external and back → NaNs / black outputs.
- Re-download fixes it.
That is exactly the pattern of a copy that is not bit-identical to the original.
How to recognise
- Sometimes you get an explicit ComfyUI error like above instead of just NaNs. (GitHub)
- File size of a “bad” copy is smaller than the original.
- A checksum (hash) of the file changes after copy (more on this in solutions).
2.2 Backing up the wrong thing (cache vs real model folders)
If you ever copy:
- Random subfolders from
~/.cache/huggingface/hub (or Windows equivalent).
- Symlinked files instead of the real file.
- Temporary download folders.
you can easily end up with:
- Half a model.
- Only one shard of a sharded model.
- Symlinks that point to nothing on the new machine.
The Hugging Face docs and the local backups answer say:
- Cache paths returned by
hf_hub_download are pointers into the cache and must not be edited. (Hugging Face Forums)
- For long-term storage you should create your own per-model folder using
save_pretrained, snapshot_download(local_dir=..., local_dir_use_symlinks=False), or hf download --local-dir. (Hugging Face Forums)
If you back up random cache internals instead of those clean folders, you will see exactly the kind of weird behaviour you describe.
Your note “I just click download and save them to my drive” suggests you may be fine here, but it is worth knowing: do not copy HF cache directories around as “backups”.
2.3 ComfyUI reinstall changes the environment → NaNs
NaNs in the UNet are very common when:
- The GPU is near its limits.
- Precision is too low (lots of fp16 under tight VRAM).
- xFormers or other optimisations change.
- PyTorch / CUDA versions change.
The classic error text (from AUTOMATIC1111, Hugging Face and others) is: (GitHub)
“NansException: A tensor with all NaNs was produced in Unet.
This could be either because there’s not enough precision to represent the picture, or because your video card does not support half type. Try setting ‘Upcast cross attention layer to float32’ or using the --no-half command line argument…”
Many people see this after only changing:
- WebUI / ComfyUI version.
- Driver version.
- Optimisation settings.
They do not change the model file at all, but NaNs appear.
Why this fits your story
You said:
“As soon as I have to reinstall ComfyUI… they are broken and will create NaN errors.”
If after reinstall:
- The same LoRA file from before now triggers NaNs,
- And your backup copy has the same size and hash as the original,
then nothing is “corrupted”. The math environment changed and became less stable.
2.4 Wrong model + LoRA combination after restore
If you mix:
- LoRAs for SD1.5 with SDXL or Flux base models.
- ControlNet models built for different base models.
- Wrong VAEs.
you can get:
- Shape mismatch errors, or
- Very unstable activations that lead to NaNs.
After a reinstall, it is easy for ComfyUI to:
- Point to a different default checkpoint.
- Forget some custom paths.
- Load a different VAE or text encoder.
Then a workflow that used to be “LoRA X + base model Y” becomes “LoRA X + base model Z”, which might blow up numerically even if both files are individually fine.
2.5 External drive or cable issues
If the external drive or USB path is flaky, the damage will only show up on large files. Small text files are usually fine.
This is why it can feel like “only LoRAs get corrupted”:
- They are large.
- They are compressed binary formats.
- A tiny error in a large
.safetensors often causes the loader to fail or the network to produce NaNs.
Tools like 7-Zip have a t (test) command specifically to check archive integrity. (7-zip.opensource.jp)
If testing a big .7z archive on your external drive ever shows errors, then the drive/cable/port is not safe for these backups.
3. Safe setup and solutions (PC-beginner friendly)
3.1 Step 1: Pick one permanent “models” folder
On Windows:
-
Open File Explorer.
-
Choose a drive with free space (for example D:).
-
Create:
D:\AI\
models\
checkpoints\
loras\
vae\
controlnet\
This is your main model store. Do not put ComfyUI itself in here. This folder is what you back up and restore.
3.2 Step 2: Tell ComfyUI to use that folder
You want ComfyUI to look into D:\AI\models instead of storing everything inside its own directory.
ComfyUI docs and guides say to use extra_model_paths.yaml for this on Windows: (GitHub)
-
Go to your ComfyUI install folder, something like:
C:\ComfyUI_windows_portable\ComfyUI\
-
Find extra_model_paths.yaml.example.
-
Copy it and rename the copy to:
extra_model_paths.yaml
-
Open extra_model_paths.yaml in Notepad.
-
Add something like:
my_models:
base_path: D:/AI/models
checkpoints: checkpoints
loras: loras
vae: vae
controlnet: controlnet
Notes:
- Use
/ in paths, ComfyUI accepts that even on Windows. (Reddit)
-
Move your existing models into these folders:
- LoRAs →
D:\AI\models\loras
- Checkpoints →
D:\AI\models\checkpoints
- etc.
-
Start ComfyUI and press r in the UI to refresh models. (Blender Neko)
Now:
- You can delete and reinstall ComfyUI as many times as you want.
- Your models stay in
D:\AI\models and do not move.
3.3 Step 3: Download and store LoRAs correctly
For LoRAs from Civitai or similar:
- When you click download in the browser, save directly into
D:\AI\models\loras.
- Wait for the download to fully finish before closing the browser or moving the file.
- Do not move the file again unless you really have to.
If you ever use Hugging Face Hub via Python or CLI, the “Local Model Backups” answer recommends: (Hugging Face Forums)
- Use
snapshot_download(..., local_dir=..., local_dir_use_symlinks=False) or
- Use
hf download <repo> --local-dir D:\AI\models\some-model
so that your real model copy is in your own folder, not just the cache.
3.4 Step 4: Back up models as archives and test them
Instead of copying hundreds of .safetensors individually:
-
Install 7-Zip if you do not have it.
-
Right-click D:\AI\models → 7-Zip → “Add to models_backup.7z”.
-
After it finishes, test the archive:
-
Open 7-Zip, select models_backup.7z, click Test.
-
Or from Command Prompt:
"C:\Program Files\7-Zip\7z.exe" t "D:\AI\models_backup.7z"
The t command tests archive integrity. (7-zip.opensource.jp)
-
If test passes, copy models_backup.7z to your external drive.
-
On the external drive, run the test again.
If the test fails only on the external copy, the drive or cable is causing corruption.
When you need to restore:
- Copy
models_backup.7z from external back to your PC.
- Test it again with 7-Zip.
- If it passes, extract it into
D:\AI\models.
Now all the LoRAs and checkpoints are back in the exact same paths ComfyUI expects.
3.5 Step 5: Simple “file fingerprint” check (hash) on Windows
To prove a single LoRA file is identical before and after backup:
-
Open Command Prompt.
-
Run:
certutil -hashfile "D:\AI\models\loras\my_lora.safetensors" SHA256
This prints a SHA-256 hash (a long hex number). Windows tips and vendor docs show this exact pattern. (Qiita)
-
Copy the same file to your external drive, e.g. E:\backups\my_lora.safetensors.
-
Run:
certutil -hashfile "E:\backups\my_lora.safetensors" SHA256
-
Compare the two hashes:
- If they are identical, the file is bit-perfect.
- If they are different, the file was changed or corrupted during copy.
Do the same after copying back from external to see if the round trip is safe.
3.6 Step 6: When you see NaNs, separate “corruption” from “math problems”
When ComfyUI says:
“NansException: A tensor with all NaNs was produced in Unet…” (GitHub)
use this simple checklist:
-
Test with a base model only
- In ComfyUI, disable all LoRAs and ControlNets.
- Use a well-known base checkpoint.
- If this already gives NaNs, your environment (precision, drivers, etc.) is the problem, not the LoRA file.
-
Add one LoRA at a time
-
Turn on one LoRA, try again.
-
When NaNs start with a specific LoRA, check:
- Does its SHA-256 hash match the original file?
- Does it live in the correct folder?
- Is it for the right model family (SD1.5 vs SDXL vs Flux)?
-
If hashes match but NaNs only appear on new ComfyUI builds
- Then the LoRA file is not corrupt.
- You are hitting the same numeric/precision issues other users report after updates. Solutions there involve changing precision or GPU settings, not re-downloading. (GitHub)
4. Very short checklist
-
Cause 1 (common): copies to external drive are incomplete or damaged → safetensors error or NaNs.
- Fix: back up as
.7z archives, use 7z t to test them, and use certutil -hashfile ... SHA256 to confirm files are identical. (GitHub)
-
Cause 2: backing up or moving HF cache internals instead of clean model folders.
- Fix: follow the HF “Local Model Backups” pattern: keep one per-model folder and back that up, do not touch cache internals. (Hugging Face Forums)
-
Cause 3: ComfyUI reinstall / update changes GPU math and precision → NaNs with same model.
- Fix: treat ComfyUI as disposable, keep models in a central folder, and treat NaNs as environment issues when hashes match. (GitHub)
-
Cause 4: wrong mixes (wrong base model + LoRA + ControlNet) after reinstall.
- Fix: test base model alone, then add components one by one and confirm they are for the same model family.
If you set up the single “models” folder, use archives for backup, and verify with hashes, you should reach a point where you never need to re-download the same LoRA again unless the original download itself was bad.