Recovery files overrunning disk space

Please login with a confirmed email address before reporting spam

I've helping a colleague run some COMSOL jobs (multiphysics simulations involving a parametric sweep) on our university's computing cluster. These require several terabytes of RAM and/or temporary storage space, which we have access to on the compute nodes as the job is running; however we do not have quite the same degree of storage available while the jobs are not actively running. This is a problem when COMSOL tries to write 1TB+ of recovery files to a filesystem that is less than 1TB away from its group quota.

The observed ensuing behavior is that COMSOL exits due to exceeded disk quota; then the next run seems to see that the recovery files are not intact (hardly a surprise if the last attempt failed while trying to write them) and starts over.

For further context (as much as I can give without currently having access to the recovery files) one attempt produced hundreds of savepoint subdirectories in addition to about 50k .mph.bin files; I believe the latter is what took up most of the space (perhaps relevantly, this is on a striped filesystem with a large block size).

My questions, then, are:

  • What options can we invoke to reduce the amount of space COMSOL uses for recovery, short of simply not saving recovery files at all?
    • On the flipside, should we expect that the final output file would actually be roughly that big?
    • Can we get COMSOL to use fewer, large files instead of saving so many small files?
    • Are there any files in this recovery folder that can be safely deleted after a certain point? How would these be identified?
    • If we were to run a process that (say) ran behind COMSOL zipping up recovery files once they were written, would this interfere with its operations?
  • Failing that, is it possible to impose a limit on the amount of disk space COMSOL will use so that it doesn't interfere with other users from our group who rely on the same scratch space?
  • As for the existing recovery files, are there shell commands to convert them to a combined (hopefully smaller) .mph file or export their contents?
    • What about only parsing some of them (especially if others are corrupted)?

Thanks for any help.


2 Replies Last Post Mar 24, 2024, 11:47 a.m. EDT
Robert Koslover Certified Consultant

Please login with a confirmed email address before reporting spam

Posted: 1 month ago Mar 24, 2024, 12:15 a.m. EDT
Updated: 1 month ago Mar 24, 2024, 12:16 a.m. EDT

I don't think you have to save recovery files. Go to File, Preferences, Files, Recovery, and uncheck "Keep recovery files when the application terminates unexpectedly." You can also change the path and folder for recovery files. Search for "recovery file" in the Help system for more info. There's a section there called "Keeping and Opening Recovery Files."

-------------------
Scientific Applications & Research Associates (SARA) Inc.
www.comsol.com/partners-consultants/certified-consultants/sara
I don't think you have to save recovery files. Go to File, Preferences, Files, Recovery, and uncheck "Keep recovery files when the application terminates unexpectedly." You can also change the path and folder for recovery files. Search for "recovery file" in the Help system for more info. There's a section there called "Keeping and Opening Recovery Files."

Please login with a confirmed email address before reporting spam

Posted: 1 month ago Mar 24, 2024, 11:47 a.m. EDT
Updated: 1 month ago Mar 24, 2024, 11:47 a.m. EDT

I don't think you have to save recovery files. Go to File, Preferences, Files, Recovery, and uncheck "Keep recovery files when the application terminates unexpectedly." You can also change the path and folder for recovery files. Search for "recovery file" in the Help system for more info. There's a section there called "Keeping and Opening Recovery Files."

Thanks for the reply, but to be clear I would like to still keep recovery files; each failed attempt takes a decent chunk of our group's limited compute resources and I would prefer not to have to restart from scratch. (Additionally we learned how to change the tmp & recovery directories after the tmp files overran the much smaller home directory.)

>I don't think you have to save recovery files. Go to File, Preferences, Files, Recovery, and uncheck "Keep recovery files when the application terminates unexpectedly." You can also change the path and folder for recovery files. Search for "recovery file" in the Help system for more info. There's a section there called "Keeping and Opening Recovery Files." Thanks for the reply, but to be clear I would like to still keep recovery files; each failed attempt takes a decent chunk of our group's limited compute resources and I would prefer not to have to restart from scratch. (Additionally we learned how to change the tmp & recovery directories after the tmp files overran the much smaller home directory.)

Reply

Please read the discussion forum rules before posting.

Please log in to post a reply.

Note that while COMSOL employees may participate in the discussion forum, COMSOL® software users who are on-subscription should submit their questions via the Support Center for a more comprehensive response from the Technical Support team.