Module: Jobs on Legion

Learning Outcomes

How do I run and manage jobs?

To answer this question you will learn to:

  1. Understand what wallclock time is.

  2. Explain the assignment of resources to users to maintain fairness.

  3. Write and submit a simple job script that leaves some resources as default values.

  4. Use qstat after submitting a job and qstat -j to see what resources you requested.

  5. Write and submit a job script that specifies all resources appropriately.

  6. Write and submit a job script that writes to $TEMPDIR and copies data back.

  7. Understand that SGE’s working directory and the working directory for the program you are running inside the script can be different.

  8. Explain that #Local2Scratch happens outside of wallclock time, while other copying methods happen inside it.

  9. Explain the difference in intended use for some of Legion’s nodes.

  10. Explain and justify why you might want to run on a specific node.

  11. Explain what common qstat statuses mean.

  12. Use qexplain to identify faults with a submitted job.

  13. Use qdel to delete a job.

  14. Use jobhist after a job has ended.

Readings

Experiential Learning