snsxt.util package¶
Submodules¶
snsxt.util.classes module¶
General utility classes for the program
-
class
snsxt.util.classes.
AnalysisItem
(id, extra_handlers=None)[source]¶ Bases:
snsxt.util.classes.LoggedObject
Base class for objects associated with a data analysis
-
add_file
(name, path)[source]¶ Add a file to the analysis object’s ‘files’ dict name = dict key paths_list = list of file paths
-
add_files
(name, paths_list)[source]¶ Add a file to the analysis object’s ‘files’ dict name = dict key paths_list = list of file paths
-
get_dirs
(name)[source]¶ Retrieve a file by name from the object’s ‘files’ dict name = dict key i = index entry in file list
-
get_files
(name)[source]¶ Retrieve a file by name from the object’s ‘files’ dict name = dict key i = index entry in file list
-
list_none
(l)[source]¶ return None for an empty list, or the first element of a list convenience function for dealing with object’s file lists
-
set_dir
(name, path)[source]¶ Add a single dir to the analysis object’s ‘dirs’ dict name = dict key path = dict value
-
set_dirs
(name, paths_list)[source]¶ Add dirs to the analysis object’s ‘dirs’ dict name = dict key paths_list = list of file paths
-
-
class
snsxt.util.classes.
LoggedObject
(id, extra_handlers=None)[source]¶ Bases:
object
Base class for an object with its own custom logger
Requires an id to be passed extra_handlers should be a list of handlers to add to the logger
snsxt.util.find module¶
Functions for finding files and dirs
-
snsxt.util.find.
find
(search_dir, inclusion_patterns=('*', ), exclusion_patterns=(), search_type='all', num_limit=None, level_limit=None, match_mode='any')[source]¶ Function to search for files and directories
Parameters: - search_dir (str) – path to the directory in which to search for files and subdirectories
- inclusion_patterns (list or tuple) – a list or tuple of patterns to match files/dirs against for inclusion in match output
- exclusion_patterns (list or tuple) – a list or tuple of patterns to match files/dirs against for exclusion from match output
- num_limit (int) – the number of matches to return; use None for no limit
- level_limit (int) – the number of directory levels to recurse; 0 is parent dir only
- match_mode – ‘any’ or ‘all’; matches any of the provided inclusion_patterns, or all of them
- search_type – ‘all’, ‘file’, or ‘dir’; type of items to find
Returns: a list of matching file or directory paths
Return type: list
-
snsxt.util.find.
find_files
(search_dir, search_filename)[source]¶ deprecated function that returns the paths to all files matching the supplied filename in the search dir
-
snsxt.util.find.
find_gen
(search_dir, inclusion_patterns=('*', ), exclusion_patterns=(), search_type='all', level_limit=None, match_mode='any')[source]¶ Generator function to return file matches. Used internally by find
Parameters: - search_dir (str) – path to the directory in which to search for files and subdirectories
- inclusion_patterns (list or tuple) – a list or tuple of patterns to match files/dirs against for inclusion in match output
- exclusion_patterns (list or tuple) – a list or tuple of patterns to match files/dirs against for exclusion from match output
- level_limit (int) – the number of directory levels to recurse; 0 is parent dir only
- match_mode – ‘any’ or ‘all’; matches any of the provided inclusion_patterns, or all of them
- search_type – ‘all’, ‘file’, or ‘dir’; type of items to find
-
snsxt.util.find.
multi_filter
(names, patterns, match_mode='any')[source]¶ Generator function which yields the names that match one or more of the patterns.
-
snsxt.util.find.
super_filter
(names, inclusion_patterns=('*', ), exclusion_patterns=(), match_mode='any')[source]¶ Enhanced version of fnmatch.filter() that accepts multiple inclusion and exclusion patterns.
Filter the input names by choosing only those that are matched by some pattern in inclusion_patterns _and_ not by any in exclusion_patterns.
Adapted from: https://codereview.stackexchange.com/questions/74713/filtering-with-multiple-inclusion-and-exclusion-patterns
-
snsxt.util.find.
walklevel
(some_dir, level=1)[source]¶ deprecated function that recursively searches a directory for all items up to a given depth
Examples
Example usage:
file_list = [] for item in pf.walklevel(some_dir): if (item.endswith('my_file.txt') and os.path.isfile(item) ): file_list.append(item)
snsxt.util.git module¶
Functions for finding files and dirs
tested with python 2.7
snsxt.util.log module¶
Functions & items to set up the program loggers
-
snsxt.util.log.
add_missing_console_handler
(logger, *args, **kwargs)[source]¶ Adds a console
StreamHandler
if a handler named “console” is not present already in the loggerExamples
Example usage:
>>> import log >>> import logging >>> import qsub >>> log.has_console_handler(qsub.logger) False >>> log.add_missing_console_handler(qsub.logger) >>> log.has_console_handler(qsub.logger) True
-
snsxt.util.log.
build_console_handler
(name='console', level=10, log_format='[%(asctime)s] %(levelname)s (%(name)s:%(funcName)s:%(lineno)d) %(message)s', datefmt='%Y-%m-%d %H:%M:%S')[source]¶ Returns a basic “console”
StreamHandler
-
snsxt.util.log.
build_logger
(name, level=10, log_format='[%(asctime)s] %(levelname)s (%(name)s:%(funcName)s:%(lineno)d) %(message)s')[source]¶ Create a basic logger instance Only add console handler by default
-
snsxt.util.log.
create_main_filehandler
(log_file, name='main', level=10, log_format='%(asctime)s:%(name)s:%(module)s:%(funcName)s:%(lineno)d:%(levelname)s:%(message)s')[source]¶ Return the ‘main’ file handler using globally set variables
-
snsxt.util.log.
email_log_filehandler
(log_file, name='emaillog', level=20, log_format='[%(levelname)-8s] %(message)s', datefmt='%Y-%m-%d %H:%M:%S')[source]¶ Return a fileHandler for a log meant to be used as the body of an email
-
snsxt.util.log.
get_all_handlers
(logger, types=('FileHandler', ))[source]¶ Get all logger handlers of the given types from the logger types = [‘FileHandler’, ‘StreamHandler’] x = [h for h in get_all_handlers(logger)]
-
snsxt.util.log.
get_logger_handler
(logger, handler_name, handler_type='FileHandler')[source]¶ Get the filehander object from a logger
-
snsxt.util.log.
has_console_handler
(logger)[source]¶ Searches a logger’s handlers to determine if a
console
handler is presentParameters: logger (logging.Logger) – a logging.Logger
object
-
snsxt.util.log.
log_all_handler_filepaths
(logger)[source]¶ Adds Info log messages for all filepaths for all file handlers
-
snsxt.util.log.
log_exception
(logger, errors)[source]¶ Create a log entry with the errors and traceback
-
snsxt.util.log.
log_setup
(config_yaml, logger_name)[source]¶ Set up the logger for the script using a YAML config file config = path to YAML config file
-
snsxt.util.log.
logger_filepath
(logger, handler_name)[source]¶ Get the path to the filehander log file
-
snsxt.util.log.
logpath
(logfile='log.txt')[source]¶ Return the path to the main log file; needed by the logging.yml use this for dynamic output log file paths & names
-
snsxt.util.log.
print_filehandler_filepaths_to_log
(logger)[source]¶ Make a log entry with the paths to each filehanlder in the logger
snsxt.util.mutt module¶
This script provides a flexible wrapper for mailing files from a remote server with mutt
USAGE: mutt.py -s “Subject line” -r “address1@gmail.com, address2@gmail.com” -rt “my.address@internets.com” -m “This is my email message” /path/to/attachment1.txt /path/to/attahment2.txt
example mutt command which will be created: # reply-to field; PUT YOUR EMAIL HERE export EMAIL=”kellys04@nyumc.org” recipient_list=”address1@gmail.com, address2@gmail.com” mutt -s “$SUBJECT_LINE” -a “$attachment_file” -a “$summary_file” -a “$zipfile” – “$recipient_list” <<E0F email message HERE E0F
-
snsxt.util.mutt.
get_reply_to_address
(server)[source]¶ Get the email address to use for the ‘reply to’ field in the email needs to be supplied with a server name
-
snsxt.util.mutt.
make_attachement_string
(attachment_files)[source]¶ Return a string to use to in the mutt command to include attachment files ex: -a “$attachment_file” -a “$summary_file” -a “$zipfile”
-
snsxt.util.mutt.
mutt_mail
(recipient_list, reply_to='', subject_line='[mutt.py]', message='~ This message was sent by the mutt.py email script ~', message_file=None, attachment_files=[], return_only_mode=False, quiet=False)[source]¶ Main control function for the program Send the message with mutt
recipient_list = character string; Format is 'address1@gmail.com, address2@gmail.com‘
snsxt.util.qsub module¶
A collection of functions and objects for submitting jobs to the NYUMC SGE compute cluster with qsub from within Python, and monitoring them until completion
This submodule can also be run as a stand-alone demo script
-
class
snsxt.util.qsub.
Job
(id, name=None, log_dir=None, debug=False)[source]¶ Bases:
object
Main object class for tracking and validating a compute job that has been submitted to the HPC cluster with the qsub command
Notes
The default action upon initialization is to query qstat to determine whether the job is currently running. After a job has completed, built-in methods can be used to query qacct -j to determine if the job finished with a successful exit status. Both qstat and qacct are queried by making system calls to the the corresponding programs and parsing their stdout messages.
Many of the methods included with this object class have stand-alone functions of the same name, with the same usage & functionality.
Examples
Example usage:
x = qsub.Job('2379768') x.running() x.present()
-
__init__
(id, name=None, log_dir=None, debug=False)[source]¶ Parameters: - id (int) – numeric job ID, as returned by qsub at job submission
- name (str) – the name given to the compute job
- log_dir (str) – path to the directory used to hold log output by the compute job
- debug (bool) – intialize the job without immediately querying qstat to determine job status
Variables: - job_state_key (dict) – the module’s job_state_key object
- id (int) – a numeric ID for the Job object
- name (str) – a name for the Job
- log_dir (str) – path to the directory used to hold log output by the compute job
- log_paths (dict) – dictionary containing the types and paths to the job’s output logs
- completions (str) – character string used to describe the job and its completion states
-
_completions
()[source]¶ Makes a default ‘completions’ string attribute
Returns: character string describing the object and its qsub log paths Return type: str
-
_debug_update
(qstat_stdout)[source]¶ Debug update mode with requires a qstat_stdout to be passed manually after object initialization
-
error
()[source]¶ Returns True or False whether or not the job is currently considered to be in an error state
Returns: True if in error, otherwise False Return type: bool
-
filter_qacct
(qacct_dict=None, days_limit=7, username=None)[source]¶ Filters out ‘bad’ entries from the qacct output dictionary
Parameters: - qacct_dict (dict) – dictionary containing job records which represent qacct entries
- days_limit (int or None) – Maximum allowed age of a job. Defaults to 7 days, change this to None to disable date filtering
- username (str) – The username which qacct records must match, defaults to the current user’s name
Returns: a dictionary which will hopefully contain only one qacct record, hopefully matching the intended compute job
Return type: dict
Notes
Filtering is required to remove historic job records from the qacct output; only one record can remain in order for the job’s completeion status to be determined. This function will try to identify entries which are extraneous and do not represent the intended compute job. The default filtering criteria will first try filter out records that contain usernames which do not match that of the current user. Next, records with a timestamp older than the provided days_limit will also be filtered out, in case the current user has multiple job entries for the given job_id. Note that the timestamp format used in the qacct output is inconsistent, so this type of filtering may be prone to errors.
-
get_is_error
(state, job_state_key)[source]¶ Checks if the job is considered to in an error state
Returns: Return type: bool
-
get_is_present
(id, entry=None, qstat_stdout=None)[source]¶ Finds out if a job is present in qsub
Returns: Return type: bool
-
get_is_running
(state, job_state_key)[source]¶ Checks if the job is considered to be running
Returns: Return type: bool
-
get_log_file
(_type='stdout')[source]¶ Returns the expected path to the job’s log file
Parameters: _type (str) – either ‘stdout’ or ‘stderr’, representing the type of log path to generate Notes
A stdout log file basename for a compute job with an ID of 4088513 and a name of python would look like this: python.o4088513 The corresponding stderr log name would look like: python.e4088513
-
get_qacct
(job_id=None)[source]¶ Gets the qacct entry for a completed qsub job, used to determine if the job completed successfully
Notes
This operation is extremely slow, takes about 10 - 30+ seconds to complete
Returns: The character string representation of the stdout from the qacct -j command for the job Return type: str
-
get_qacct_job_failed_status
(failed_entry)[source]¶ Special parsing for the ‘failed’ entry in qacct output because its not a plain digit value its got some weird text description stuck in there too
Returns: the first int value found after splitting text on the first whitespace found Return type: int Examples
Example of weird ‘failed’ entry that needs to be parsed:
{'failed': '100 : assumedly after job'}
In this case, the value 100 would be returned
-
get_state
(status, job_state_key)[source]¶ Gets the interpretation of the job’s status from the job_state_key, e.g. “Running”, etc.
Returns: Return type: str
-
get_status
(id, entry=None, qstat_stdout=None)[source]¶ Gets the status of the qsub job, e.g. “Eqw”, “r”, etc.
Returns: Return type: str
-
present
()[source]¶ Returns True or False whether or not the job is currently in the qstat queue
Returns: True if present, otherwise False Return type: bool
-
qacct2dict
(proc_stdout=None, entry_delim=None)[source]¶ Converts text output from qacct into a dictionary for parsing
Parameters: entry_delim (str) – character string delimiter to split entries in the qacct output, defaults to ‘==============================================================’ Returns: a dictionary of individual records containing metadata about the completion status of jobs with the matching job_id Return type: dict Notes
qacct returns multiple entries per job_id, because the job_id wrap around. So multiple historic jobs with the same job_id number will also be returned, delimited by a long string of ===
-
running
()[source]¶ Returns True or False whether or not the job is currently considered to be running
Returns: True if running, otherwise False Return type: bool
-
update_completion_validations
(validation_dict)[source]¶ Updates the completion_validations dict of validation stats with a pretty printed view of the validations dictionary, along with the Job’s text string representation
-
update_log_files
(_type='stdout')[source]¶ Updates the paths to the log files in the log_paths attribute
-
validate_completion
(job_id=None, *args, **kwargs)[source]¶ Checks if the qsub job completed successfully. Multiple validation criteria are evaluated one at a time, and the results of each are added to a completion_validations dictionary attribute along with a verbose description of the criteria. After all the criteria have been evaluated, returns a boolean True or False to determine if all criteria passed validation. This determines if a compute job is considered to have completed successfully or not.
Returns: True or False, whether or not all job completion validation criteria passed Return type: bool
-
-
snsxt.util.qsub.
demo_multi_qsub
(job_num=3)[source]¶ Demo of the qsub code functions. Submits multiple jobs and monitors them to completion.
-
snsxt.util.qsub.
demo_qsub
()[source]¶ Demo the qsub code functions
Examples
Example usages:
import qsub; job = qsub.submit(log_dir = "logs", print_verbose = True); qsub.monitor_jobs([job], print_verbose = True); job.validate_completion(); print(job.completions) import qsub; job = qsub.submit(log_dir = "logs", print_verbose = True, monitor = True); job.validate_completion() import qsub; job = qsub.submit(log_dir = "logs", print_verbose = True, monitor = True, validate = True)
-
snsxt.util.qsub.
filter_qacct
(qacct_dict, days_limit=7)[source]¶ Filters out ‘bad’ entries from the dict
-
snsxt.util.qsub.
find_all_job_id_names
(text)[source]¶ Searchs a multi-line character string for all qsub job submission messages, where text represents the stdout from a series of shell commands where are assumed to have submitted a number of qsub jobs (e.g. by an external program)
Parameters: text (str) – a single character string, e.g. representing line(s) of text assumed to be stdout from a shell command that submitted qsub jobs Notes
This function works by parsing the provided text for lines that look like this:
Your job 3947957 ("sns.wes.SeraCare-1to1-Positive") has been submitted
Examples
Example usage:
>>> text = '\n\n process sample SeraCare-1to1-Positive\n\n CMD: qsub -q all.q -cwd -b y -j y -N sns.wes.SeraCare-1to1-Positive -M kellys04@nyumc.org -m a -hard -l mem_free=64G -pe threaded 8-16 bash /ifs/data/molecpathlab/scripts/snsxt/sns_output/test/sns/routes/wes.sh /ifs/data/molecpathlab/scripts/snsxt/sns_output/test SeraCare-1to1-Positive\nYour job 3947957 ("sns.wes.SeraCare-1to1-Positive") has been submitted\n\n' >>> [(job_id, job_name) for job_id, job_name in find_all_job_id_names(text)] [('3947957', 'sns.wes.SeraCare-1to1-Positive')]
-
snsxt.util.qsub.
get_job_ID_name
(proc_stdout)[source]¶ Parses stdout text to find lines that match the output message from a qsub job submission
Returns: (<job id number>, <job name>) Return type: tuple Examples
Example usage:
proc_stdout = submit_job(return_stdout = True) # 'Your job 1245023 ("python") has been submitted' job_id, job_name = get_job_ID_name(proc_stdout)
-
snsxt.util.qsub.
get_qacct_job_failed_status
(failed_entry)[source]¶ Special parsing for the ‘failed’ entry in qacct output because its not a plain digit value its got some weird text description stuck in there too sometimes
Examples
Example text that needs parsing:
{'failed': '100 : assumedly after job'}
-
snsxt.util.qsub.
job_state_key
= defaultdict(<function <lambda>>, {'r': 'Running', 'dr': None, 'qw': 'Waiting', 'Eqw': 'Error', 't': None})¶ dictionary containing possible qsub job states; default state is None
format key: value, where key is the character string representation of the job state provided by qstat output, and value is a description of the state.
Eqw: Error; the job is in an error status and never started running
r: Running; the job is currently running
qw: Waiting; the job is currently in the scheduler queue waiting to run
t: None; ???
dr: None; the job has been submitted for deletion and will be deleted
-
snsxt.util.qsub.
kill_job_ids
(job_ids)[source]¶ Kills qsub jobs by issuing the
qdel
commandParameters: job_ids (list) – a list of job ID numbers Examples
Example usage:
import qsub job_ids = ['4104004', '4104006', '4104009'] qsub.kill_job_ids(job_ids = job_ids)
-
snsxt.util.qsub.
kill_jobs
(jobs)[source]¶ Kills qsub jobs by issuing the
qdel
commandParameters: jobs (list) – a list of Job
objects
-
snsxt.util.qsub.
monitor_jobs
(jobs=None, kill_err=True, print_verbose=False, **kwargs)[source]¶ Monitors a list of qsub Job objects for completion. Job monitoring is accomplished by calling each job’s present() and error() methods, then waiting for several seconds. Jobs that are no longer present in qstat or have an error state will be removed from the monitoring queue. The function will repeatedly check each job and then wait, removing absent or errored jobs, until no jobs remain in the monitoring queue. Optionally, jobs that had an error status will be killed with the qdel command, or else they will remain in qstat indefinitely.
This function allows your program to wait for jobs to finish running before continuing.
Parameters: - jobs (list) – a list of Job objects
- kill_err (bool) – True or False, whether or not jobs left in error state should be automatically killed. Its recommened to leave this True
- print_verbose (bool) – whether or not descriptions of the steps being taken should be printed to the console with Python’s print function
Returns: a tuple of lists containing Job objects, in the format: (completed_jobs, err_jobs)
Return type: tuple
Notes
This function will only check whether a job is present/absent in the qstat queue, or in an error state in the qstat queue; it does not actually check if a job is in a ‘Running’ state.
If a job is present and not in error state, it is assumed to either be ‘qw’ (waiting to run), or ‘r’ (running). In both cases, it is assumed that the job will eventually finish and leave the qstat queue, and subsequently be removed from this function’s monitoring queue.
Jobs in ‘Eqw’ error state are stuck and will not leave on their own so must be removed automatically by this function, or killed manually by the end user.
The
jobs
is mutable and passed by reference; this means that upon completion of this function, the originaljobs
list will be depleted:>>> import qsub >>> jobs = [] >>> len(jobs) 0 >>> for i in range(5): ... job = qsub.submit('sleep 20') ... jobs.append(job) ... >>> len(jobs) 5 >>> qsub.monitor_jobs(jobs = jobs) ([Job(id = 4098911, name = python, log_dir = None), Job(id = 4098913, name = python, log_dir = None), Job(id = 4098915, name = python, log_dir = None), Job(id = 4098912, name = python, log_dir = None), Job(id = 4098914, name = python, log_dir = None)], []) >>> len(jobs) 0
Examples
Example usage:
job = submit(print_verbose = True) completed_jobs, err_jobs = monitor_jobs([job], print_verbose = True) [job.validate_completion() for job in completed_jobs]
-
snsxt.util.qsub.
qacct2dict
(proc_stdout)[source]¶ Converts text output from qacct into a dictionary for parsing
-
snsxt.util.qsub.
submit
(verbose=False, log_dir=None, monitor=False, validate=False, *args, **kwargs)[source]¶ Submits a shell command to be run as a qsub compute job. Returns a Job object. Passes args and kwargs to submit_job. Compute jobs are created by assembling a qsub shell command using a bash heredoc wrapped around the provided shell command to be executed. The numeric job ID and job name echoed by qsub on stdout will be captured and used to generate a ‘Job’ object.
Parameters: - verbose (bool) – True or False, whether or not the generated qsub command should be printed in log output
- log_dir (str) – the directory to use for qsub job log output files, defaults to the current working directory
- monitor (bool) – whether the job should be immediately monitored until completion
- validate (bool) – whether or not the job should immediately be validated upon completion
- *args (list) – list of arguments to pass on to submit_job
- **kwargs (dict) – dictionary of args to pass on to submit_job
Returns: a Job object, representing a qsub compute job that has been submitted to the HPC cluster
Return type: Examples
Example usage:
job = submit(command = 'echo foo') job = submit(command = 'echo foo', log_dir = "logs", print_verbose = True, monitor = True, validate = True)
-
snsxt.util.qsub.
submit_job
(command='echo foo', params='-j y', name='python', stdout_log_dir=None, stderr_log_dir=None, return_stdout=False, verbose=False, pre_commands='set -x', post_commands='set +x', sleeps=0.5, print_verbose=False, **kwargs)[source]¶ Internal function for submitting compute jobs to the HPC cluster running SGE by using the qsub shell command. Call this function with submit instead; args and kwargs will be evaluated here. Creates a qsub shell command to be run in a subprocess, submitting the cluster job with a bash heredoc wrapper. Basic format for job submission to the SGE cluster with qsub using a bash heredoc format
Parameters: - command (str) – shell commands to be run inside the compute job
- params (str) – extra params to be passed to qsub
- name (str) – the name of the qsub compute job
- stdout_log_dir (str) – the path to the directory to use for qsub log output; if None, defaults to the current working directory
- stderr_log_dir (str) – the path to the directory to use for qsub log output; if None, defaults to the current working directory
- return_stdout (bool) – whether or not the function should return the stdout of the qsub submission subprocess call, its recommened to always leave this set to True, otherwise stdout will be printed to program the log output
- verbose (bool) – whether or not the generated qsub command should be printed in program log output
- pre_commands (str) – commands to run before the command inside the qsub job; defaults to ‘set -x’ in order to provide verbose qsub log output, you can also put environment modulation code here.
- post_commands (str) – commands to run after the command inside the qsub job; defaults to ‘set +x’
- sleeps (int) – number of seconds to sleep after submitting a qsub job; it is recommened to leave this set to a value >0 in order to avoid overwhelming the job scheduler with requests
- print_verbose (bool) – print the generated qsub command to the console with the Python print function (as opposed to logger output)
Returns: returns the stdout of the evaluated qsub shell command, assuming return_stdout = True was passed. Otherwise, returns nothing.
Return type: str
Notes
stdout_log_dir and stderr_log_dir should have trailing slashes in their paths, and are set to the same path by default using the log_dir arg in submit
Malformed or nonexistant stdout_log_dir and stderr_log_dir paths are a common source for compute job failure.
Call this function with submit instead.
This function generates a qsub shell command in a format such as this:
qsub -j y -N "python" -o :"/ifs/data/molecpathlab/scripts/snsxt/snsxt/util/" -e :"/ifs/data/molecpathlab/scripts/snsxt/snsxt/util/" <<E0F set -x cat /etc/hosts sleep 10 set +x E0F
The generated shell command will be evaluated by Python subprocess, and its stdout messages returned.
snsxt.util.sh module¶
snsxt.util.template module¶
Template Python script
snsxt.util.test module¶
Run all the unit tests
snsxt.util.test_find module¶
unit tests for the find module
snsxt.util.test_qsub module¶
unit tests for the find module
-
class
snsxt.util.test_qsub.
TestJob
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
test_debug_init_Job
()[source]¶ Make sure that the ‘debug’ init setting prevents attributes from being set
-
test_find_all_job_id_names1
()[source]¶ Test that job IDs and names can be parsed from a blob of text
-
test_running_job1
()[source]¶ Find running job id = ‘2495634’ self.qstat_stdout_r_Eqw_file
qstat_stdout_r_Eqw_file = “fixtures/qstat_stdout_r_Eqw.txt” with open(qstat_stdout_r_Eqw_file, “rb”) as f: qstat_stdout_r_Eqw_str = f.read() from qsub import Job x = Job(id = ‘2495634’, debug = True)
-
snsxt.util.test_tools module¶
unit tests for the find module
-
class
snsxt.util.test_tools.
TestItemExists
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
class
snsxt.util.test_tools.
TestNumLines
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
class
snsxt.util.test_tools.
TestSubprocessCmd
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
snsxt.util.tools module¶
General utility functions and classes for the program
-
class
snsxt.util.tools.
DirHop
(directory)[source]¶ Bases:
object
A class for executing commands in the context of a different working directory adapted from: https://mklammler.wordpress.com/2011/08/14/safe-directory-hopping-with-python/
- with DirHop(‘/some/dir’) as d:
- do_something()
-
class
snsxt.util.tools.
SubprocessCmd
(command)[source]¶ Bases:
object
A command to be run in subprocess
run_cmd = SubprocessCmd(command = ‘echo foo’).run()
-
snsxt.util.tools.
backup_file
(input_file, return_path=False, sys_print=False, use_logger=None)[source]¶ backup a file by moving it to a folder called ‘old’ and appending a timestamp use_logger is a logger object to log to
-
snsxt.util.tools.
compare
(x, y)¶
-
snsxt.util.tools.
copy_and_overwrite
(from_path, to_path)[source]¶ copy a directory tree to a new locaiton and overwrite if it already exits
-
snsxt.util.tools.
item_exists
(item, item_type='any', n=False)[source]¶ Check that an item exists item_type is ‘any’, ‘file’, ‘dir’ n is True or False and negates ‘exists’
-
snsxt.util.tools.
mkdirs
(path, return_path=False)[source]¶ Make a directory, and all parent dir’s in the path
-
snsxt.util.tools.
my_debugger
(vars)[source]¶ starts interactive Python terminal at location in script very handy for debugging call this function with my_debugger(globals().copy()) anywhere in the body of the script, or my_debugger(locals().copy()) within a script function
-
snsxt.util.tools.
num_lines
(input_file, skip=0)[source]¶ Count the number of lines in a file TODO: add tests for this one
-
snsxt.util.tools.
reply_to_address
(servername, username=None)[source]¶ Get the email address to use for the ‘reply to’ field in emails
-
snsxt.util.tools.
update_json
(data, input_file)[source]¶ Add new data to an existing JSON file, or create the file if it doesnt exist
-
snsxt.util.tools.
write_dicts_to_csv
(dict_list, output_file)[source]¶ write a list of dicts to a CSV file
-
snsxt.util.tools.
write_tabular_overlap
(file1, ref_file, output_file, delim='\t', inverse=False)[source]¶ Find matching entries between two tabular files Write out all the entries in ‘file1’ that are found in the ‘ref_file’ save entries to the output_file both ‘file1’ and ‘ref_file’ must have headers in common inverse = True write out entries in file1 that are not in ref_file