The GOcats API Reference

The following are located in /GOcats/gocats.

The Gene Ontology Categories Suite (GOcats)

This module provides methods for the creation of directed acyclic concept subgraphs of Gene Ontology, along with methods for evaluating those subgraphs.

gocats.gocats.build_graph(args)[source]

Not yet implemented

Try build_graph_interpreter to create a GO graph object to explore within a Python interpreter.

gocats.gocats.build_graph_interpreter(database_file, supergraph_namespace=None, allowed_relationships=None, relationship_directionality='gocats')[source]

Creates a graph object of GO, which can be traversed and queried within a Python interpreter.

Parameters:
  • database_file (file_handle) – Ontology database file.
  • supergraph_namespace (str) – Optional - Filter graph to a sub-ontology namespace.
  • allowed_relationships (list) – Optional - Filter graph to use only those relationships listed.
  • relationship_directionality – Optional - Any string other than ‘gocats’ will retain all original GO relationship directionalities. Defaults to reverseing has_part direction.
Returns:

A Graph object of the ontology provided.

Return type:

class

gocats.gocats.categorize_dataset(dataset_file, term_mapping, output_directory, mapped_dataset_filename, dataset_type='GAF', entity_col=0, go_col=1, retain_unmapped_annotations=False)[source]

Reads in a Gene Annotation File (GAF) and maps the annotations contained therein to the categories organized by GOcats or other methods. Outputs a mapped GAF and a list of unmapped genes in the specified output directory.

Parameters:
  • dataset_file – A file containing gene annotations.
  • term_mapping – A dictionary mapping category-defining ontology terms to their subgraph children terms. May be produced by GOcats or another method.
  • output_directory – The directory where the output file will be stored.
  • mapped_dataset_filename – The desired name of the mapped GAF.
  • dataset_type – Enter file type for dataset [GAF|TSV|CSV]. Defaults to “GAF”.
  • entity_col – If CSV or TSV file type, indicate which column the entity IDs are listed. Defaults to 0.
  • go_col – If CSV or TSV file type, indicate which column the GO IDs are listed. Defaults to 1.
  • retain_unmapped_annotations – If specified, annotations that are not mapped to a concept are copied into the mapped dataset output file with its original annotation.
Returns:

None

Return type:

None

gocats.gocats.create_subgraphs(database_file, keyword_file, output_directory, supergraph_namespace=None, subgraph_namespace=None, supergraph_relationships=['is_a', 'part_of', 'has_part'], subgraph_relationships=['is_a', 'part_of', 'has_part'], map_supersets=False, output_termlist=False, go_basic_scoping=False, network_table_name=None, test=False)[source]

Creates a graph object of an ontology, processed into gocats.dag.OboGraph or to an object that inherits from gocats.dag.OboGraph, and then extracts subgraphs which represent concepts that are defined by a list of provided keywords. Each subgraph is processed into gocats.subdag.SubGraph.

Parameters:
  • database_file – Ontology database file.
  • keyword_file – A CSV file with two columns: column 1 naming categories, and column 2 listing search strings (no quotation marks, separated by semicolons).
  • output_directory – The directory where results are stored.
  • supergraph_namespace – a supergraph sub-ontology to filter e.g. cellular_component, optional
  • subgraph_namespace – a subgraph sub-ontology to filter e.g. cellular_component, optional
  • supergraph_relationships – a list of relationships to limit in the supergraph e.g. [‘is_a’, ‘part_of’], optional
  • subgraph_relationships – a list of relationships to limit in subgraphs e.g. [‘is_a’, ‘part_of’], optional
  • map_supersets – whether to allow subgraphs to subsume other subgraphs, logical, optional
  • output_termlist – whether to create a translation of ontology terms to their names to improve interpretability of dev test results, logical, optional
  • go-basic-scoping – whether to create a GO graph similar to go-basic with only scoping-type relationships (is_a and part_of), logical, optional
  • network_table_name – whether to make a specific name for the network table produced from the subgraphs (defaults to NetworkTable.csv)
Returns:

None

Return type:

None

gocats.gocats.find_category_subsets(subgraph_collection)[source]

Finds subgraphs which are subsets of other subgraphs to remove redundancy, when specified.

Parameters:subgraph_collection – A dictionary of subgraph objects (keys: subgraph name, values: subgraph object).
Returns:A dictionary relating which subgraph objects are subsets of other subgraphs (keys: subset subgraph, values: superset subgraphs).
Return type:dict
gocats.gocats.json_format_graph(graph_object, graph_identifier)[source]

Creates a dictionary representing the edges in the graph and formats it in such a way that it can be encoded into JSON for comparing the graph objects between versions of GOcats.

gocats.gocats.remap_goterms(go_database, goa_gaf, ancestor_filename, namespace_filename, allowed_relationships, identifier_column)[source]

Reads in a Gene Ontology relationship file, and a Gene Annotation File (GAF), and follows the GOcats rules for allowed term-to-term relationships. Generates as output a new GAF, and a new term to ontology namespace mapping.

Parameters:
  • go_database – the gene ontology dataset
  • goa_gaf – the gene annotation file
  • ancestor_filename – the output file containing new gene to ontology mappings
  • namespace_filename – the output file containing the term to ontology mappings
  • allowed_relationships – what term to term relationships will be considered (is_a,part_of,has_part)
  • identifier_column – which column is being used for the gene identifiers (1)
Returns:

None

Return type:

None

Directed Acyclic Graph (DAG)

Contains necessary objects for creating a Directed Acyclic Graph (DAG) object to represent Open Biomedical Ontologies (OBO).

class gocats.dag.OboGraph(namespace_filter=None, allowed_relationships=None)[source]

A pythonic graph of a generic Open Biomedical Ontology (OBO) directed acyclic graph (DAG).

__init__(namespace_filter=None, allowed_relationships=None)[source]

OboGraph initializer. Leave namespace_filter and allowed_relationship as None to create the entire ontology graph. Otherwise, provide filters to limit what information is pulled into the graph.

Parameters:
  • namespace_filter (str) – Specify the namespace of a sub-ontology namespace, if one is available for the ontology.
  • allowed_relationships (list) – Specify a list of relationships to utilize in the graph, other relationships will be ignored.
orphans

property defining a set of nodes in the graph which have no parents. When the graph is modified, calls _update_graph() to repopulate the sets of orphan and leaf nodes.

Returns:Set of ‘orphan’ gocats.dag.AbstractNode objects.
Return type:set
leaves

property defining a set of nodes in the graph which have no children. When the graph is modified, calls _update_graph() to repopulate the sets of orphan and leaf nodes.

Returns:Set of ‘leaf’ gocats.dag.AbstractNode objects.
Return type:set
valid_node(node)[source]

Defines condition of a valid node. Node is valid if it is not obsolete and is contained within the given ontology namespace constraint.

Parameters:node – A gocats.dag.AbstractNode object
Returns:True if node is valid, False otherwise
Return type:True or False
valid_edge(edge)[source]

Defines condition of a valid edge. Edge is valid if it is within the list of allowed edges and connects two nodes that are both contained in the graph in question.

Parameters:edge – A gocats.dag.AbstractEdge object
Returns:True if node is valid, False otherwise
Return type:True or False
_update_graph()[source]

Repopulates graph orphans and leaves sets.

Returns:None
Return type:None
add_node(node)[source]

Adds a node object to the graph, adds an object pointer to the vocabulary index to reference nodes to every word in the node name and definition. Sets modification state to True.

Parameters:node – A gocats.dag.AbstractNode object.
Returns:None
Return type:None
remove_node(node)[source]

Removes a node from the graph and deletes node references from all entries in the vocabulary index. Sets modification state to True.

Parameters:node – A gocats.dag.AbstractNode object.
Returns:None
Return type:None
add_edge(edge)[source]

Adds an edge object to the graph, and counts the edge relationship type. Sets modification state to True.

Parameters:edge – A gocats.dag.AbstractEdge object.
Returns:None
Return type:None
remove_edge(edge)[source]

Removes an edge object from the graph, and removes references to that edge from the node objects involved. Sets modification state to True.

Parameters:edge – A gocats.dag.AbstractEdge object.
Returns:None
Return type:None
add_relationship(relationship)[source]

Adds a gocats.dag.AbstractRelationship object to the graph’s relationship index, referenced by that relationships ID. Sets modification state to True.

Parameters:relationship – A gocats.dag.AbstractRelationship object.
Returns:None
Return type:None
instantiate_valid_edges()[source]

Add all edge references to their respective nodes and vice versa if both nodes of the edge are in the graph. This is carried out by AbstractEdge.connect_nodes(). Also adds gocats.dag.AbstractRelationship object reference to each edge. If both nodes are not in the graph, the edge is deleted from the graph. Sets modification state to True.

Returns:None
Return type:None
node_depth(sample_node)[source]

Returns an integer representing how many nodes are between the given node and the root node of the graph (depth level).

Parameters:sample_node – A gocats.dag.AbstractNode object.
Returns:Depth level.
Return type:int
filter_nodes(search_string_list)[source]

Returns a list of node objects that contain vocabulary matching the keywords provided in the search string list. Nodes are selected by searching through the vocablary index.

Parameters:search_string_list – A list of search strings provided in the keyword_file provided to gocats.gocats.create_subgraphs().
Returns:A list of gocats.dag.AbstractNode objects.
Return type:list
filter_edges(filtered_nodes)[source]

Returns a list of edges in the graph that connect the nodes provided in the filtered nodes list.

Parameters:filtered_nodes – List of filtered nodes provided by filter_nodes().
Returns:A list of gocats.dag.AbstractEdge objects.
Return type:list
nodes_between(start_node, end_node)[source]

Returns a set of nodes that occur along all paths between the start node and the end node. If no paths exist, an empty set is returned.

Parameters:
Returns:

A set of gocats.dag.AbstractNode objects if there is at least one path between the parameters, an empty set otherwise.

Return type:

set

__weakref__

list of weak references to the object (if defined)

class gocats.dag.AbstractNode[source]

A node containing all basic properties of an OBO node. The parsing object, gocats.ontologyparser.OboParser currently has direct access to data members (id, name, definition, namespace, edges, and obsolete) so that information from the database file can be added to the object.

__init__()[source]

AbstractNode initializer

descendants

property defining a set of nodes in the graph that are recursively reverse of a node with a scoping-type relationship. When the node is modified, calls gocats.dag.AbstractNode._update_node() to repopulate the sets of descendants and ancestors. This represents a “lazy” evaluation of node descendants.

Returns:Set of gocats.dag.AbstractNode objects
Return type:set
ancestors

property defining a set of nodes in the graph that are recursively forward of a node with a scoping-type relationship. When the node is modified, calls gocats.dag.AbstractNode._update_node() to repopulate the sets of descendants and ancestors. This represents a “lazy” evaluation of node ancestors.

Returns:Set of gocats.dag.AbstractNode objects
Return type:set
_update_node()[source]

Repopulates ancestor and descendant sets for a node. Sets modification state to True.

Returns:None
Return type:None
add_edge(edge, allowed_relationships)[source]

Adds a given gocats.dag.AbstractEdge to a each gocats.dag.AbstractNode objects that the edge connects. If there is a filter for the types of relationships allowed, edges with non-allowed relationship types are not processed. Sets modification state to True.

Returns:None
Return type:None
remove_edge(edge)[source]

Removes a given gocats.dag.AbstractEdge the gocats.dag.AbstractNode object. Also removes parent or child node references that the edge referenced. Sets modification state to True.

Returns:None
Return type:None
_update_descendants()[source]

Used for the lazy evaluation of graph descendants of the current gocats.dag.AbstractNode object. Creates internal set variable, descendant_set. Iterates through node children until the bottom of the graph is reached. The descendant_set is a set of all nodes across all paths encountered from the current node.

Returns:None
Return type:None
_update_ancestors()[source]

Used for the lazy evaluation of graph ancestors of the current gocats.dag.AbstractNode object. Creates internal set variable, ancestors_set. Iterates through node parents until the top of the graph is reached. The ancestors_set is a set of all nodes across all paths encountered from the current node.

Returns:None
Return type:None
__weakref__

list of weak references to the object (if defined)

class gocats.dag.AbstractEdge(node1_id, node2_id, relationship_id, node_pair=None)[source]

An OBO edge which links two ontology term nodes and contains a relationship type describing now the two nodes are related.

__init__(node1_id, node2_id, relationship_id, node_pair=None)[source]

AbstractEdge initializer. Node pair refers to a tuple of gocats.dag.AbstractNode objects that are connected by the edge. Defaults to None and is later populated.

Parameters:
  • node1_id (str) – The ID of the first term referenced from the ontology file’s relationship line.
  • node2_id (str) – The ID of the second term referenced from the ontology file’s relationship line.
  • relationship_id (str) – The ID of the relationship in the ontology file’s relationship line.
  • node_pair (tuple) – Default-None, provide a tuple containing two gocats.dag.AbstractNode objects if they are already created and able to be referenced.
json_edge

property which returns a tuple where position 0 is a unique string representation of the edge made by combining the ID of the reverse node and the id of the forward nodes and where position 1 is a list of two node IDs: the reverse and forward node.

Returns:tuple of a unique AbstractEdge ID and a list of that edge object’s reverse and forward node IDs, respectively. Returns an empty :py:obj:str at a position for which there are no forward or reverse nodes in the graph.
Return type:tuple
parent_id

property defining the ID of the node forward of the current gocats.dag.AbstractEdge object.

Returns:str ID of the forward node in the node_pair associated with the edge if the edge’s relationship is assigned, None otherwise.
Return type:str or None
child_id

property defining the ID of the node reverse of the current gocats.dag.AbstractEdge object.

Returns:str ID of the reverse node in the node_pair associated with the edge if the edge’s relationship is assigned, None otherwise.
Return type:str or None
forward_node

property defining the gocats.dag.AbstractNode object forward of the current gocats.dag.AbstractEdge object.

Returns:gocats.dag.AbstractNode object of the forward node in the node_pair associated with the edge if the edge’s relationship is assigned, the node_pair is assigned, and the type of relationship is instantiated by gocats.dag.DirectionalRelationship None otherwise.
Return type:gocats.dag.AbstractNode or None
reverse_node

property defining the gocats.dag.AbstractNode object reverse of the current gocats.dag.AbstractEdge object.

Returns:gocats.dag.AbstractNode object of the reverse node in the node_pair associated with the edge if the edge’s relationship is assigned, the node_pair is assigned, and the type of relationship is instantiated by gocats.dag.DirectionalRelationship None otherwise.
Return type:gocats.dag.AbstractNode or None
parent_node

property defining the gocats.dag.AbstractNode object forward of the current gocats.dag.AbstractEdge object. This designation will be unique to scoping-type relationships, although this is not yet specified.

Returns:gocats.dag.AbstractNode object of the forward node in the node_pair associated with the edge if the edge’s relationship is assigned, the node_pair is assigned, and the type of relationship is instantiated by gocats.dag.DirectionalRelationship None otherwise.
Return type:gocats.dag.AbstractNode or None
child_node

property defining the gocats.dag.AbstractNode object reverse of the current gocats.dag.AbstractEdge object. This designation will be unique to scoping-type relationships, although this is not yet specified.

Returns:gocats.dag.AbstractNode object of the reverse node in the node_pair associated with the edge if the edge’s relationship is assigned, the node_pair is assigned, and the type of relationship is instantiated by gocats.dag.DirectionalRelationship None otherwise.
Return type:gocats.dag.AbstractNode or None
actor_node

not yet implemented

Returns:None
Return type:None
recipient_node

not yet implemented

Returns:None
Return type:None
ordinal_prior_node

not yet implemented

Returns:None
Return type:None
ordinal_post_node

not yet implemented

Returns:None
Return type:None
other_node

not yet implemented

Returns:None
Return type:None
connect_nodes(node_pair, allowed_relationships)[source]

Adds the current edge object to the gocats.dag.AbstractNode objects that are connected by the edge. Populates the node_pair with gocats.dag.AbstractNode objects.

Returns:None
Return type:None
__weakref__

list of weak references to the object (if defined)

class gocats.dag.AbstractRelationship[source]

A relationship as defined by a [typedef] stanza in an OBO ontology and augmented by GOcats to better interpret semantic correspondence.

__init__()[source]

AbstractRelationship initializer.

__weakref__

list of weak references to the object (if defined)

class gocats.dag.DirectionalRelationship[source]

A singly-directional relationship edge connecting two nodes in the graph. The two nodes are designated ‘forward’ and ‘reverse.’ The ‘forward’ node semantically succeeds the ‘reverse’ node in a way that depends on the context of the type of relationship describing the edge to which it is applied.

__init__()[source]

DirectionalRelationship initializer.

forward(pair)[source]

Returns the forward node in a node pair that semantically succeeds the other and is independent of the directionality of the edge. Default position is the second position [1].

Parameters:pair (tuple) – A pair of gocats.dag.AbstractNode objects.
Returns:The forward gocats.dag.AbstractNode object as determined by the pre-defined semantic directionality of the relationship.
reverse(pair)[source]

Returns the reverse node in a node pair that semantically precedes the other and is independent of the directionality of the edge. Default position is the second position [1].

Parameters:pair (tuple) – A pair of gocats.dag.AbstractNode objects.
Returns:The reverse gocats.dag.AbstractNode object as determined by the pre-defined semantic directionality of the relationship.
class gocats.dag.NonDirectionalRelationship[source]

A non-directional relationship whose edge directionality is either non-existent or semantically irrelevant.

__init__()[source]

NonDirectionalRelationship initializer.

Gene Ontology Directed Acylic Graph (GODAG)

Defines a Gene Ontology-specific graph which may have special properties when compared to other OBO formatted ontologies.

class gocats.godag.GoGraph(namespace_filter=None, allowed_relationships=None)[source]

A Gene-Ontology-specific graph. GO-specific idiosyncrasies go here.

__init__(namespace_filter=None, allowed_relationships=None)[source]

GoGraph initializer. Inherits and specializes properties from gocats.dag.OboGraph.

Parameters:
  • namespace_filter (str) – Specify the namespace of a sub-ontology namespace, if one is available for the ontology.
  • allowed_relationships (list) – Specify a list of relationships to utilize in the graph, other relationships will be ignored.
class gocats.godag.GoGraphNode[source]

Extends AbstractNode to include GO relevant information.

__init__()[source]

GoGraphNode initializer. Inherits all properties from gocats.dag.AbstractNode.

Directed Acyclic Subgraph (SubDAG)

A subgraph object of an OBOGraph object.

class gocats.subdag.SubGraph(super_graph, namespace_filter=None, allowed_relationships=None)[source]

A subgraph of a provided supergraph with node contents.

__init__(super_graph, namespace_filter=None, allowed_relationships=None)[source]

SubGraph initializer. Creates a subgraph object of :class:`gocats.dag.OboGraph. Leave namespace_filter and allowed_relationship as None to create the entire ontology graph. Otherwise, provide filters to limit what information is pulled into the subgraph.

Parameters:
  • super_graph (obj) – A supergraph object i.e. gocats.godag.GoGraph.
  • namespace_filter (str) – Specify the namespace of a sub-ontology namespace, if one is available for the ontology.
  • allowed_relationships (list) – Specify a list of relationships to utilize in the graph, other relationships will be ignored.
root_id_mapping

Property describing a mapping dict that relates every ontology term ID of subgraphs in gocats.dag.OboGraph to a list of root gocats.subdag.CategoryNode IDs.

Returns:dict of gocats.subdag.SubGraphNode IDs mapped to a list of root gocats.subdag.CategoryNode IDs.
Return type:dict
root_node_mapping

Property describing a mapping dict that relates every ontology gocats.subdag.SubGraphNode object of subgraphs in gocats.subdag.SubGraph to a list of root gocats.subdag.CategoryNode objects.

Returns:dict of gocats.subdag.SubGraphNode objects mapped to a list of root gocats.subdag.CategoryNode objects.
Return type:dict
content_mapping

Property describing a mapping dict that relates every root gocats.subdag.CategoryNode IDs of subgraphs in a gocats.subdag.SubGraph to a list of their subgraph nodes’ IDs.

Returns:dict of gocats.dag.AbstractNode IDs mapped to a list' of :class:`gocats.dag.AbstractNode IDs.
Return type:dict
subnode(super_node)[source]

Defines a gocats.subdag.SubGraph node object. Calls add_node() to convert a supergraph node into a gocats.subdag.SubGraphNode and add this node to the subgraph.

Parameters:super_node – A node object from the supergraph i.e. gocats.godag.GoGraphNode.
Returns:A gocats.subdag.SubGraphNode object.
Return type:class
add_node(super_node)[source]

Converts a supergraph node into a gocats.subdag.SubGraphNode and adds this node to the subgraph. Sets modification state to True.

Parameters:super_node (obj) – A node object from the supergraph i.e. gocats.godag.GoGraphNode.
Returns:None
Return type:None
connect_subnodes()[source]

Analogous to gocats.dag.instantiate_valid_edges() and gocats.dag.AbstractEdge.connect_nodes(). Updates child and parent node sets for each gocats.subdag.SubGraphNode in the gocats.subdag.SubGraph. Adds edge object references to nodes and node object references to edges. Counts instances of relationship IDs and sets modification state to True.

Returns:None
Return type:None
greedily_extend_subgraph()[source]

Extends a seeded subgraph to include all supergraph descendants of the nodes. Searches through the supergraph to add new SubGraphNode objects.

Returns:None
Return type:None
conservatively_extend_subgraph()[source]

Not currently in use.* Needs to be updated to handle CategoryNode.

Extends a seeded subgraph to include only nodes in the supergraph that occur along paths between nodes in the subgraph. Searches through the supergraph to add new node objects.

Returns:None
Return type:None
remove_orphan_paths()[source]

Not currently in use. Needs to be updated ot handle CategoryNode.

Removes nodes and their descendants from the subgraph which do not root to the category-representative node.

Returns:None
Return type:None
static find_representative_nodes(subgraph, search_string_list)[source]

Compiles a list candidate gocats.subdag.SubGraphNode objects from the gocats.subdag.SubGraph object based on a list of search strings matching strings in the names of the nodes (using regular expressions). Returns a list containing a single candidate node with the highest number of descendants when possible, returns the sole node if the subgraph only contains one node, returns a list of all seeded nodes when choosing candidates is impossible, or aborts if the subgraph is empty.

Parameters:
Returns:

A list of one or more candidate term gocats.subgraph.SubGraphNode chosen as the subgraph’s representative ontology term(s).

static from_filtered_graph(super_graph, subgraph_name, keyword_list, namespace_filter=None, allowed_relationships=None, extension='greedy')[source]

Staticmethod for extracting a subgraph from the supergraph by selecting nodes that contain vocabulary in the supplied keyword list. Leave namespace_filter and allowed_relationship as None to create the entire ontology graph. Otherwise, provide filters to limit what information is pulled into the subgraph. Graph extension variable defaults to ‘greedy’ which calls greedily_extend_subgraph() to add nodes to the subgraph after instantiation. Conversely, ‘conservative’ may be used to call conservatively_extend_subgraph() for this function.

Parameters:
  • super_graph (obj) – A supergraph object i.e. gocats.godag.GoGraph.
  • subgraph_name (str) – The name of the subgraph being created; will be used as the id of the gocats.subdag.CategoryNode.
  • keyword_list – A list of str entries used to query the supergraph for concepts to be extracted into subgraphs.
  • namespace_filter (str) – Specify the namespace of a sub-ontology namespace, if one is available for the ontology.
  • allowed_relationships (list) – Specify a list of relationships to utilize in the graph, other relationships will be ignored.
  • extension (str) – Specify ‘greedy’ or ‘conservative’ to determine how subgraphs will be extended after creation (defaults to greedy).
Returns:

A gocats.subdag.SubGraph object.

class gocats.subdag.SubGraphNode(super_node=None, allowed_relationships=None)[source]

An instance of a node within a subgraph of an OBO ontology (supergraph)

__init__(super_node=None, allowed_relationships=None)[source]

SubGraphNode initializer. Inherits from gocats.dag.AbstractNode and contains a reference to the supergraph node it represents e.g. gocats.godag.GoGraphNode.

Parameters:
  • super_node – A node from the supergraph.
  • allowed_relationshipsNot currently used Used to specify a list of allowable relationships evaluated between nodes.
super_edges
property describing the set of edges referenced in the supergraph node, filtered to only those
edges with nodes in the subgraph node.
Returns:A set of gocats.subgraph.SubGraphNode edges that were copied from the supergraph node.
Return type:set
id

property describing the ID of the supernode

Returns:The ID of a supernode e.g. gocats.godag.GoGraphNode
Return type:str
name

property describing the name of the supernode

Returns:The name of a supernode e.g. gocats.godag.GoGraphNode
Return type:str
definition

property describing the definition of the supernode

Returns:The definition of a supernode e.g. gocats.godag.GoGraphNode
Return type:str
namespace

property describing the namespace of the supernode

Returns:A namespace of a supernode e.g. gocats.godag.GoGraphNode
Return type:str
obsolete

property describing whether or not supernode is marked as obsolete.

Returns:True or False
update_parents(parent_set)[source]

Updates the parent_node_set with a set of new parents provided. Sets modification state to True.

Parameters:parent_set – A set of parent nodes to be added to this objects parent_node set.
Returns:None
Return type:None
update_children(child_set)[source]

Updates the child_node_set with a set of new children provided. Sets modification state to True.

Parameters:child_set – A set of child nodes to be added to this objects child_node set.
Returns:None
Return type:None
class gocats.subdag.CategoryNode(category_name, representative_node_list, namespace_filter=None)[source]

A special node added to the subgraph which contains all representative nodes identified and serves as the single representative of the subgraph which represents a concept.

__init__(category_name, representative_node_list, namespace_filter=None)[source]

AbstractNode initializer

Ontology Parser

A parser which reads ontologies in the OBO format and calls appropriate graph objects to store information in a graph representation. Separate parsing classes within this module operate on distinct ontologies in the OBO Foundry to handle any subtle differences among ontologies.

class gocats.ontologyparser.OboParser[source]

A scaffolding for parsing OBO formatted ontologies. Contains regular expressions for the basic stanzas and information pertinent for creating a graph object of an ontology.

__init__()[source]

OboParser initializer. Contains Regular Expressions for identifying crucial information from OBO formatted ontologies.

__weakref__

list of weak references to the object (if defined)

class gocats.ontologyparser.GoParser(database_file, go_graph, relationship_directionality='gocats')[source]

An ontology parser specific to Gene Ontology

__init__(database_file, go_graph, relationship_directionality='gocats')[source]

GoParser initializer. Parses a Gene Ontology database file and adds properties found therein to a godag.GoGraph object. Importantly: includes descriptions of semantic directionality of all GO relationships. :param file_handle database_file: Specify the location of a Gene Ontology .obo file. :param go_graph: gocats.godag.GoGraph object. :return: None :rtype: None

parse()[source]

Parses the ontology database file and accesses the ontology graph object to add information found in the database. Once all information is added, this function calls the graph’s instantiate_valid_edges function to connect all nodes in the graph by their edges.

Returns:None
Return type:None

Tools

Functions for handling some file input and output and reformatting tasks in GOcats.

gocats.tools.json_save(obj, filename)[source]

Takes a Python object, converts it into a JSON serializable object (if it is not already), and saves it to a file that is specified.

Parameters:
  • obj – A Python obj.
  • filename (file_handle) – A path to output the resulting JSON file.
gocats.tools.jsonpickle_save(obj, filename)[source]

Takes a Python object, converts it into a JsonPickle string, and writes it out to a file.

Parameters:
  • obj – A Python obj
  • filename (file_handle) – A path to output the resulting JsonPickle file.
gocats.tools.jsonpickle_load(filename)[source]

Takes a JsonPickle file and loads in the JsonPickle object into a Python object.

Parameters:filename (file_handle) – A path to a JsonPickle file.
gocats.tools.list_to_file(filename, data)[source]

Makes a text document from a list of data, with each line of the document being one item from the list and outputs the document into a file.

Parameters:
  • filename (file_handle) – A path to the output file.
  • data – A Python list.
gocats.tools.write_out_gaf(data, filename)[source]

Writes out an object representing a Gene Annotation File (GAF) to a file.

Parameters:
  • data (list) – A list object representing a GAF. Each item in the list represents a row.
  • filename (file_handle) – A path and name for the GAF.
gocats.tools.parse_gaf(filename)[source]

Converts a Gene Annotation File (GAF) into a list object where every item is a row from the GAF.

Parameters:filename (file_handle) – Specify the location of the GAF.
Returns:A list representing the GAF.
Return type:list