gklearn.kernels.treeletKernel
@author: linlin
@references:
[1] Gaüzère B, Brun L, Villemin D. Two new graphs kernels in chemoinformatics. Pattern Recognition Letters. 2012 Nov 1;33(15):2038-47.
- find_all_paths(G, length, is_directed)[source]
Find all paths with a certain length in a graph. A recursive depth first search is applied.
Parameters
- GNetworkX graphs
The graph in which paths are searched.
- lengthinteger
The length of paths.
Return
- pathlist of list
List of paths retrieved, where each path is represented by a list of nodes.
- find_paths(G, source_node, length)[source]
Find all paths with a certain length those start from a source node. A recursive depth first search is applied.
Parameters
- GNetworkX graphs
The graph in which paths are searched.
- source_nodeinteger
The number of the node from where all paths start.
- lengthinteger
The length of paths.
Return
- pathlist of list
List of paths retrieved, where each path is represented by a list of nodes.
- get_canonkeys(G, node_label, edge_label, labeled, is_directed)[source]
Generate canonical keys of all treelets in a graph.
Parameters
- GNetworkX graphs
The graph in which keys are generated.
- node_labelstring
node attribute used as label. The default node label is atom.
- edge_labelstring
edge attribute used as label. The default edge label is bond_type.
- labeledboolean
Whether the graphs are labeled. The default is True.
Return
- canonkey/canonkey_ldict
For unlabeled graphs, canonkey is a dictionary which records amount of every tree pattern. For labeled graphs, canonkey_l is one which keeps track of amount of every treelet.
- treeletkernel(*args, sub_kernel, node_label='atom', edge_label='bond_type', parallel='imap_unordered', n_jobs=None, chunksize=None, verbose=True)[source]
Compute treelet graph kernels between graphs.
Parameters
- GnList of NetworkX graph
List of graphs between which the kernels are computed.
- G1, G2NetworkX graphs
Two graphs between which the kernel is computed.
- sub_kernelfunction
The sub-kernel between 2 real number vectors. Each vector counts the numbers of isomorphic treelets in a graph.
- node_labelstring
Node attribute used as label. The default node label is atom.
- edge_labelstring
Edge attribute used as label. The default edge label is bond_type.
- parallelstring/None
Which paralleliztion method is applied to compute the kernel. The Following choices are available:
‘imap_unordered’: use Python’s multiprocessing.Pool.imap_unordered method.
None: no parallelization is applied.
- n_jobsint
Number of jobs for parallelization. The default is to use all computational cores. This argument is only valid when one of the parallelization method is applied.
Return
- KmatrixNumpy matrix
Kernel matrix, each element of which is the treelet kernel between 2 praphs.