gklearn.kernels.treeletKernel

@author: linlin

@references:

[1] Gaüzère B, Brun L, Villemin D. Two new graphs kernels in chemoinformatics. Pattern Recognition Letters. 2012 Nov 1;33(15):2038-47.

find_all_paths(G, length, is_directed)[source]

Find all paths with a certain length in a graph. A recursive depth first search is applied.

Parameters

GNetworkX graphs

The graph in which paths are searched.

lengthinteger

The length of paths.

Return

pathlist of list

List of paths retrieved, where each path is represented by a list of nodes.

find_paths(G, source_node, length)[source]

Find all paths with a certain length those start from a source node. A recursive depth first search is applied.

Parameters

GNetworkX graphs

The graph in which paths are searched.

source_nodeinteger

The number of the node from where all paths start.

lengthinteger

The length of paths.

Return

pathlist of list

List of paths retrieved, where each path is represented by a list of nodes.

get_canonkeys(G, node_label, edge_label, labeled, is_directed)[source]

Generate canonical keys of all treelets in a graph.

Parameters

GNetworkX graphs

The graph in which keys are generated.

node_labelstring

node attribute used as label. The default node label is atom.

edge_labelstring

edge attribute used as label. The default edge label is bond_type.

labeledboolean

Whether the graphs are labeled. The default is True.

Return

canonkey/canonkey_ldict

For unlabeled graphs, canonkey is a dictionary which records amount of every tree pattern. For labeled graphs, canonkey_l is one which keeps track of amount of every treelet.

treeletkernel(*args, sub_kernel, node_label='atom', edge_label='bond_type', parallel='imap_unordered', n_jobs=None, chunksize=None, verbose=True)[source]

Compute treelet graph kernels between graphs.

Parameters

GnList of NetworkX graph

List of graphs between which the kernels are computed.

G1, G2NetworkX graphs

Two graphs between which the kernel is computed.

sub_kernelfunction

The sub-kernel between 2 real number vectors. Each vector counts the numbers of isomorphic treelets in a graph.

node_labelstring

Node attribute used as label. The default node label is atom.

edge_labelstring

Edge attribute used as label. The default edge label is bond_type.

parallelstring/None

Which paralleliztion method is applied to compute the kernel. The Following choices are available:

‘imap_unordered’: use Python’s multiprocessing.Pool.imap_unordered method.

None: no parallelization is applied.

n_jobsint

Number of jobs for parallelization. The default is to use all computational cores. This argument is only valid when one of the parallelization method is applied.

Return

KmatrixNumpy matrix

Kernel matrix, each element of which is the treelet kernel between 2 praphs.

wrapper_get_canonkeys(node_label, edge_label, labeled, is_directed, itr_item)[source]
wrapper_treeletkernel_do(sub_kernel, itr)[source]