### Graph Statistics and Kernel Methods

## Node degree (equation)

## Eigenvector centrality: objective

Per-node scalar proprotional to the average centrality of its neighbours

## Eigenvector centrality:

## Eigenvector centrality: description of each node's ec-value

The corresponding element of the adjacency matrix's (positive) eigenvector

## Eigenvector centrality: how can we guarantee positive centrality values?

By taking the eigenvector corresponding to 's largest eigenvalue

(see Perron-Frobenius theorem)

## Clustering coefficient: objective

Per-node scalar indicating the fraction of neighbours that are themselves connected

## Clustering coefficient: equation

## Clustering coefficient: geometric interpretation

Measures the proportion of closed triangles (vs theoretical max) in a node's neighbourhood.

## Ego graph

The subgraph containing a node, its neighbours, and all the edges between its neighbours.

## Motif / graphlet

Patterns / subgraphs within a graph

## Idea behind generalised clustering coefficient

Measure the proportion of arbitrary motifs within a node's ego graph.

## Three basic statistical features for nodes in a graph (Hamilton book)

Degree, eigenvector centrality, clustering coefficient

#### Graph-level features and graph kernels

## Bag-of-nodes approach to graph-level features

Aggregate node-level statistics (e.g. histograms of degrees, centralities, etc.)

## What does it mean for two graphs to be isomorphic?

There is a bijection (one-to-one mapping) between nodes of the graphs where adjacency is preserved

## Term for one-to-one node mapping between graphs that preserves adjacency.

Isomorphism

## What is the canonical algorithm for determining isomorphism

The Weisfeiler-Lehman (WL) algorithm

## Weisfeiler-Lehman (WL) algorithm

- Assign initial labels to each node (typically degree)

- Until the groups of nodes with similar labels doesn't change (or set limit):
- For each node:
- Assign a new label to each node based on the
*multi*-set of (previous) labels in its neighbourhood

## How is the WL algorithm adapted to generate the WL Kernel

Running the WL algorithm for a set number of iterations (or to completion?) and computing summary statistics (e.g. histogram) over the labels

## Graphlet kernel

Count the number of times a given graphlet occurs in the full graph (hard combinatorial problem)

## Random walk kernel (graphs)

Run random walks and count the occurrence of different degree sequences

## What is a kernel in the context of graphs?

An algorithm for computing statistics about the entire graph

## Give four examples of graph kernels

- Bag-of-nodes kernel

- WL kernel

- Graphlet kernel

- Random walk kernel

#### Neighbourhood overlap detection

## What do we term a matrix containing node neighbourhood overlap statistics?

A similarity matrix

## What do similarity matrices tell us that graph kernel methods don't

Information about relationships between nodes - i.e. edges

## What kind of method is the sorensen index?

A neighbourhood overlap / node similarity method

## Sorensen index: equation

## Sorensen index: description

Normalizes the count of common neighbors by the sum of the node degrees

## What kind of method is the Resource Allocation (RA) index?

A neighbourhood overlap / node similarity method

## Resource Allocation (RA) index: equation

## Resource Allocation (RA) index: description

Sums the inverse degrees of the common neighbors