```
library(backbone)
#> ____
#> | _ \ backbone v1.5.0
#> |#|_) | Cite: Domagalski, R., Neal, Z. P., & Sagan, B. (2021). Backbone: An
#> |# _ < R package for extracting the backbone of bipartite projections.
#> |#|_) | PLoS ONE. https://doi.org/10.1371/journal.pone.0244363
#> |____/ For help: type vignette("backbone"); email zpneal@msu.edu; github domagal9/backbone
```

Thank you for your interest in the backbone package! This vignette illustrates how to use the functions in this package to extract the backbone of a bipartite projection. For more details on these functions and methods, please see our latest manuscripts on backbone here:

“Domagalski R., Neal Z.P., and Sagan B. (2021). Backbone: an R package for extracting the backbone of bipartite projections. PLoS ONE 16(1): e0244363.” https://doi.org/10.1371/journal.pone.0244363

"Neal, Z. P., Domagalski, R, Sagan B. (2021). Comparing models for extracting the backbone of bipartite projections. https://arxiv.org/abs/2105.13396

For additional resources on how to use the backbone package, please see https://www.zacharyneal.com/backbone

In a graph \(G\), edges are either present (i.e. \(G_{ij}=1\)) or absent (i.e. \(G_{ij}=0\)). However in a weighted or valued graph, edges can take a range of values that may capture such properties as the strength or capacity of the edge. Although weighted graphs contain a large amount of information, there are some cases (e.g. visualization, application of statistical models not developed for weighted graphs) where it is useful to reduce this information by focusing on an unweighted subgraph that contains only the most important edges. We call this subgraph the backbone of \(G\), which we denote as \(G’\).

Extracting \(G’\) from \(G\) requires deciding which edges to preserve. This usually involves selecting a threshold \(T_{ij}\) such that edges are preserved if they are above the threshold (i.e. \(G_{ij}’=1\) if \(G_{ij} > T_{ij}\)), and omitted if they are below the threshold (i.e. \(G_{ij}’=0\) if \(G_{ij} < T_{ij}\)). It is also possible to extract a signed backbone by selecting upper \(T^+_{ij}\) and lower \(T^-_{ij}\) thresholds such that \(G_{ij}’=1\) if \(G_{ij} > T^+_{ij}\), \(G_{ij}’=-1\) if \(G_{ij} < T^-_{ij}\), and \(G_{ij}’=0\) if \(G_{ij} > T^-_{ij}\) and \(G_{ij} < T^+_{ij}\). The key to all backbone extraction methods lies in the selection of \(T\). The backbone package provides several different methods for selecting \(T\) and thus extracting \(G’\) from \(G\).

We outline the use of the backbone package with Davis, Gardner, and Gardner’s Southern Women Dataset (Davis, Gardner, and Gardner 1941), which can be accessed via (Repository 2006). This data takes the form of a bipartite graph \(B\) containing 18 women (rows) and 14 social events (columns) taking place over a nine month period. In \(B\), \(B_{ij} = 1\) if women \(i\) attended event \(j\), and otherwise is 0. Let’s take a look at the Davis dataset included in this package to see that it is bipartite.

```
data(davis) #load the dataset
op <- options(width = 100)
davis #view the dataset
#> 6/27 3/2 4/12 9/26 2/25 5/19 3/15 9/16 4/8 6/10 2/23 4/7 11/21 8/3
#> EVELYN 1 1 1 1 1 1 0 1 1 0 0 0 0 0
#> LAURA 1 1 1 0 1 1 1 1 0 0 0 0 0 0
#> THERESA 0 1 1 1 1 1 1 1 1 0 0 0 0 0
#> BRENDA 1 0 1 1 1 1 1 1 0 0 0 0 0 0
#> CHARLOTTE 0 0 1 1 1 0 1 0 0 0 0 0 0 0
#> FRANCES 0 0 1 0 1 1 0 1 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 1 1 1 1 0 0 0 0 0 0
#> PEARL 0 0 0 0 0 1 0 1 1 0 0 0 0 0
#> RUTH 0 0 0 0 1 0 1 1 1 0 0 0 0 0
#> VERNE 0 0 0 0 0 0 1 1 1 0 0 1 0 0
#> MYRNA 0 0 0 0 0 0 0 1 1 1 0 1 0 0
#> KATHERINE 0 0 0 0 0 0 0 1 1 1 0 1 1 1
#> SYLVIA 0 0 0 0 0 0 1 1 1 1 0 1 1 1
#> NORA 0 0 0 0 0 1 1 0 1 1 1 1 1 1
#> HELEN 0 0 0 0 0 0 1 1 0 1 1 1 0 0
#> DOROTHY 0 0 0 0 0 0 0 1 1 0 0 0 0 0
#> OLIVIA 0 0 0 0 0 0 0 0 1 0 1 0 0 0
#> FLORA 0 0 0 0 0 0 0 0 1 0 1 0 0 0
options(op)
```

We see that our two sets of vertices are women and events attended.

A weighted graph \(G\) can be constructed from \(B\) via bipartite projection, where \(G = BB^T\) and \(G_{ij}\) contains the number of events that both woman \(i\) and woman \(j\) attended. Looking at the matrix of southern women and events attended above, we see that Evelyn and Charlotte have attended three of the same events. This means that \(G_{15} = 3\) in the projection, shown below.

```
davis%*%t(davis) #The projected davis dataset
#> EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
#> EVELYN 8 6 7 6 3 4 3 3 3
#> LAURA 6 7 6 6 3 4 4 2 3
#> THERESA 7 6 8 6 4 4 4 3 4
#> BRENDA 6 6 6 7 4 4 4 2 3
#> CHARLOTTE 3 3 4 4 4 2 2 0 2
#> FRANCES 4 4 4 4 2 4 3 2 2
#> ELEANOR 3 4 4 4 2 3 4 2 3
#> PEARL 3 2 3 2 0 2 2 3 2
#> RUTH 3 3 4 3 2 2 3 2 4
#> VERNE 2 2 3 2 1 1 2 2 3
#> MYRNA 2 1 2 1 0 1 1 2 2
#> KATHERINE 2 1 2 1 0 1 1 2 2
#> SYLVIA 2 2 3 2 1 1 2 2 3
#> NORA 2 2 3 2 1 1 2 2 2
#> HELEN 1 2 2 2 1 1 2 1 2
#> DOROTHY 2 1 2 1 0 1 1 2 2
#> OLIVIA 1 0 1 0 0 0 0 1 1
#> FLORA 1 0 1 0 0 0 0 1 1
#> VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
#> EVELYN 2 2 2 2 2 1 2 1 1
#> LAURA 2 1 1 2 2 2 1 0 0
#> THERESA 3 2 2 3 3 2 2 1 1
#> BRENDA 2 1 1 2 2 2 1 0 0
#> CHARLOTTE 1 0 0 1 1 1 0 0 0
#> FRANCES 1 1 1 1 1 1 1 0 0
#> ELEANOR 2 1 1 2 2 2 1 0 0
#> PEARL 2 2 2 2 2 1 2 1 1
#> RUTH 3 2 2 3 2 2 2 1 1
#> VERNE 4 3 3 4 3 3 2 1 1
#> MYRNA 3 4 4 4 3 3 2 1 1
#> KATHERINE 3 4 6 6 5 3 2 1 1
#> SYLVIA 4 4 6 7 6 4 2 1 1
#> NORA 3 3 5 6 8 4 1 2 2
#> HELEN 3 3 3 4 4 5 1 1 1
#> DOROTHY 2 2 2 2 1 1 2 1 1
#> OLIVIA 1 1 1 1 2 1 1 2 2
#> FLORA 1 1 1 1 2 1 1 2 2
```

In this vignette, we demonstrate using the backbone package to extract the backbone of \(G\), which involves deciding whether to preserve an edge between Evelyn and Charlotte in \(G’\), and similarly for all other edges in \(G\).

In this section, we will describe backbone methods that can be applied to any weighted graph, whether the weights are present in a natively unipartite graph, or are the result of a bipartite projection (as is the case in our example data). All of the methods described can accept inputs of matrices, sparse matrices, igraph objects, edgelists, and network objects. For the sake of these examples, we use matrices.

The simplest approach to backbone extraction applies a single threshold \(T\) to all edges, and is achieved using the `universal()`

function. The `universal()`

function allows the user to extract a binary backbone by selecting a single threshold \(T\), or extract a signed backbone by selecting upper and lower thresholds \(T^+\) and \(T^-\).

The `universal( )`

function has four parameters:

- M, graph: Bipartite graph object of class matrix, sparse matrix, igraph, edgelist, or network object.
- upper, Real or FUN: upper threshold value or function to be applied to the edge weights. Default is 0.
- lower, Real or FUN: lower threshold value or function to be applied to the edge weights. Default is NULL.
- bipartite Boolean: TRUE if bipartite matrix, FALSE if weighted matrix. Default is FALSE.

The function `universal()`

returns a `backbone`

object containing the backbone graph, with either signed (or binary) edge weights, and a data frame called `summary`

, containing the model name (universal threshold), number of rows in M, skew of row sums of M, number of columns of M, skew of column sums of M, and running time. The `universal()`

function can be used in a variety of different ways, demonstrated in the following examples.

Using the `davis`

dataset, if we input the projected matrix `G <- davis%*%t(davis)`

, we can use the universal threshold on the weighted matrix `G`

. If we set an upper threshold of 0, then if two women have attended any event together (co-attendance > 0), there will be an edge between the two. We can plot this graph with the `igraph`

package.

```
G <- davis%*%t(davis) #projected davis dataset, a weighted graph
universal_bb <- universal(G, upper = 0)
#> This matrix object looks like a weighted undirected network containing 18 nodes.
#> Warning in universal(G, upper = 0): The input data is treated as unipartite
universal_bb$backbone
#> EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
#> EVELYN 0 1 1 1 1 1 1 1 1
#> LAURA 1 0 1 1 1 1 1 1 1
#> THERESA 1 1 0 1 1 1 1 1 1
#> BRENDA 1 1 1 0 1 1 1 1 1
#> CHARLOTTE 1 1 1 1 0 1 1 0 1
#> FRANCES 1 1 1 1 1 0 1 1 1
#> ELEANOR 1 1 1 1 1 1 0 1 1
#> PEARL 1 1 1 1 0 1 1 0 1
#> RUTH 1 1 1 1 1 1 1 1 0
#> VERNE 1 1 1 1 1 1 1 1 1
#> MYRNA 1 1 1 1 0 1 1 1 1
#> KATHERINE 1 1 1 1 0 1 1 1 1
#> SYLVIA 1 1 1 1 1 1 1 1 1
#> NORA 1 1 1 1 1 1 1 1 1
#> HELEN 1 1 1 1 1 1 1 1 1
#> DOROTHY 1 1 1 1 0 1 1 1 1
#> OLIVIA 1 0 1 0 0 0 0 1 1
#> FLORA 1 0 1 0 0 0 0 1 1
#> VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
#> EVELYN 1 1 1 1 1 1 1 1 1
#> LAURA 1 1 1 1 1 1 1 0 0
#> THERESA 1 1 1 1 1 1 1 1 1
#> BRENDA 1 1 1 1 1 1 1 0 0
#> CHARLOTTE 1 0 0 1 1 1 0 0 0
#> FRANCES 1 1 1 1 1 1 1 0 0
#> ELEANOR 1 1 1 1 1 1 1 0 0
#> PEARL 1 1 1 1 1 1 1 1 1
#> RUTH 1 1 1 1 1 1 1 1 1
#> VERNE 0 1 1 1 1 1 1 1 1
#> MYRNA 1 0 1 1 1 1 1 1 1
#> KATHERINE 1 1 0 1 1 1 1 1 1
#> SYLVIA 1 1 1 0 1 1 1 1 1
#> NORA 1 1 1 1 0 1 1 1 1
#> HELEN 1 1 1 1 1 0 1 1 1
#> DOROTHY 1 1 1 1 1 1 0 1 1
#> OLIVIA 1 1 1 1 1 1 1 0 1
#> FLORA 1 1 1 1 1 1 1 1 0
universal_bb$summary
#> Model Summary
#> Model Universal Threshold
#> Input Class matrix
#> Bipartite FALSE
#> Symmetric TRUE
#> Weighted TRUE
#> Number of Rows 18
#> Number of Columns 18
```

```
graph <- igraph::graph_from_adjacency_matrix(universal_bb$backbone, mode = "undirected")
op <- par(mar=c(0,0,0,0))
lo <- igraph::layout_(graph, igraph::with_fr())
plot(graph, vertex.label = 1:18, layout = lo)
```

We can also use the `universal()`

function on the original bipartite data. When inputting bipartite data, we set parameter `bipartite = TRUE`

. The bipartite matrix will be multiplied by its transpose before the threshold is applied. Below, we input the bipartite matrix `davis`

with the same threshold values as before, returning the same backbone matrix.

```
universal_bb <- universal(davis, upper = 0, bipartite = TRUE)
#> This matrix object looks like an unweighted bipartite network of 18 agents and 14 artifacts.
universal_bb$summary
#> Model Summary
#> Model Universal Threshold
#> Input Class matrix
#> Bipartite TRUE
#> Symmetric FALSE
#> Weighted FALSE
#> Number of Rows 18
#> Number of Columns 14
graph <- igraph::graph_from_adjacency_matrix(universal_bb$backbone, mode = "undirected")
op <- par(mar=c(0,0,0,0))
plot(graph, vertex.label = 1:18, layout = lo)
```

To create a signed backbone, we can apply both an upper and lower threshold value. For instance, we could choose to retain a positive edge if the women attended more than 4 events together, and a negative edge if they attended less than 2 events together (co-attendance of 0 or 1 events). We can do this with the following code. Note that the returned backbone matrix now has both \(+1\) and \(-1\) values.

```
universal_bb <- universal(davis, upper = 4, lower = 2, bipartite = TRUE)
#> This matrix object looks like an unweighted bipartite network of 18 agents and 14 artifacts.
universal_bb$backbone
#> EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
#> EVELYN 0 1 1 1 0 0 0 0 0
#> LAURA 1 0 1 1 0 0 0 0 0
#> THERESA 1 1 0 1 0 0 0 0 0
#> BRENDA 1 1 1 0 0 0 0 0 0
#> CHARLOTTE 0 0 0 0 0 0 0 -1 0
#> FRANCES 0 0 0 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 0 0 0 0 0
#> PEARL 0 0 0 0 -1 0 0 0 0
#> RUTH 0 0 0 0 0 0 0 0 0
#> VERNE 0 0 0 0 -1 -1 0 0 0
#> MYRNA 0 -1 0 -1 -1 -1 -1 0 0
#> KATHERINE 0 -1 0 -1 -1 -1 -1 0 0
#> SYLVIA 0 0 0 0 -1 -1 0 0 0
#> NORA 0 0 0 0 -1 -1 0 0 0
#> HELEN -1 0 0 0 -1 -1 0 -1 0
#> DOROTHY 0 -1 0 -1 -1 -1 -1 0 0
#> OLIVIA -1 -1 -1 -1 -1 -1 -1 -1 -1
#> FLORA -1 -1 -1 -1 -1 -1 -1 -1 -1
#> VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
#> EVELYN 0 0 0 0 0 -1 0 -1 -1
#> LAURA 0 -1 -1 0 0 0 -1 -1 -1
#> THERESA 0 0 0 0 0 0 0 -1 -1
#> BRENDA 0 -1 -1 0 0 0 -1 -1 -1
#> CHARLOTTE -1 -1 -1 -1 -1 -1 -1 -1 -1
#> FRANCES -1 -1 -1 -1 -1 -1 -1 -1 -1
#> ELEANOR 0 -1 -1 0 0 0 -1 -1 -1
#> PEARL 0 0 0 0 0 -1 0 -1 -1
#> RUTH 0 0 0 0 0 0 0 -1 -1
#> VERNE 0 0 0 0 0 0 0 -1 -1
#> MYRNA 0 0 0 0 0 0 0 -1 -1
#> KATHERINE 0 0 0 1 1 0 0 -1 -1
#> SYLVIA 0 0 1 0 1 0 0 -1 -1
#> NORA 0 0 1 1 0 0 -1 0 0
#> HELEN 0 0 0 0 0 0 -1 -1 -1
#> DOROTHY 0 0 0 0 -1 -1 0 -1 -1
#> OLIVIA -1 -1 -1 -1 0 -1 -1 0 0
#> FLORA -1 -1 -1 -1 0 -1 -1 0 0
```

We can also choose a threshold that is a multiple of some function, such as mean, max, or min. The function is applied to the edge weights, and then multiplied by the upper and lower thresholds. Any \(G_{ij}\) values above the upper threshold are counted as a positive \(+1\) value in the backbone, and any below the lower threshold are counted as a negative \(-1\) value in the backbone. The following code will return a backbone where the positive edges indicate two women attended more than 1 standard deviation above the mean number of events and negative edges indicate two women attended less than 1 standard deviation below the mean number of events.

```
universal_bb <- universal(davis,
upper = function(x)mean(x)+sd(x),
lower=function(x)mean(x)-sd(x),
bipartite = TRUE)
#> This matrix object looks like an unweighted bipartite network of 18 agents and 14 artifacts.
```

Here, the `davis`

matrix has first been projected. Then, the standard deviation of the \(G_{ij}\) entries is calculated and added to (or subtracted from) to the mean of the \(G_{ij}\) values. This value is then used to threshold the projected matrix for the positive (or negative) entries.

The methods described above can be applied to any weighted graph \(G\). In this section we describe methods that are designed for weighted graphs that are the result of bipartite projections. They differ from other methods because they take into account the information contained in the original bipartite graph \(B\). Specifically, these methods are conditioned on the bipartite graph’s two degree sequences: the row vertex degrees (i.e. row marginals) and column vertex degrees (i.e. column marginals). We compare the values of \(G_{ij} = (BB^T)_{ij}\) to the probability distributions that describe \(G^*_{ij} = (B^*B^{*T})_{ij}\) for all bipartite graphs \(B^*\) that satisfy the row and column vertex degree restrictions we choose.

The backbone package lets the user choose which of the row and column vertex degrees they would like to restrict by specifying a null model:

`fixedrow()`

- row degrees in \(B^*\) exactly match those in \(B\)`fixedcol()`

- column degrees in \(B^*\) exactly match those in \(B\)`fdsm()`

- row and column degrees in \(B^*\) exactly match those in \(B\)`sdsm()`

- expected row and column degrees in \(B^*\) match those in \(B\) (recommended)`fixedfill()`

- \(B^*\) contains the same number of 1s as \(B\)

The backbone can then be extracted for a given \(\alpha\) level using the `backbone.extract()`

function. In this section, we first describe `backbone.extract()`

, then illustrate its use for each of functions mentioned above.

The null model functions `fdsm()`

, `sdsm()`

, `fixedrow()`

, `fixedcol()`

, and `fixedfill()`

return a `backbone`

class object containing two matrices: a `positive`

matrix containing the probability that (or in the case of `fdsm()`

, the proportion of times that) \(G^*_{ij}\) was greater than or equal to \(G_{ij}\), and a `negative`

matrix containing the number of times \(G^*_{ij}\) was less than or equal to \(G_{ij}\). The `backbone.extract()`

function allows the user to take these positive and negative matrices and return a binary or signed backbone.

The `backbone.extract()`

function has six parameters: `matrix`

, `signed`

, a significance test value `alpha`

, `fwer`

, `class`

, and `narrative`

. The `matrix`

parameter takes in the entire backbone object which is the output of null model functions `fdsm()`

, `sdsm()`

, `fixedrow()`

, `fixedcol()`

, or `fixedfill()`

. If the `signed`

parameter is set to `TRUE`

(the default) a signed backbone is returned, if `FALSE`

a binary backbone is returned.

One can adjust the precision of the significance test, `alpha`

, to refine their backbone results. The value of `alpha`

should be between `0`

and `1`

. The default is `alpha=0.05`

. The statistical test is two-tailed with an area of `alpha/2`

in each tail.

Extracting the backbone of a bipartite projection involves applying this significance test to each of the \(N(N-1)/2\) edges in the projection. Because each of these tests is independent, this can inflate the familywise error rate beyond the desired `alpha`

. The `fwer`

parameter, which is set to NULL by default, offers two ways to correct for this. When `fwer = bonferroni`

, the classical Bonferroni correction is applied. When `fwer = holm`

, the more powerful Holm-Bonferroni correction is applied.

If an entry in the `positive`

matrix is less than or equal to the `alpha`

/2 value, it is considered a `+1`

edge in the backbone. If an entry in the `negative`

matrix is less than or equal to the `alpha`

/2 value, it is considered a `-1`

edge in the backbone. All other values are `0`

in the backbone graph. The `backbone.extract()`

function will return a backbone graph of the same class and input parameter `class`

. This can be one of “original”, “matrix”, “sparseMatrix”, “igraph”, “network”, or “edgelist”. If “original”, the backbone graph returned is of the same class as the data inputted in one of null model functions.

When `narrative`

is set to `TRUE`

, `backbone.extract()`

will provide text describing the generated backbone graph that could be included in a manuscript. This text includes citations for the applied backbone methods.

We demonstrate this function’s use in the following sections.

To compare the observed bipartite projection to projections arising from an ensemble of bipartite graphs where the row degrees of \(B^*\) exactly match the row degrees of \(B\), but the column degrees are unconstrained, one can use `fixedrow()`

. This function applies the hypergeometric distribution to the bipartite graph `B`

, and in earlier versions was called the Hypergeometric Model using `hyperg()`

.

The FRM compares an edge’s observed weight, \(G_{ij}\) to the distribution of weights expected in a projection obtained from a random bipartite network where the row vertex degrees are fixed, but the column vertex degrees are allowed to vary. This method of backbone extraction was developed in (Tumminello et al. 2011) and later in (Neal 2013), which showed that the distribution of \(G^*_{ij}\) when only vertex degrees are fixed is given by the hypergeometric distribution. For documentation on the hypergeometric distribution, see `stats::phyper`

.

The `fixedrow()`

function has one parameter,

- B, graph: Bipartite graph object of class matrix, sparse matrix, igraph, edgelist, or network object.

Following the `fixedrow()`

function, the user must use the `backbone.extract()`

function to find the backbone at a given significance value `alpha`

.

To compare the observed bipartite projection to projections arising from an ensemble of bipartite graphs where the column degrees of \(B^*\) exactly match the column degrees of \(B\), but the row degrees are unconstrained, one can use `fixedcol()`

. This function applies the Poisson binomial distribution to the bipartite graph `B`

.

The FCM compares an edge’s observed weight, \(G_{ij}\) to the distribution of weights expected in a projection obtained from a random bipartite network where the column vertex degrees are fixed, but the row vertex degrees are allowed to vary.

The `fixedcol()`

function has two parameters,

- B, graph: Bipartite graph object of class matrix, sparse matrix, igraph, edgelist, or network object.
- method string: Specifies the method of the Poisson Binomial distribution computation used by .

The probability of edge weights being above or below the observed values are computed using the Poisson Binomial distribution. These values are approximated using a Refined Normal Approximation . The user can change the parameter `method`

to use different methods for computing these values: “RefinedNormal” gives quick, very accurate approximations, while “DivideFFT” gives the quickest exact computations.

Following the `fixedcol()`

function, the user must use the `backbone.extract()`

function to find the backbone at a given significance value `alpha`

.

To compare the observed bipartite projection to projections arising from an ensemble of bipartite graphs where both the row degrees and column degrees of \(B^*\) exactly match the row degrees and column degrees of \(B\), one can use `fdsm(B, trials = 1000)`

where the number of trials can be any positive integer. This function applies the fixed degree sequence model to the bipartite graph `B`

.

The FDSM compares an edge’s observed weight, \(G_{ij}\), to the distribution of weights expected in a projection obtained from a random bipartite network where both the row vertex degrees and column vertex degrees are fixed. This method of backbone extraction was developed in (Zweig and Kaufmann 2011), however the challenge lies in randomly sampling from the space of \(B^*\) with fixed degree sequences. The `fdsm()`

function uses the curveball algorithm (Strona et al. 2014), which is proven to do so (Carstens 2015).

The `fdsm( )`

function has four parameters,

- B, graph: Bipartite graph object of class matrix, sparse matrix, igraph, edgelist, or network object.
- trials, Integer: Number of random bipartite graphs generated. Default is 1000.
- dyad, vector length 2: two row entries i,j. Saves each value of \(G^*_{ij}\), which is useful for visualizing an example of the empirical null edge weight distribution generated by the model. These correspond to the row and column indices of a cell in the projected matrix , and can be written as their string row names or as numeric values. Default is NULL.
- progress, Boolean: If
`utils::txtProgressBar`

should be used to measure progress. Default is FALSE.

In addition to the normal outputs of a bipartite backbone function, when `fdsm()`

is used, one can also return a list of `dyad_values`

. These are a list of edge weights for a given pair \(i,j\) of \(G^*\), during each of the trials. To get these values, we add in the parameter \(dyad\) and specify the two vertices to keep track of.

We can find the backbone using the fixed degree sequence model as follows:

```
fdsm <- fdsm(davis, trials = 100, dyad = c(1,5))
#> This matrix object looks like an unweighted bipartite network of 18 agents and 14 artifacts.
#> Estimated time to complete is 0.2 secs for 100 trials
```

```
fdsm$dyad_values
#> [1] 1 3 4 2 4 3 3 2 3 3 4 3 3 2 3 2 3 2 4 2 3 3 2 3 2 2 3 2 2 1 3 3 3 3 3 2 3
#> [38] 3 3 1 3 3 2 2 1 3 4 2 3 2 3 2 3 2 3 4 3 3 3 3 2 4 1 2 2 1 4 3 3 4 3 2 3 3
#> [75] 3 3 2 3 3 2 2 2 2 3 2 3 3 2 2 2 1 1 2 3 3 2 3 1 2 2
fdsm_bb <- backbone.extract(fdsm, signed = TRUE, alpha = 0.1)
fdsm_bb
#> EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
#> EVELYN 0 0 1 0 0 0 0 0 0
#> LAURA 0 0 0 1 0 0 0 0 0
#> THERESA 1 0 0 0 0 0 0 0 0
#> BRENDA 0 1 0 0 0 0 0 0 0
#> CHARLOTTE 0 0 0 0 0 0 0 0 0
#> FRANCES 0 0 0 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 0 0 0 0 0
#> PEARL 0 0 0 0 0 0 0 0 0
#> RUTH 0 0 0 0 0 0 0 0 0
#> VERNE 0 0 0 0 0 0 0 0 0
#> MYRNA 0 0 0 0 -1 0 0 0 0
#> KATHERINE 0 -1 0 -1 -1 0 0 0 0
#> SYLVIA -1 -1 0 0 0 0 0 0 0
#> NORA -1 -1 0 -1 0 0 0 0 0
#> HELEN -1 0 0 0 0 0 0 0 0
#> DOROTHY 0 0 0 0 0 0 0 0 0
#> OLIVIA 0 0 0 0 0 0 0 0 0
#> FLORA 0 0 0 0 0 0 0 0 0
#> VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
#> EVELYN 0 0 0 -1 -1 -1 0 0 0
#> LAURA 0 0 -1 -1 -1 0 0 0 0
#> THERESA 0 0 0 0 0 0 0 0 0
#> BRENDA 0 0 -1 0 -1 0 0 0 0
#> CHARLOTTE 0 -1 -1 0 0 0 0 0 0
#> FRANCES 0 0 0 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 0 0 0 0 0
#> PEARL 0 0 0 0 0 0 0 0 0
#> RUTH 0 0 0 0 0 0 0 0 0
#> VERNE 0 0 0 0 0 0 0 0 0
#> MYRNA 0 0 0 0 0 0 0 0 0
#> KATHERINE 0 0 0 1 0 0 0 0 0
#> SYLVIA 0 0 1 0 0 0 0 0 0
#> NORA 0 0 0 0 0 0 0 0 0
#> HELEN 0 0 0 0 0 0 0 0 0
#> DOROTHY 0 0 0 0 0 0 0 0 0
#> OLIVIA 0 0 0 0 0 0 0 0 1
#> FLORA 0 0 0 0 0 0 0 1 0
```

The `fdsm_props$dyad_values`

output is a list of the \(G_{1,5}^*\) values for each of the 100 trials, which in these data corresponds to the number of parties Evelyn and Charlotte would be expected to simultaneously attend if: (a) the number of parties attended by Evelyn was fixed, (b) the number of parties attended by Charlotte was fixed, and (c) the number of attendees at each party was fixed. Because we have provided both a `positive`

and `negative`

matrix, `backbone.extract()`

returns a signed backbone matrix by conducting a two-tailed significance test in which `alpha`

is \(0.05\) on each end of the distribution.

To compare the observed bipartite projection to projections arising from an ensemble of bipartite graphs where both the expected row degrees and expected column degrees of \(B^*\) match the row degrees and column degrees of \(B\), one can use `sdsm()`

. This function applies the stochastic degree sequence model and Poisson binomial distribution to the bipartite graph `B`

. This is the model recommended for most bipartite projections (Neal, Domagalski, and Sagan 2021).

The SDSM compares an edge’s observed weight, \(G_{ij}\) to the distribution of weights expected in a projection obtained from a random bipartite network where both the row vertex degrees and column vertex degrees are *approximately* fixed. This method of backbone extraction was developed in (Neal 2014). The distribution of \(G^*_{ij}\) is given by the Poisson binomial distribution (Hong 2013). In order to apply the Poisson binomial distribution we need to have \(P(B_{ij}=1)\) for all values of \(B\). These probabilities are given by the Bipartite Configuration Model (BiCM) (Saracco et al. 2015, 2017). The matrix \(G^*\) is then constructed via the Poisson-Binomial distribution, where the \((i,j)\) entry of \(G\) is the probability of an edge weight begin above or below the observed value in the projection of \(B\).

The `sdsm( )`

function has two parameters,

- method string: Specifies the method of the Poisson Binomial distribution computation used by .

The probability of edge weights being above or below the observed values are computed using the Poisson Binomial distribution. These values are approximated using a Refined Normal Approximation . The user can change the parameter `method`

to use different methods for computing these values: “RefinedNormal” gives quick, very accurate approximations, while “DivideFFT” gives the quickest exact computations.

```
sdsm <- sdsm(davis)
#> This matrix object looks like an unweighted bipartite network of 18 agents and 14 artifacts.
```

The backbone package allows for two different types of family-wise error rate correction: Holm-Bonferroni and Bonferroni. To use Holm-Bonferroni correction, add parameter `fwer = "holm"`

to `backbone.extract()`

, and to use Bonferroni correction, add `fwer = "bonferroni"`

. Note in this case, the Holm-Bonferroni is too restrictive and leaves us with no edges in our backbone graph.

```
sdsm_bb <- backbone.extract(sdsm, signed = FALSE, alpha = 0.1, fwer = "bonferroni")
sdsm_bb
#> EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
#> EVELYN 0 0 0 0 0 0 0 0 0
#> LAURA 0 0 0 0 0 0 0 0 0
#> THERESA 0 0 0 0 0 0 0 0 0
#> BRENDA 0 0 0 0 0 0 0 0 0
#> CHARLOTTE 0 0 0 0 0 0 0 0 0
#> FRANCES 0 0 0 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 0 0 0 0 0
#> PEARL 0 0 0 0 0 0 0 0 0
#> RUTH 0 0 0 0 0 0 0 0 0
#> VERNE 0 0 0 0 0 0 0 0 0
#> MYRNA 0 0 0 0 0 0 0 0 0
#> KATHERINE 0 0 0 0 0 0 0 0 0
#> SYLVIA 0 0 0 0 0 0 0 0 0
#> NORA 0 0 0 0 0 0 0 0 0
#> HELEN 0 0 0 0 0 0 0 0 0
#> DOROTHY 0 0 0 0 0 0 0 0 0
#> OLIVIA 0 0 0 0 0 0 0 0 0
#> FLORA 0 0 0 0 0 0 0 0 0
#> VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
#> EVELYN 0 0 0 0 0 0 0 0 0
#> LAURA 0 0 0 0 0 0 0 0 0
#> THERESA 0 0 0 0 0 0 0 0 0
#> BRENDA 0 0 0 0 0 0 0 0 0
#> CHARLOTTE 0 0 0 0 0 0 0 0 0
#> FRANCES 0 0 0 0 0 0 0 0 0
#> ELEANOR 0 0 0 0 0 0 0 0 0
#> PEARL 0 0 0 0 0 0 0 0 0
#> RUTH 0 0 0 0 0 0 0 0 0
#> VERNE 0 0 0 0 0 0 0 0 0
#> MYRNA 0 0 0 0 0 0 0 0 0
#> KATHERINE 0 0 0 0 0 0 0 0 0
#> SYLVIA 0 0 0 0 0 0 0 0 0
#> NORA 0 0 0 0 0 0 0 0 0
#> HELEN 0 0 0 0 0 0 0 0 0
#> DOROTHY 0 0 0 0 0 0 0 0 0
#> OLIVIA 0 0 0 0 0 0 0 0 0
#> FLORA 0 0 0 0 0 0 0 0 0
```

To compare the observed bipartite graph to a distribution where the number of ones in \(B^*\) match the number of ones in \(B\), one can use `fixedfill()`

.

The FFM compares an edge’s observed weight, \(G_{ij}\) to the distribution of weights expected in a projection obtained from a random bipartite network with the same number of edges as `B`

.

The `fixedfill()`

function has one parameter,

Following the `fixedfixed()`

function, the user must use the `backbone.extract()`

function to find the backbone at a given significance value `alpha`

.

Carstens, C. J. 2015. “Proof of Uniform Sampling of Binary Matrices with Fixed Row Sums and Column Sums for the Fast Curveball Algorithm.” *Physical Review E* 91 (4). https://doi.org/10.1103/PhysRevE.91.042812.

Davis, Allison, Burleigh B Gardner, and Mary R Gardner. 1941. *Deep South: A Social Anthropological Study of Caste and Class*. University of Chicago Press. https://doi.org/10.1177/0002716242220001105.

Hong, Yili. 2013. “On Computing the Distribution Function for the Poisson Binomial Distribution.” *Computational Statistics & Data Analysis* 59 (March): 41–51. https://doi.org/10.1016/j.csda.2012.10.006.

Neal, Zachary. 2013. “Identifying Statistically Significant Edges in One-Mode Projections.” *Social Network Analysis and Mining* 3 (4): 915–24. https://doi.org/10.1007/s13278-013-0107-y.

———. 2014. “The Backbone of Bipartite Projections: Inferring Relationships from Co-Authorship, Co-Sponsorship, Co-Attendance and Other Co-Behaviors.” *Social Networks* 39 (October): 84–97. https://doi.org/10.1016/j.socnet.2014.06.001.

Neal, Z. P., R. Domagalski, and B. Sagan. 2021. “Comparing Models for Extracting the Backbone of Bipartite Projections.” *arXiv*, 2105.13396. https://arxiv.org/abs/2105.13396.

Repository, UCI Network Data. 2006. “Southern Women Data Set.” https://networkdata.ics.uci.edu/netdata/html/davis.html.

Saracco, Fabio, Riccardo Di Clemente, Andrea Gabrielli, and Tiziano Squartini. 2015. “Randomizing Bipartite Networks: The Case of the World Trade Web.” *Scientific Reports* 5 (11): 10595. https://doi.org/10.1038/srep10595.

Saracco, Fabio, Mika J. Straka, Riccardo Di Clemente, Andrea Gabrielli, Guido Caldarelli, and Tiziano Squartini. 2017. “Inferring Monopartite Projections of Bipartite Networks: An Entropy-Based Approach.” *New Journal of Physics* 19 (5): 053022. https://doi.org/10.1088/1367-2630/aa6b38.

Strona, Giovanni, Domenico Nappo, Francesco Boccacci, Simone Fattorini, and Jesus San-Miguel-Ayanz. 2014. “A Fast and Unbiased Procedure to Randomize Ecological Binary Matrices with Fixed Row and Column Totals.” *Nature Communications* 5 (June): 4114. https://doi.org/10.1038/ncomms5114.

Tumminello, Michele, Salvatore Miccichè, Fabrizio Lillo, Jyrki Piilo, and Rosario N. Mantegna. 2011. “Statistically Validated Networks in Bipartite Complex Systems.” *PLOS ONE* 6 (3): e17994. https://doi.org/10.1371/journal.pone.0017994.

Zweig, Katharina Anna, and Michael Kaufmann. 2011. “A Systematic Approach to the One-Mode Projection of Bipartite Graphs.” *Social Network Analysis and Mining* 1 (3): 187–218. https://doi.org/10.1007/s13278-011-0021-0.