Identification and Estimation of Semi-Parametric Link Formation Models with Externalities

May 23rd, 2025

Context

Summary

Novel semi-parametric framework

Identifying link formation with externalities in socio-economic networks

Recursive estimation integrating kernel density and method of moments elements.

Motivation

Making friends is easy!

Motivation

Making friends is easy!

Image: Generated by Microsoft Designer

Motivation

Making friends was easy!

Image: Generated by Microsoft Designer

Motivation

Making friends was easy!

Image: Generated by Microsoft Designer

Evidence

Florentine marriages and business dealings
  • Clustering: 0.46
  • Random: 0.18
Data: (Padgett and Ansell 1993), Image: Matthew Jackson

Evidence

Credit: Matthew Jackson

Previous Work

  • Mele (2017); Miyauchi (2016); Sheng (2020); Ridder and Sheng (2015); Ridder and Sheng (2020); Menzel (2015)
  • Jackson and Rogers (2007); Goldsmith-Pinkham and Imbens (2013);

Model

\(G_{ij}\)\(\; = {\Large 𝟙}\bigg\{\) \(\mathrm{H}_{i}\)\(\;+\;\)\(\mathrm{H}_{j}\) \(\ge\;\)\(U_{ij}\)\(\;\bigg\}\)

  • \(G_{ij}\): binary link formation variable.
  • \(\mathrm{H}= {(\mathrm{H}_{i})}_{i\in[n]}\): i.i.d. unobservable heterogeneity
    • \(\mathrm{supp}\left(\mathrm{H}_{i}\right)=\mathcal{H}=\mathbb{R}\).
  • \(U = {(U_{ij})}_{i, j\in[n]} \overset{i.i.d.}{\sim} F_{U}\): unobservable shocks

Model

\(G_{ij} = {\Large 𝟙}\bigg\{\) \(h\)\((\)\(X_{i}\)\(,\,\)\(X_{j}\)\()\;+\;\) \(\mathrm{H}_{i} + \mathrm{H}_{j}\) \(\ge U_{ij}\bigg\}\)

  • \(X = {(X_{i})}_{i\in[n]} \overset{i.i.d.}{\sim} F_{X}\) observable characteristics
    • \(\mathrm{supp}\left(X_{i}\right) \subseteq\mathbb{R}^{k}\).
  • \(h\colon {\mathcal{X}}^{2} \to \mathbb{R}\) unknown function
    • symmetric
    • \(h(X_{i}, X_{j}) = h_{d}\) for all \(X_{i}=X_{j}\).

Model

\(G_{ij} = {\Large 𝟙}\bigg\{\) \(h\)\((X_{i}, X_{j}) +\;\) \(\mathrm{H}_{i} + \mathrm{H}_{j}\) \(\ge U_{ij}\bigg\}\)

\(h\colon {\mathcal{X}}^{2} \to \mathbb{R}\) examples:

  • \(h(X_{i}, X_{j}) = \alpha\, \lVert X_{i} - X_{j} \rVert_{2}\)
  • \(h(X_{i}, X_{j}) = \alpha\, \mathrm{e}^{-{\lVert X_{i} - X_{j} \rVert}^{2}_{2}}\)

Model

\(G_{ij} = {\Large 𝟙}\bigg\{\) \(h(X_{i}, X_{j}) +\;\) \(\mathrm{H}_{i} + \mathrm{H}_{j}\) \(+ \beta \displaystyle\sum_{k\in\gamma_{n}(i, j)} \mathrm{H}_{k}\) \(\ge U_{ij}\bigg\}\)

Model

\(G_{ij} = {\Large 𝟙}\bigg\{\) \(h(X_{i}, X_{j}) +\;\) \(\mathrm{H}_{i} + \mathrm{H}_{j}\) \(+ \beta \displaystyle\sum_{k\in{\color{red}\gamma_{\color{red}n}}(i, j)} \mathrm{H}_{k}\) \(\ge U_{ij}\bigg\}\)

  • \(\gamma_{n}\colon D_{n} \to \mathcal{P}\left([n]\right)\) known functional form
    • \(D_{n}=\left\{(i, j)\in{[n]}^{2}\colon i\neq j\right\}\)
    • symmetric
    • \(i, j \notin \gamma_{n}(i, j)\) for all \((i, j)\in D_{n}\)

Externalities

  • Common Friends
  • Cliques

Identification

Assumptions

  1. \(F_{U}\) is continuous and strictly increasing
  2. \((X, \mathrm{H}) \perp U\)
  3. \(\mathrm{supp}\left(\mathrm{H}\mid X, \gamma\right) = \mathbb{R}\)
  4. \(F_{U} = F_{U\mid \gamma}\) and \(F_{X} = F_{X\mid \gamma}\)
  5. For all \(i,j\in\mathbb{N}\), the sequence \((\gamma_{n}(i,j))_{n\ge i,j}\) is finally constant.
  6. \(\lim\inf_{n \to \infty} \{i\in[n]\colon \forall j \neq i\; \gamma(i,j) = \emptyset\}\) is countably infinite

Identification up to Normalization

Theorem 1 Under assumptions (A1)-(A6) and a know hyper-diagonal value \(h_{d}\) the fixed effects, the error distribution, the externality parameter, and the homophily function are asymptotically uniquely identifiable up to an interquantile normalization.

Estimation

Initialization

  • Notation: \(v_{ij}=\eta_i+\eta_j+\beta \gamma_{ij}\)
  • Issues:
    • We need \(F_u\) to estimate \(v_{ij}\) and \(v_{ij}\) to estimate \(F_u\)
    • The functional form of \(v_{ij}\) is different for each pair (due to \(\gamma_{ij}\))

Initialization

  • Select pairs without externalities \(\mathcal{M} \subseteq \mathcal{L}\)
  • Pairs in \(\mathcal{M}\) satisfy \(v_{ij}=\eta_i+\eta_j\)
  • And the sufficency condition \[\mathbb{P}\left(g_{ij}=1 \mid \boldsymbol{\eta}\right) =F_u(v_{ij})\]
  • \(F_u\) and \(\boldsymbol{\eta}\) can be estimated simultaneously (KS or DBMM estimators)

Initialization

  • \(F_u\) and \(\boldsymbol{\eta}\) can be estimated simultaneously (KS or DBMM estimators)

\[\begin{align} F_u(v) &= \mathbb{P}\left(G_{ij}=1|\boldsymbol{\eta},\beta\right) \\ &= \frac{\mathbb{P}\left(G_{ij}=1 \right) f_{v|G_{ij=1}}(v)}{\mathbb{P}\left(G_{ij}=1 \right) f_{v|G_{ij=1}}(v) + \mathbb{P}\left(G_{ij}=0 \right) f_{v|G_{ij=0}}(v)} \\ &\overset{def}{=} \frac{p_{1}(v)}{p_{1}(v) + p_{0}(v)} \end{align}\]

Initialization

  • \(F_u\) and \(\boldsymbol{\eta}\) can be estimated simultaneously (KS or DBMM estimators)

  • Pick normalized candidate \(\boldsymbol{\eta}\), and bandwidth \(b_{0}\)

    \(p_{1} (v_{ij};\, \boldsymbol{\eta})= \frac{1}{b^0(|\mathcal{M}|-1)} \displaystyle{\sum_{km \in\{\mathcal{M} - \{ij\}\}}} {\Large 𝟙}_{\{g_{km}=1\}} K \left(\frac{v_{ij}-v_{km}}{b^0} \right)\)

    \(p_{0} (v_{ij};\, \boldsymbol{\eta})= \frac{1}{b^0(|\mathcal{M}|-1)} \displaystyle{\sum_{km \in\{\mathcal{M} - \{ij\}\}}} {\Large 𝟙}_{\{g_{km}=0\}} K \left(\frac{v_{ij}-v_{km}}{b^0} \right)\)

Initialization

  • \(F_u\) and \(\boldsymbol{\eta}\) can be estimated simultaneously (KS or DBMM estimators)
  • \(\displaystyle\max_{\boldsymbol{\eta}} \mathbb{P}\left(G \mid \boldsymbol{\eta}, p_{1}(\cdot;\,\boldsymbol{\eta}), p_{0}(\cdot;\,\boldsymbol{\eta})\right)\)
  • Obtain \(\boldsymbol{\hat\eta^{0}}\) and \(\hat F_u^{0}\)

Initialization

  • Obtain \(\boldsymbol{\hat\eta^{0}}\) and \(\hat F_u^{0}\)

\(\hat p_{1} (v;\, \boldsymbol{\hat\eta^{0}})= \frac{1}{b^0|\mathcal{M}|} \displaystyle{\sum_{km \in \mathcal{M}}} {\Large 𝟙}_{\{g_{km}=1\}} K \left(\frac{v-\hat\eta_{k}^{0} -\hat\eta_{m}^{0}}{b^0} \right)\)

\(\hat p_{0} (v;\, \boldsymbol{\hat\eta^{0}})= \frac{1}{b^0|\mathcal{M}|} \displaystyle{\sum_{km \in\mathcal{M}}} {\Large 𝟙}_{\{g_{km}=0\}} K \left(\frac{v-\hat\eta_{k}^{0} -\hat\eta_{m}^{0}}{b^0} \right)\)

Step

  • The first step estimates were consistent but inefficient
  • With an estimate of \(F_u\), we can use all links in \(\mathcal{L}\)
  • \(\displaystyle\max_{\boldsymbol{\eta},\beta} \mathbb{P}\left(G \mid \boldsymbol{\eta},\hat{F}^{0}_u(\boldsymbol{\eta},\beta)\right)\)
  • Obtain \(\boldsymbol{\hat\eta}^1,\hat\beta^1\) and \(\hat{F}_u^{1}\)

Step

  • Obtain \(\boldsymbol{\hat\eta}^1,\hat\beta^1\), and \(\hat{F}_u^{1}\)

\(\hat p_{1} (v;\, \boldsymbol{\hat\eta^{1}}, \hat\beta^1)= \frac{1}{b^1|\mathcal{L}|} \displaystyle{\sum_{km \in \mathcal{L}}} {\Large 𝟙}_{\{g_{km}=1\}} K \left(\frac{v-\hat\eta_{k}^{1} - \hat\eta_{m}^{1} - \hat\beta^1\sum_{j\in\gamma(k,m)}\hat\eta_{j}^{1} }{b^1} \right)\)

\(\hat p_{0} (v;\, \boldsymbol{\hat\eta^{1}}, \hat\beta^1)= \frac{1}{b^1|\mathcal{L}|} \displaystyle{\sum_{km \in\mathcal{L}}} {\Large 𝟙}_{\{g_{km}=0\}} K \left(\frac{v-\hat\eta_{k}^{1} - \hat\eta_{m}^{1} - \hat\beta^1\sum_{j\in\gamma(k,m)}\hat\eta_{j}^{1} }{b^1} \right)\)

Recursion

  • Iteration between estimation of \(F_u\) and estimation of \(\beta, \boldsymbol{\eta}\)
  • Each step: efficiency gain
  • Iteration until convergence

Simulation

  • \(n=100\) draws
  • \(\eta \overset{i.i.d.}{\sim} \mathcal{U}(-2,2)\)
  • \(\mathrm{e}^u \overset{i.i.d.}{\sim} \mathcal{N}(0,1)\)
  • \(\beta = 0.25\)
  • \(h\equiv 0\)
  • Erdos-Renyi past network with \(p=0.3\)
  • Common friends selection

Simulation

Initialization

Step 1

Identification and Estimation of Semi-Parametric Link Formation Models with Externalities

  • Novel externality formulation accomodating many commonly occuring patterns in socio-economic networks
  • Semi-parametric approach relying on spartan a set of assumptions
  • Identification up to normalization of link formation models with externalities
  • Recursive estimation combining kernel density and method of moments components
Pantelis Karapanagiotis
p.karapanagiotis@rug.nl
Sanna Stephan
l.s.stephan@rug.nl

References

Adamic, Lada A. 1999. “The Small World Web.” In Research and Advanced Technology for Digital Libraries, edited by Gerhard Goos, Juris Hartmanis, Jan Van Leeuwen, Serge Abiteboul, and Anne-Marie Vercoustre, 1696:443–52. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-48155-9_27.
Goldsmith-Pinkham, Paul, and Guido W Imbens. 2013. “Social Networks and the Identification of Peer Effects.” Journal of Business & Economic Statistics 31 (3): 253–64. https://doi.org/10.1080/07350015.2013.801251.
Goyal, Sanjeev, Marco J. Van Der Leij, and José Luis Moraga-González. 2006. “Economics: An Emerging Small World.” Journal of Political Economy 114 (2): 403–12. https://doi.org/10.1086/500990.
Jackson, Matthew O, and Brian W Rogers. 2007. “Meeting Strangers and Friends of Friends: How Random Are Social Networks?” American Economic Review 97 (3): 890–915. https://doi.org/10.1257/aer.97.3.890.
MacRae, Duncan. 1960. “Direct Factor Analysis of Sociometric Data.” Sociometry 23 (4): 360. https://doi.org/10.2307/2785690.
Mele, Angelo. 2017. “A Structural Model of Dense Network Formation.” Econometrica 85 (3): 825–50. https://doi.org/10.3982/ECTA10400.
Menzel, Konrad. 2015. “Strategic Network Formation with Many Agents.”
Miyauchi, Yuhei. 2016. “Structural Estimation of Pairwise Stable Networks with Nonnegative Externality.” Journal of Econometrics 195 (2): 224–35. https://doi.org/10.1016/j.jeconom.2016.08.001.
Newman, M. E. J. 2001. “The Structure of Scientific Collaboration Networks.” Proceedings of the National Academy of Sciences 98 (2): 404–9. https://doi.org/10.1073/pnas.98.2.404.
Padgett, John F., and Christopher K. Ansell. 1993. “Robust Action and the Rise of the Medici, 1400-1434.” American Journal of Sociology 98 (6): 1259–319. https://doi.org/10.1086/230190.
Ridder, Geert, and Shuyang Sheng. 2015. “Estimation of Large Network Formation Games.” Work. Pap., Univ. South. Calif., Los Angeles.
———. 2020. “Two-Step Estimation of a Strategic Network Formation Model with Clustering.” arXiv Preprint arXiv:2001.03838. https://doi.org/10.48550/arXiv.2001.03838.
Sheng, Shuyang. 2020. “A Structural Econometric Analysis of Network Formation Games Through Subnetworks.” Econometrica 88 (5): 1829–58. https://doi.org/10.3982/ECTA12558.