Overlapping Community Detection Using Seed Set Expansion


The Program

We propose an efficient overlapping community detection algorithm using a seed expansion approach. In particular, we develop effective seeding strategies for a personalized PageRank scheme that optimizes the conductance community score. The key idea of our algorithm is to find good seeds, and then greedily expand these seeds based on a community metric. We name our algorithm 'NISE' by abbreviating our main idea, Neighborhood-Inflated Seed Expansion.


Download

The code is released under the GNU Public License (GPL). Our algorithm is written in a mixture of C++ and MATLAB. The high level interface is written in MATLAB.

Download and extract the files. Once you prompt MATLAB, please type 'compile' inside the main directory.


Usage

C = nise(A, k);
Input:
	A: adjacency matrix
	k: the number of communities
C = nise(A, k, seeding, ego, expansion, nworkers, fid);
Optional Input:
	seeding: seeding strategy ('hrc_graclus' or 'sphub')
	ego: neighborhood inflation (true or false) -- set this to be true for large networks (e.g., more than 400 nodes).
	expansion: expansion method ('ppr' or 'vppr')
	nworkers: the number of threads (parallel expansion of the seeds)
	fid: output file ID
Output:
	C: assignment matrix (no. of nodes x no. of clusters)
	

Citation

Please acknowledge the use of the code with a citation.

Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion, J. J. Whang, D. F. Gleich, and I. S. Dhillon, IEEE Transactions on Knowledge and Data Engineering (TKDE), May 2016. [pdf]
@article{whang-tkde2016,
  author  = {Joyce Jiyoung Whang and David F. Gleich and Inderjit S. Dhillon},
  title   = {Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion},
  journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
  year    = 2016,
  number  = 5,
  pages   = {1272--1284},
  month   = May,
  volume  = 28
}
Overlapping Community Detection Using Seed Set Expansion, J. J. Whang, D. F. Gleich, and I. S. Dhillon, ACM International Conference on Information and Knowledge Management (CIKM), October 2013. [pdf]
@inproceedings{whang_cikm13,
  title ={Overlapping Community Detection Using Seed Set Expansion},
  author={Joyce Jiyoung Whang and David F. Gleich and Inderjit S. Dhillon},
  booktitle = {ACM International Conference on Information and Knowledge Management},
  pages = {2099--2108},
  year = {2013}
}
Bug reports and comments are always appreciated.