Posts

Image
  Distributed Graph Embedding with Information-Oriented Random Walks A summary of the PVLDB 2023 research  paper  by Peng Fang, Arijit Khan, Siqiang Luo, Fang Wang, Dan Feng, Zhenli Li, Wei Yin, and Yuchao Cao Background [Graph Embedding]: Graph embedding maps graph nodes to low-dimensional vectors and is widely adopted in machine learning tasks such as link prediction [1], node classification [2], clustering [3], and recommendation [4]. Sequential graph embedding techniques [5] fall into three categories. (1) Matrix factorization-based algorithms [6, 7, 8, 9, 10] construct feature representations based on the adjacency or Laplacian matrix and involve spectral techniques [11]. (2) Graph neural networks (GNNs)-based approaches [12, 13, 14, 15, 16] focus on generalizing graph spectra into semi-supervised or supervised graph learning. Both techniques incur high computational overhead and DRAM dependencies, limiting their scalability to large graphs. (3) A plethora of random walk-base
Image
  Neighborhood-based Hypergraph Core Decomposition A summary of the PVLDB 2023 research paper  by Naheed Anjum Arafat, Arijit Khan, Arpit Kumar Rai, and Bishwamittra Ghosh Background: Many real-world relations consist of polyadic entities, e.g., relations between individuals in co-authorships [1], legislators in parliamentary voting [2], items in e-shopping carts [3], proteins in protein complexes, and metabolites in a metabolic process [4, 5]. In these scenarios, the hypergraph is a natural data model where an edge may connect more than two entities. Recently, there has been a growing interest in hypergraph data management [6, 7, 8, 9, 10, 11]. We study neighborhood-based hypergraph core decomposition (Figure 1(a)): a novel way of decomposing hypergraphs into hierarchical neighborhood-cohesive subhypergraphs. It decomposes a hypergraph into nested, strongly-induced maximal subhypergraphs such that all the nodes in every subhypergraph have at least a certain number of neighbors i
Image
Most Probable Densest Subgraphs A summary of the IEEE International Conference on Data Engineering (ICDE) 2023 research  paper  by Arkaprava Saha, Xiangyu Ke, Arijit Khan, and Cheng Long Background:  The discovery of dense subgraphs has attracted extensive attention in the data management community [1], [2], [3], [4], [5].They may correspond to communities [6], filter bubbles and echo chambers [7], [8] in social networks, brain regions responding to stimuli [9] or related to diseases [10], and commercial value motifs in financial domains [11]. They also have wide applications in graph compression and visualization [12], [13], [14], indexing for reachability and distance queries [15], [16], and social piggybacking [17]. Densest subgraphs usually maximize some notion of density in a given graph, e.g., the edge density [1], defined as the ratio of the number of induced edges to the number of nodes in a subgraph. Although there are an exponential number of subgraphs, a densest subgraph