Andra Lungu created FLINK-2661:
----------------------------------
Summary: Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs
Key: FLINK-2661
URL:
https://issues.apache.org/jira/browse/FLINK-2661 Project: Flink
Issue Type: Task
Components: Gelly
Affects Versions: 0.10
Reporter: Andra Lungu
Assignee: Andra Lungu
Skewed graphs raise unique challenges to computation models such as Gelly's vertex-centric or GSA iterations. This is mainly because of the fact that these approaches uniformly process vertices regardless of their degree distribution.
In vertex-centric, for instance, a skewed node will take more time to process its neighbors compared to the other nodes in the graph. The first will act as a straggler causing the latter to remain idle until it finishes its computation.
This issue can be mitigated by splitting a high-degree node into subnodes and evenly distributing the edges to the the resulted subvertices. The computation will then be performed on the split vertex.
To this end, we should add a Splitting API on top of Gelly which can help:
- determine skewed nodes
- split them
- merge them back at the end of the computation, given a user defined combiner.
To illustrate the usage of these methods, we should add an example as well as a separate entry in the documentation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)