[jira] [Created] (FLINK-11595) Gelly addEdge in certain circumstances still include duplicate vertices.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-11595) Gelly addEdge in certain circumstances still include duplicate vertices.

Shang Yuanchun (Jira)
Calvin Han created FLINK-11595:
----------------------------------

             Summary: Gelly addEdge in certain circumstances still include duplicate vertices.
                 Key: FLINK-11595
                 URL: https://issues.apache.org/jira/browse/FLINK-11595
             Project: Flink
          Issue Type: Bug
          Components: Gelly
    Affects Versions: 1.7.1
         Environment: MacOS, intelliJ
            Reporter: Calvin Han


Assuming a base graph constructed by:

```

public class GraphCorn {

 public static Graph<String, VertexLabel, EdgeLabel> gc;

 public GraphCorn(String filename) throws Exception {
 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

 DataSet<Tuple6<String, String, String, String, String, String>> csvInput = env.readCsvFile(filename)
 .types(String.class, String.class, String.class, String.class, String.class, String.class);

 DataSet<Vertex<String, VertexLabel>> srcTuples = csvInput.project(0, 2)
 .map(new MapFunction<Tuple, Vertex<String, VertexLabel>>() {
 @Override
 public Vertex<String, VertexLabel> map(Tuple tuple) throws Exception {
 VertexLabel lb = new VertexLabel(Util.hash(tuple.getField(1)));
 return new Vertex<>(tuple.getField(0), lb);
 }
 }).returns(new TypeHint<Vertex<String, VertexLabel>>(){});

 DataSet<Vertex<String, VertexLabel>> dstTuples = csvInput.project(1, 3)
 .map(new MapFunction<Tuple, Vertex<String, VertexLabel>>() {
 @Override
 public Vertex<String, VertexLabel> map(Tuple tuple) throws Exception {
 VertexLabel lb = new VertexLabel(Util.hash(tuple.getField(1)));
 return new Vertex<>(tuple.getField(0), lb);
 }
 }).returns(new TypeHint<Vertex<String, VertexLabel>>(){});

 DataSet<Vertex<String, VertexLabel>> vertexTuples = srcTuples.union(dstTuples).distinct(0);

 DataSet<Edge<String, EdgeLabel>> edgeTuples = csvInput.project(0, 1, 4, 5)
 .map(new MapFunction<Tuple, Edge<String, EdgeLabel>>() {
 @Override
 public Edge<String, EdgeLabel> map(Tuple tuple) throws Exception {
 EdgeLabel lb = new EdgeLabel(Util.hash(tuple.getField(2)), Long.parseLong(tuple.getField(3)));
 return new Edge<>(tuple.getField(0), tuple.getField(1), lb);
 }
 }).returns(new TypeHint<Edge<String, EdgeLabel>>(){});

 this.gc = Graph.fromDataSet(vertexTuples, edgeTuples, env);
 }

}

```

Base graph CSV:

```

0,1,a,b,c,0
0,2,a,d,e,1
1,2,b,d,f,2

```

Attempt to add edges using the following function:

```

try(BufferedReader br = new BufferedReader(new FileReader(this.fileName))) {
 for(String line; (line = br.readLine()) != null; ) {
 String[] attributes = line.split(",");
 assert(attributes.length == 6);
 String srcID = attributes[0];
 String dstID = attributes[1];
 String srcLb = attributes[2];
 String dstLb = attributes[3];
 String edgeLb = attributes[4];
 String ts = attributes[5];

 Vertex<String, VertexLabel> src = new Vertex<>(srcID, new VertexLabel(Util.hash(srcLb)));
 Vertex<String, VertexLabel> dst = new Vertex<>(dstID, new VertexLabel(Util.hash(dstLb)));
 EdgeLabel edge = new EdgeLabel(Util.hash(edgeLb), Long.parseLong(ts));

 GraphCorn.gc = GraphCorn.gc.addEdge(src, dst, edge);
 }
} catch (Exception e) {
 System.err.println(e.getMessage());
}

```

The graph components to add is:

```

0,4,a,d,k,3
1,3,b,a,g,3
2,3,d,a,h,4

```

GraphCorn.gc will contain duplicate node 0, 1, and 2 (those that exist in base graph), which should not be the case acceding to the documentation.

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)