Supporting Multi-tenancy in a Flink

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Supporting Multi-tenancy in a Flink

Aparup Banerjee (apbanerj)
We are building a Stream processing system using Apache beam on top of Flink using the Flink Runner. Our pipelines take Kafka streams as sources , and can write to multiple sinks. The system needs to be tenant aware. Tenants can share same Kafka topic. Tenants can write their own pipelines. We am providing a small framework to write pipelines (on top of beam),  so we have control of what data stream is available to pipeline developer. I am looking for some strategies for following :


  1.  How can I partition / group the data in a way that pipeline developers don’t need to care about tenancy , but the data integrity is maintained ?
  2.  Ways in which I can assign compute(work nodes for e.g) to different jobs based on Tenant configuration.

Thanks,
Aparup