Hi,
I have the following usecase to implement in my organization. Say there is huge relational database(1000 tables for each of our 30k customers) in our monolith setup We want to reduce the load on the DB and prevent the applications from hitting it for latest events. So an extract is done from redo logs on to kafka. We need to set up a streaming platform based on the table updates that happen(read from kafka) , we need to form events and send it consumer. Each consumer may be interested in same table but different updates/columns respective of their business needs and then deliver it to their endpoint/kinesis/SQS/a kafka topic. So the case here is *1* table update : *m* events : *n* sink. Peak Load expected is easily a 100k-million table updates per second(all customers put together) Latency expected by most customers is less than a second. Mostly in 100-500ms. Is this usecase suited for flink ? I went through the Flink book and documentation. These are the following questions i have 1). If we have situation like this *1* table update : *m* events : *n* sink , is it better to write our micro service on our own or it it better to implement through flink. 1 a) How does checkpointing happens if we have *1* input: *n* output situations. 1 b) There are no heavy transformations maximum we might do is to check the required columns are present in the db updates and decide whether to create an event. So there is an alternative thought process to write a service in node since it more IO and less process. 2) I see that we are writing a Job and it is deployed and flink takes care of the rest in handling parallelism, latency and throughput. But what i need is to write a generic framework so that we should be able to handle any table structure. we should not end up writing one job driver for each case. There are at least 200 type of events in the existing monolith system which might move to this new system once built. 3) How do we maintain flink cluster HA . From the book , i get that internal task level failures are handled gracefully in flink. But what if the flink cluster goes down, how do we make sure its HA ? I had earlier worked with spark and we had issues managing it. (Not much problem was there since there the latency requirement is 15 min and we could make sure to ramp another one up within that time). These are absolute realtime cases and we cannot miss even one message/event. There are also thoughts whether to use kafka streams/apache storm for the same. [They are investigated by different set of folks] Thanks, Prasanna. |
Free forum by Nabble | Edit this page |