Processing events based on weights

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Processing events based on weights

Vijay Srinivasaraghavan
Hello,
I would like to understand options available to design an ingestion pipeline to support the following requirements.
1) Events are coming from various sources and depending on the type of the events it will be stored in specific Kafka topics (say we have 4 topics)
2) The events that are part of topics are weighted (Topic1: 0.6, Topic2: 0.1: Topic3: 0.2 and Topic4: 0.1)
3) The events are to be processed (consumed and enriched) based on the weights. For example, if I am reading 10 events from each topic, then I should consider processing 6 events from Topic1, 1 event from Topic2, 2 events from Topic3 and 1 event from Topic4. Basically trying to do something similar to this implementation https://github.com/flipkart-incubator/priority-kafka-client
Question:

1) Should I handle the weighted distribution at the source (custom) connector or use a window after we read the data?
2) When reading from multiple Kafka topics, how the source connector enforce the batch read? If the batch size is 100, will it try to read 100 messages from each topic at once or through round-robin (try to get 100 from Topic1 first, and move on to the next topics till the batch size is reached)
Appreciate your inputs.

ThanksVijay
Reply | Threaded
Open this post in threaded view
|

Re: Processing events based on weights

Vijay Srinivasaraghavan
 Resending email again...
Hello,
I would like to understand the options available to design an ingestion pipeline to support the following requirements.
1) Events are coming from various sources and depending on the type of the events it will be stored in specific Kafka topics (say we have 4 topics)
2) The events that are part of topics are weighted (Topic1: 0.6, Topic2: 0.1: Topic3: 0.2 and Topic4: 0.1)
3) The events are to be processed (consumed and enriched) based on the weights. For example, if I am reading 10 events from each topic, then I should consider processing 6 events from Topic1, 1 event from Topic2, 2 events from Topic3 and 1 event from Topic4. Basically trying to do something similar to this implementation https://github.com/flipkart-incubator/priority-kafka-client
Question:

1) Should I handle the weighted distribution at the source (custom) connector or use a window after we read the data?
2) When reading from multiple Kafka topics, how the source connector enforce the batch read? If the batch size is 100, will it try to read 100 messages from each topic at once or through round-robin (try to get 100 from Topic1 first, and move on to the next topics till the batch size is reached)
Appreciate your inputs.

ThanksVijay    On Monday, December 16, 2019, 08:20:31 PM PST, Vijay Srinivasaraghavan <[hidden email]> wrote:  
 
 Hello,
I would like to understand options available to design an ingestion pipeline to support the following requirements.
1) Events are coming from various sources and depending on the type of the events it will be stored in specific Kafka topics (say we have 4 topics)
2) The events that are part of topics are weighted (Topic1: 0.6, Topic2: 0.1: Topic3: 0.2 and Topic4: 0.1)
3) The events are to be processed (consumed and enriched) based on the weights. For example, if I am reading 10 events from each topic, then I should consider processing 6 events from Topic1, 1 event from Topic2, 2 events from Topic3 and 1 event from Topic4. Basically trying to do something similar to this implementation https://github.com/flipkart-incubator/priority-kafka-client
Question:

1) Should I handle the weighted distribution at the source (custom) connector or use a window after we read the data?
2) When reading from multiple Kafka topics, how the source connector enforce the batch read? If the batch size is 100, will it try to read 100 messages from each topic at once or through round-robin (try to get 100 from Topic1 first, and move on to the next topics till the batch size is reached)
Appreciate your inputs.

ThanksVijay