[jira] [Created] (FLINK-22894) Window Top-N should allow n=1

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-22894) Window Top-N should allow n=1

Shang Yuanchun (Jira)
David Anderson created FLINK-22894:
--------------------------------------

             Summary: Window Top-N should allow n=1
                 Key: FLINK-22894
                 URL: https://issues.apache.org/jira/browse/FLINK-22894
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Runtime
    Affects Versions: 1.13.1
            Reporter: David Anderson


I tried to reimplement the Hourly Tips exercise from the DataStream training using Flink SQL. The objective of this exercise is to find the one taxi driver who earned the most in tips during each hour, and report that driver's driverId and the sum of their tips. 

This can be expressed as a window top-n query, where n=1, as in

{{FROM (}}
{{  SELECT *, ROW_NUMBER() OVER }}{{(PARTITION BY window_start, window_end ORDER BY sumOfTips DESC) as rownum}}
{{  FROM ( }}
{{    SELECT driverId, window_start, window_end, sum(tip) as sumOfTips}}
{{    FROM TABLE( }}
{{      TUMBLE(TABLE fares, DESCRIPTOR(startTime), INTERVAL '1' HOUR))}}
{{    GROUP BY driverId, window_start, window_end}}
{{  )}}
{{) WHERE rownum = 1;}}

 

This fails because the {{WindowRankOperatorBuilder}} insists on {{rankEnd > 1. }}So, in other words, while it is possible to report the top 2 drivers, or the driver in 2nd place, it's not possible to report only the top driver.

This appears to be an off-by-one error in the range checking.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)