[jira] [Created] (FLINK-17173) Supports query hint to config "IdleStateRetentionTime" per query in SQL

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17173) Supports query hint to config "IdleStateRetentionTime" per query in SQL

Shang Yuanchun (Jira)
Danny Chen created FLINK-17173:
----------------------------------

             Summary: Supports query hint to config "IdleStateRetentionTime" per query in SQL
                 Key: FLINK-17173
                 URL: https://issues.apache.org/jira/browse/FLINK-17173
             Project: Flink
          Issue Type: Improvement
          Components: Table SQL / API
    Affects Versions: 1.11.0
            Reporter: Danny Chen


The motivation why we need this (copy from user mailing list [~qzhzm173227])

In some of the use cases our users have, they have a couple of complex join queries where the key domains key evolving - we definitely want some sort of state retention for those queries; but there are other where the key domain doesn't evolve overtime, but there isn't really a guarantee on what's the maximum gap between 2 records of the same key to appear in the stream, we don't want to accidentally invalidate the state for those keys in these streams.

Because of queries with different requirements can both exist in the pipeline, I think we have to config `IDLE_STATE_RETENTION_TIME` per operator.

Just wondering, has similar requirement not come up much for SQL users before? (being able to set table / query configuration inside SQL queries)

We are also a little bit concerned because right now since 'toRetractStream(Table, Class, QueryConfig)' is deprecated, relying on the fact that TableConfig is read during toDataStream feels like relying on an implementation details that just happens to work, and there is no guarantee that it will keep working in the future versions...

Demo syntax:

{code:sql}
CREATE TABLE `/output` AS
SELECT /*+ IDLE_STATE_RETENTION_TIME(minTime ='5m', maxTime ='11m') */ *
FROM `/input1` a
INNER JOIN `/input2` b
ON a.column_name = b.column_name;
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)