Timestamp(timezone) conversion bug in non blink Table/SQL runtime

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Timestamp(timezone) conversion bug in non blink Table/SQL runtime

Shuyi Chen
Hi all,

Currently, in the non-blink table/SQL runtime, Flink used
SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time
(in long) to java.sql.Timestamp. However, as discussed in the recent
Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp()
assumes the input timestamp value is in the current JVM’s default timezone
(which is unusual), NOT milliseconds since epoch. And
SqlFunctions.internalToTimestamp() is used to convert timestamp value in
the current JVM’s default timezone to milliseconds since epoch, which
java.sql.Timestamp constructor takes. Therefore, the results will not only
be wrong, but change if the job runs in machines on different timezones as
well. (The only exception is that all your production machines uses UTC
timezone.)

Here is an example, if the user input value is 0 (00:00:00 UTC on 1 January
1970), and the table/SQL runtime runs in a machine in PST (UTC-8), the
output sql.Timestamp after SqlFunctions.internalToTimestamp() will become
28800000 millisec since epoch (08:00:00 UTC on 1 January 1970); And with
the same input, if the table/SQL runtime runs again in a different machine
in EST (UTC-5), the output sql.Timestamp after
SqlFunctions.internalToTimestamp() will become 18000000 millisec since
epoch (05:00:00 UTC on 1 January 1970).

More details are captured in
https://issues.apache.org/jira/browse/FLINK-13372. Please let me know your
thoughts and correct me if I am wrong. Thanks a lot.

Shuyi
Reply | Threaded
Open this post in threaded view
|

Re: Timestamp(timezone) conversion bug in non blink Table/SQL runtime

Lasse Nedergaard
Hi.

I have encountered the same problem when you input epoch time to window table function and then use window.start and window.end the out doesn’t output in epoch but local time and I located the problem to the same internal function as you.

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 22. jul. 2019 kl. 20.46 skrev Shuyi Chen <[hidden email]>:
>
> Hi all,
>
> Currently, in the non-blink table/SQL runtime, Flink used SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time (in long) to java.sql.Timestamp. However, as discussed in the recent Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp() assumes the input timestamp value is in the current JVM’s default timezone (which is unusual), NOT milliseconds since epoch. And SqlFunctions.internalToTimestamp() is used to convert timestamp value in the current JVM’s default timezone to milliseconds since epoch, which java.sql.Timestamp constructor takes. Therefore, the results will not only be wrong, but change if the job runs in machines on different timezones as well. (The only exception is that all your production machines uses UTC timezone.)
>
> Here is an example, if the user input value is 0 (00:00:00 UTC on 1 January 1970), and the table/SQL runtime runs in a machine in PST (UTC-8), the output sql.Timestamp after SqlFunctions.internalToTimestamp() will become 28800000 millisec since epoch (08:00:00 UTC on 1 January 1970); And with the same input, if the table/SQL runtime runs again in a different machine in EST (UTC-5), the output sql.Timestamp after SqlFunctions.internalToTimestamp() will become 18000000 millisec since epoch (05:00:00 UTC on 1 January 1970).
>
> More details are captured in https://issues.apache.org/jira/browse/FLINK-13372. Please let me know your thoughts and correct me if I am wrong. Thanks a lot.
>
> Shuyi
Reply | Threaded
Open this post in threaded view
|

Re: Timestamp(timezone) conversion bug in non blink Table/SQL runtime

Shuyi Chen
Hi Lasse,

Thanks for the reply. If your input is in epoch time, you are not getting
local time, instead, you are getting a wrong time that does not make sense.
For example,  if the user input value is 0 (which means 00:00:00 UTC on 1
January 1970), and your local timezone is UTC-8, converting 00:00:00 UTC on
1 January 1970 to your local timezone should yield 16:00:00 Dec 31, 1969.
But actually, you will be getting 08:00:00 UTC on 1 January 1970  from
Table/SQL runtime, which 00:00:00 on 1 January 1970 in your local timezone
(UTC-8). Your input time just get shifted by 8 hours in output.

Shuyi

On Mon, Jul 22, 2019 at 12:49 PM Lasse Nedergaard <[hidden email]>
wrote:

> Hi.
>
> I have encountered the same problem when you input epoch time to window
> table function and then use window.start and window.end the out doesn’t
> output in epoch but local time and I located the problem to the same
> internal function as you.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 22. jul. 2019 kl. 20.46 skrev Shuyi Chen <[hidden email]>:
>
> Hi all,
>
> Currently, in the non-blink table/SQL runtime, Flink used
> SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time
> (in long) to java.sql.Timestamp. However, as discussed in the recent
> Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp()
> assumes the input timestamp value is in the current JVM’s default timezone
> (which is unusual), NOT milliseconds since epoch. And
> SqlFunctions.internalToTimestamp() is used to convert timestamp value in
> the current JVM’s default timezone to milliseconds since epoch, which
> java.sql.Timestamp constructor takes. Therefore, the results will not only
> be wrong, but change if the job runs in machines on different timezones as
> well. (The only exception is that all your production machines uses UTC
> timezone.)
>
> Here is an example, if the user input value is 0 (00:00:00 UTC on 1
> January 1970), and the table/SQL runtime runs in a machine in PST (UTC-8),
> the output sql.Timestamp after SqlFunctions.internalToTimestamp() will
> become 28800000 millisec since epoch (08:00:00 UTC on 1 January 1970); And
> with the same input, if the table/SQL runtime runs again in a different
> machine in EST (UTC-5), the output sql.Timestamp after
> SqlFunctions.internalToTimestamp() will become 18000000 millisec since
> epoch (05:00:00 UTC on 1 January 1970).
>
> More details are captured in
> https://issues.apache.org/jira/browse/FLINK-13372. Please let me know
> your thoughts and correct me if I am wrong. Thanks a lot.
>
> Shuyi
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Timestamp(timezone) conversion bug in non blink Table/SQL runtime

Rong Rong
Hi Shuyi,

I think there were some discussions in the mailing list [1,2] and JIRA
tickets [3,4] that might be related.
Since the table-blink planner doesn't produce such error, I think this
problem is valid and should be fixed.

Thanks,
Rong

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/event-time-timezone-is-not-correct-tt26457.html
[2]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TimeZone-shift-problem-in-Flink-SQL-td25666.html#a25739
[3] https://issues.apache.org/jira/browse/FLINK-8353
[4] https://issues.apache.org/jira/browse/FLINK-8169

On Mon, Jul 22, 2019 at 1:49 PM Shuyi Chen <[hidden email]> wrote:

> Hi Lasse,
>
> Thanks for the reply. If your input is in epoch time, you are not getting
> local time, instead, you are getting a wrong time that does not make sense.
> For example,  if the user input value is 0 (which means 00:00:00 UTC on 1
> January 1970), and your local timezone is UTC-8, converting 00:00:00 UTC on
> 1 January 1970 to your local timezone should yield 16:00:00 Dec 31, 1969.
> But actually, you will be getting 08:00:00 UTC on 1 January 1970  from
> Table/SQL runtime, which 00:00:00 on 1 January 1970 in your local timezone
> (UTC-8). Your input time just get shifted by 8 hours in output.
>
> Shuyi
>
> On Mon, Jul 22, 2019 at 12:49 PM Lasse Nedergaard <
> [hidden email]> wrote:
>
>> Hi.
>>
>> I have encountered the same problem when you input epoch time to window
>> table function and then use window.start and window.end the out doesn’t
>> output in epoch but local time and I located the problem to the same
>> internal function as you.
>>
>> Med venlig hilsen / Best regards
>> Lasse Nedergaard
>>
>>
>> Den 22. jul. 2019 kl. 20.46 skrev Shuyi Chen <[hidden email]>:
>>
>> Hi all,
>>
>> Currently, in the non-blink table/SQL runtime, Flink used
>> SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time
>> (in long) to java.sql.Timestamp. However, as discussed in the recent
>> Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp()
>> assumes the input timestamp value is in the current JVM’s default timezone
>> (which is unusual), NOT milliseconds since epoch. And
>> SqlFunctions.internalToTimestamp() is used to convert timestamp value in
>> the current JVM’s default timezone to milliseconds since epoch, which
>> java.sql.Timestamp constructor takes. Therefore, the results will not only
>> be wrong, but change if the job runs in machines on different timezones as
>> well. (The only exception is that all your production machines uses UTC
>> timezone.)
>>
>> Here is an example, if the user input value is 0 (00:00:00 UTC on 1
>> January 1970), and the table/SQL runtime runs in a machine in PST (UTC-8),
>> the output sql.Timestamp after SqlFunctions.internalToTimestamp() will
>> become 28800000 millisec since epoch (08:00:00 UTC on 1 January 1970); And
>> with the same input, if the table/SQL runtime runs again in a different
>> machine in EST (UTC-5), the output sql.Timestamp after
>> SqlFunctions.internalToTimestamp() will become 18000000 millisec since
>> epoch (05:00:00 UTC on 1 January 1970).
>>
>> More details are captured in
>> https://issues.apache.org/jira/browse/FLINK-13372. Please let me know
>> your thoughts and correct me if I am wrong. Thanks a lot.
>>
>> Shuyi
>>
>>