[ANNOUNCE] Weekly Community Update 2019/33-36

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[ANNOUNCE] Weekly Community Update 2019/33-36

Konstantin Knauf-3
Dear Community,

happy to share this "week's" community update, back after a three week
summer break. It's been a very busy time in the Flink community as a lot of
FLIP discussions and votes for Apache Flink 1.10 are on their way. I will
try to cover a good part of it in this update along with bugs in Flink
1.9.0 and and more...

Flink Development
==============

* [roadmap] There are currently two great resources to get an overview
of *Flink's
Roadmap* for 1.10 and beyond. The first one is the recently updated roadmap
on the Project website [1] and the other one is a discussion thread
launched by Gary on the features for Flink 1.10 [2]. Gary and Yu Li stepped
up as release managers for Flink 1.10 and proposed a feature freeze around
end of November 2019 and a release beginning of January 2020. Most of the
FLIP discussions covered in this update are mentioned on these roadmaps.

* [releases] The vote for *Apache Flink 1.8.2 *RC1 [3] is currently
ongoing. Checkout the corresponding discussion thread [4] for a list of
fixes.

* [development] Following up on the repository split discussion, the
community is now looking into other ways to *reduce the build time* of
Apache Flink. Chesnay has proposed several options, some of which are
investigated in more detailed as of writing. Among these are sharing JVMs
between tests for more modules, moving to gradle has a build system (better
incremental builds) and moving to a different CI system (Azure Pipelines?).
[5]

* [state] Yu Li proposes to add a new state backend to Flink, the
*SpillableHeapStatebackend.* [6] State will primarly live on the Java heap,
but the coldest state will be spilled to disk if memory becomes scarce. The
vote has already passed. [7]

* [python] Jincheng has started a discussion on adding support for
*user-defined
functions* in the Python Table API. The high-level architecture follows the
approach of Beam's portability framework of executing user-defined
functions in a separate language specific environment. The first FLIP
(FLIP-58) will only deal with stateless user-defined functions and will lay
the ground work.[8]

* [sql] Xu Forward has started a discussion on adding functions to *construct
and query JSON* objects in Flink SQL. The proposal has generally been
well-received, but there is no FLIP yet. [9]

* [sql] Bowen has started a discussion on reworking the *function catalog*,
which among other goals aims to support external built-in functions (Hive),
to revisit the resolution order of function names and to support fully
qualified function names. [10]

* [connectors] Yijie Shen proposes to contribute the *Apache Pulsar
connector* (currently in Apache Pulsar) back to Apache Flink. While
everyone agrees that a strong Apache Pulsar connector is a valuable
contribution to the project, there are concerns about build time,
maintainability in the long-run and dependencies on FLIP-27 (New Source
Interface). The discussion is ongoing. [11]

* [connectors] From Apache Flink 1.10 onwards the* Kinesis Connector* will
be part of the Apache Flink release. In the past this was blocked by the
license of its dependencies, which have recently been changed to Apache
2.0. [12]

* [recovery] Till has published to small FLIPs on *Flink's restart
strategies*. The first one, FLIP-61, proposes to change the logic to
determine the restart strategy to ignore restart strategy configuration
properties, when the corresponding restart strategy was not set via
"restart-strategy". The other one, FLIP-62, proposes to change the default
restart delay for all strategies from 0s to 1s. The vote has passed for
both of them [13, 14].

* [resource management] Following up on FLIP-49, Xintong Song has started a
discussion on FLIP-53 to add *fine grained operator resource management* to
Flink [15]. If I understand it correctly, the feature will only be
available via the Blink Planner of the Table API at first, and might later
be extended to the DataStream API. The DataSet API will not be affected.
The vote [16] is currently ongoing.

* [configuration] Dawid introduced a FLIP that adds support to configure
ExecutionConfig (and similar classes) from a file or more generally from
layers above the StreamExecutionEnvironment, which you currently need
access to change these configurations. [17]

* [development] Stephan proposed to switch *Java's Duration class* instead
of Flink's time class for non-API parts of Flink (API maybe in Flink 2.0).
[18]

* [development] Gyula started a discussion to unify the implementation of
the *Builder pattern in Flink*. Following the discussion he will add some
guidelines to the code style guide. [19]

* [releases] *Apache Flink-shaded 8.0* has been released. [20]

[1] https://flink.apache.org/roadmap.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-10-tp32824p32844.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-2-release-candidate-1-tp32808.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-2-tp32402.html
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Reducing-build-times-tp31800.html
[6]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-50-Spill-able-Heap-State-Backend-tp31896.html
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-tp31673.html
[9]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-JSON-functions-in-Flink-SQL-tp32674.html
[10]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-tp32291p32542.html
[11]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tp32538.html
[12]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Kinesis-connector-becomes-part-of-Flink-releases-tp32459.html
[13]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-61-Simplify-Flink-s-cluster-level-RestartStrategy-configuration-tp32634.html
[14]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-62-Set-default-restart-delay-for-FixedDelay-and-FailureRateRestartStrategy-to-1s-tp32635.html
[15]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
[16]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-tp32359.html
[17]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
[18]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Use-Java-s-Duration-instead-of-Flink-s-Time-tp32163.html
[19]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/CODE-STYLE-Builder-pattern-tp32225.html
[20]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Release-flink-shaded-8-0-tp31860.html

Notable Bugs
==========

For this update, I will focus on new bugs in Flink 1.9.0.

* [FLINK-13386] [FLINK-13799] [FLINK-13591] [1.9.0] A couple of issues with
the new dashboard have already been filed. If you experience any friction
with it, check if these tickets already address the issue. Otherwise please
create a new issue. [21,22,23]

* [FLINK-13568] [1.9.0] It is currently not possible to create a table with
a "String" data type via the SQL DDL. Resolved. [24]

* [FLINK-13940] [1.9.0] [1.8.1] Since Flink 1.8.0 the StreamingFileSink
cleans up some temporary files in S3 during recovery. If a job fails during
recovery after the cleanup subsequent recovery attempts also fail, because
the files have already been cleaned up. This results in data loss. Fixed
with a workaround for 1.9.1 and 1.8.2. [25]

* [FLINK-13526] [1.9.0] When switching to a non existing catalog or
database in the SQL Client the client crashes. [26]

* [FLINK-13737] flink-examples-table are missing in the binary distribution
of Flink 1.9.0. Fixed for 1.9.1. [27]

* [FLINK-13958] [1.9.0] [1.8.1] [1.7.2] A native library can only be loaded
by a single classloader per JVM. This may be a problem, if a native library
is loaded via Flink's user classloader because the library might be
reloaded after recovery by a new user class loader. The discussion on a
possible resolution is ongoing. [28]

[21] https://issues.apache.org/jira/browse/FLINK-13591
[22] https://issues.apache.org/jira/browse/FLINK-13386
[23] https://issues.apache.org/jira/browse/FLINK-13799
[24] https://issues.apache.org/jira/browse/FLINK-13568
[25] https://issues.apache.org/jira/browse/FLINK-13940
[26] https://issues.apache.org/jira/browse/FLINK-13526
[27] https://issues.apache.org/jira/browse/FLINK-13737
[28] https://issues.apache.org/jira/browse/FLINK-13958

Events, Blog Posts, Misc
===================

* *Kostas Kloudas* is now a member of the Apache Flink PMC.
Congratulations! [29]
* *Andrey Zagrebin* is now an Apache Flink Committer. Congrats! [30]
* Flink Forward Europe *training registration closes* on September 30th.
This time there are four different full-day training options (Dev, Ops,
SQL, Tuning & Troubleshooting). [31]
* Upcoming Meetups
    * *Enrico Canzonieri* of Yelp and *David Massart* of Tentative will
share their Apache Flink user stories of Yelp and BNP Paribas at the next *Bay
Area Apache Flink Meetup* 24th of September.  [32]
    * On the 23rd of September there will be another edition of the *London
Flink Meetup* with a talk by Yelp on how they run Flink on K8s. [33]

[29]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Kostas-Kloudas-joins-the-Flink-PMC-tp32810.html
[30]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Andrey-Zagrebin-becomes-a-Flink-committer-tp31735p31931.html
[31] https://europe-2019.flink-forward.org/training-program
[32] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262680261/
[33] https://www.meetup.com/Apache-Flink-London-Meetup/events/264123672/

Cheers,

Konstantin

--

Konstantin Knauf | Solutions Architect

+49 160 91394525


Follow us @VervericaData Ververica <https://www.ververica.com/>


--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Tony) Cheng