Search

Suggested keywords:
  • Java
  • Docker
  • Git
  • React
  • NextJs
  • Spring boot
  • Laravel

Apache Tez - A Framework for YARN-based, Data Processing Applications In Hadoop

  • Share this:
post-title
Apache Tez is an extensible framework for building high performance batch and interactive data processing applications, coordinated by YARN in Apache Hadoop. Tez improves the MapReduce paradigm by dramatically improving its speed, while maintaining MapReduce’s ability to scale to petabytes of data. Important Hadoop ecosystem projects like Apache Hive and Apache Pig use Apache Tez, as do a growing number of third party data access applications developed for the broader Hadoop ecosystem.

Tez can execute complex directed acyclic graphs of general data processing tasks. In many ways it can be thought of as a more flexible and powerful successor of the map-reduce framework. By allowing projects like Apache Hive and Apache Pig to run a complex DAG of tasks, Tez can be used to process data, that earlier took multiple MR jobs, now it can be done in a single Tez job.
https://tez.apache.org/
https://github.com/apache/tez
License:
Tech: