To use the Kite modules in a Java project add the Cloudera repository to your Maven POM:

<repository>
  <id>cdh.repo</id>
  <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
  <name>Cloudera Repositories</name>
  <snapshots>
    <enabled>false</enabled>
  </snapshots>
</repository>

Then add a dependency for each module you want to use by referring to the information listed on the Dependency Information pages listed below. You can also view the transitive dependencies for each module.

Hadoop Component Dependencies

As a general rule, Kite modules mark Hadoop component dependencies as having provided scope, since in many cases the dependencies are provided by the container that the code is running in.

For example,

  • Kite Data has a provided dependency on the core Hadoop libraries
  • Kite Crunch has a provided dependency on Crunch and the core Hadoop libraries
  • Kite HCatalog has a provided dependency on HCatalog

The following containers provide the dependencies listed:

  • The Kite Maven Plugin goal kite:run-tool provides the Hadoop and HCatalog dependencies.
  • The Kite Maven Plugin goal kite:run-job provides the Hadoop dependencies. HCatalog should be added as a runtime dependency (example).
  • The hadoop jar command provides the Hadoop dependencies.

However, there are some cases where you may have to provide the relevant Hadoop component dependencies yourself:

  • Crunch programs (even those running in the containers listed above) (example)
  • Standalone Java programs, not run using kite:run-tool or hadoop jar (example)
  • Web apps (example)

Kite Data Modules

Kite Morphlines Modules

Kite Tools Modules

Back to top

Version: 0.12.0. Last Published: 2014-03-11.

Reflow Maven skin by Andrius Velykis.