Dependency Information
The simplest way to specify Kite and Hadoop dependencies is to use the Kite App Parent POM. This ensures that you inherit a compatible set of dependencies and that the Kite plugins are suitably configured. Add the following to your POM:
1 2 3 4 5 |
<parent> <groupId>org.kitesdk</groupId> <artifactId>kite-app-parent-cdh4</artifactId> <version>1.1.0</version> </parent> |
Alternatively, if you choose not to use the Kite App Parent POM add the Cloudera repository to your Maven POM:
1 2 3 4 5 6 7 8 |
<repository> <id>cdh.repo</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos</url> <name>Cloudera Repositories</name> <snapshots> <enabled>false</enabled> </snapshots> </repository> |
Then add a dependency for each module you want to use by referring to the information listed on the Dependency Information pages listed below. You can also view the transitive dependencies for each module.
Hadoop Component Dependencies
As a general rule, Kite modules mark Hadoop component dependencies as having provided
scope, since in many cases the dependencies are provided by the container that the code is running in.
For example,
- Kite Data has a
provided
dependency on the core Hadoop libraries - Kite Crunch has a
provided
dependency on Crunch and the core Hadoop libraries - Kite HCatalog has a
provided
dependency on Hive
The following containers provide the dependencies listed:
- The
hadoop jar
command provides the core Hadoop dependencies. - The MapReduce task environment provides the core Hadoop dependencies.
- When used from then Kite App Parent POM the Kite Maven Plugin provides the Hadoop, HBase and Hive dependencies. If the Kite App Parent POM is not being used, then these dependencies should be specified in the plugin’s dependencies section of the POM.
However, there are some cases where you may have to provide the relevant Hadoop component dependencies yourself, and Kite has grouping dependencies for this purpose.
There is a grouping dependency for each flavor of Hadoop distribution, which differ by Maven artifact ID:
- Apache Hadoop 2 (the default),
kite-hadoop2-dependencies
- Apache Hadoop 1,
kite-hadoop1-dependencies
- CDH 4,
kite-hadoop-cdh4-dependencies
- CDH 5,
kite-hadoop-cdh5-dependencies
This is how you would specify a dependency on the CDH 5 dependencies:
1 2 3 4 5 6 7 |
<dependency> <groupId>org.kitesdk</groupId> <artifactId>kite-hadoop-cdh5-dependencies</artifactId> <version>1.1.0</version> <type>pom</type> <scope>compile</scope> </dependency> |
There are an analogous set of grouping dependencies for HBase:
- Apache Hadoop 2 (the default),
kite-hbase2-dependencies
- Apache Hadoop 1,
kite-hbase1-dependencies
- CDH 4,
kite-hbase-cdh4-dependencies
- CDH 5,
kite-hbase-cdh5-dependencies
Here are some scenarios when you need to provide Hadoop component dependencies:
- Crunch jobs, even those running in the containers listed above. However, if using
the Kite App Parent POM Crunch is provided (example) - Standalone Java programs, not run using
kite:run-tool
orhadoop jar
(example) - Web apps (example)
Kite Data Modules
- Kite Data Core - Dependency Information, Dependencies
- Kite Data Hive - Dependency Information, Dependencies
- Kite Data HBase - Dependency Information, Dependencies
- Kite Data Crunch - Dependency Information, Dependencies
- Kite Data MapReduce - Dependency Information, Dependencies
Kite Morphlines Modules
- Kite Morphlines Core - Dependency Information, Dependencies
- Kite Morphlines Avro - Dependency Information, Dependencies
- Kite Morphlines JSON - Dependency Information, Dependencies
- Kite Morphlines protobuf - Dependency Information, Dependencies
- Kite Morphlines Hadoop Core - Dependency Information, Dependencies
- Kite Morphlines Hadoop Parquet Avro - Dependency Information, Dependencies
- Kite Morphlines Hadoop RC File - Dependency Information, Dependencies
- Kite Morphlines Hadoop Sequence File - Dependency Information, Dependencies
- Kite Morphlines Maxmind - Dependency Information, Dependencies
- Kite Morphlines Metrics Servlets - Dependency Information, Dependencies
- Kite Morphlines Saxon - Dependency Information, Dependencies
- Kite Morphlines Solr Core - Dependency Information, Dependencies
- Kite Morphlines Solr Cell - Dependency Information, Dependencies
- Kite Morphlines Tika Core - Dependency Information, Dependencies
- Kite Morphlines Tika Decompress - Dependency Information, Dependencies
- Kite Morphlines Twitter - Dependency Information, Dependencies
- Kite Morphlines UserAgent - Dependency Information, Dependencies
- Kite Morphlines All - Dependency Information, Dependencies
- Kite Morphlines All Except Solr - Dependency Information, Dependencies
Kite Tools Modules
- Kite Tools - Dependency Information, Dependencies