kite:create-dataset

kite:create-dataset

Full name:

org.kitesdk:kite-maven-plugin:1.0.0:create-dataset

Description:

Create a named dataset whose entries conform to a defined schema.

Attributes:

  • Requires dependency resolution of artifacts in scope: compile.

Optional Parameters

Name Type Since Description
avroSchemaFile String - The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource. One of either this property or kite.avroSchemaReflectClass must be specified.
User property is: kite.avroSchemaFile.
avroSchemaReflectClass String - The fully-qualified classname of the Avro reflect class to use to generate a schema. The class must be available on the classpath. One of either this property or kite.avroSchemaFile must be specified.
User property is: kite.avroSchemaReflectClass.
columnDescriptorFile String - (no description)
User property is: kite.columnDescriptorFile.
datasetName String - The name of the dataset to create. Ignored if kite.uri is set.
User property is: kite.datasetName.
datasetNamespace String - The name of the dataset to create. Ignored if kite.uri is set.
Default value is: default.
User property is: kite.datasetNamespace.
format String - The file format (avro or parquet).
User property is: kite.format.
hadoopConfiguration Properties - Hadoop configuration properties.
User property is: kite.hadoopConfiguration.
hcatalog boolean - If true, store dataset metadata in HCatalog, otherwise store it on the filesystem.
User property is: kite.hcatalog.
partitionExpression String - The partition expression, in JEXL format (experimental).
User property is: kite.partitionExpression.
partitionStrategyFile String - (no description)
User property is: kite.partitionStrategyFile.
repositoryUri String - The URI specifying the dataset repository, e.g. repo:hdfs://host:8020/data. Optional, but if specified then kite.rootDirectory and kite.hcatalog are ignored.
User property is: kite.repositoryUri.
rootDirectory String - The root directory of the dataset repository. Optional if using HCatalog for metadata storage.
User property is: kite.rootDirectory.
uri String - A Kite dataset URI.
User property is: kite.uri.

Parameter Details

avroSchemaFile:

The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource. One of either this property or kite.avroSchemaReflectClass must be specified.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.avroSchemaFile

avroSchemaReflectClass:

The fully-qualified classname of the Avro reflect class to use to generate a schema. The class must be available on the classpath. One of either this property or kite.avroSchemaFile must be specified.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.avroSchemaReflectClass

columnDescriptorFile:

(no description)
  • Type: java.lang.String
  • Required: No
  • User Property: kite.columnDescriptorFile

datasetName:

The name of the dataset to create. Ignored if kite.uri is set.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.datasetName

datasetNamespace:

The name of the dataset to create. Ignored if kite.uri is set.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.datasetNamespace
  • Default: default

format:

The file format (avro or parquet).
  • Type: java.lang.String
  • Required: No
  • User Property: kite.format

hadoopConfiguration:

Hadoop configuration properties.
  • Type: java.util.Properties
  • Required: No
  • User Property: kite.hadoopConfiguration

hcatalog:

If true, store dataset metadata in HCatalog, otherwise store it on the filesystem.
  • Type: boolean
  • Required: No
  • User Property: kite.hcatalog

partitionExpression:

The partition expression, in JEXL format (experimental).
  • Type: java.lang.String
  • Required: No
  • User Property: kite.partitionExpression

partitionStrategyFile:

(no description)
  • Type: java.lang.String
  • Required: No
  • User Property: kite.partitionStrategyFile

repositoryUri:

The URI specifying the dataset repository, e.g. repo:hdfs://host:8020/data. Optional, but if specified then kite.rootDirectory and kite.hcatalog are ignored.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.repositoryUri

rootDirectory:

The root directory of the dataset repository. Optional if using HCatalog for metadata storage.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.rootDirectory

uri:

A Kite dataset URI.
  • Type: java.lang.String
  • Required: No
  • User Property: kite.uri