kite:create-dataset
Full name:
org.kitesdk:kite-maven-plugin:0.13.0:create-dataset
Description:
Create a named dataset whose entries conform to a defined schema.
Attributes:
- Requires dependency resolution of artifacts in scope: compile.
Required Parameters
Name | Type | Since | Description |
---|---|---|---|
datasetName | String | - | The name of the dataset to create. User property is: kite.datasetName. |
Optional Parameters
Name | Type | Since | Description |
---|---|---|---|
avroSchemaFile | String | - | The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource. One of either this property or kite.avroSchemaReflectClass must be specified. User property is: kite.avroSchemaFile. |
avroSchemaReflectClass | String | - | The fully-qualified classname of the Avro reflect class to use to generate a schema. The class must be available on the classpath. One of either this property or kite.avroSchemaFile must be specified. User property is: kite.avroSchemaReflectClass. |
format | String | - | The file format (avro or parquet). User property is: kite.format. |
hadoopConfiguration | Properties | - | Hadoop configuration properties. User property is: kite.hadoopConfiguration. |
hcatalog | boolean | - | If true, store dataset metadata in HCatalog, otherwise store it on the filesystem. User property is: kite.hcatalog. |
partitionExpression | String | - | The partition expression, in JEXL format (experimental). User property is: kite.partitionExpression. |
repositoryUri | String | - | The URI specifying the dataset repository, e.g. repo:hdfs://host:8020/data. Optional, but if specified then kite.rootDirectory and kite.hcatalog are ignored. User property is: kite.repositoryUri. |
rootDirectory | String | - | The root directory of the dataset repository. Optional if using HCatalog for metadata storage. User property is: kite.rootDirectory. |
Parameter Details
The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource. One of either this property or
kite.avroSchemaReflectClass must be specified.
- Type: java.lang.String
- Required: No
- User Property: kite.avroSchemaFile
The fully-qualified classname of the Avro reflect class to use to generate a schema. The class must be available on the classpath. One of either this property or
kite.avroSchemaFile must be specified.
- Type: java.lang.String
- Required: No
- User Property: kite.avroSchemaReflectClass
The name of the dataset to create.
- Type: java.lang.String
- Required: Yes
- User Property: kite.datasetName
The file format (avro or parquet).
- Type: java.lang.String
- Required: No
- User Property: kite.format
Hadoop configuration properties.
- Type: java.util.Properties
- Required: No
- User Property: kite.hadoopConfiguration
If true, store dataset metadata in HCatalog, otherwise store it on the filesystem.
- Type: boolean
- Required: No
- User Property: kite.hcatalog
The partition expression, in JEXL format (experimental).
- Type: java.lang.String
- Required: No
- User Property: kite.partitionExpression
The URI specifying the dataset repository, e.g.
repo:hdfs://host:8020/data. Optional, but if specified then
kite.rootDirectory and
kite.hcatalog are ignored.
- Type: java.lang.String
- Required: No
- User Property: kite.repositoryUri
The root directory of the dataset repository. Optional if using HCatalog for metadata storage.
- Type: java.lang.String
- Required: No
- User Property: kite.rootDirectory