@Immutable public class DatasetDescriptor extends Object
The structural definition of a Dataset
.
Each Dataset
has an associated Schema
and optional
PartitionStrategy
defined at the time of creation. You use instances
of this class to hold this information. You are strongly encouraged to use
the inner DatasetDescriptor.Builder
to create new instances.
Modifier and Type | Class and Description |
---|---|
static class |
DatasetDescriptor.Builder
A fluent builder to aid in the construction of
DatasetDescriptor s. |
Constructor and Description |
---|
DatasetDescriptor(Schema schema,
URI schemaUri,
Format format,
URI location,
Map<String,String> properties,
PartitionStrategy partitionStrategy,
ColumnMapping columnMapping,
CompressionType compressionType)
Create an instance of this class with the supplied
Schema , optional
URL, Format , optional location URL, optional
PartitionStrategy , optional ColumnMapping , and optional
CompressionType . |
DatasetDescriptor(Schema schema,
URL schemaUrl,
Format format,
URI location,
Map<String,String> properties,
PartitionStrategy partitionStrategy)
Create an instance of this class with the supplied
Schema ,
optional URL, Format , optional location URL, and optional
PartitionStrategy . |
DatasetDescriptor(Schema schema,
URL schemaUrl,
Format format,
URI location,
Map<String,String> properties,
PartitionStrategy partitionStrategy,
ColumnMapping columnMapping)
Create an instance of this class with the supplied
Schema , optional
URL, Format , optional location URL, optional
PartitionStrategy , and optional ColumnMapping . |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
ColumnMapping |
getColumnMapping()
Get the
ColumnMapping . |
CompressionType |
getCompressionType()
Get the
CompressionType |
Format |
getFormat()
Get the associated
Format the data is stored in. |
URI |
getLocation()
Get the URL location where the data for this
Dataset is stored
(optional). |
PartitionStrategy |
getPartitionStrategy()
Get the
PartitionStrategy , if this dataset is partitioned. |
String |
getProperty(String name)
Get a named property.
|
Schema |
getSchema()
Get the associated
Schema . |
URL |
getSchemaUrl()
Get a URL from which the
Schema can be retrieved (optional). |
int |
hashCode() |
boolean |
hasProperty(String name)
Check if a named property exists.
|
boolean |
isColumnMapped()
Returns true if an associated dataset is column mapped (that is, has an
associated
ColumnMapping ), false otherwise. |
boolean |
isPartitioned()
Returns true if an associated dataset is partitioned (that is, has an
associated
PartitionStrategy ), false otherwise. |
Collection<String> |
listProperties()
List the names of all custom properties set.
|
String |
toString() |
public DatasetDescriptor(Schema schema, @Nullable URL schemaUrl, Format format, @Nullable URI location, @Nullable Map<String,String> properties, @Nullable PartitionStrategy partitionStrategy)
Schema
,
optional URL, Format
, optional location URL, and optional
PartitionStrategy
.public DatasetDescriptor(Schema schema, @Nullable URL schemaUrl, Format format, @Nullable URI location, @Nullable Map<String,String> properties, @Nullable PartitionStrategy partitionStrategy, @Nullable ColumnMapping columnMapping)
Schema
, optional
URL, Format
, optional location URL, optional
PartitionStrategy
, and optional ColumnMapping
.public DatasetDescriptor(Schema schema, @Nullable URI schemaUri, Format format, @Nullable URI location, @Nullable Map<String,String> properties, @Nullable PartitionStrategy partitionStrategy, @Nullable ColumnMapping columnMapping, @Nullable CompressionType compressionType)
Schema
, optional
URL, Format
, optional location URL, optional
PartitionStrategy
, optional ColumnMapping
, and optional
CompressionType
.public Schema getSchema()
Schema
. Depending on the underlying storage
system, this schema can be simple (that is, records made up of only scalar
types) or complex (that is, containing other records, lists, and so on).
Validation of the supported schemas is performed by the managing
repository, not the dataset or descriptor itself.@Nullable public URL getSchemaUrl()
Schema
can be retrieved (optional). This
method might return null
if the schema is not stored at a persistent
URL (for example, if it were constructed from a literal string).public Format getFormat()
Format
the data is stored in.@Nullable public URI getLocation()
Dataset
is stored
(optional).@Nullable public String getProperty(String name)
name
- the String property name to get.public boolean hasProperty(String name)
name
- the String property name.public Collection<String> listProperties()
public PartitionStrategy getPartitionStrategy()
PartitionStrategy
, if this dataset is partitioned. Calling
this method on a non-partitioned dataset is an error. Instead, use the
isPartitioned()
method prior to invocation.public ColumnMapping getColumnMapping()
ColumnMapping
.public CompressionType getCompressionType()
CompressionType
public boolean isPartitioned()
PartitionStrategy
), false otherwise.public boolean isColumnMapped()
ColumnMapping
), false otherwise.Copyright © 2013–2014. All rights reserved.