org.kitesdk.data
Class DatasetDescriptor

java.lang.Object
  extended by org.kitesdk.data.DatasetDescriptor

@Immutable
public class DatasetDescriptor
extends Object

The structural definition of a Dataset.

Each Dataset has an associated Schema and optional PartitionStrategy defined at the time of creation. Instances of this class are used to hold this information. Users are strongly encouraged to use the inner DatasetDescriptor.Builder to create new instances.


Nested Class Summary
static class DatasetDescriptor.Builder
          A fluent builder to aid in the construction of DatasetDescriptors.
 
Constructor Summary
DatasetDescriptor(Schema schema, URL schemaUrl, Format format, URI location, Map<String,String> properties, PartitionStrategy partitionStrategy)
          Create an instance of this class with the supplied Schema, optional URL, Format, optional location URL, and optional PartitionStrategy.
 
Method Summary
 boolean equals(Object obj)
           
 Format getFormat()
          Get the associated Format that the data is stored in.
 URI getLocation()
          Get the URL location where the data for this Dataset is stored (optional).
 PartitionStrategy getPartitionStrategy()
          Get the PartitionStrategy, if this dataset is partitioned.
 String getProperty(String name)
          Get a named property.
 Schema getSchema()
          Get the associated Schema.
 URL getSchemaUrl()
          Get a URL from which the Schema may be retrieved (optional).
 int hashCode()
           
 boolean hasProperty(String name)
          Check if a named property exists.
 boolean isPartitioned()
          Returns true if an associated dataset is partitioned (that is, has an associated PartitionStrategy), false otherwise.
 Collection<String> listProperties()
          List the names of all custom properties set.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DatasetDescriptor

public DatasetDescriptor(Schema schema,
                         @Nullable
                         URL schemaUrl,
                         Format format,
                         @Nullable
                         URI location,
                         @Nullable
                         Map<String,String> properties,
                         @Nullable
                         PartitionStrategy partitionStrategy)
Create an instance of this class with the supplied Schema, optional URL, Format, optional location URL, and optional PartitionStrategy.

Method Detail

getSchema

public Schema getSchema()
Get the associated Schema. Depending on the underlying storage system, this schema may be simple (i.e. records made up of only scalar types) or complex (i.e. containing other records, lists, and so on). Validation of the supported schemas is performed by the managing repository, not the dataset or descriptor itself.

Returns:
the schema

getSchemaUrl

@Nullable
public URL getSchemaUrl()
Get a URL from which the Schema may be retrieved (optional). This method may return null if the schema is not stored at a persistent URL, e.g. if it was constructed from a literal string.

Returns:
a URL from which the schema may be retrieved
Since:
0.3.0

getFormat

public Format getFormat()
Get the associated Format that the data is stored in.

Returns:
the format
Since:
0.2.0

getLocation

@Nullable
public URI getLocation()
Get the URL location where the data for this Dataset is stored (optional).

Returns:
a location URL or null if one is not set
Since:
0.8.0

getProperty

@Nullable
public String getProperty(String name)
Get a named property.

Parameters:
name - the String property name to get.
Returns:
the String value of the property, or null if it does not exist.
Since:
0.8.0

hasProperty

public boolean hasProperty(String name)
Check if a named property exists.

Parameters:
name - the String property name.
Returns:
true if the property exists, false otherwise.
Since:
0.8.0

listProperties

public Collection<String> listProperties()
List the names of all custom properties set.

Returns:
a Collection of String property names.
Since:
0.8.0

getPartitionStrategy

public PartitionStrategy getPartitionStrategy()
Get the PartitionStrategy, if this dataset is partitioned. Calling this method on a non-partitioned dataset is an error. Instead, use the isPartitioned() method prior to invocation.


isPartitioned

public boolean isPartitioned()
Returns true if an associated dataset is partitioned (that is, has an associated PartitionStrategy), false otherwise.


hashCode

public int hashCode()
Overrides:
hashCode in class Object

equals

public boolean equals(Object obj)
Overrides:
equals in class Object

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2013–2014. All rights reserved.