org.kitesdk.data
Interface Dataset<E>

Type Parameters:
E - The type of entities stored in this Dataset.
All Superinterfaces:
RefinableView<E>, View<E>
All Known Subinterfaces:
RandomAccessDataset<E>

@Immutable
public interface Dataset<E>
extends RefinableView<E>

A logical representation of a set of data entities.

Logically, all datasets have two generic properties: a name, and a descriptor that holds information such as the dataset's schema and its partitioning information. Concrete implementations of Dataset can support additional properties, mandatory or otherwise, as needed. Datasets are not normally instantiated directly, but managed by a repository (also implementation-specific).

Implementations of Dataset are immutable.

See Also:
View, DatasetRepository, DatasetWriter, DatasetReader, PartitionStrategy, DatasetDescriptor, Schema

Method Summary
 void dropPartition(org.kitesdk.data.spi.PartitionKey key)
          Deprecated. will be removed in 0.16.0; use View.deleteAll() on an appropriate view instead
 DatasetDescriptor getDescriptor()
          Get the DatasetDescriptor associated with this dataset.
 String getName()
          Get the name of a Dataset.
 Dataset<E> getPartition(org.kitesdk.data.spi.PartitionKey key, boolean autoCreate)
          Deprecated. will be removed in 0.16.0; use RefinableView methods instead
 Iterable<? extends Dataset<E>> getPartitions()
          Deprecated. will be removed in 0.16.0; use RefinableView methods instead
 URI getUri()
          Return a URI for this Dataset.
 
Methods inherited from interface org.kitesdk.data.RefinableView
from, fromAfter, to, toBefore, with
 
Methods inherited from interface org.kitesdk.data.View
deleteAll, getDataset, getType, includes, isEmpty, newReader, newWriter
 

Method Detail

getName

String getName()
Get the name of a Dataset. No guarantees are made about the format of this name.


getDescriptor

DatasetDescriptor getDescriptor()
Get the DatasetDescriptor associated with this dataset.


getPartition

@Deprecated
Dataset<E> getPartition(org.kitesdk.data.spi.PartitionKey key,
                                   boolean autoCreate)
Deprecated. will be removed in 0.16.0; use RefinableView methods instead

Get a partition for a PartitionKey, optionally creating the partition if it doesn't already exist. You can obtain the PartitionKey using PartitionStrategy.partitionKey(Object...) or PartitionStrategy.partitionKeyForEntity(Object).

Parameters:
key - The key used to look up the partition.
autoCreate - If true, automatically creates the partition if it doesn't exist.
Throws:
DatasetException

dropPartition

@Deprecated
void dropPartition(org.kitesdk.data.spi.PartitionKey key)
Deprecated. will be removed in 0.16.0; use View.deleteAll() on an appropriate view instead

Drop a partition for a PartitionKey. Dropping a partition that doesn't exist results in a DatasetException being thrown.

Parameters:
key - The key used to look up the partition.
Throws:
DatasetException
Since:
0.2.0

getPartitions

@Deprecated
Iterable<? extends Dataset<E>> getPartitions()
Deprecated. will be removed in 0.16.0; use RefinableView methods instead

Return partitions, if this dataset is partitioned.

Note that, depending on the implementation, the returned iterable can hold system resources until exhausted and/or finalized.

Returns:
an iterable over all partitions of this dataset
Throws:
DatasetException

getUri

URI getUri()
Return a URI for this Dataset. The returned URI should load a copy of this dataset when passed to Datasets.load(java.net.URI, java.lang.Class).

Returns:
a URI that identifies this dataset
Since:
0.15.0


Copyright © 2013–2014. All rights reserved.