org.kitesdk.data
Interface Dataset<E>

Type Parameters:
E - The type of entities stored in this Dataset.
All Superinterfaces:
RefinableView<E>, View<E>
All Known Subinterfaces:
RandomAccessDataset<E>

@Immutable
public interface Dataset<E>
extends RefinableView<E>

A logical representation of a set of data entities.

Logically, all datasets have two generic properties: a name, and a descriptor that holds information such as the dataset's schema and its partitioning information. Concrete implementations of Dataset can support additional properties, mandatory or otherwise, as needed. Datasets are not normally instantiated directly, but managed by a repository (also implementation-specific).

Implementations of Dataset are immutable.

See Also:
View, DatasetRepository, DatasetWriter, DatasetReader, PartitionStrategy, DatasetDescriptor, Schema

Method Summary
 void dropPartition(PartitionKey key)
          Drop a partition for a PartitionKey.
 DatasetDescriptor getDescriptor()
          Get the DatasetDescriptor associated with this dataset.
 String getName()
          Get the name of a Dataset.
 Dataset<E> getPartition(PartitionKey key, boolean autoCreate)
          Get a partition for a PartitionKey, optionally creating the partition if it doesn't already exist.
 Iterable<Dataset<E>> getPartitions()
           Return partitions, if this dataset is partitioned.
 
Methods inherited from interface org.kitesdk.data.RefinableView
from, fromAfter, to, toBefore, with
 
Methods inherited from interface org.kitesdk.data.View
deleteAll, getDataset, includes, newReader, newWriter
 

Method Detail

getName

String getName()
Get the name of a Dataset. No guarantees are made about the format of this name.


getDescriptor

DatasetDescriptor getDescriptor()
Get the DatasetDescriptor associated with this dataset.


getPartition

Dataset<E> getPartition(PartitionKey key,
                        boolean autoCreate)
Get a partition for a PartitionKey, optionally creating the partition if it doesn't already exist. You can obtain the PartitionKey using PartitionStrategy.partitionKey(Object...) or PartitionStrategy.partitionKeyForEntity(Object).

Parameters:
key - The key used to look up the partition.
autoCreate - If true, automatically creates the partition if it doesn't exist.
Throws:
DatasetException

dropPartition

void dropPartition(PartitionKey key)
Drop a partition for a PartitionKey. Dropping a partition that doesn't exist results in a DatasetException being thrown.

Parameters:
key - The key used to look up the partition.
Throws:
DatasetException
Since:
0.2.0

getPartitions

Iterable<Dataset<E>> getPartitions()

Return partitions, if this dataset is partitioned.

Note that, depending on the implementation, the returned iterable can hold system resources until exhausted and/or finalized.

Returns:
an iterable over all partitions of this dataset
Throws:
DatasetException


Copyright © 2013–2014. All rights reserved.