org.kitesdk.data
Interface Dataset<E>

Type Parameters:
E - The type of entities stored in this Dataset.
All Superinterfaces:
RefineableView<E>, View<E>
All Known Subinterfaces:
RandomAccessDataset<E>

@Immutable
public interface Dataset<E>
extends RefineableView<E>

A logical representation of a set of data entities.

Logically, all datasets have two generic properties: a name, and a descriptor that holds information such as the dataset's schema and its partitioning information. Concrete implementations of Dataset may support additional properties, mandatory or otherwise, as needed. Datasets are not normally instantiated directly, but managed by a repository (also implementation-specific).

Implementations of Dataset are immutable.

See Also:
View, DatasetRepository, DatasetWriter, DatasetReader, PartitionStrategy, DatasetDescriptor, Schema

Method Summary
 void dropPartition(PartitionKey key)
          Drop a partition for a PartitionKey.
 DatasetDescriptor getDescriptor()
          Get the DatasetDescriptor associated with this dataset.
 String getName()
          Get the name of a Dataset.
 Dataset<E> getPartition(PartitionKey key, boolean autoCreate)
          Get a partition for a PartitionKey, possibly creating the partition if it doesn't already exist.
 Iterable<Dataset<E>> getPartitions()
           Return partitions, if this dataset is partitioned.
 
Methods inherited from interface org.kitesdk.data.RefineableView
from, fromAfter, to, toBefore, with
 
Methods inherited from interface org.kitesdk.data.View
getDataset, includes, newReader, newWriter
 

Method Detail

getName

String getName()
Get the name of a Dataset. No guarantees about the format of this name are made.


getDescriptor

DatasetDescriptor getDescriptor()
Get the DatasetDescriptor associated with this dataset.


getPartition

Dataset<E> getPartition(PartitionKey key,
                        boolean autoCreate)
Get a partition for a PartitionKey, possibly creating the partition if it doesn't already exist. A PartitionKey may be obtained using PartitionStrategy.partitionKey(Object...) or PartitionStrategy.partitionKeyForEntity(Object).

Parameters:
key - The key used to look up the partition.
autoCreate - If true, automatically create the partition if doesn't exist,
Throws:
DatasetException

dropPartition

void dropPartition(PartitionKey key)
Drop a partition for a PartitionKey. Dropping a partition that doesn't exist results in a DatasetException being thrown.

Parameters:
key - The key used to look up the partition.
Throws:
DatasetException
Since:
0.2.0

getPartitions

Iterable<Dataset<E>> getPartitions()

Return partitions, if this dataset is partitioned.

Note that, depending on the implementation, the returned iterable may hold system resources until exhausted and/or finalized.

Returns:
an iterable over all partitions of this dataset
Throws:
DatasetException


Copyright © 2013–2014. All rights reserved.