org.kitesdk.data
Class PartitionStrategy

java.lang.Object
  extended by org.kitesdk.data.PartitionStrategy

@Immutable
public class PartitionStrategy
extends Object

The strategy used to determine how a dataset is partitioned.

A PartitionStrategy is configured with one or more FieldPartitioners upon creation. When a Dataset is configured with a partition strategy, we say that data is partitioned. Any entities written to a partitioned dataset are evaluated with its PartitionStrategy which, in turn, produces a PartitionKey that is used by the dataset implementation to select the proper partition.

Users should use the inner PartitionStrategy.Builder to create new instances.

See Also:
FieldPartitioner, PartitionKey, DatasetDescriptor, Dataset

Nested Class Summary
static class PartitionStrategy.Builder
          A fluent builder to aid in the construction of PartitionStrategys.
 
Constructor Summary
PartitionStrategy(List<FieldPartitioner> partitioners)
          Deprecated. will be removed in 0.12.0; use PartitionStrategy.Builder
 
Method Summary
 boolean equals(Object o)
           
 int getCardinality()
           Return the cardinality produced by the contained field partitioners.
 List<FieldPartitioner> getFieldPartitioners()
           Get the list of field partitioners used for partitioning.
 int hashCode()
           
 PartitionKey partitionKey(Object... values)
           Construct a partition key with a variadic array of values corresponding to the field partitioners in this partition strategy.
 PartitionKey partitionKeyForEntity(Object entity)
           Construct a partition key for the given entity.
 PartitionKey partitionKeyForEntity(Object entity, PartitionKey reuseKey)
           Construct a partition key for the given entity, reusing the supplied key if not null.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PartitionStrategy

@Deprecated
public PartitionStrategy(List<FieldPartitioner> partitioners)
Deprecated. will be removed in 0.12.0; use PartitionStrategy.Builder

Construct a partition strategy with a list of field partitioners.

Method Detail

getFieldPartitioners

public List<FieldPartitioner> getFieldPartitioners()

Get the list of field partitioners used for partitioning.

FieldPartitioners are returned in the same order they are used during partition selection.


getCardinality

public int getCardinality()

Return the cardinality produced by the contained field partitioners.

This can be used to aid in calculating resource usage used during certain operations. For example, when writing data to a partitioned dataset, this method can be used to estimate (or discover exactly, depending on the partition functions) how many leaf partitions exist.

Warning: This method is allowed to lie and should be treated only as a hint. Some partition functions are fixed (e.g. hash modulo number of buckets), while others are open-ended (e.g. discrete value) and depend on the input data.

Returns:
The estimated (or possibly concrete) number of leaf partitions.

partitionKey

public PartitionKey partitionKey(Object... values)

Construct a partition key with a variadic array of values corresponding to the field partitioners in this partition strategy.

It is permitted to have fewer values than field partitioners, in which case all subpartititions in the unspecified parts of the key are matched by the key.

Null values are not permitted.


partitionKeyForEntity

public PartitionKey partitionKeyForEntity(Object entity)

Construct a partition key for the given entity.

This is a convenient way to find the partition that a given entity would be written to, or to find a partition using objects from the entity domain.


partitionKeyForEntity

public PartitionKey partitionKeyForEntity(Object entity,
                                          @Nullable
                                          PartitionKey reuseKey)

Construct a partition key for the given entity, reusing the supplied key if not null.

This is a convenient way to find the partition that a given entity would be written to, or to find a partition using objects from the entity domain.


equals

public boolean equals(Object o)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2013–2014. All rights reserved.