org.kitesdk.data.hcatalog
Class HCatalogDatasetRepository
java.lang.Object
org.kitesdk.data.spi.AbstractDatasetRepository
org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository
org.kitesdk.data.hcatalog.HCatalogDatasetRepository
- All Implemented Interfaces:
- DatasetRepository
public class HCatalogDatasetRepository
- extends org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository
A DatasetRepository
that uses the Hive/HCatalog metastore for metadata,
and stores data in a Hadoop FileSystem
.
The location of the data directory is either chosen by Hive/HCatalog (so called
"managed tables"), or specified when creating an instance of this class by providing
a FileSystem
, and a root directory in the constructor ("external tables").
The primary methods of interest will be
create(String, DatasetDescriptor)
, FileSystemDatasetRepository.load(String)
, and
delete(String)
which create a new dataset, load an existing
dataset, or delete an existing dataset, respectively. Once a dataset has been created
or loaded, users can invoke the appropriate Dataset
methods to get a reader
or writer as needed.
- See Also:
DatasetRepository
,
Dataset
Fields inherited from class org.kitesdk.data.spi.AbstractDatasetRepository |
REPOSITORY_URI_PROPERTY_NAME |
Methods inherited from class org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository |
exists, getUri, list, load, partitionKeyForPath, toString, update |
Methods inherited from class org.kitesdk.data.spi.AbstractDatasetRepository |
addRepositoryUri |
create
public <E> Dataset<E> create(String name,
DatasetDescriptor descriptor)
- Description copied from interface:
DatasetRepository
- Create a
Dataset
with the supplied descriptor
. Depending on
the underlying dataset storage, some schema types or configurations might
not be supported. If you supply an illegal schema, the implementing class
throws an exception. It is illegal to create more than one dataset with the
same name. If you provide a duplicate name, the implementing class throws
an exception.
- Specified by:
create
in interface DatasetRepository
- Overrides:
create
in class org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository
- Parameters:
name
- The fully qualified dataset namedescriptor
- A descriptor that describes the schema and other
properties of the dataset
- Returns:
- The newly created dataset
delete
public boolean delete(String name)
- Description copied from interface:
DatasetRepository
- Delete data for the
Dataset
named name
and remove its
DatasetDescriptor
from the underlying metadata provider.
After this method is called, there is no Dataset
with the given
name
, unless an exception is thrown. If either data or metadata
are removed, this method returns true
. If there is no
Dataset
corresponding to the given name
, this
method makes no changes and returns false
.
- Specified by:
delete
in interface DatasetRepository
- Overrides:
delete
in class org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository
- Parameters:
name
- The name of the dataset to delete.
- Returns:
true
if any data or metadata is removed,
false
if no action is taken.
Copyright © 2013–2014. All rights reserved.