|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.kitesdk.data.DatasetRepositories
public class DatasetRepositories
Convenience methods for working with DatasetRepository
instances.
Constructor Summary | |
---|---|
DatasetRepositories()
|
Method Summary | |
---|---|
static DatasetRepository |
open(String uri)
Synonym for open(java.net.URI) for String URIs. |
static DatasetRepository |
open(URI repositoryUri)
Open a DatasetRepository for the given URI. |
static RandomAccessDatasetRepository |
openRandomAccess(String uri)
Synonym for openRandomAccess(java.net.URI) for String URIs. |
static RandomAccessDatasetRepository |
openRandomAccess(URI repositoryUri)
Synonym for open(java.net.URI) for RandomAccessDatasetRepository s |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DatasetRepositories()
Method Detail |
---|
public static DatasetRepository open(String uri)
open(java.net.URI)
for String URIs.
uri
- a String URI
IllegalArgumentException
- If the String cannot be parsed into a
valid URI
.public static DatasetRepository open(URI repositoryUri)
Open a DatasetRepository
for the given URI.
This method provides a simpler way to connect to a DatasetRepository
while providing information about the appropriate MetadataProvider
and other options to use. For almost all cases, this is the preferred method
of retrieving an instance of a DatasetRepository
.
The format of a repository URI is as follows.
repo:[storage component]
The [storage component]
indicates the underlying metadata and,
in some cases, physical storage of the data, along with any options. The
supported storage backends are:
file:[path]
where [path]
is a relative or absolute
filesystem path to be used as the dataset repository root directory in which
to store dataset data. When specifying an absolute path, the
null authority
(i.e. file:///my/path
)
form may be used. Alternatively, the authority section may be omitted
entirely (e.g. file:/my/path
). Either way, it is illegal to
provide an authority (i.e.
file://this-part-is-illegal/my/path
). This storage backend
will produce a DatasetRepository
that stores both data and metadata
on the local operating system filesystem. See
FileSystemDatasetRepository
for more information.
hdfs://[host]:[port]/[path]
where [host]
and
[port]
indicate the location of the Hadoop NameNode, and
[path]
is the dataset repository root directory in which to
store dataset data. This form will load the Hadoop configuration
information per the usual methods (i.e. searching the process's classpath
for the various configuration files). This storage backend will produce a
DatasetRepository
that stores both data and metadata in HDFS. See
FileSystemDatasetRepository
for more information.
hive
and
hive://[metastore-host]:[metastore-port]/
will connect to the
Hive MetaStore. Dataset locations will be determined by Hive as managed
tables.
hive:/[path]
and
hive://[metastore-host]:[metastore-port]/[path]
will also
connect to the Hive MetaStore, but tables will be external and stored
under [path]
. The repository storage layout will be the same
as hdfs
and file
repositories. HDFS connection
options can be supplied by adding hdfs-host
and
hdfs-port
query options to the URI (see examples).
repo:hbase:[zookeeper-host1]:[zk-port],[zookeeper-host2],...
will open a HBase-backed DatasetRepository. This URI may also be
instantiated with openRandomAccess(URI)
to instantiate a RandomAccessDatasetRepository
repo:file:foo/bar |
Store data+metadata on the local filesystem in the directory
./foo/bar . |
repo:file:///data |
Store data+metadata on the local filesystem in the directory
/data |
repo:hdfs://localhost:8020/data |
Same as above, but stores data+metadata on HDFS. |
repo:hive |
Connects to the Hive MetaStore and creates managed tables. |
repo:hive://meta-host:9083/ |
Connects to the Hive MetaStore at thrift://meta-host:9083 ,
and creates managed tables. This only matches when the path is
/ | . Any non-root path will match the external Hive URIs.
repo:hive:/path?hdfs-host=localhost&hdfs-port=8020 |
Connects to the default Hive MetaStore and creates external tables
stored in hdfs://localhost:8020/ at path .
hdfs-host and hdfs-port are optional.
|
repo:hive://meta-host:9083/path?hdfs-host=localhost&hdfs-port=8020
|
Connects to the Hive MetaStore at thrift://meta-host:9083/
and creates external tables stored in hdfs://localhost:8020/
at path . hdfs-host and hdfs-port
are optional.
|
repo:hbase:zk1,zk2,zk3
|
Connects to HBase via the given zookeeper quorum nodes. |
repositoryUri
- The repository URI
DatasetRepository
public static RandomAccessDatasetRepository openRandomAccess(String uri)
openRandomAccess(java.net.URI)
for String URIs.
uri
- a String URI
RandomAccessDatasetRepository
IllegalArgumentException
- If the String cannot be parsed into a
valid URI
.public static RandomAccessDatasetRepository openRandomAccess(URI repositoryUri)
Synonym for open(java.net.URI)
for RandomAccessDatasetRepository
s
This method provides a simpler way to connect to a DatasetRepository
the same
way open(java.net.URI)
does, but instead returns an implementation of type
RandomAccessDatasetRepository
. This method should be used when one needs to
access RandomAccessDataset
s to take advantage of the random access methods.
repo:[storage component]
The [storage component]
indicates the underlying metadata and,
in some cases, physical storage of the data, along with any options. The
supported storage backends are:
repo:hbase:[zookeeper-host1]:[zk-port],[zookeeper-host2],...
will open a HBase-backed DatasetRepository. This URI may also be
instantiated with openRandomAccess(URI)
to instantiate a RandomAccessDatasetRepository
repo:hbase:zk1,zk2,zk3
|
Connects to HBase via the given zookeeper quorum nodes. |
repositoryUri
- The repository URI
RandomAccessDatasetRepository
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |