| 
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.kitesdk.data.DatasetRepositories
public class DatasetRepositories
Convenience methods for working with DatasetRepository instances.
| Constructor Summary | |
|---|---|
DatasetRepositories()
 | 
|
| Method Summary | |
|---|---|
static DatasetRepository | 
open(String uri)
Synonym for open(java.net.URI) for String URIs. | 
static DatasetRepository | 
open(URI repositoryUri)
Open a DatasetRepository for the given URI. | 
static RandomAccessDatasetRepository | 
openRandomAccess(String uri)
Synonym for openRandomAccess(java.net.URI) for String URIs. | 
static RandomAccessDatasetRepository | 
openRandomAccess(URI repositoryUri)
Synonym for open(java.net.URI) for RandomAccessDatasetRepositorys | 
| Methods inherited from class java.lang.Object | 
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public DatasetRepositories()
| Method Detail | 
|---|
public static DatasetRepository open(String uri)
open(java.net.URI) for String URIs.
uri - a String URI
IllegalArgumentException - If the String cannot be parsed into a
                                  valid URI.public static DatasetRepository open(URI repositoryUri)
 Open a DatasetRepository for the given URI.
 
 This method provides a simpler way to connect to a DatasetRepository
 while providing information about the appropriate MetadataProvider
 and other options to use. For almost all cases, this is the preferred method
 of retrieving an instance of a DatasetRepository.
 
The format of a repository URI is as follows.
repo:[storage component]
 
 The [storage component] indicates the underlying metadata and,
 in some cases, physical storage of the data, along with any options. The
 supported storage backends are:
 
 file:[path] where [path] is a relative or absolute
 filesystem path to be used as the dataset repository root directory in which
 to store dataset data. When specifying an absolute path, the
 null authority
 (i.e. file:///my/path)
 form may be used. Alternatively, the authority section may be omitted
 entirely (e.g. file:/my/path). Either way, it is illegal to
 provide an authority (i.e.
 file://this-part-is-illegal/my/path). This storage backend
 will produce a DatasetRepository that stores both data and metadata
 on the local operating system filesystem. See
 FileSystemDatasetRepository for more information.
 
 hdfs://[host]:[port]/[path] where [host] and
 [port] indicate the location of the Hadoop NameNode, and
 [path] is the dataset repository root directory in which to
 store dataset data. This form will load the Hadoop configuration
 information per the usual methods (i.e. searching the process's classpath
 for the various configuration files). This storage backend will produce a
 DatasetRepository that stores both data and metadata in HDFS. See
 FileSystemDatasetRepository for more information.
 
 hive and
 hive://[metastore-host]:[metastore-port]/ will connect to the
 Hive MetaStore.  Dataset locations will be determined by Hive as managed
 tables.
 
 hive:/[path] and
 hive://[metastore-host]:[metastore-port]/[path] will also
 connect to the Hive MetaStore, but tables will be external and stored
 under [path]. The repository storage layout will be the same
 as hdfs and file repositories. HDFS connection
 options can be supplied by adding hdfs-host and
 hdfs-port query options to the URI (see examples).
 
 repo:hbase:[zookeeper-host1]:[zk-port],[zookeeper-host2],...
  will open a HBase-backed DatasetRepository. This URI may also be
 instantiated with openRandomAccess(URI) to instantiate a RandomAccessDatasetRepository
 
repo:file:foo/bar | 
 Store data+metadata on the local filesystem in the directory
 ./foo/bar. | 
 
repo:file:///data | 
 Store data+metadata on the local filesystem in the directory
 /data | 
 
repo:hdfs://localhost:8020/data | 
 Same as above, but stores data+metadata on HDFS. | 
repo:hive | 
 Connects to the Hive MetaStore and creates managed tables. | 
repo:hive://meta-host:9083/ | 
 Connects to the Hive MetaStore at thrift://meta-host:9083,
 and creates managed tables. This only matches when the path is
 / | . Any non-root path will match the external Hive URIs.
 
repo:hive:/path?hdfs-host=localhost&hdfs-port=8020 | 
 Connects to the default Hive MetaStore and creates external tables
 stored in hdfs://localhost:8020/ at path.
 hdfs-host and hdfs-port are optional.
  | 
 
 repo:hive://meta-host:9083/path?hdfs-host=localhost&hdfs-port=8020
 
  | 
 
 Connects to the Hive MetaStore at thrift://meta-host:9083/
 and creates external tables stored in hdfs://localhost:8020/
 at path. hdfs-host and hdfs-port
 are optional.
  | 
 
 repo:hbase:zk1,zk2,zk3
  | 
 Connects to HBase via the given zookeeper quorum nodes. | 
repositoryUri - The repository URI
DatasetRepositorypublic static RandomAccessDatasetRepository openRandomAccess(String uri)
openRandomAccess(java.net.URI) for String URIs.
uri - a String URI
RandomAccessDatasetRepository
IllegalArgumentException - If the String cannot be parsed into a
                                  valid URI.public static RandomAccessDatasetRepository openRandomAccess(URI repositoryUri)
 Synonym for open(java.net.URI) for RandomAccessDatasetRepositorys
 
 This method provides a simpler way to connect to a DatasetRepository the same
 way open(java.net.URI) does, but instead returns an implementation of type
 RandomAccessDatasetRepository. This method should be used when one needs to
 access RandomAccessDatasets to take advantage of the random access methods.
 
repo:[storage component]
 
 The [storage component] indicates the underlying metadata and,
 in some cases, physical storage of the data, along with any options. The
 supported storage backends are:
 
 repo:hbase:[zookeeper-host1]:[zk-port],[zookeeper-host2],...
  will open a HBase-backed DatasetRepository. This URI may also be
 instantiated with openRandomAccess(URI) to instantiate a RandomAccessDatasetRepository
 
 repo:hbase:zk1,zk2,zk3
 
  | 
 Connects to HBase via the given zookeeper quorum nodes. | 
repositoryUri - The repository URI
RandomAccessDatasetRepository
  | 
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||