E
- The type of entity accepted by this writer.@NotThreadSafe public interface DatasetWriter<E> extends Flushable, Closeable
A stream-oriented dataset writer.
Implementations of this interface write data to a Dataset
.
Writers are use-once objects that serialize entities of type E
and
write them to the underlying storage system. Normally, you are
not expected to instantiate implementations directly. Instead, use the
containing dataset's View.newWriter()
method to get an appropriate
implementation. You should receive an instance of this interface from a
dataset, invoke write(Object)
and flush()
(or sync()
) as
necessary, and close()
when they are done, or no more data exists.
Implementations can hold system resources until the close()
method
is called, so you must follow the normal try / finally
pattern to ensure these resources are properly freed when the writer is no
longer needed. Do not rely on implementations automatically invoking the
close()
method upon object finalization (implementations must not do
so). All implementations must silently ignore multiple invocations of
close()
as well as a close of an unopened writer.
If any method throws an exception, the writer is no longer valid, and the
only method that can be subsequently called is close()
.
Implementations of DatasetWriter
are typically not thread-safe; that
is, the behavior when accessing a single instance from multiple threads is
undefined.
Modifier and Type | Method and Description |
---|---|
void |
close()
Close the writer and release any system resources.
|
void |
flush()
Force or commit any outstanding buffered data to the underlying stream (optional
operation).
|
boolean |
isOpen() |
void |
sync()
Ensure that data in the underlying stream has been written to disk (optional
operation).
|
void |
write(E entity)
Write an entity of type
E to the associated dataset. |
void write(E entity)
Write an entity of type E
to the associated dataset.
Implementations can buffer entities internally (see the flush()
and sync()
methods). All instances of entity
must conform to the dataset's
schema. If they don't, implementations should throw an exception, although
this is not required.
entity
- The entity to writeDatasetWriterException
void flush()
Force or commit any outstanding buffered data to the underlying stream (optional operation).
Note: Some implementations may not implement this method
depending on the guarantees available to the underlying storage system.
In particular, when using HDFS-backed datasets the Parquet
format
does not implement flush()
by default,
and calling it has no effect.
flush
in interface Flushable
DatasetWriterException
void sync()
Ensure that data in the underlying stream has been written to disk (optional operation).
Note: Some implementations may not implement this method
depending on the guarantees available to the underlying storage system.
In particular, when using HDFS-backed datasets the Parquet
format
does not implement sync()
by default,
and calling it has no effect.
DatasetWriterException
void close()
Close the writer and release any system resources. If this method returns without
throwing an exception then any entity that was successfully written with
write(Object)
will be stored to stable storage.
No further operations of this interface (other than additional calls to this method) can be performed; however, implementations can choose to permit other method calls. See implementation documentation for details.
close
in interface AutoCloseable
close
in interface Closeable
DatasetWriterException
boolean isOpen()
Copyright © 2013–2014. All rights reserved.