|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.debellor.core.DataObject org.debellor.core.Sample
public final class Sample
Sample of data, also known as an instance/object/vector,
the basic unit of data transfer between cells (see Cell.Stream.next()
).
Sample is composed of input data
and an associated
decision
(output data).
Samples are constant (immutable),
like String
objects, so you may freely share them
without risk of accidental modification.
In contrast to some other data mining systems, e.g. Weka,
Debellor's samples may contain various types of data and decisions,
not necessarily vectors.
The data
and decision
fields are declared as references to the base
DataObject
class, so it is possible to add new data types
by defining new subclasses of Data.
When the cell receives a sample, it usually has to downcast manually
the contained Data objects to specific subclasses, as expected by this cell,
in order to process the sample.
It is up to the cell which fields (data
, decision
)
of the sample it actually uses.
The cell may choose to read and/or write both,
only one or none of them - this depends on the type of the cell
(is it a decision system? preprocessing algorithm? etc.),
its parameters (e.g., a cell could
take a parameter which controls whether the processing is applied to
data
or decision
)
and whether the sample is presented at the input
or generated at the output of the cell.
Every cell should define a contract which specifies
what type of samples is expected at the input
and what type of samples is generated at the output.
If the cell wants to know in advance what type of samples will be generated
by Stream.next()
of input stream, it may read the Sample.SampleType
from Cell.Stream.sampleType
field - its value
is available immediately after the stream in opened,
so the cell may prepare internal structures as necessary for a given data type,
e.g., arrays of appropriate length if the data will be composed of vectors.
On the other hand, before the cell starts generating output samples,
it should create a sampleType object describing the samples to be produced
as precisely as possible. This object should be returned from
overriden Cell.onOpen()
.
Providing a meaningful (non-null) sampleType object is not obligatory,
but in other case the usability of the cell is low,
because most cells that could be connected to the given cell as consumers
would fail on runtime due to unhandled type of input data.
Algorithms from Weka and Rseslib libraries
operate on samples whose data
field is a DataVector
composed of NumericFeature
or SymbolicFeature
objects,
while the decision
is a single feature object.
Cell.Stream.next()
,
Cell.onNext()
Nested Class Summary | |
---|---|
static class |
Sample.SampleType
Describes common properties of all Sample objects in a given
data Cell.Stream . |
Field Summary | |
---|---|
DataObject |
data
Input data on which data processing algorithms will primarily work. |
DataObject |
decision
Decision (also known as target/decision/prediction/output value) associated with the data . |
Constructor Summary | |
---|---|
Sample(DataObject data,
DataObject decision)
|
Method Summary | |
---|---|
boolean |
equals(java.lang.Object obj)
Must be implemented by every subclass. |
int |
hashCode()
Must be implemented by every subclass. |
Sample |
setData(DataObject data)
|
Sample |
setDecision(DataObject decision)
|
java.lang.String |
toString()
|
Methods inherited from class org.debellor.core.DataObject |
---|
asDataVector, asNumericFeature, asSymbolicFeature |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public final DataObject data
null
for some or all samples in a data set.
May have an associated decision
.
public final DataObject decision
data
.
Either assigned by a supervisor (ground truth / target)
OR predicted by a decision system (prediction / output value).
Can be null
for some or all samples in a data set.
Constructor Detail |
---|
public Sample(DataObject data, DataObject decision)
Method Detail |
---|
public Sample setData(DataObject data)
public Sample setDecision(DataObject decision)
public java.lang.String toString()
toString
in class java.lang.Object
public boolean equals(java.lang.Object obj)
DataObject
equals
in class DataObject
public int hashCode()
DataObject
hashCode
in class DataObject
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |