java.lang.Object
es.upm.fi.cig.multictbnc.data.representation.Dataset
Represents a time series dataset, which stores sequences and provides methods to access and modify their
 information.
- 
Constructor SummaryConstructorsConstructorDescriptionCreates an empty dataset with the name of the time variable.Creates an empty dataset with the names of the time variable and class variables.Creates a dataset with a list of sequences.
- 
Method SummaryModifier and TypeMethodDescriptionvoidaddFeatureVariable(String nameFeatureVariable) Registers the name of a feature in the dataset to allow the inclusion of sequences that contain it.voidaddFeatureVariable(String nameFeatureVariable, Dataset dataset) Add a new feature variable to the dataset given the sequences containing the transitions of the variable.booleanaddSequence(Sequence sequence) Receives aSequenceto add it to the dataset.booleanaddSequence(List<String[]> data) Receives a list of Strings (a sequence) from which aSequenceis created and adds it to the dataset.booleanaddSequence(List<String[]> data, String filePath) Receives a list of Strings (a sequence) and the path of the file from which it was extracted.voidcheckVarianceFeatures(boolean removeZeroVariance) Removes from the dataset those feature variables with zero variance.Returns a multi-class dataset generated from the multidimensional dataset.Returns the name of all the variables, including the time variable.Returns the name of the class variables.Returns the names of the feature variables.Returns the name of the time variable.Returns the name of all the variables except the time variable.intReturns the number of class variables.intReturns the number of data points.intReturns the number of observations in the dataset, i.e., the number of observations that occur in all the sequences.getPossibleStatesVariable(String nameVariable) Returns the possible states of the specified variable.Returns the sequences of the dataset.State[]Gets the states of the class variables for each of the sequences.Gets the possible states of all variables.voidRetrieves the states of the class variables and stores them in aMap.voidremoveFeatureVariable(String nameFeatureVariable) Remove the specified feature variable from the dataset.voidremoveFeatureVariables(List<String> namesFeatureVariables) Remove the specified feature variables from the dataset.voidsetIgnoredClassVariables(List<String> ignoredClassVariables) Sets the class variables to ignored.voidsetStatesVariables(Map<String, List<String>> statesVariables) Sets states of all variables.
- 
Constructor Details- 
DatasetCreates an empty dataset with the names of the time variable and class variables.- Parameters:
- nameTimeVariable- name of the time variable
- nameClassVariables- names of the class variables
 
- 
DatasetCreates an empty dataset with the name of the time variable.- Parameters:
- nameTimeVariable- name of the time variable
 
- 
DatasetCreates a dataset with a list of sequences.- Parameters:
- sequences- list of- Sequence
 
 
- 
- 
Method Details- 
addFeatureVariableRegisters the name of a feature in the dataset to allow the inclusion of sequences that contain it. This method requires a fill state that will be used to include the variable in those sequences.- Parameters:
- nameFeatureVariable- feature variable name
 
- 
getSequencesReturns the sequences of the dataset.- Returns:
- list with the sequences of the dataset
 
- 
addFeatureVariableAdd a new feature variable to the dataset given the sequences containing the transitions of the variable. It is assumed that the given sequences and those of the dataset have the same length. This method ignores the time variable.- Parameters:
- nameFeatureVariable- name of the feature variable to add
- dataset- dataset containing sequences with the new feature variable
 
- 
addSequenceReceives a list of Strings (a sequence) from which aSequenceis created and adds it to the dataset. The first array of Strings has to contain the name of the variables.- Parameters:
- data- list of Strings (a sequence) where the first array contains the name of the variables
- Returns:
- trueif the sequence was successfully added to the dataset;- falseotherwise.
 
- 
addSequenceReceives a list of Strings (a sequence) and the path of the file from which it was extracted. Then, it creates aSequenceand adds it to the dataset. The first array of Strings representing the sequence has to contain the name of the variables.- Parameters:
- data- list of Strings (a sequence) where the first array contains the name of the variables
- filePath- path of the file from which the sequence was extracted
- Returns:
- trueif the sequence was successfully added to the dataset;- falseotherwise.
 
- 
getNumDataPointspublic int getNumDataPoints()Returns the number of data points. In this case, this is the number of sequences.- Returns:
- number of sequences
 
- 
getNameAllVariablesReturns the name of all the variables, including the time variable.- Returns:
- name of all the variables
 
- 
getNameTimeVariableReturns the name of the time variable.- Returns:
- name of time variable
 
- 
getNameVariablesReturns the name of all the variables except the time variable. The list returned contains first the features and then the class variables.- Returns:
- name of all the variables except the time variable
 
- 
getNameFeatureVariablesReturns the names of the feature variables.- Returns:
- list with the names of the feature variables
 
- 
getNameClassVariablesReturns the name of the class variables. Those class variables that should be ignored are filtered.- Returns:
- list with the names of the class variables
 
- 
addSequenceReceives aSequenceto add it to the dataset.- Parameters:
- sequence- a- Sequence
- Returns:
- trueif the sequence was successfully added to the dataset;- falseotherwise.
 
- 
checkVarianceFeaturespublic void checkVarianceFeatures(boolean removeZeroVariance) Removes from the dataset those feature variables with zero variance. This method should be used when the entire dataset was read, as new sequences could be included.- Parameters:
- removeZeroVariance-- trueto remove variables with no variance,- falseotherwise
 
- 
getPossibleStatesVariableReturns the possible states of the specified variable. The states of the variable are extracted once and stored in a map to avoid recomputations. In order to not always return a reference to the same State list, the State objects from the map are copied.- Parameters:
- nameVariable- variable name
- Returns:
- states of the variable
 
- 
getLabelPowersetReturns a multi-class dataset generated from the multidimensional dataset.- Returns:
- multi-class dataset
 
- 
getNumClassVariablespublic int getNumClassVariables()Returns the number of class variables.- Returns:
- number of class variables.
 
- 
getNumObservationpublic int getNumObservation()Returns the number of observations in the dataset, i.e., the number of observations that occur in all the sequences.- Returns:
- number of observations
 
- 
getStatesClassVariablesGets the states of the class variables for each of the sequences.- Returns:
- array of Stateobjects
 
- 
getStatesVariablesGets the possible states of all variables.- Returns:
- array of Stateobjects
 
- 
setStatesVariablesSets states of all variables. This method is used when training and test datasets are defined, and the training dataset needs to know all possible states.- Parameters:
- statesVariables- a {code Map} linking the names of the variables with their possible states
 
- 
initialiazeStatesClassVariablespublic void initialiazeStatesClassVariables()Retrieves the states of the class variables and stores them in aMap.
- 
removeFeatureVariablesRemove the specified feature variables from the dataset.- Parameters:
- namesFeatureVariables- names of the feature variables
 
- 
removeFeatureVariableRemove the specified feature variable from the dataset.- Parameters:
- nameFeatureVariable- feature variable name
 
- 
setIgnoredClassVariablesSets the class variables to ignored.- Parameters:
- ignoredClassVariables- names of the class variables to ignore
 
 
-