Class DataStreamExperiment
The class includes functionalities to: 1. Load and process data from specified paths. 2. Train an initial model on a given training dataset. 3. Evaluate and adapt the model in response to incoming data. 4. Implement different strategies for handling concept drift in the data. 5. Calculate and track various performance metrics over data batches. 6. Optionally, display results visually using line charts. 7. Save experimental results to external files for further analysis.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionexecuteExperiment
(String pathExperiment, double detectionThreshold) Executes a single data stream experiment for a specified path and detection threshold.double
Retrieves the mean global accuracy across all processed data batches.double
Retrieves the mean global Brier score across all processed data batches.double
Retrieves the mean macro-averaged F1 score across all processed data batches.double
Retrieves the average mean accuracy across all processed data batches.double
Retrieves the mean micro-averaged F1 score across all processed data batches.int
Retrieves the number of times the model was updated during the experiment.String[]
Retrieves the array of dataset paths for the experiments.void
The main method to execute the data stream experiments.void
setAreResultsSaved
(boolean areResultsSaved) Sets the flag indicating whether the results of the experiment should be saved.void
setParametersExperiment
(Queue<String> args) Sets up the parameters for a data stream experiment using a queue of arguments.
-
Constructor Details
-
DataStreamExperiment
public DataStreamExperiment()
-
-
Method Details
-
getMeanGlobalAccuracy
public double getMeanGlobalAccuracy()Retrieves the mean global accuracy across all processed data batches.- Returns:
- mean global accuracy
-
getMeanMeanAccuracy
public double getMeanMeanAccuracy()Retrieves the average mean accuracy across all processed data batches.- Returns:
- average mean accuracy
-
getMeanMacroAveragedF1Score
public double getMeanMacroAveragedF1Score()Retrieves the mean macro-averaged F1 score across all processed data batches.- Returns:
- mean macro-averaged F1 score
-
getMeanMicroAveragedF1Score
public double getMeanMicroAveragedF1Score()Retrieves the mean micro-averaged F1 score across all processed data batches.- Returns:
- mean micro-averaged F1 score
-
getMeanGlobalBrierScore
public double getMeanGlobalBrierScore()Retrieves the mean global Brier score across all processed data batches.- Returns:
- mean global Brier score
-
getNumTimesModelUpdated
public int getNumTimesModelUpdated()Retrieves the number of times the model was updated during the experiment.- Returns:
- number of times the model was updated
-
main
The main method to execute the data stream experiments.- Parameters:
args
- queue of strings representing parameters needed for setting up and executing experiments.- Throws:
ErroneousValueException
- if there is an issue with the values of the provided parameters
-
setParametersExperiment
Sets up the parameters for a data stream experiment using a queue of arguments. This method initializes the experiment with the necessary configurations required for its execution. Additionally, this method initializes the learning algorithms for both parameter and structure learning based on the specified parameters.The method processes the following parameters in sequence: 1. Path to data: extracts the path to the dataset, which includes a training set and a data stream folder. 2. Batch size: determines the size of the data batches to be processed in each iteration of the experiment. 3. Concept drift adaptation strategy: specifies the method of adaptation to concept drift, such as 'LOCAL', 'GLOBAL' or no adaptation. 4. Detection thresholds: sets the thresholds for the Page-Hinkley test used in concept drift detection. 5. Magnitude threshold: defines the magnitude threshold for concept drift detection. 6. Reset after concept drift: a boolean value indicating whether the model should be reset after detecting concept drift. 7. Window size: determines the window size for the concept drift detection algorithm. 8. Class variables: specifies the names of the class variables. 9. Time Variable: specifies the name of the time variable. 10. Concept drift score function: defines the scoring function to be used in concept drift detection. 11. Display charts: a boolean value indicating whether to display charts for visualizing the results.
- Parameters:
args
- a queue of strings containing the parameters needed to configure the experiment
-
getPathsExperiments
Retrieves the array of dataset paths for the experiments.- Returns:
- an array of strings, each representing a path to datasets
-
executeExperiment
public List<Map<String,Double>> executeExperiment(String pathExperiment, double detectionThreshold) throws ErroneousValueException Executes a single data stream experiment for a specified path and detection threshold. This method encompasses the entire process of running an experiment, including training the initial model, setting up concept drift adaptation strategies, iterating over the data stream and saving the results.- Parameters:
pathExperiment
- path to the experiment data, including training and streaming datadetectionThreshold
- threshold value used for detecting concept drift- Returns:
- list of maps, each map containing performance metrics for a data batch in the stream
- Throws:
ErroneousValueException
- if there is an issue with the provided data or configuration settings
-
setAreResultsSaved
public void setAreResultsSaved(boolean areResultsSaved) Sets the flag indicating whether the results of the experiment should be saved.- Parameters:
areResultsSaved
- a boolean value specifying whether to save the experiment results. `true` to save results, `false` otherwise.
-