java.lang.Object
es.upm.fi.cig.multictbnc.experiments.implementationsexperiments.DataStreamExperiment

public class DataStreamExperiment extends Object
Represents an experiment for evaluating continuous-time Bayesian network classifiers on streaming data. This class facilitates conducting experiments on data streams, handling different aspects of data stream processing, concept drift detection and adaptation. This class is configurable with various parameters including batch size, concept drift adaptation strategy, threshold for drift detection, etc., making it versatile for different streaming data scenarios.

The class includes functionalities to: 1. Load and process data from specified paths. 2. Train an initial model on a given training dataset. 3. Evaluate and adapt the model in response to incoming data. 4. Implement different strategies for handling concept drift in the data. 5. Calculate and track various performance metrics over data batches. 6. Optionally, display results visually using line charts. 7. Save experimental results to external files for further analysis.

  • Constructor Details

    • DataStreamExperiment

      public DataStreamExperiment()
  • Method Details

    • getMeanGlobalAccuracy

      public double getMeanGlobalAccuracy()
      Retrieves the mean global accuracy across all processed data batches.
      Returns:
      mean global accuracy
    • getMeanMeanAccuracy

      public double getMeanMeanAccuracy()
      Retrieves the average mean accuracy across all processed data batches.
      Returns:
      average mean accuracy
    • getMeanMacroAveragedF1Score

      public double getMeanMacroAveragedF1Score()
      Retrieves the mean macro-averaged F1 score across all processed data batches.
      Returns:
      mean macro-averaged F1 score
    • getMeanMicroAveragedF1Score

      public double getMeanMicroAveragedF1Score()
      Retrieves the mean micro-averaged F1 score across all processed data batches.
      Returns:
      mean micro-averaged F1 score
    • getMeanGlobalBrierScore

      public double getMeanGlobalBrierScore()
      Retrieves the mean global Brier score across all processed data batches.
      Returns:
      mean global Brier score
    • getNumTimesModelUpdated

      public int getNumTimesModelUpdated()
      Retrieves the number of times the model was updated during the experiment.
      Returns:
      number of times the model was updated
    • main

      public void main(Queue<String> args) throws ErroneousValueException
      The main method to execute the data stream experiments.
      Parameters:
      args - queue of strings representing parameters needed for setting up and executing experiments.
      Throws:
      ErroneousValueException - if there is an issue with the values of the provided parameters
    • setParametersExperiment

      public void setParametersExperiment(Queue<String> args)
      Sets up the parameters for a data stream experiment using a queue of arguments. This method initializes the experiment with the necessary configurations required for its execution. Additionally, this method initializes the learning algorithms for both parameter and structure learning based on the specified parameters.

      The method processes the following parameters in sequence: 1. Path to data: extracts the path to the dataset, which includes a training set and a data stream folder. 2. Batch size: determines the size of the data batches to be processed in each iteration of the experiment. 3. Concept drift adaptation strategy: specifies the method of adaptation to concept drift, such as 'LOCAL', 'GLOBAL' or no adaptation. 4. Detection thresholds: sets the thresholds for the Page-Hinkley test used in concept drift detection. 5. Magnitude threshold: defines the magnitude threshold for concept drift detection. 6. Reset after concept drift: a boolean value indicating whether the model should be reset after detecting concept drift. 7. Window size: determines the window size for the concept drift detection algorithm. 8. Class variables: specifies the names of the class variables. 9. Time Variable: specifies the name of the time variable. 10. Concept drift score function: defines the scoring function to be used in concept drift detection. 11. Display charts: a boolean value indicating whether to display charts for visualizing the results.

      Parameters:
      args - a queue of strings containing the parameters needed to configure the experiment
    • getPathsExperiments

      public String[] getPathsExperiments()
      Retrieves the array of dataset paths for the experiments.
      Returns:
      an array of strings, each representing a path to datasets
    • executeExperiment

      public List<Map<String,Double>> executeExperiment(String pathExperiment, double detectionThreshold) throws ErroneousValueException
      Executes a single data stream experiment for a specified path and detection threshold. This method encompasses the entire process of running an experiment, including training the initial model, setting up concept drift adaptation strategies, iterating over the data stream and saving the results.
      Parameters:
      pathExperiment - path to the experiment data, including training and streaming data
      detectionThreshold - threshold value used for detecting concept drift
      Returns:
      list of maps, each map containing performance metrics for a data batch in the stream
      Throws:
      ErroneousValueException - if there is an issue with the provided data or configuration settings
    • setAreResultsSaved

      public void setAreResultsSaved(boolean areResultsSaved)
      Sets the flag indicating whether the results of the experiment should be saved.
      Parameters:
      areResultsSaved - a boolean value specifying whether to save the experiment results. `true` to save results, `false` otherwise.