Methods
- class gatree.gatree.GATree(max_depth=None, random=None, fitness_function=None, n_jobs=1, random_state=None)
Bases:
BaseEstimatorEvolutionary decision tree classifier. The GATree classifier is a decision tree classifier that is trained using a genetic algorithm. The genetic algorithm is used to evolve a population of trees over multiple generations. The fitness of each tree is evaluated using a fitness function, which is used to select the best trees for crossover and mutation.
- Parameters:
max_depth (int, optional) – Maximum depth of the tree.
random (Random, optional) – Random number generator.
fitness_function (function, optional) – Fitness function for the genetic algorithm.
n_jobs (int, optional) – Number of jobs to run in parallel.
random_state (int, optional) – Seed for reproducibility.
- max_depth
Maximum depth of the tree.
- Type:
int, optional
- random
Random number generator.
- Type:
Random
- X
Training data.
- Type:
pandas.DataFrame
- y
Target values.
- Type:
pandas.Series
- att_indexes
Array of attribute indexes.
- Type:
numpy.ndarray
- att_values
Dictionary of attribute values.
- Type:
dict
- class_count
Number of classes.
- Type:
int
- fitness_function
Fitness function for the genetic algorithm.
- Type:
function
- n_jobs
Number of jobs to run in parallel.
- Type:
int
- random_state
Seed for reproducibility.
- Type:
int
- _best_fitness
List of best fitness values for each iteration.
- Type:
list
- _avg_fitness
List of average fitness values for each iteration.
- Type:
list
- static default_fitness_function(root, **fitness_function_kwargs)
Default fitness function for the genetic algorithm.
- Parameters:
root (Node) – Root node of the tree.
- Returns:
The fitness value.
- Return type:
float
- fit(X, y, population_size=150, max_iter=2000, mutation_probability=0.1, elite_size=1, selection_tournament_size=2, fitness_function_kwargs={})
Fit a tree to a training set. The population size, maximum iterations, mutation probability, elite size, and selection tournament size can be specified.
- Parameters:
X (pandas.DataFrame) – Training data.
y (pandas.Series) – Target values.
population_size (int, optional) – Size of the population.
max_iter (int, optional) – Maximum number of iterations.
mutation_probability (float, optional) – Probability of mutation.
elite_size (int, optional) – Number of elite trees.
selection_tournament_size (int, optional) – Number of trees in tournament.
fitness_function_kwargs (dict, optional) – Additional kwargs to be passed to the fitness_funciton.
- Returns:
The fitted tree.
- Return type:
- plot(node=None, prefix='')
Plot the decision tree with nodes and leaves.
- Parameters:
node (Node, optional) – Current node to plot.
prefix (str, optional) – Prefix for the current node.
- predict(X)
Predict classes for the given data.
- Parameters:
X (pandas.DataFrame) – Data to predict.
- Returns:
Predicted classes.
- Return type:
list
- set_fit_request(*, elite_size: bool | None | str = '$UNCHANGED$', fitness_function_kwargs: bool | None | str = '$UNCHANGED$', max_iter: bool | None | str = '$UNCHANGED$', mutation_probability: bool | None | str = '$UNCHANGED$', population_size: bool | None | str = '$UNCHANGED$', selection_tournament_size: bool | None | str = '$UNCHANGED$') GATree
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
elite_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
elite_sizeparameter infit.fitness_function_kwargs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
fitness_function_kwargsparameter infit.max_iter (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
max_iterparameter infit.mutation_probability (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
mutation_probabilityparameter infit.population_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
population_sizeparameter infit.selection_tournament_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
selection_tournament_sizeparameter infit.
- Returns:
self – The updated object.
- Return type:
object
- class gatree.methods.gatreeclassifier.GATreeClassifier(max_depth=None, random=None, fitness_function=None, n_jobs=1, random_state=None)
Bases:
ClassifierMixin,GATreeEvolutionary decision tree classifier. The GATree classifier is a decision tree classifier that is trained using a genetic algorithm. The genetic algorithm is used to evolve a population of trees over multiple generations. The fitness of each tree is evaluated using a fitness function, which is used to select the best trees for crossover and mutation.
- Parameters:
max_depth (int, optional) – Maximum depth of the tree.
random (Random, optional) – Random number generator.
fitness_function (function, optional) – Fitness function for the genetic algorithm.
n_jobs (int, optional) – Number of jobs to run in parallel.
random_state (int, optional) – Seed for reproducibility.
- max_depth
Maximum depth of the tree.
- Type:
int, optional
- random
Random number generator.
- Type:
Random
- X
Training data.
- Type:
pandas.DataFrame
- y
Target values.
- Type:
pandas.Series
- att_indexes
Array of attribute indexes.
- Type:
numpy.ndarray
- att_values
Dictionary of attribute values.
- Type:
dict
- class_count
Number of classes.
- Type:
int
- fitness_function
Fitness function for the genetic algorithm.
- Type:
function
- n_jobs
Number of jobs to run in parallel.
- Type:
int
- random_state
Seed for reproducibility.
- Type:
int
- _best_fitness
List of best fitness values for each iteration.
- Type:
list
- _avg_fitness
List of average fitness values for each iteration.
- Type:
list
- static default_fitness_function(root, **fitness_function_kwargs)
Default fitness function for the genetic algorithm.
- Parameters:
root (Node) – Root node of the tree.
- Returns:
The fitness value.
- Return type:
float
- fit(X, y, population_size=150, max_iter=2000, mutation_probability=0.1, elite_size=1, selection_tournament_size=2, fitness_function_kwargs={})
Fit a tree to a training set. The population size, maximum iterations, mutation probability, elite size, and selection tournament size can be specified.
- Parameters:
X (pandas.DataFrame) – Training data.
y (pandas.Series) – Target values.
population_size (int, optional) – Size of the population.
max_iter (int, optional) – Maximum number of iterations.
mutation_probability (float, optional) – Probability of mutation.
elite_size (int, optional) – Number of elite trees.
selection_tournament_size (int, optional) – Number of trees in tournament.
fitness_function_kwargs (dict, optional) – Additional kwargs to be passed to the fitness_funciton.
- Returns:
The fitted tree.
- Return type:
- set_fit_request(*, elite_size: bool | None | str = '$UNCHANGED$', fitness_function_kwargs: bool | None | str = '$UNCHANGED$', max_iter: bool | None | str = '$UNCHANGED$', mutation_probability: bool | None | str = '$UNCHANGED$', population_size: bool | None | str = '$UNCHANGED$', selection_tournament_size: bool | None | str = '$UNCHANGED$') GATreeClassifier
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
elite_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
elite_sizeparameter infit.fitness_function_kwargs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
fitness_function_kwargsparameter infit.max_iter (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
max_iterparameter infit.mutation_probability (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
mutation_probabilityparameter infit.population_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
population_sizeparameter infit.selection_tournament_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
selection_tournament_sizeparameter infit.
- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') GATreeClassifier
Request metadata passed to the
scoremethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.- Returns:
self – The updated object.
- Return type:
object
- class gatree.methods.gatreeclustering.GATreeClustering(max_depth=None, random=None, fitness_function=None, n_jobs=1, random_state=None, min_clusters=2, max_clusters=10)
Bases:
ClusterMixin,GATreeEvolutionary decision tree clustering. The GATree clustering is a decision tree clustering that is trained using a genetic algorithm. The genetic algorithm is used to evolve a population of trees over multiple generations. The fitness of each tree is evaluated using a fitness function, which is used to select the best trees for crossover and mutation.
- Parameters:
max_depth (int, optional) – Maximum depth of the tree.
random (Random, optional) – Random number generator.
fitness_function (function, optional) – Fitness function for the genetic algorithm.
n_jobs (int, optional) – Number of jobs to run in parallel.
random_state (int, optional) – Seed for reproducibility.
min_clusters (int, optional) – Number of minimum clusters.
max_clusters (int, optional) – Number of maximum clusters.
- max_depth
Maximum depth of the tree.
- Type:
int, optional
- random
Random number generator.
- Type:
Random
- X
Training data.
- Type:
pandas.DataFrame
- y
Target values.
- Type:
pandas.Series
- att_indexes
Array of attribute indexes.
- Type:
numpy.ndarray
- att_values
Dictionary of attribute values.
- Type:
dict
- class_count
Number of classes.
- Type:
int
- fitness_function
Fitness function for the genetic algorithm.
- Type:
function
- n_jobs
Number of jobs to run in parallel.
- Type:
int
- random_state
Seed for reproducibility.
- Type:
int
- min_clusters
Number of minimum clusters.
- Type:
int
- max_clusters
Number of maximum clusters.
- Type:
int
- _best_fitness
List of best fitness values for each iteration.
- Type:
list
- _avg_fitness
List of average fitness values for each iteration.
- Type:
list
- static default_fitness_function(root, **fitness_function_kwargs)
Default fitness function for the genetic algorithm.
- Parameters:
root (Node) – Root node of the tree.
- Returns:
The fitness value.
- Return type:
float
- fit(X, population_size=150, max_iter=2000, mutation_probability=0.1, elite_size=1, selection_tournament_size=2, fitness_function_kwargs={})
Fit a tree to a training set. The population size, maximum iterations, mutation probability, elite size, and selection tournament size can be specified.
- Parameters:
X (pandas.DataFrame) – Training data.
y (pandas.Series) – Target values.
population_size (int, optional) – Size of the population.
max_iter (int, optional) – Maximum number of iterations.
mutation_probability (float, optional) – Probability of mutation.
elite_size (int, optional) – Number of elite trees.
selection_tournament_size (int, optional) – Number of trees in tournament.
fitness_function_kwargs (dict, optional) – Additional kwargs to be passed to the fitness_funciton.
- Returns:
The fitted tree.
- Return type:
- set_fit_request(*, elite_size: bool | None | str = '$UNCHANGED$', fitness_function_kwargs: bool | None | str = '$UNCHANGED$', max_iter: bool | None | str = '$UNCHANGED$', mutation_probability: bool | None | str = '$UNCHANGED$', population_size: bool | None | str = '$UNCHANGED$', selection_tournament_size: bool | None | str = '$UNCHANGED$') GATreeClustering
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
elite_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
elite_sizeparameter infit.fitness_function_kwargs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
fitness_function_kwargsparameter infit.max_iter (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
max_iterparameter infit.mutation_probability (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
mutation_probabilityparameter infit.population_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
population_sizeparameter infit.selection_tournament_size (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
selection_tournament_sizeparameter infit.
- Returns:
self – The updated object.
- Return type:
object