pylearn.clustering package¶
Submodules¶
pylearn.clustering.clustering module¶
- class pylearn.clustering.clustering.Clustering(k=3)[source]¶
Bases:
objectAn abstract base class for Clustering Algorithms. Defines the basic properties and methods that can be used.
- Attributes:
- k (int):
Number of clusters
- centroids (numpy.ndarray):
Matrix of centroids of all k clusters
- data_points (numpy.ndarray):
Matrix of all data points
- data_points_to_cluster (list):
List of each data point’s assigned cluster
- clusters (list):
List of all k clusters
- assigned_clusters(clusters: list | str | int) list[tuple][source]¶
All to the clusters assigned data points.
- Parameters:
- clusters (list | str | int):
Cluster name(s)
- Returns:
List of the data points
- static euclidean_distance(x: ndarray, centroids: ndarray) ndarray[source]¶
Calculates distance of a data point x to all k centroids.
- Parameters:
- x (numpy.ndarray):
Data point (vector)
- centroids (numpy.ndarray):
Centroids in a matrix (each row is one centroid)
- Returns:
Array of the distances
pylearn.clustering.k_means module¶
- class pylearn.clustering.k_means.KMeans(k=3)[source]¶
Bases:
ClusteringK Means algorithm computes clusters by calculating the mean of the cluter points.
- Attributes:
- k (int):
Number of clusters
- centroids (numpy.ndarray):
Matrix of centroids of all k clusters
- data_points (numpy.ndarray):
Matrix of all data points
- data_points_to_cluster (list):
List of each data point’s assigned cluster
- clusters (list):
List of all k clusters
- fit(X: ndarray, max_iterations=500, threshold=0.001) list[source]¶
Assigns each data point the best cluster by calculating the distances.
- Parameters:
- X (numpy.ndarray):
Matrix of data points (each row is one data point)
- max_iterations (int, optional):
Number of iterations to update the centroids, default: 500
- threshold (float, optional):
Stopping criterion to interrupt the update iterations, default: 0.001
- Returns:
A list of the to data points assigned clusters
pylearn.clustering.k_medoids module¶
- class pylearn.clustering.k_medoids.KMedoids(k=3)[source]¶
Bases:
ClusteringK Medoids algorithm computes clusters by calculating the median of the cluter points. Centroid must be a data point itself.
- Attributes:
- k (int):
Number of clusters
- centroids (numpy.ndarray):
Matrix of centroids of all k clusters
- data_points (numpy.ndarray):
Matrix of all data points
- data_points_to_cluster (list):
List of each data point’s assigned cluster
- clusters (list):
List of all k clusters
- fit(X: ndarray, max_iterations=500, threshold=0.001) list[source]¶
- Parameters:
- X (numpy.ndarray):
Matrix of data points (each row is one data point)
- max_iterations (int, optional):
Number of iterations to update the centroids, default: 500
- threshold (float, optional):
Stopping criterion to interrupt the update iterations, default: 0.001
- Returns:
A list of the to data points assigned clusters