pylearn.clustering package¶

Submodules¶

pylearn.clustering.clustering module¶

class pylearn.clustering.clustering.Clustering(k=3)[source]¶

Bases: object

An abstract base class for Clustering Algorithms. Defines the basic properties and methods that can be used.

Attributes:

k (int):: Number of clusters
centroids (numpy.ndarray):: Matrix of centroids of all k clusters
data_points (numpy.ndarray):: Matrix of all data points
data_points_to_cluster (list):: List of each data point’s assigned cluster
clusters (list):: List of all k clusters

assigned_clusters(clusters: list | str | int) → list[tuple][source]¶

All to the clusters assigned data points.

Parameters:

clusters (list | str | int):: Cluster name(s)

Returns:

List of the data points

static euclidean_distance(x: ndarray, centroids: ndarray) → ndarray[source]¶

Calculates distance of a data point x to all k centroids.

Parameters:

x (numpy.ndarray):: Data point (vector)
centroids (numpy.ndarray):: Centroids in a matrix (each row is one centroid)

Returns:

Array of the distances

static median(x: ndarray) → ndarray[source]¶

Determines the point with the median smallest distance to all other data points in the cluster. The median point must be one of the data points.

Parameters:: x (numpy.ndarray): Matrix of all data points in one cluster
Returns:: Data point as one-element array

rename(old_clusters: list, new_clusters: list) → list[source]¶

Renames the clusters.

Parameters:

old_clusters (list):: List of all old clusters to get renamed
new_clusters (list):: List of the renamed clusters

Returns:

A list of the data points

pylearn.clustering.k_means module¶

class pylearn.clustering.k_means.KMeans(k=3)[source]¶

Bases: Clustering

K Means algorithm computes clusters by calculating the mean of the cluter points.

Attributes:

k (int):: Number of clusters
centroids (numpy.ndarray):: Matrix of centroids of all k clusters
data_points (numpy.ndarray):: Matrix of all data points
data_points_to_cluster (list):: List of each data point’s assigned cluster
clusters (list):: List of all k clusters

fit(X: ndarray, max_iterations=500, threshold=0.001) → list[source]¶

Assigns each data point the best cluster by calculating the distances.

Parameters:

X (numpy.ndarray):: Matrix of data points (each row is one data point)
max_iterations (int, optional):: Number of iterations to update the centroids, default: 500
threshold (float, optional):: Stopping criterion to interrupt the update iterations, default: 0.001

Returns:

A list of the to data points assigned clusters

pylearn.clustering.k_medoids module¶

class pylearn.clustering.k_medoids.KMedoids(k=3)[source]¶

Bases: Clustering

K Medoids algorithm computes clusters by calculating the median of the cluter points. Centroid must be a data point itself.

Attributes:

k (int):: Number of clusters
centroids (numpy.ndarray):: Matrix of centroids of all k clusters
data_points (numpy.ndarray):: Matrix of all data points
data_points_to_cluster (list):: List of each data point’s assigned cluster
clusters (list):: List of all k clusters

fit(X: ndarray, max_iterations=500, threshold=0.001) → list[source]¶

Parameters:

X (numpy.ndarray):: Matrix of data points (each row is one data point)
max_iterations (int, optional):: Number of iterations to update the centroids, default: 500
threshold (float, optional):: Stopping criterion to interrupt the update iterations, default: 0.001

Returns:

A list of the to data points assigned clusters

PyLearn 1.0.0 documentation

pylearn.clustering package¶

Submodules¶

pylearn.clustering.clustering module¶

pylearn.clustering.k_means module¶

pylearn.clustering.k_medoids module¶

Module contents¶

Table of Contents

Previous topic

Next topic

This Page