Technical Journal - Spectral Clustering for Matlab

       After our meeting with Ethan, Dr. Hassibi's graduate student, we were tasked to implement spectral clustering on the adjacency matrices dataset. In order to determine the algorithm's integrity, we have to use false positives and false negatives. As mentioned, the first step was to clean the data and use spectral clustering on it. We soon realized that the spectral clustering function would only work on R2019b. We downloaded R2019b and it recognized spectralcluster as a function. During the weekend, I did a bit of self-study and tried to teach myself MATLAB syntax which I found similar to Python 3.8. Below is a video that I found helpful.

The error displayed in the console when we tried to use spectralcluster on R2019a



       The first step is to load in the dataset which is set as a struct, a block of memory with physically grouped variables. Assign observed and rawAdj to CAdj and Adj respectively to then entrywise multiply the two to Adj1. We use the eye function to get an n-by-n matrix with ones on major diagonals and zeroes elsewhere. To get a fullstorage matrix, the full function must be used. Then, spectralcluster is used on the Laplacian of the graph-form of gAdj and includes 3 for the number of classes there are. To get our results, we use a confusionmat with the ground truth.

Code for spectral clustering

       The program did not work. By that I mean it had a very poor accuracy score. For some reason, in contrast to the time when we used spectral for the Karate Club dataset, it did a horrendous job, leading me to believe that some information was lost in the process and threw everything off. As mentioned before, we used a confusion matrix to show results. The thought that "false positives and false negatives works well for binary classification problems" seemed reasonable but the confusion matrix can quantify the results of an infinite amount of clusters. The rows are the true class and the columns are the predicted class. Ideally, only the diagonals should have the number of nodes classified as such but in our case, all of them were classified as 1. As a result, we had poor results.
Confusion map used to show results

       In conclusion, while spectral clustering had been unsuccessfully implemented, we found out about our new goals for the next few weeks. As I mentioned before, minimum cut could have some potential, despite the dataset being super large. It is worth the shot if we cannot get good results from spectral. I am also curious as to what Dr. Ramya, the Ph.D. student who provided the dataset, did to correctly predict the dataset. Below is a video of minimum cut:


Comments

Popular posts from this blog

Technical Journal - Research Papers

Journal

First Month