Clustering Algorithms - 9/16-27/19


The past two weeks have been more towards our field of study than what happened previously. Prior to the second meeting with Dr. Hassibi, the Machine Learning group was given the task to learn about Cluster Algorithms. We spent time studying algorithms such as Expectation-Maximization, k-means and Gaussian Mixture Model. The objective with each subtopic was to understand its purpose, what makes it different from other methods, and the math behind it. This was much more ambiguous than learning what it does. The learning process was somewhat very uncoordinated because situations where you needed to learn an idea to understand a particular thing, leading to more and more topics for us to pick up on. Once all algorithms have been inferred thoroughly, each student was given a topic to write a concept map on.

Below is an extensive video going over what Clustering is in Machine Learning:


I was assigned to the Expectation-Maximization Algorithm which is an iterative method to find the maximum likelihood of parameters in models, where the model depends on unobserved latent variables. It has two steps that it iterates between until the Gaussians converge: [E]-step which tries to associate data points with a single gaussian, and [M]-step to adjust the Gaussian's mean and variance/deviation. Overall, my feeling about this type of work has been positive. It was super interesting to look at the different algorithms used to cluster data points and struggle together. Every single person was the crutch of another person in understanding a concept which was poetic, to say the least.


This was a video that helped me understand the math behind the EM-Algorithm:

On Thursday we visited Dr. Hassibi and went over what we learned with him. The purpose of doing this was to see if we got any concepts wrong and if so, what can be said/corrected about them. In my opinion, since we have about an hour or less to talk with him, the time should be spent otherwise instead of us presenting what we learned, dwindling the time Dr. Hassibi has to talk. After we all had a turn to demonstrate our understanding, Dr. Hassibi noted that all the topics we learned involved no "graphs." He introduced "Graph Clustering" which involves graphs, which are essentially data structures, instead of data points. An example Dr. Hassibi mentioned was the 1970's Karate Club network problem, an issue that asks if a karate club were to split into two different clubs, what members were likely to side with one club over the other.


This week has been much more engaging with our group's focus on Machine Learning which has really gotten me excited. We are getting into the theory behind such ML subtopics which makes implementing them into projects much easier. In the following few weeks, Mr. Lee and we hope to thoroughly understand Graph Clustering and start writing code for the Karate Club problem. I don't know about others but I'm incredibly eager to be doing this and the program has just begun.

Comments

Popular posts from this blog

Technical Journal - Research Papers

Journal

First Month