Saturday, February 16, 2013

Algorithm Analysis: The Basic K-means Clustering Algorithm


The basic k-means clustering algorithm is a simple algorithm that separates the given data space into different clusters based on centroids calculation using some proximity function. Using this algorithm, we first choose the k- points as initial centroids and then each point is assigned to a cluster with the closest centroid. The algorithm is formally described as follows:
Input: A data set D containing m objects (points) with n attributes in an              Euclidean space
Output: Partitioning of m objects into k-clusters C1, C2, C3, …, Ck, i.e. Ci D and Ci ∩ Cj = ᶲ  (for 1 ≤ i, j ≤ k)


Procedure:

  1. Select k points as initial centroid
  2. Repeat
  3. Form k clusters by assigning each point to its closest centroid
  4.  Re-compute the centroid of each cluster
  5. Until the centroids don’t change

Time Complexity of the algorithm:

 To analyze the complexity of the algorithm in terms of time required, we need to identify the operations required in each step and in each iteration of the algorithm. Here, for k-means algorithm, the operations for each iteration are:
  • Calculation of distances: To calculate the distance from a point to the centroid, we can use the squared Euclidean proximity function. For this, we need two subtractions, one summation, two multiplications and one square - root operations, i.e., 6-operations.
  • Comparisons between distances
  • Calculation of centroids

So, the total number of operations for an iteration
= distance calculation operations + comparison operations + centroids calculation operations
= 6*[k*m*n] operations + [(k-1)*m*n] operations + [k*((m-1) + 1)*n] operations

Assume that the algorithm converges after I iterations, then total number of operations for the algorithm
= 6*[I*k*m*n] operations + [I*(k-1)*m*n] operations + [I*k*((m-1) + 1)*n] operations
= [6Ikmn + I(k-1)mn + Ikmn] operations
≈ O (I*k*m*n)

Here, we’ve applied the basic assumption of each operation requiring the same amount of time. Since, normally, k << m & n << m, the analysis shows that the k-means is linear in m, the number of points, and is scalable and efficient in processing large data sets.

Space Complexity of the Algorithm:

The space requirement for k-means are modest because only the data points and centroids are needed to be stored. Specifically, the storage required is O((m+k)n), where m is the number of objects (points) & n is the number of attributes considering n-dimensional objects.

References
[1] Vipin Kumar, Pang-Ning Tan & Michael Steinbach.(2005). "Introduction to Data Mining." Pearson/Addison Wesley, ISBN 0-321-32136-7.

96 comments:

  1. Very interesting blog post.Quite informative and very helpful.This indeed is one of the recommended blog for learners.Thank you for providing such nice piece of article. I'm glad to leave a comment. Expect more articles in future. You too can check this R Programming tutorial for updated knowledge on R Programming.
    https://www.youtube.com/watch?v=gXb9ZKwx29U

    ReplyDelete
  2. Thank you for sharing such valuable information and tips. This can give insights and inspirations for us; very helpful and informative! Would love to see more updates from you in the future.
    Big Data Training in Chennai
    Big Data Training
    Loadrunner Training in Chennai
    iOS Training in Chennai
    Selenium Training in Chennai
    Selenium Training

    ReplyDelete
  3. Thank you for sharing such great information with us. I really appreciate everything that you’ve done here and am glad to know that you really care about the world that we live in.
    Big Data Course in Chennai
    Big Data Hadoop Training in Chennai
    Hadoop Course in Chennai
    best big data training in chennai
    Big Data Hadoop Training
    Hadoop Training in Chennai

    ReplyDelete
  4. Nice idea,keep sharing your ideas with us.i hope this information's will be helpful for the new learners.
    Best AWS Training in Bangalore
    AWS Training in Nolambur
    AWS Training in Ashok Nagar

    ReplyDelete
  5. More informative,thanks for sharing with us.
    this blog makes the readers more enjoyable.keep add more info on your page.
    angularjs classes in bangalore
    angularjs tutorial in bangalore
    AngularJS Training in Mogappair
    AngularJS Training in Nungambakkam

    ReplyDelete
  6. There was a time when it was difficult with money. I decided to turn to gambling for all kinds of slot machines and the like. Now knowing this site splendid casino games online for money across and opposite to me is no longer scary

    ReplyDelete
  7. Great Article ,Great blog! lot of valid information.
    Keep sharing more.
    www.apponix.com

    ReplyDelete
  8. Great Article ,Great blog! lot of valid information.
    Keep sharing more.
    JAVA Tutorials

    ReplyDelete
  9. This is really too useful and have more ideas and keep sharing many techniques. Eagerly waiting for your new blog keep doing more.
    Regards,
    Tableau training in Chennai | Tableau Courses Training in Chennai | Tableau training Institute in Chennai

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. Very useful blog thanks for sharing With over a three decade of beauty expertise at our fingertips, we believed that everyone has the right to be beautiful. And so began the journey of our very own Pearl’s Beautician course in Chennai.

    ReplyDelete
  13. Great information to say the least. I really do appreciate everything so much from this great article.
    qr code generator

    ReplyDelete
  14. We as a team of real-time industrial experience with a lot of knowledge in developing applications in python programming (7+ years) will ensure that we will deliver our best in python training in vijayawada. , and we believe that no one matches us in this context.

    ReplyDelete
  15. Really impressed! Everything is very open and very clear clarification of issues. It contains truly facts. Your website is very valuable. Thanks for sharing.
    Please check this ExcelR Pune Digital Marketing Course

    ReplyDelete
  16. Thanks for sharing this informations.
    data science course in coimbatore

    data science training in coimbatore

    android training institutes in coimbatore

    ios training in coimbatore

    aws training in coimbatore

    amazon web services training in coimbatore

    big data training in coimbatore

    ReplyDelete
  17. I really enjoyed your blog Thanks for sharing such an informative post.

    digital marketing course in hubli

    ReplyDelete
  18. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

    Big Data Online Training

    Big Data Classes Online

    Big Data Training Online

    Online Big Data Course

    Big Data Course Online

    ReplyDelete
  19. Very interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.

    Sap Ariba Online Training

    Sap Ariba Classes Online

    Sap Ariba Training Online

    Online Sap Ariba Course

    Sap Ariba Course Online

    ReplyDelete
  20. Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which I need, thanks to offer such a helpful information here....
    hardware and networking training in chennai

    hardware and networking training in tambaram

    xamarin training in chennai

    xamarin training in tambaram

    ios training in chennai

    ios training in tambaram

    iot training in chennai

    iot training in tambaram

    ReplyDelete
  21. Thank you for excellent article.You made an article that is interesting.
    Rajasthan Budget Tours

    ReplyDelete
  22. Amazing Post. The content is very interesting. Waiting for your future updates.
    DevOps Training in Chennai

    DevOps Course in Chennai

    ReplyDelete
  23. Thanks For Sharing The Information The Information Shared Is Very Valuable Please Keep Updating Us Time Just Went On Reading The article
    by cognex is the AWS Training in Chennai. cognex offer so many courses they are microsoft azure, prince2 foundation, ITI V4 foundation,etc,

    ReplyDelete
  24. I pay a visit every day a few blogs and sites to read articles, but this weblog offers feature based writing.
    leather sofa set

    ReplyDelete
  25. I read this article fully on the topic of the resemblance of most recent and
    preceding technologies, it’s remarkable article.

    Study in Ireland Consultants

    ReplyDelete
  26. Fantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.

    Data Science Training in Raipur

    ReplyDelete
  27. Really good blog. Well-written and knowledgeable blog. Thanks for sharing such a wonderful blog. Keep sharing!
    Data Science Course in Hyderabad

    ReplyDelete
  28. very interesting to read.thanks for sharing .Angular training in Chennai

    ReplyDelete
  29. Excellent website. I was always checking this article, and I’m impressed! Extremely useful and valuable info specially the last part, I care for such information a lot. Thanks buddy.
    Data Science Training in Hyderabad
    Data Science Course in Hyderabad

    ReplyDelete
  30. Excellent post. I want to thank you for this informative read, I really appreciate sharing this great post. Keep up your work.
    DevOps Training in Hyderabad
    DevOps Course in Hyderabad

    ReplyDelete
  31. Extraordinary blog went amazed with the content that they have developed in a very descriptive manner. This type of content surely ensures the participants to explore themselves. Hope you deliver the same near the future as well. Gratitude to the blogger for the efforts.

    ReplyDelete
  32. Thanks for your marvelous posting! I really enjoyed reading it, you could be a great author. I will be sure to bookmark your blog and definitely will come back in the foreseeable future. I want to encourage you to continue your great posts, have a nice afternoon! Best Digital Marketing Courses in Bangalore



    ReplyDelete
  33. This post is so helpfull and informative.Keep updating more information....
    RPA Training Cost
    Learn RPA

    ReplyDelete
  34. Really wonderful blog! Thanks for taking your valuable time to share this with us.
    What is AWS
    Benefits Of AWS

    ReplyDelete
  35. Amazing Post. Thanks for sharing. It gives a great insight into the topic. Please keep updating.I also want to share few points about Best MicroNutrients Company in India

    ReplyDelete


  36. Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, please check these depression, stress and anxiety related articles.
    Federal Bank Signet Credit Card 2021 Review , The High Five Habit Free pdf Download , Sadhguru Karma Book PDF Download

    ReplyDelete
  37. Thank you so much for sharing this nice article.
    Introducing our Next Level Complete Laundry Management & Dry-Cleaning software. Connect with us to get accelerated and cost-effective Laundry Management Software.
    Visit Now: https://syswash.net/

    ReplyDelete
  38. Here at this site is really a fastidious material collection so that everybody can enjoy a lot.
    data analytics training in hyderabad

    ReplyDelete
  39. I’ve read some good stuff here. Definitely worth bookmarking for revisiting. I surprise how much effort you put to create such a great informative website. data scientist course in mysore

    ReplyDelete
  40. Pinnacle Game Profiler Crack is a valuable application that allows you to play many games on your computer through the use of diverse game Pinnacle Game Profiler Windows 10

    ReplyDelete
  41. MEGAsync, you are given a tool to synchronize your PC with your file storage hosted at MEGA. As a cloud service, you may also use it Mega Download Link

    ReplyDelete
  42. They say the best of all gifts around any Christmas tree is the presence of a happy family all wrapped up in each other. Wishing you a very Merry Christmas Christmas MSG To Loved Ones

    ReplyDelete
  43. Such a very useful article. Very interesting to read this article. Join our online English classes in Riyadh, designed specifically for learners in Saudi Arabia who want to improve their spoken English skills.
    For more info visit Best English Speaking Course in Riyadh or Call +971505593798

    ReplyDelete
  44. Nice blog was really feeling good to read it. Thanks for this information. Ziyyara Edutech’s private tuition classes for Class 11 provide comprehensive and personalized learning experiences, tailored specifically to meet the needs of Class 11 students.
    For more info Contact us: +91-9654271931, +971-505593798 or visit online tutoring sites for class 11

    ReplyDelete
  45. Great experience I got good information from your blog. Ziyyara’s expert phonics tutors, available for personalized one-to-one sessions, create a harmonious learning experience that tunes into your child's unique needs.
    visit phonics tutor near me

    ReplyDelete