Suppose in the street if someone asks me what is meant by Machine Learning, my quick, top of my head answer will be - its an art of gathering and extracting valuable information from the past data using statistical tools to come up with a model or understanding to predict the future. It could be something like studying data on old market trend to come up with a way to predict the future trend.
In this post I’m doing some topic modelling. Topic modelling is a way of finding abstact topics in collection of documents. I’m using Sherlock Holmes stories and try to find out which word contributes to how much in telling what the story is about. One way to think about is certain words will play important role in defining what the story is about and the frequncy of those words play vital role in our task.
In this post I’m going to show some cool feature of Purrr. Purrr is an R package for functional programming. I have always been facinated by functional programming. I first heard about it while I was learning Scala. With this approach, not only it makes our code more succinct, but more expressive. There are other ways to achieve our results by using loops or functions like sapply,lapply but let’s not go into that direction.
Recently I have been watching Tidytuesday screencast from David Robinson. In his screencast he selects never seen before dataset and analyses it using R. Today I’m going to follow in his footsetp. In this blog post I’m going to analyse a set of data and visualize the result using ggplot2. The dataset is collected from medium.com. I’m going to breakdown all the titles of the articles into indivisual words and try to see which word is used the most in all of them.