Skip to main content

Mobomo webinars-now on demand! | learn more.

One of the biggest challenges for any marketer is to create content that is relevant to their target market. Because content is the basis of all marketing today, getting it right is very important. Have the wrong content, and you repel the people you're trying to attract. But have the right content, and you're golden.

Since becoming the Director of Engagement at Intridea, I've been working to answer this question for us. The pursuit of an answer has led me to begin gathering information from a variety of sources in order to get a full picture of what's "relevant" to our clients. One of those sources is LinkedIn.

Scraping Data From LinkedIn With Python

A short time ago I open sourced one of the scripts I created to gather company information from LinkedIn: LinkedIn Data Miner.

In the first version, given a MongoDB database with client information including the url of their LinkedIn company page, the script will go to LinkedIn, scrape the page, and update their record with the following:

  1. Description
  2. Specialties
  3. Address
  4. Website
  5. Company Type
  6. Founded
  7. Industry
  8. Company Size

Once you have the data in your database, you can do a lot with it, including:

  1. Creating simple charts in Excel or Numbers
  2. Mashing it up with other client information and visualize it in Tableau
  3. Add it to the client data you already have in your CRM

The point is to get the information out of LinkedIn and with the rest of your client data.

What You Can Do With This

One of the reasons I created this script was that I wanted to see what industries our clients were in. Once I had scraped the data I created a tree map in Tableau:

Alt text

The darker and larger the square, the more client records I had for that industry. And yes, my chart has labels on it.

A few other charts I created were:

  1. Client map: see how many clients we had in each state, on a map
  2. Clients by company size: another tree map like the one above, only showing how many clients we have of each size (per LinkedIn)
  3. Client industries by state: how many clients we have in each industry, in each state

All of this is the tip of the iceberg. I've also created some python scripts to get data out of Google Analytics. The real power comes in when you start to mash up data from all of these sources.

Your Turn

Now it's your turn to scrape LinkedIn. Go ahead and install MongoDB, get your client records in there along with their LinkedIn company page URLs, download the LinkedIn Data Miner script, and go to town!

Let me know how it works for you.

Categories
Author

Coming from having worked in CV for many years in varying degrees, as well, knowing some rather heavy weights working within it today, I was interested in this book from a review perspective.

Practical Computer Vision with SimpleCV

I found it encouraging that the author did dive right in with some high level informative definitions, common challenges and practical use case. Contrary to my encouragement is that it misses the mark with low level detail, theory and any real in depth explanation upon computer vision itself.

For a beginner, this is a decent title. Be aware though, if you are a beginner, you will need to embrace a quick rhythm and progression throughout. While it may lack in providing "an education in theory", it makes up for with "an education in application".

A few examples within the book did catch my attention such as the XBox Kinect material, which is quite relevant with the buzz surrounding the tech and its accessibility. A bonus here is the escape from Microsoft tools that some may feel make the Kinect undesirable. The clear examples in Python should address concerns with its use for practical training and application outside the Microsoft's Developer Tools ecosystem.

I enjoyed Chapter 7 (Drawing on Images) as this is where I spent a lot of time in the past with imaging annotations for medical applications. The ability to work with layers, objects, lines, etc. The author did a good job with describing the canvas but lacked in the actual drawing sections. The lollipop example was rather crude and SimpleCV's support for drawing is demonstrably more robust.

A lot of time was spent upon histograms which does make sense to me. I believe though, there was too much time spent on it. Which, will likely create some discontent with the quick progression I mentioned in a previous paragraph. I realize it is a vital concept/feature within CV and therefore the attention that was spent on it is good, however, not for this book.

I was disappointed to find there was very little in the book regarding SURF and SIFT, used for feature detection. (Arguably the most prominent CV industry application) Arguments regarding which algorithm's perform better or worse may have prevented their inclusion. As well, While SimpleCV doesn't have an implementation of SIFT, it does for SURF. In practice, within feature detection initiatives one of the two will typically be utilized. The following link is some open source SURF and SIFT with OpenCV work I and a friend worked on a few years back. Specific to feature detection around brand logos. LINK

Generally speaking, when you ask a beginner why they are interested in Computer Vision, most likely the answer will be something along the lines of: Facial Recognition and or Object Recognition. The author does provide examples and lesson within those more interesting facets. Therefore, it hits the mark for the intended reader.

This book was provided free, for purpose of review from O'Reilly Media

Categories
Author
1
Subscribe to Python