Latest news
Author: Matthew A. RussellPages: 360
Publisher: O'Reilly
ISBN: 1449388345


Introduction
The only way you could have missed the fact that the social networking boom has led to huge amounts of social data becoming available to knowledgeable searchers is if you haven't been using a computer and the Internet at all. This book will show you how to discover who's talking to whom, what about and where they are located in the real world - in short, how to mine useful data from the social networks, blogs and email.
About the author
Matthew Russell, Vice President of Engineering at Digital Reasoning Systems and Principal at Zaffra, is a computer scientist who is passionate about data mining, open source, and web application technologies.
Inside the book
If you pick up this book, it is very desirable that you know something about programming in general and programming in Python in particular, otherwise, I guarantee you, you won't understand most of what you are meant to.
The book starts by instantly diving into the problem of setting up a development environment - and doing so by using Python - and into collecting and analyzing Twitter data. To do that, you have to have an account of your own and enough followers whose data you want to mine. You will learn how to see what people are talking about at a given moment, extract relationships from the tweets and how to visualize all the graphs you discover.
Later, you will be able to learn things like "Given all my followers and all their followers, what is my potential influence if I get retweeted?", by learning how to harness Twitter's API.
Next, you'll learn about microformats and how they allow the searcher to take existing content and make the data in it explicit and standardized so that it can be collected and made sense of. You will also see how you can mine your own mailbox(es) and which tools to use to have a clear overview of all the data and perform further analyses.
Other targets for mining data that are included in this book are LinkedIn, Facebook, blogs, etc. Each of them is in some way different than the others, so it requires a different approach and offers different possible results. LinkedIn, for example doesn't show how people are connected among each other, but you can cluster contacts by job title or location.
Facebook is especially a great trove of interesting data, since the users are encouraged to share on so many aspects of their lives, and use it to chat, keep in touch, share photos and thoughts, and many other things. To mine the data collected on this social network, you'll have to make an application that will do that for you - regardless of the fact that you yourself can access all that information by simply accessing your friend's profiles.
Final thoughts
Prepare to be sidetracked a lot of times while reading this book - it touches so many technologies and techniques that you're bound to go searching more about them on the Web.
This book is a good choice for the burgeoning data analyst. People who are more interested in finding out stuff about their friends and colleagues due to sheer curiosity should skip this tome.
Spotlight

Is it time to professionalize information security?
Posted on 23 May 2013. | The issue of whether or not information security professionals should be licensed to practice has already been the topic of many a passionate debate.

Review: Logging and Log Management
Posted on 22 May 2013. | Every security practitioner should be aware of the overwhelming advantages of logging and perusing logs for discovering system intrusions. But logging and log management comes with its own set of difficulties.

Experts highlight top data breach vulnerabilities
Posted on 22 May 2013. | Hidden vulnerabilities lie in everyday activities that can expose personal information and lead to data breach, including buying gas with a credit card or wearing a pacemaker.

A closer look at Mega cloud storage
Posted on 21 May 2013. | Once a novelty, nowadays many cloud storage services are fighting for their piece of the market in the virtual world. Mega offers 50GB of free space with great pricing on Pro accounts.

The CSO perspective on healthcare security and compliance
Posted on 20 May 2013. | Randall Gamby is the CSO of the Medicaid Information Service Center of New York. In this interview he discusses healthcare security and compliance challenges and offers a variety of tips.
By subscribing to our early morning news update, you will receive a daily digest of the latest security news published on Help Net Security.
With over 500 issues so far, reading our newsletter every Monday morning will keep you up-to-date with security risks out there.





