For Everyone to See
Winter ended late this year in Boston and people are still trying to figure out how to dress. The Bruins are in the playoffs, which means we're in one of those weird years here where Hockey and Baseball seasons overlap. Should we be wearing Red and Blue, as summer fashion suggests, or put on one more weeks’ worth of black and yellow to show the B's some spirit? (I'm wearing Red Sox under my college sweatshirt; statisticians have always been partial to Baseball.)
In this way, our own Dr. Nolker and I could not be more different. I coach special needs baseball; he plays amateur hockey. Sometimes it seems like baseball and hockey people are from different worlds. Ever wonder what data science has to say about it? I looked at sociometric data from 100 thousand US households to try to come up with an answer. Here's what I found:
(1) Hockey fans are twice as likely (20% vs 11%) to ski recreationally. No surprises there.
(2) Hockey fans are more likely to gamble at a casino (24% vs 19%). Baseball folks don't gamble, unless it's in the locker room. (Remember Peter Rose?)
(3) Hockey fans are more likely to smoke cigars or premium tobacco (17.5% vs 13%). Chewing tobacco was not factored.
(4) Baseball players are significantly more religious. 38% read religious or inspirational books and magazines, vs. 29% of hockey fans. (We had 93 seasons without a series win to get some religion.)
(5) Hockey fans are better off, financially at least. They are more likely to have an "upscale lifestyle" (53% vs 48%) and a credit card from a premium department store (57% vs 50%). They'll need one to buy all those sweaters and ski equipment.
At least there was one thing we could both agree on. Only two baseball fans and absolutely no hockey fans (0% vs 0%), reported that they watch professional soccer. Guess we're not so different after all.
Do you have an interesting topic you'd like us to research and write about? Send us your ideas for future topics.
InsideAnalysis.com published an article this morning describing the big data work that Analyze has been doing to help combat illegal fishing. Check out the article.
Recognizing that if you're reading our blog or our social media posts, it’s unlikely that you're attending Dr. Nolker’s presentation at the Sentiment Analysis Symposium, we’ve posted a paper on our site that you can download for free entitled, “Social Computing and Weighting to Identify Member Roles in Online Communities.” This paper was the genesis of what has become a groundbreaking approach for pulling meaning out of social networks.
For those that would rather get the meat without sifting through the paper, here’s a summary of the paper and how it's useful when doing social network analysis:
A. Not everything that everyone says in an online social network is worth analyzing. We've all met that guy or gal that post things that don't matter.
B. Structure mining provides a means for finding and weighting which individuals are most worth analyzing and which individuals we should ignore in an automated fashion.
C. Not surprising to those that work in teams, the most important people to analyze are Influencers and Motivators (defined more specifically in the paper).
D. You can detect roles (like Influencers and Motivators)
in online communities (like Facebook, LinkedIN, Twitter, or other more specialized forums like those for Hackers) and sift out individuals that detract from a community by measuring things like:
- The number of one and two way conversations,
- Whether those conversations or posts are directed at individual persons,
- The number of different people users converse with, and
- How close (first, second, or third level connections) a user has in a given social network, among others.
C. This type of analysis can help businesses target to whom they market, social networks measure how healthy their communities are, and data scientists choose whom to target for more in-depth sentiment, natural language, or link analysis.
Download the paper for more detail.
Dr. Robert Nolker, Analyze’s Vice President of Research and Development, will be presenting at the Sentiment Analysis Symposium in New York this week, March 5, during the Technology and Innovation workshops. Dr. Nolker will be presenting his groundbreaking research in identifying user roles within social networks using structure mining approaches. Dr. Nolker’s approach provides two primary benefits. First, a user’s role provides insight into how much weight their opinions or comments should be given in text and sentiment analysis. Second, role identification can be used to reduce the size of your dataset, an important step to reducing processing costs when doing text analysis. Dr. Nolker will demonstrate these structure mining techniques on cybersecurity networks, more specifically software vulnerability research forums, in order to demonstrate how to choose the most important targets for additional sentiment and text analysis.
Analyze successfully uses advanced analytics to improve marketing return on investment, reduce operational labor costs, and improve cybersecurity by providing businesses next generation analytics using machine learning, graph theory, and structure mining techniques.
Read more about Dr. Nolker at http://analyzecorp.com/executiveteam
Illegal fishing is a significant economic and environmental challenge for countries around the world. Up to 40% of fishing catch in certain parts of the world is unlawful or unregulated, resulting in approximately $10B to $20B in economic losses and significantly depleting international food stocks.
Using geospatial position information, data scientists at Analyze provided a reliable method for characterizing fishing behaviors among ships on the high seas. These methods have the potential to significantly improve interdiction of illegal fishing on the high seas.
Using data transmitted from the Automated Identification System, Analyze studied nearly 500,000,000 data points for 110,000 vessels. They analyzed time-codes, vessel identity and motion data including: navigational status, rate of turn, speed over ground, lat/long, true heading, true bearing and more. The hypothesis was that unique motion behaviors could be associated with fishing activity using motion analytics. For example circular and duplicative motions could indicate fishing behavior.
Analyze research consisted in identifying and characterizing this unique motion behavior. To accomplish this, we employed a basic “big data” analytics strategy consisting of data acquisition, data extraction, transformation and loading, data analysis using statistical and machine leaning approaches, predictive analytics and visualization.
Once the data set was identified for a specific geo-fenced area in the, Analyze utilized a number of analytics from the Mercury Motion Analytics Module that would aid in the discovery of motion behaviors including position information, boundaries & geocoding, distance, velocity, acceleration, motion primitives, shape conformance and consistency of motion. Measures were derived from this data.
We noticed that frequent and significant changes in the vessel's compass heading (erratic heading) and erratic changes in velocity were strong predictors of fishing activity. The vessels themselves use a navigational status of 7 to self-report fishing activity but this was under and over reported throughout the data set. Analyze was able to derive a fishing prediction function using candidate analytics to positively identify fishing behavior on the open seas.
Data Scientists working on this analysis would be willing to discuss the process and methods used in this analysis. If you happen to be attending Strata 2014 in Santa Clara, visit Analyze in booth 928 in the Innovators Pavilion.
Data Science describes the processes, techniques, and tools used to extract deeper, non-obvious meaning from data of all kinds. Whether an organization is attempting to understand it customers, operations, competition, or market, data science draws from best practices in computer science and statistics to find more meaning in the world.
While most organizations already make basic observations about their data by tracking sales, operations, productivity, and customer satisfaction; these organizations don't realize how much data science can improve decision making.
For example, several years ago I was asked to analyze a company attempting to address staffing problems. After gathering data, I was impressed with this organization's breadth of understanding of their sales cycle and staffing--the sales department knew exactly who their customers were, how frequently they purchased, and when they purchased and it knew how many hours it paid employees and for which projects it paid them. With data fusion and trend analysis techniques, this data produced much more depth such as trigger events in the sales cycle that could be used to plan staffing and supply chain events.
You can read many more examples of how data science produces deeper insights into data on our Case Studies page.
Big data is code for difficult data. More precisely, it is any data set where traditional techniques (databases and software) are inadequate -- whether trying to to store, query, manipulate, analyze, or otherwise use the data.
Because (by definition) Big Data is difficult, an industry is springing up, with various database, software products and analytical techniques to address the most common problems with traditional techniques. These are often described using the three (3) V's: Volume, Variety, and Velocity. Essentially, data sets that become too big, contain incongruous data types (such as video files, images, documents, and text and numerical values) and require real time storage (such as click behavior online or sensor outputs from satellites, cell phones, and vehicles).
The Big Data industry as a whole, in an effort to solve the 3Vs is still evolving as to how it will provide additional, non-obvious meaning difficult data giving rise to Data Science and Big Data Analytics.
In response to the requests we've received, we've created a simple video demonstration of our virtual Cybersecurity Training curriculum. You can read a sample list of our training curriculum here. Having developed more than 400 hours of the most advanced training curriculum and interactive cyber exercises for customers such as the Department of State and Department of Defense, Analyze's instructors offer both a mentored approach to teaching advanced cyber to practitioners as well as specific cyber demonstrations to offer management and executives looking to stay abreast of cyber threats.
On June 19, Analyze hosted the first Illegal, Unregulated, and Unreported (IUU) Fishing Roundtable. With attendees from the United States, United Kingdom, and Israel, the Roundtable brought together the most influential players in the campaign against IUU fishing, including the National Oceanographic and Atmospheric Agency, Pew Charitable Trusts, Google, SpaceQuest, Greenline Systems, IHS Fairfplay, Windward Maritime Solutions, OrbComm, and SkyTruth. Read more about the meeting by downloading Analyze's report.
Innovation is a way of looking at the world. Analyze takes pride in being innovative outside the bounds of any industry, product, or service. Some of our most innovative ideas come from the shower, the gym, nap time, family time, and anywhere but work. We are passionate about our ideas, love to share them, and are thrilled to see them come to reality. Whether they make us a $M or $0.99, we love our ideas.
The mobile app for iPhone and Android designed to ensure that your phone never embarrasses you again. Open the app, click NOT HERE, and your phone will go on silence. Whenever you return to that location again, your phone remembers and will ensure you're not interrupted. Important meetings, Movies, Theaters, Parent Teacher Conferences, Church... you choose ONCE and you'll never be interrupted again. Download for Android from the Google Play Store here.