(Honestly, this is the *main* reason I petitioned Twitter for a verified account.)
Twitter is so diluted and hard to parse, but is still my easiest and fastest source of live, up-to-the-minute news around the world.
Just this morning, I knew about a terrorist attack in London as it was happening thanks to my feed, and I am routinely alerted to similar important events just as quickly.
A few good rules of thumb are to run Twitter searches eliminating Retweets and “RT”, eliminate tweets that have an overabundance of hashtags, and if you are using an external API, you can safely eliminate tweets with links that are being posted by *certain* subscription automated services, as they just regurgitate content.
If you take my (very active) Twitter feed as an example, you’ll find that the majority of the content I post is unique – I’m the original Twitter source, and as a verified Twitter account People tend to trust the things I post. I do also have a smattering of likes/retweets each day, but more “new” stuff than not.
With that case in point, look for the original source of the tweets and you’ll find the influencers for whatever topic, and you can tune your algorithms to reduce the noise around it.
Originally Posted: https://www.quora.com/How-would-one-sanitize-the-signal-from-the-noise-in-the-data-from-Twitter-given-how-pervasive-fake-tweets-likes-retweets-comments-etc-are
Originally Posted On: 2017-03-22