I’m increasingly disillusioned with google reader for keeping up to date. Its better than visiting sites manually, but I’d rather have a twitter-based model. I am thinking of setting up a twitter feed dedicated to the same blogs as the Otakusphere search engine (henceforth abbreviated as the OSE). Will work on this and report back when I’ve got something to show…
The Politico interviewed former chief of staff for President Bush (43), Andy Card, and he mentioned this surprising tidbit:
The nation has been so focused on the 44th president that we’ve nearly forgotten about the 43rd. What exactly has George W. Bush been up to since he left office on January 20? Is he just lounging around or has he been keeping busy?
Who better to ask than Andrew Card, a Bush confidante who served as W.’s chief of staff from 2001 to 2006. We caught up with Card at the National Press Club Thursday after a panel discussion sponsored by Politico and Georgetown University.
Card says that Bush has plenty on his plate and may even â€” gasp! â€” delve into the tech-savvy world of Twitter and Facebook.
watch the video yourself:
One of my mantras is to rely on others to filter my data in the social web, because the key to improving your signal to noise ratio is not to try and filter the noise, but actually to reduce your signal. That’s a lot harder than it sounds to do. But it’s made a lot easier by genuinely smart filterers like Dave Winer’s NewsJunk, which was an invaluable tool during the election season. Winer basically culled the best and most interesting news stories (by hand) and fed them to a dedicated RSS feed, which then fed into twitter. As a result I often briefed myself on the day’s politics by first checking @newsjunkies rather than wading into my mess of feeds on Google Reader cold. This is why i am genuinely sad to see that Winer is considering pulling the plug on NewsJunk now that the election has ended.
Twitter: over one billion tweets served. Actually, it’s probably more than that, since the count is from GigaTweet, an external service and not an official count. If we do the math, that comes out to:
140 chars per tweet x 1 byte per char x 10^9 tweets = 140 billion bytes = 130.4 GB worth of data
The 1 billion tweet mark took Twitter just over two years to achieve. Even assuming exponential growth, it’s hard to see Twitter’s raw tweet storage needs exceeding a terabyte ($109 and falling) in the next five years.
Of course, raw storage alone isn’t the whole story, since unlike the gigabytes of data on our home computers, the data on Twitter needs to be actively accessed and queried from the databases, which is a non-trivial task (as any regular user of Twitter over the past year can attest to). This is probably why Twitter has been enforcing a limit of 3200 tweets on users’ archives. The overhead on maintaining the archives beyond that is probably a threat to Twitter’s ability to maintain uptime and service reliability. The limit seems reasonable, since only the heaviest users will have reached that limit by now – I’ve been on twitter longer than most of the A-listers, and I tweet every blog entry I make from 5-6 different blogs, but I’m still only around 1200 tweets. Also, with far fewer followers (several hundred instead of thousands), I have only a handful of @replies compared to the firehose that folks like Darren (@problogger) or Scoble (@Scobleizer) see on their real-time clients. As a result, Twitter is more akin to an email/private messaging system for users like myself, rather than a real-time chatroom for the big users.
Still, even a casual Twitter user should be at least partially alarmed at the thought that their entire Twitter history is subject to arbitrary limits and no real guarantee of backup. As usual, it’s up to us to protect our own data, especially data in a walled garden (albeit one with handy RSS and API gates). Good user practices are the same whether we are using an online service or word processing at home, after all.
Here are just a few ways in which you can backup your tweets. I am sure there are more, so if you have any ideas I’ve not listed here, please share in comments!
Tweetake.com – This service lets you enter your username and download a CSV-format file of your followers, favorites, friends, and tweets. Unfortunately, @replies are not available for backup. It doesn’t save direct messages, either, but if you configure your twitter account to send you notification emails of direct messages, you can t least archive those separately. The CSV format is useful for archiving but not very user-friendly, though you could in principle import the data again into some other form.
Alex King’s Twitter Tools – this is a wordpress plugin that lets you manage your twitter account directly from your WordPress blog. The plugin lets you blog each tweet and/or tweet each blog post, and you can also generate a daily tweet digest as a blog post if you choose (and assign it to an arbitrary category). There’s no way to archive replies, DMs, or follower relationships.
Twitter itself supports RSS feeds, so you could slurp your own feed of replies and tweets using a feedreader and periodically back those up or even write them to disk. Also, users of third-party services like Socialthing, Friendfeed, or Ping.fm also have an alternate interface to Twitter that could potentially be used for backup. However, none of these provide comprehensive tweet archives either, only real-time mirroring.
Finally, Dave Winer has proposed a service/API that twitter clients can use to backup the RSS feed of a twitter account, but this is more of a technical solution of interest to twitter developers rather than end users.
UPDATE: Johann Burkard has written a great little tool in Java called (appropriately) TwitterBackup. It is a very simple piece of freeware that simply downloads all your tweets to a XML-format file saved locally. You specify a filename as you desire, and the tool is smart enough that if you give it the name of a file that already exists, it will only download newer tweets and append them to it rather than do a full download again. This incremental backup of tweets is ideal behavior – the only thing that this tool doesn’t do is preserve your follower/following relationships.
To be honest, none of these solutions are perfect, though Tweetake and Twitter Backup come closest. What would the ideal twitter backup tool look like? A few thoughts:
- be available as a desktop client or Adobe AIR application rather than yet another online service asking for your twitter password. ((Twitter’s implementation of OAuth or OpenID or some other authorization system is long overdue, by the way.))
- At first run, it should allow you to retrieve your entire (available) twitter history, including tweets, replies, and DMs.
- After the initial import, it should provide for periodic incremental backups of your tweets/replies/DMs, at an interval you specify (ideally, a five minute interval minimum).
- It should preserve your friend/follower relationships, and let you import everyone you follow onto any new twitter account or export all their RSS feeds as an OPML file.
What else? There’s definitely a niche out there for an enterprising developer to take Twitter’s API and create a tool focused on backup rather than yet another twitter client. Hopefully before I reach the 3200 tweet limit myself!
Darren Rowse of ProBlogger fame has launched a new blog, TwiTip, aimed at introducing Twitter to new users. Darren always has an interesting and insightful take on blogging and so I think his insights will be worth reading even if you’re a veteran twitter user. Given how much I blog about twitter I can fully understand the appeal of starting a blog devoted to it!
At RWW, Bernard Lunn asks readers to suggest a revenue model for Twitter, that satisfies two criteria:
1. Do not irritate/interrupt the user and even occasionally add value to the user.
2. Provide a value proposition that is so compelling that even conservative buyers give it a try.
There’s actually a fairly simple solution that meets the criteria above, and it relies on a relatively new feature that Twitter introduced primarily for the 2008 presidential elections: selling ad space on topics pages. The common topics pages are candidate-specific ones like “Obama” or “Palin” but there are also new topical ones being generated such as “Muslim” or “Colin Powell“. Note that these topical pages, unlike the candidate pages, are dynamic and fade into and out of existence based on the real-time activity of twitter users, so these truly are a snapshot of current discussion rather than any kind of archive or comprehensive index. There’s even a “tag cloud” at the top of the main election page that shows what the current topics are and the topicscan be filtered by candidate (for example, “Obama and muslim“)
These topics and candidates pages are election-centric for obvious reasons, but there’s no reason that they can’t be expanded in scope, analogous to the breadth of various topics at alltop.com. The crucial difference here however is that the content is entirely user-generated tweets rather than RSS feeds of news and blogs, and is presented as a real-time “river” of information.
So, then, how to monetize? Simply, to imitate Google, and sell ad space on the topics pages. Twitter could even partner with Google or Yahoo and share the revenue. Imagine a partnership with google, for example: adwords purchasers would buy ads for specific keywords, and if/when those keywords become Topics at Twitter, their ads would display. Likewise, contextual ads based on the real-time river of tweets for a given topic could also scroll by in the sidebar, or appear interspersed.
The point here is that Twitter has created instantaneous portals for the hottest topics of the day, and what makes it so useful as an end-point destination for websurfers is that the twitter users are generating the content, providing both links and commentary. So, the real estate created by these topics pages has real value for advertising, as long as it is contextual and targeted. But targeting is easy because instead of having to analyse the entire webpage (as Adsense does at present), the contextual algorithm has a head start because of the topic itself. Then the remaining contextualization can be done on the river of tweets for fine-tuning. This should ensure better relevancy and higher click-through overall.
Are you on Twitter? Share some cool people to follow.
I like Musab (@musabb) – he twitters exclusively in haiku 🙂 If you’re really into haiku on twitter, you’ll need to follow Haiku Twaiku (@haikutwaiku), but they updated a bit too often for me, I just use #haiku to get my fix as needed instead. There’s also Twitter Lit (@twitterlit), which only posts the first line of novels.
Twitter is also a handy source of news – you can get breaking news from CNN (@cnnbrk) or international news headlines from Al Jazeera English (@AJEnglish). I also rely on News Junkies (@newsjunkies) for politics headlines and Tech Junk (@techjunk) and Read/Write Web (@rww) for tech news. I’m also a fan of China Web 2.0 Review (@cwr) and Malaysia Matters (@malaysiamatters).
Of course, the punditocracy is well-represented as well. In politics, there’s The Politico (@ThePolitico), Joe Trippi (@JoeTrippi), Patrick Ruffini (@PatrickRuffini), Joshua Trevino (@jstrevino), and Marc Ambinder (@marcambinder). On the tech side, there’s Robert Scoble (@Scobleizer), Dave Winer (@davewiner), Mike Arrington (@techcrunch), and Michael Parekh (@MParekh). You can also follow Lawrence Lessig’s new organization, Change Congress (@change_congress).
There are also a lot of simply interesting people and celebrities on Twitter. For example, Muhammad Saleem (@msaleem), Om Malik (@om), Felicia Day (@feliciaday), and Wil Wheaton (@wilw) (yes, that Wil Wheaton). The Mars Phoenix Lander (@MarsPhoenix) also is quite talkative.
I intended to write a blog post on this topic, but ended up using Powerpoint oto t organize my thoughts, and then realized that the resulting slideshow mace the post somewhat superfluous. It is a rumination on the problem with web2.0 today (information overload), some solutions, and speculation about where we go from here:
Twitter is down for “database replication catchup” (a phrase I assume has meaning to someone fluent in mysql-ese). I found the associated graphic rather amusing:
Maybe Twitter would be more reliable if all the birdies were pulling the whale in the same direction?
I was somewhat bemused by the new service that lets you put Twitter friends on “snooze”. The idea is that if someone is very prolific (ahem @Scobleizer, ahem) then you may want to take a break from their updates for a while. While it is certainly true that some users drown out others in your stream by sheer volume of tweets alone, I think that “snoozing” them is the wrong approach, because you are missing out on data. What would be more useful would be the ability to mark all tweets from a specific user as “read” analogous to how you mark emails in your inbox as read. This would have the effect of hiding all posts from that user dated prior to the time you marked them as read. That way the Twitterers you follow who do not update as often can be rescued from the stream, because as you mark the more prolific users read, they are left behind.
This might be more appropriate as a Greasemonkey script actually. Regardless of how it is implemented, it would make following hundreds or thousands of other people much more manageable. or even two Scobles.