Data Mining of iTunes Library XML Files

-– Interested in contributing? Upload your iTunes Library now. -–

For better or worse, I appreciate a good argument. It’s no doubt been greatly influenced by my dad, who, naturally predisposed as the Devil’s advocate, often brought about many interesting discussions, ranging from the mundane to the insightful (often to the chagrin of my mom). Regardless of why this is the case, I have developed a knack for taking a stance and justifying it, acknowledging a losing battle and reveling in a victory. This is one of the (many) reasons I miss living with Reid Draper.

You see, Reid is a music fanatic, through and through. One of the few folks in this day and age to spend more money on music than food, he lives for music discovery, and has found inner happiness at his sweet job in Boston at the EchoNest. It is understandable then that our ongoing debate about the commercial viability of music discovery technologies is nothing short of interesting. While Reid has chosen his side, I’m not fully convinced there’s a market for it (NOTE: GMuE alum Kurt Jacobson has an interesting post related to this).

Truth is, this discussion has serious implications for a decent number of people. It would appear that the current general consensus on the “killer app” of the music information retrieval community is the joint venture between music recommendation and discovery. The inherent risk here lies in the realization that the individuals fervently researching the problem tend to be the ones with a vested personal interest in the technology. Obviously, those that want something will work the hardest for it. But the million dollar question is, how many people actually want it?

I want to try to get a straight answer to that question. Starting in the next week or so, I’ll be launching a web portal to collect iTunes Library XML files from everyone I can get, in addition to some demographic information. Is the data clean? No way. Will it provide a definitive result? Probably not. But will it give yield insights into the usage habits of a (ideally) large sample space? Oh heck yes. To potentially motivate thought, some aspects I plan on exploring are as follows:

  • How have people added music to their libraries?
  • How are play counts clustered wrt. album/artist?
  • Will anything shake out that might be that killer app?

That being said, there are plenty of other things to look for, and I’d love to hear comments and feedback from interested parties. As things develop, I’ll continue to monitor my progress here.

Leave Your Comment

Your email will not be published or shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>