Deep Learning in MIR, from Concept to Conversation

By | October 2, 2012

With ISMIR now less than a week away, it’s quickly become that time to start outlining goals and setting expectations for the conference. There is no shortage of exciting ideas that can be traced back to our annual pilgrimage, and this year looks to be no different.

Personally, I’m genuinely stoked for the opportunity to present a position piece at the MIRrors session on Friday, titled “Moving Beyond Feature Design: Deep Architectures and Automatic Feature Learning in Music Informatics.” For reasons I’ll hopefully be able to address in the talk, I strongly believe that the future of MIR resides in the tandem of deep information processing architectures and automatic feature learning. I’m wholly convinced that the potential to solve difficult problems in music informatics and, more broadly, artificial intelligence will come on the heels of advances in deep learning, and that we as a community both can and should be at the bleeding edge – rather than the lagging tail – of these breakthroughs.

Now more than ever, it is an especially critical point to have this discussion at length. Last year, just before ISMIR, I was chatting with a colleague about the impending conference which, as you’ll recall, was in Miami. When I asked what they were looking forward to, as I couldn’t attend, the answer was sobering: “To be honest? The beach. I’m not sure I can sit through a bunch of presentations about someone’s fancy new feature extractor.” I found the insight particularly poignant, not because it was so perfectly candid, but that it succinctly captured an increasingly prevalent sentiment: content-based MIR is getting stale.

Since we’re being candid, I feel it’s hardly a risk to say that a variety of research areas simply feel stuck. I’m not alone when I doubt that genre classification on MFCCs would just work if only someone devised a more powerful classifier. Do we really believe human-comparable chord detection can be solved with short-time chroma features, and that the answer lies in more complicated post-filtering methods? Even more concerning is the fact that these are just the tasks we think we understand. How will these methods scale to artist recognition with more than a few dozen unique bands? Is this really the future of MIR?

The truth is, we have other options now. Methods that were once overblown and disparagingly ineffective are finally practical and delivering on old promises. It is often the case, though, that discussions of “neural networks” tend to have a polarizing effect, resulting in the two common camps of evangelists and skeptics; many have already chosen sides. One of my goals, therefore, going into the next week and beyond is to encourage this dialogue as a single group with a common objective, in hallways and hotels, around coffee carafes and bar stools, because I believe it is an incredibly important one to have. We need to consider our options with an open mind. We need to question our reasoning and challenge our assumptions. What’s more, we need to be honest.

If it wasn’t already apparent, I’m enthused by our place at a collective crossroads. In the interest of making the most of our shared time in Portugal, it is my hope that some mildly provocative text may catalyze a reaction we can transform into productive conversation. I encourage any and all responses here, and look forward to seeing everyone in Porto.

recovery mode

By | September 30, 2012

Ack. It seems my SQL database has been corrupted and a year’s worth of posts have vanished. I’m planning an attempt to restore them for posterity, but in the meantime it seems this site will be a bit barren.


content explosion, and other rarities.

By | February 22, 2011

due to a course I’m currently taking at NYU – new media research studio taught by Jonah Brucker-Cohen – recent blog post productivity has shot through the ceiling. in case anything goes awfully awry over at blogspot, where we’re hosting the entire class’s output, I’m going to repost my own entries here.

also. blogspot’s HTML editor is atrocious… but that’s more of a personal grievance than anything.

populating my own blog here in reverse chronological order, I just added / updated social software is a blessing (no disguises necessary). more to follow soonish.

word of the day.

By | January 7, 2011

next – genvy

[nekst jen-vee] – noun

The feeling one gets as a once-proud Apple device owner, namely iPhones and iPads, when the next generation is unveiled and crushes said individual’s soul for being infinitely better than the tech that person is now begrudgingly stuck with. Detailed illustration here.

so twitter4j and a java applet go on safari…

By | December 1, 2010

Alrighty, I’m going to go ahead and interrupt my workflow on the ever so slight, but still non-zero, chance that someone else in the world is struggling with incorporating twitter4j into a deployable Java applet. I’ve been slamming my head against the wall (figuratively… so far), and since I couldn’t (easily) find this information, I hope this will help… someone… eventually.

First, a little background: for my Java Music Systems course this semester, I’ve settled on a project with the goal of developing an applet that can interactively sonify Twitter data in real-time. How, exactly, is an aesthetic decision I have yet to make, but I’m probably going to need access to the streaming API. Regardless of how this ultimately makes sound, one thing is unavoidable: I need OAuth handshaking.

If you know anything about OAuth, your eyes are probably already wide with security concerns related to trying to pull this off in any kind of client-side software. I know. I’ll be the first to admit that an applet is probably not the best platform for the task at hand, so I’ll dodge that responsibility with a “that’s not the point, I’m supposed to use an applet.” However, the bottom line is this: it would appear that, at least for the twitter4j API, an unsigned applet cannot make the requests necessary for OAuth authentication. Womp.

make a couple twitter4j calls and your unsigned app gets the cold shoulder...

In hindsight, this makes sense. I’m new to Java and the bittersweet sentiment I’ve fostered toward applet development, but locking out the applet’s access to other servers definitely feels like something I read somewhere. Figuring out for certain that this was in fact the issue, though, took far too long.

I was rather surprised to learn that neither Google Chrome or Mozilla Firefox have active/up-to-date Java Console plugins for developers. My initial reaction was, “geez, everyone must really be calling it quits on Java Applets” (although with JavaFX being effectively shipped with NetBeans, I may just be developing with the wrong tools). Safari, on the other hand, offers a native Java console for all of your live debugging purposes. This can be toggled through /Applications/Utilities/Java, where you’ll want to go ahead and set “Show Console” under the Advanced tab. Disclaimer, this is all with respect to OS X 10.6, so your mileage will vary on a different OS.

In summary:

  • The Twitter API is sweet, and I definitely recommend checking it out. I’m having far more fun with Twitter as a developer than as a user.
  • Twitter4J is also sweet. Major kudos to YY and his hard work…
  • But it looks like it won’t work in an unsigned applet.
  • Safari has a Java console (ftw). Chrome and Firefox, no dice.

And with that, it’s back to the grind.

new post, plus cognitive surplus

By | November 30, 2010

Given a recent immersion in some readings about copyright/left, just rattled of a post about free culture confusion… I strongly hold that the whole discussion could benefit from a semantic facelift.

In other news, this visualization from David McCandless of an statistical approximation of Clay Shirky’s (from his excellent book, Cognitive Surplus) is among one of the more promising observations I’ve seen in some time.

Beware the collaborative creativity of crowds!

Dreaming in Beta

By | November 10, 2010

Welcome to the slightly overhauled site. In a concerted effort to grow the site a little bit and share some of the knowledge and insight I’m starting to accumulate, I’m aiming for consistent and regular updates.

As it goes, the times, they are a’changing.