Thursday, May 22, 2008

Interactive "installations" with openFrameworks

Found a fascinating C++ framework called openFrameworks, whose primary purpose is to allow artists/programmers to create interactive installations. Click on the video labelled "made with openFrameworks" - the video is done quite badly, unfortunately, but the stuff that it showcases is amazing. Drawing "graffiti" on a building using a laser pointer, "bugs" that crawl off an LCD screen and onto your arm when you touch the screen, popping "bubbles" on a massive video wall, interactive floors...

Here's the video:

made with openFrameworks from openFrameworks on Vimeo.

Most of the technology seems to be based on a combination of live video feeds with clever motion tracking and/or face recognition.

I think Quartz Composer could do some of this stuff using custom "Core Image Filter" patches, which use OpenGL shading language and Core Image extensions. Or not. I don't know. I'm not a programmer. Anyone?

Monday, May 12, 2008

PowerSet a "Google-killer"? Er, I don't think so...

If you read articles such as PCWorld's Powerset Unveils Test Version of Google-killer, you might be tempted to think that the so-called "semantic web" is here. (In fact, that's exactly what some articles attempted to suggest: Powerset brings the Semantic Web to Wikipedia.

To save you some reading: Powerset is basically a "plain language" search tool, i.e. instead of being "keyword"-based, the search engine supposedly understands plain English questions and actually connects your question to meaning and facts. At present, it only searches Wikipedia, not the entire web, but you can imagine the appeal of a system that can find out exactly what you're looking you, from the world's largest enyclopedia.

Let's deal with the notion that this is a "Google-killer". Let's put it to the test.

I played around with Powerset myself, just to see what the all the fuss is about. I tried a few example searches, stuff I figured must be more or less in line with what the system is designed to do. I actually did some pretty easy examples, e.g. "When was Bill Clinton inaugurated as president"? That worked perfectly. Then I tried something else easy: "Where were the 1936 Olympic Games held?" And straight away the system fell apart. You can see the results for yourself. While you could more or less "gather" from the results what the answer is, you can see that the system hasn't really understood what you were looking for. Clearly, it hasn't understood the meaning of the word "held" in the context, because the first result is the article "Popular Front (France)", and the sentence it thinks is most relevant is the following:

In this complex situation, Léo Lagrange held fast to an ethical conception of sports which rejected both fascist militarism and indoctrination, scientific racist theories as well as professionalisation of sports, which he opposed as an elitist conception which ignored the main, popular aspect of sport, which should aim, according to him, for the fulfilment of the personality of the individual. ... The 1936 Olympic Games

And the second result is the article "1936 Summer Olympics" - indeed the correct article, but once again it almost looks like it was keywords alone that helped it to appear at all, and certainly the meaning of the question was completely misunderstood:
Due to the quagmire, the teams could not dribble, thus the score was held to a minimum. ... – Adolf Hitler, commenting on the 1936 Berlin Olympic Games


Compare the same question typed directly into a Google search. Yes, despite what Powerset might want you to believe, you can actually type full, plain-English sentences into Google. I typed "where were the 1936 Olympic Games held" and here were the results. Shock! Google does not get thrown off by the "plain English" sentence, and in fact the top result it returns is spot on:
1936 Summer Olympics - Wikipedia, the free encyclopedia
The 1936 Summer Olympics, officially known as the Games of the XI Olympiad, were held in 1936 in Berlin, Germany. Berlin won the bid to host the games, ...

And the cherry on the top... The result is from Wikipedia, which is what Powerset is designed to search!

So in a simple example, Google shows itself not only capable of making meaningful sense of plain English search strings, but in fact does a better job of locating relevant information than Powerset, which is supposedly designed precisely for that purpose.

A "Google-killer"? Hardly. Powerset are trying to perpetuate some kind of myth that Google is a pure "keyword-based" search engine, as if it blindly lists articles by how many times the keywords appear in the text, without any concern for context. Which is total nonsense. Google's algorithms are far more advanced than that - which is precisely why it is the number one search engine on the Web, trusted by millions all over the world to help them find what they're looking for.

And the media are carrying on as if Powerset is some sort of "new thing" - also complete nonsense, as we've seen plenty of these "plain English" search engines before. AskJeeves was one example. None have ever lived up to their promise, and Google made them all redundant anyway because it became so good at finding relevant results.

And as for Powerset being the long-promised "semantic web"... Well isn't the semantic web supposed to be about a lot more than just being able to type plain English search terms? Powerset is a gimmick, yet another piece of software trying to "trick" the user into seeing some sort of artifical intelligence where really there is none - just another search engine, and one not even up to Google's standard, at that.

Friday, May 09, 2008

James Cameron on 3D

Fascinating interview with James Cameron, discussing his latest movie, Avatar, shot in stereoscopic 3D: "James Cameron supercharges 3-D".

Cameron goes into some detail about the technical aspects of 3D film production (more succinct and correct than many supposedly "technical" articles, actually); but he spends more time talking about the creative aspects and the directorial issues that 3D presents. That's a discussion which I haven't come across often, yet. Too often the "gimmick" of 3D gets talked about ad nauseam - but Cameron gives us a sense of what the creative possibilities are and what it is actually like to shoot a feature film in 3D. Also, he gives some insight into why 3D matters at all.

As more or less a side-issue, he argues strongly for the adoption of 48P as the new standard for film acquisition and projection - specifically for 3D, but also for 2D films.

Tuesday, May 06, 2008

Confirmed: Apple iPhone in South Africa!

Finally, it looks as if we might be getting iPhones - and not the "hacked" variety - in South Africa. Articles such as PC World's "Vodafone Will Sell IPhone in Ten Countries" confirm that Vodafone, which owns local networks such as Vodacom, will be selling the iPhone in various countries.

Here's the relevant bit:

Customers in Australia, the Czech Republic, Egypt, Greece, Italy, India, Portugal, New Zealand, South Africa and Turkey will be able to purchase the iPhone for use on Vodafone's networks there later this year.


For all we know, the cost might be absurd, given the "cut" Apple insists on taking from every phone sold. But at least it's here. Almost.

Friday, April 25, 2008

More synchronised multi-screen playback


Just to make it even more clear than in my previous post, here's another photo of synchronised playback using two very simple Quartz Compositions. You can see three different clips are playing: they all happen to be exactly the same length (they were from the same series), but only the one playing on the MacBook Pro is using its "own" timebase to play back the clip.

The other two displays are on the same computer, a Mac Pro with dual monitors. I simply duplicated the same "playback" Quartz Composition, but with each set to use a different movie clip as the source. I then made each Composition go into Full Screen mode, just on different displays.

So in theory it is possible to synchronise any number of clips quite easily, across one or more displays on one or more Macs. A Mac Pro, for example, could theoretically drive up to 8 displays simultaneously (with four dual-output PCI-express graphics cards installed), and two of them synchronised over the network could drive 16 displays. And so on.

Synchronising playback in Quartz Composer 3


I've been playing with Quartz Composer 3's new "Network" patches. It's a very "QC" type of thing - almost absurdly simple, which makes it very easy to make it "just work", but as a direct consequence of its simplicity it is quite tricky to get it to do something specific.

In the photo above I've got a Mac Pro desktop and my own MacBook Pro - both running OS X 10.5.2 and Quartz Composer 3 - playing the same video clip through a Quartz Composition, in synch. The Mac Pro is basically running a Quartz Composition set up as a "controller", i.e. it plays the clip using "patch time" for the timebase. It then uses the "Network Broadcaster" patch to send the "Movie Position" (i.e. timecode of the clip). The Composition on the MBP then uses the "Network Receiver" patch to read the timecode and feed it straight into the "Movie Loader" as an external Timebase (QC doesn't mind that a string is being used in a number input, i.e. it does case conversions automatically). And that's how the two computers play the same clip in synch.

This is a fairly trivial example, but the possibilities here are quite exciting. For example, something like this could be used for synchronised multi-screen playback, either different clips or one very large clip "spanned" across multiple displays (i.e. each display computer is actually playing the same clip, just scaling it to play a particular part of the frame).

I know, you're probably saying: "but Quartz Visualizer can do this already", but you don't have a lot of control over exact layout of displays - anything other than a simple grid layout is not going to work in Quartz Visualizer. Also, there are quality issues to consider: QV streams the actual image over the network, which requires some nasty compression to keep up the framerate. With Quartz Composer running locally on each machine, but synchronised over the network, you can get full quality playback.

Also, Quartz Visualizer simply spans one Composition across multiple displays, but a custom solution using Network Broadcaster/Receiver patches would allow different compositions to be running on each display system, interacting with each other in some way.

I'm already planning to use this model to upgrade my Quartonian Mixer-based VJ app to allow synchronised playback. This could happen in all kinds of interesting ways. For example, just having the crossfader synchronised would be interesting - you could control two VJ systems with the same controller (e.g. the mouse). So two (or more) systems could be lining up different clips (perhaps manually) but you could time the fade to the next clip exactly, without having to grow more than two hands! Or maybe the actual clip selection could be synchronised, but each computer has different clips on the same "patch", so you choose related clips and you effectively load them together.

Sunday, April 20, 2008

Experiencing U23D

Went to see U23D at a digital 3D theatre in Rosebank here in Johannesburg. The only other 3D movie I've seen was Beowulf which was a pretty awful movie but as my first glimpse of digital 3D it was still an interesting experience. But U23D is just incredible.

I was a little under-awed by some of the opening sequences, where the cross-dissolves seemed a little heavy-handed (by nature I'm just anti-dissolves!). But very quickly this gave away to some absolutely spectacular imagery, stuff I've never seen before.

The first thing that grabs you attention is the sense of scale. In 3D, a huge stadium actually feels like a huge stadium. The massive LED array with concert visuals behind the stage (I could talk about that on its own for a whole blog entry) are absolutely towering and you know it within 10 seconds. Conversely, intimate moments with Bono singing right in front of (helped by clever sound design) are also impressive.

From crowd-placed camera angles, you get hands waved right in front of you, and droplets of water splashed in the air. Soon you are noticing every little cellphone held in people's hands, the songlist taped to the floor in front of Bono's mic stand, the folow-spot guys sitting in the gap between the enormous LED array behind the stage... The amount of detail is absolutely overwhelming - but in a good way, the kind of overwhelming that made me break into uncontrollable grins because it was just so frikkin amazing.

As far as I can tell, the concert(s) were shot with High-Definition video cameras (modified for 3D, obviously) - and as far as I can tell it was probably 1080P @ 24fps, not 2K or anything like that. But the amount of detail is insane. Presumably a huge part of 3D lens/optics design is dedicated to getting enormous depth-of-field. Where "deep focus" might become "cluttered" on a flat image, in 3D this is no obstacle because depth perception helps your eyes to resolve the clutter more easily. So keeping just about everything in focus becomes part of the immersion experience, because you are able to choose what to look at - it could be Bono's sunglasses, or it could be the screen of a cellphone help up by one of the crowd members. As with traditional deep focus, I guess the trick for the cinematographer/editor then becomes finding methods other selective focus to create strong composition.

To return to the cross-dissolves which irked me so much in the first two minutes or so... Very quickly this had developed into a full-blown style all of its own. In 2D, "dissolving" really just means mushing two shots together, which of course is usually only effective if the two shots have something worthwhile to juxtapose in the same frame. In 3D, however, you have this added element of depth. And boy did they use that in interesting ways. They would actually mask out members of the band, or parts of the screen visuals from the live show, and then create multi-layered (in 3D, this means literally multi-layered) compositions. Even more striking to me were moments when a totally defocussed shot (e.g. of the crowd) was layered over a razor sharp image (e.g. "The Edge" playing the keyboard) - the effect was to surround him (taking advantage of the depth effect) in a gently wafting smoke of hands, faces, etc.

And if you thought playing with text over video was getting a little hackneyed, you absolutely have to see "The Fly" performed in 3D, with millions of layers of multi-coloured words and letters cascading in multiple 3D layers in front of, next to, behind the band. It gets insane.

Oh, a tip: don't leave too early when the credits roll. Near the end you'll be treated to a display of 3D computer animation.

A selection of sites on U23D



Official U23D site

Olivier Wicki Interview
Great interview with the editor, Olivier Wicki. Goes into some detail about the technical challenges of editing a 3D movie. (As you might have expected, the cut is done in 2D first, but interesting there seems to be a circular process where 3D experts watch rough cuts and make suggestions to improve the 3D experience, which presumably leads to re-cuts and new compositing ideas, etc.)

Deep Focus Movie Review: U23D
A review that starts off dry and then increasingly waxes hyperbolic about the experience. (I agree with the sentiment!) Here's an excerpt:
But something significant has happened behind the scenes, since the film’s editorial team can layer images and cut between angles with apparently reckless abandon. What’s more, the subtle differences between left-eye and right-eye points of view have been left intact, meaning that familiar cinematographic effects like lens flares, or light washing across the frame, get a dimensionality all their own — watching a concert filmed this way is like using binoculars instead of a telescope. The result is an enveloping kind of cinematic space that I’ve never seen before.