I can bet for $1000 that winning team is going to use Convolutional neural networks. Anyone willing to bet (I can bet also for smaller amount if you prefer)?
The "state of the art" they reference is SVM's trained on color and texture features.
Pre deep belief network I'd agree with your guess on convolutional neural networks. However, now I'd guess you'd use a deep belief network to create a network that would pick out better features than those picked out "by hand" in the convolutional neural network. (See for example [1][2])
So my money would be on some deep belief network.
[1] Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554.
[2] Building high-level features using large scale unsupervised learning arXiv:1112.6209
So far as it comes to large datasets unsupervised learning doesn't work ! You better off training initially discriminatively your network on imagenet, and then switch to this cat vs dog training. Rather, than do unsupervised learning.
Yes, it'd be surprised if the straightforward implementation from https://code.google.com/p/cuda-convnet/, run on a GPU with lots of transformations, wasn't the winning entry.