Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I've followed the developments in Neural Networks somewhat, but have never applied deep learning so far. This is seems like a good place to ask a couple of question I've been having for a while.

1. When does it make sense to apply deep learning? Could it potentially be applied successful applied to any difficult problem given enough data? Could it also be good at the type of problems that Random Forest, Gradient Boosting Machines are traditionally good at versus the problems that SVMs are traditionally good at (Computer Vision, NLP)? [1]

2. How much data is enough?

3. What degree of tuning is required to make it work? Are we at the point yet where deep learning works more or less out the box?

4. Is it fair to say that dropout and maxout always work better in practice? [2]

5. What is the computational effort? How long e.g. does it take to classify an ImageNet image (on a CPU / GPU)? How long does it take train a model like that?

6. How on earth does this fit into memory? Say in ImageNet your have (256 pixels * 256 pixels) * (10,000 classes) * 4 bytes = 2.4 GB, for a NN without any hidden layers.

[1] I am overgeneralizing somewhat, I know. It's my way to avoid overfitting.

[2] My lunch today was free.



I don't have great answers to the other questions, though I too am interested in them.

#5) [1] has a some python code and timings mixed in to the docs. One such example (stacked denoising autoencoders on MNIST):

    By default the code runs 15 pre-training epochs for each layer,             
    with a batch size of 1. The corruption level forthe first layer is          
    0.1, for the second 0.2 and 0.3 for the third. The pretraining              
    learning rate is was 0.001 and the finetuning learning rate is              
    0.1. Pre-training takes 585.01 minutes, with an average of 13               
    minutes per epoch. Fine-tuning is completed after 36 epochs in              
    444.2 minutes, with an average of 12.34 minutes per epoch. The              
    final validation score is 1.39% with a testing score of                     
    1.3%. These results were obtained on a machine with an Intel Xeon           
    E5430 @ 2.66GHz CPU, with a single-threaded GotoBLAS.
#6) The size of the NN is not typically num_features * num_classes, but rather num_features * num_layers where num_layers is commonly 3-10 or so. If you want a (multi-class) classifier, you first feed your neural network a bunch of examples, unsupervised. Then once you've got your NN built, you feed the outputs of the NN to a classifier like SVM or SGD. The idea is that the net provides more meaningful features than you would have if you used hand crafted features or the raw input data itself.

[1] http://deeplearning.net/tutorial/SdA.html#sda


I understand that this unsupervised approach is out of fashion already.

https://plus.google.com/+YannLeCunPhD/posts/UVT2fYTfoAC




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: