Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

This is great advice. Another related issue I see is when you engineer a new feature that can't always be 100% accurate because the source data is spotty but you intuitively think the new festure should help the classifier anyway when it is present. And if the new feature's feature importance in the trained model turns out really high, you think you've done something great. But in the end you made model that simply detects the presence of your new feature which you knew wasn't 100% accurate anyway because the source data it is derived from is spotty. So you've accomplished precisely nothing.


OP Here, glad you liked it!

The thing you're talking about definitely happens heaps as well, because of a fundamental mental blind spot we have. I'd definitely love to hear if you've got any more stories along these lines. The psychology of what makes a successful machine learning project really interests me, and I don't mean in terms of platitudes about openness and transparency.

I'm really tempted to write another post about specifically the sort of thing you talk about in your example - narrative fallacies in machine learning. Basically because we operate in the unknown we tend to want to string the evidence we have together in a nice appealing way.


It would be unusual not to check the performance of the model at predicting the target variable, which would validate whether or not the derived feature is useful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: