Hi, we should be careful with the feature you are talking about. The results from all machine learning algorithm can be very misleading and probably some models will overfit the data.
So, if you throw some data and fit all machine learning models on it and then compare the performance. You will probably receive misleading values since different models require different tuning approaches. It's not as easy as you said it, you can't just feed data (also depends on the data) to models and expect to get the best model at the output.
One approach I can think of here is to integrate cross validation and hyperparameter tuning with your suggestion. However, I can imagine that this can be computationally expensive. I will take it into consideration as an enhancement for the tool. Thanks for your feedback
Thank you for explaining this more indepth. I should have been more specific with my original comment, I did intend cross validation and hyper parameter tuning as inclusing to the automatic feature I was describing.
These operations certainly are computationally expensive, a recent hyperparameter tuning operation locked up my laptop for 3 days but this seems to be the case for any similar operation. The only approaches I've come across so far to overcome it are things like converting the data to smaller sizes (which seems outside the scope of this tool) and some way to batch the data so that it can be "paused" and resumed as needed. Thank you again for creating Igel.
Hey, I really appreciate your answer to this question. As I was reading the question, red flags started popping up in my mind about the risk of overfitting when using the ensemble approach, and I think your response was spot on for how an ML researcher would go about it! Most ML professionals I've talked to have been really against making a user friendly ML suite because of how easy it is to misuse these algorithms.
So, if you throw some data and fit all machine learning models on it and then compare the performance. You will probably receive misleading values since different models require different tuning approaches. It's not as easy as you said it, you can't just feed data (also depends on the data) to models and expect to get the best model at the output.
One approach I can think of here is to integrate cross validation and hyperparameter tuning with your suggestion. However, I can imagine that this can be computationally expensive. I will take it into consideration as an enhancement for the tool. Thanks for your feedback