Hi, thanks for your comment. I actually understood that he meant something like a hyperparameter search/tuning using cross validation (at least that what came in my mind).
Parameter tuning and algorithm selection! I just don’t want to manually start 5 different runs of algorithms i believe which could work good on the data and manually compare the results. And maybe i was too lazy to run the 6th algorithm which now performs much better.
But to be sure, every test should be done with k-fold cross validation. The decision whether to split the training set should not be chosen by the user. It‘s crucial that this is a must!
Cross validation would be good! I think if you build this in you could automatically run a few heuristics to see if the data can be partitioned, or maybe just prompt the user for another sample of the data with the same distribution.