I am starting with Machine learning from bulk-RNAseq data (using R code) and the Random Forest model, and I needed please some advice or recommendations, a tutorial or similar from experienced people to get the best performance and the least errors when I train this model with my bulk-RNAseq data. If you could recommend a complete R code that includes the optimal preprocessing of the RNAseq data, to see how many genes approx. it could or should be included for purposes such as being able to classify different types of cells in that data, etc. Also to know what is the best model to use in addition to the Random Forest for this kind of data.
Thank you very much