Simple and sophisticated methods are often under-valued when trying to solve complex problems. This story is intended to show how linear regression is still very relevant, and how we can improve the performance of these algorithms, and become better machine learning and data science engineers.
As a fresher in the field of machine learning, the first thing that you learn would be simple univariate linear regression. However, for the past decade or so, tree-based algorithms and neural networks have overshadowed the significance of linear regression on a commercial scale. …
In this article, we will be exploring various feature selection techniques that we need to be familiar with, in order to get the best performance out of your model.
SelectKbest is a method provided by sklearn to rank features of a dataset by their “importance ”with respect to the target variable. This “importance” is calculated using a score function which can be one of the following:
To understand Random Forest, it is essential to understand what they are made from. Decision trees are the foundational building blocks of all tree-based algorithms. Every other tree-based algorithm is a sophisticated ensemble of decision trees. Thus understanding the aspects of decision trees would be a good place to start.
Since Decision Trees can be any depth (if the depth hasn't been explicitly specified), Decision Trees tend to overfit every data point. …
All the code and the screenshots used in this article are from a personal project that I worked on, earlier this year. The code for the GitHub repo is linked here and the deployed model is linked here
First, you need to get streamlit installed on your system, or on the virtual environment where you’re working on this project.
If you do not have streamlit installed, open the command prompt and type:
pip install streamlit
Once you have streamlit installed, you should check out the official documentation of streamlit to familiarize yourself with the wide range of widgets provided by their python library. …
The question that every data science/machine learning aspirant comes across at least once, while they are relatively new to this field is that
Is it too early to start my own project? What more do I need to learn before I start working on my own project?
The answer to this question varies from person to person but a general rule of thumb is that once you feel comfortable with your command over a few fundamental subtopics of machine learning, you’re good to go! It’s never too early. …