What should I do to be a data scientist

What is Data Science?

Data Science is a field that encompasses related to data cleansing, preparation, and analysis. Data science is an umbrella term in which many scientific methods apply. For example mathematics, statistics, and many other tools scientists apply to data sets. Scientist applies the tools to extract knowledge from data.

It is a tool to tackle Big Data. And then extract information from it. First Data scientist gathers data sets from multi disciplines and compiles it. After that, apply machine learning, predictive and sentimental analysis. Then sharpen it to a point where he can derive something. At last, he extracts the useful information from it.

Data scientist understands data from a business point of view.His work is to give the most accurate prediction. He takes charge of giving his predictions. The prediction of data scientist is very accurate. It prevents a businessman from future loss.

In artificial intelligence and machine learning, Data scientist has a great role to play. For Data scientist, knowledge of machine learning is the must. Machine learning is the most impressive development in the tech world. He requires knowing that which method of machine learning will exactly help him. And finally, how to apply that. He does not need to know how that method works.

What should I do to be a data scientist
What should I do to be a data scientist

What should I do to be a data scientist?

Step 1: Learning the basics for python- Python is an easy to start language but to master the idioms takes time like any other language. So as a novice first you need to understand all the basics for the language.

Step 2: Basic Statistics & Mathematics- Would highly recommend learning statistics with a heavy focus on coding up examples, preferably in Python or R.

Step 3: Python for Data Analysis- Once you are done with Step 1 & Step 2 then it’s time to get your hands dirty with some real stuffs, Learn to install Anaconda,Jupyter notebook, Python packages like Numpy,Pandas,Matplotlib,Seaborn etc.

Step 4: Machine Learning- It is classified into following two categroies:

(i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks).

(ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning),Install Python Scikit Learn Library for practicing Machine Learning in Jupyter Notebook

Step 5: Practice - Do as many Kaggle competitions as you can, start your blog and put your projects on github or bitbucket.This is only a rough pathway- you can change the sequence as per your need.

Hope it helps.


Post a Comment

Previous Post Next Post