Python is a broader language than R and is also a leader in the data science world. There are less analysis options than in R, but on the other hand Python allows you to perform better analyzes on a large scale (big data) or within broader applications or websites. Python is making an impressive advance in the field of artificial intelligence / machine learning. From a BI perspective, Python is very suitable for retrieving, cleaning and writing data. Pandas (python library) is very suitable for data analysis and manipulation. Matplotlib is used for visualizing pandas data frames. Here too, for production purposes (end users) it is more likely to be chosen Qlik or Power BI. Jupyter notebook is for making the analysis clear and transparent.
R is a language in which you perform specialized and general statistical analyzes. The enormous database with packages (more than ten thousand) ensures that you can use R for almost any analysis perfectly. The language was developed by academics and statisticians and has a very solid basis. In addition, there are good possibilities within R for reporting and visualizing outcomes. However, this is not nearly as complete and friendly as in Qlik or Power BI. You often see that you use this function when you are exploring exploratively and you want to share the end result with the business users on other platforms such as Qlik and Power BI.
In addition to the above similarities, there are also differences. It is important to be aware of these differences before choosing one of the two languages.
• R is much more difficult to learn if you have no programming knowledge yet. Python code is intuitive so that even as a layman in the field of programming, you quickly understand what is in a Python script.
• R is used by academics and within R&D departments, Python by developers / programmers
• Python is better with big data applications than R
• Python is a leader in machine learning and artificial intelligence
• The vast majority of data analysis can be done in Python with just a few packages (Numpy, Pandas, Scikit-learn, and Seaborn)
• Python integrates better in applications or websites
• R is really for statistical analysis, Python more widely applicable. This makes R especially suitable for very specific analytical activities for which specialist packages have been developed.
Python code is more robust and easier to maintain than R code.
• R has standard nice options for communicating the output of analyzes, in Python this is less. Python has made a big catch up here, so that differences have become smaller.