Pandas Unleashed - Embarking on a data journey (Part III)
In Part II of Pandas Unchained series, you have read about how we can do Data Cleaning using drop, fillna, duplicated, drop_duplicates, Data Selection using loc, iloc and basic selection methods & Data Filtering using Query & basic selection and much more. Does pandas stop here? No, till now we have completed almost every aspect of data analysis through pandas, but data analysis can't be completed with data visualisation. Let's learn how we can do this with pandas.
Data Visualisation
Data Visualisation is one of the most important part of Data Analysis. Data Analysis and Data Science is all about math, computing and other complicated things but not all can understand those things as they are. Visualising the resulted data will help people understand it clearly and do work based on that. Pandas give us good options for data visualisation.
Line Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df.plot(x='X', y='Y', kind='line')
Bar Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df.plot(x='X', y='Y', kind='bar')
Scatter Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df.plot(x='X', y='Y', kind='scatter')
Histplot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df['X'].plot(kind='hist')
Boxplot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df['X'].plot(kind='hist')
Area Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df.plot(x='X', y='Y', kind='area')
Pie Chart:
1import pandas as pd
2
3data = {'Category': ['A', 'B', 'C'], 'Value': [30, 50, 20]}
4df = pd.DataFrame(data, index=data['Category'])
5
6df['Value'].plot(kind='pie', autopct='%1.1f%%')
Barh Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6df.plot(x='X', y='Y', kind='barh')
Hexbin Plot:
1import pandas as pd
2
3data = pd.DataFrame({'X': np.random.randn(1000), 'Y': np.random.randn(1000)})
4
5data.plot(x='X', y='Y', kind='hexbin', gridsize=20)
KDE Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6data['X'].plot(kind='kde')
Density Plot:
1import pandas as pd
2
3data = {'X': [1, 2, 3, 4, 5], 'Y': [5, 4, 3, 2, 1]}
4df = pd.DataFrame(data)
5
6data['X'].plot(kind='density')
Conclusion
These are some of the most used plots we use while data analysis. We can plot all these charts using Pandas. Pandas is an indispensable tool for working with structured data, offering a wide range of capabilities for loading, cleaning, transforming, and visualizing data.
We've covered a plethora of topics, starting from the very basics of installation and data structures to more advanced concepts like time series analysis, statistical testing, and performance optimization. Whether you're a beginner taking your first steps in data analysis or an experienced data scientist looking for a quick reference, this guide has something for everyone.Pandas enables users to efficiently manage data, whether it's from CSV files, Excel spreadsheets, or even databases. You've learned how to inspect and filter data, handle missing values, and perform aggregations. We've also delved into data visualization, which is essential for gaining insights and conveying findings effectively.
To truly master Pandas, practice is key. Working on real-world datasets and tackling data analysis tasks will solidify your skills and deepen your understanding of this library. Whether you're a data analyst, data scientist, researcher, or simply someone who needs to manipulate data, Pandas is a versatile and indispensable tool for your data-related endeavors.
Remember that Pandas is an ever-evolving library, and new features and improvements are continuously being added by the open-source community. Staying up to date with the latest developments is crucial as you embark on your data analysis journey. As you continue to explore and apply Pandas in your projects, you'll find it to be a reliable companion, making your data analysis tasks more efficient and insightful.