How To Draw Graphs In Python
Plotting
Overview
Teaching: fifteen min
Exercises: fifteen minQuestions
How tin can I plot my data?
How can I relieve my plot for publishing?
Objectives
Create a time serial plot showing a single data gear up.
Create a scatter plot showing human relationship between ii data sets.
matplotlib is the almost widely used scientific plotting library in Python.
- Commonly use a sub-library called
matplotlib.pyplot. - The Jupyter Notebook will return plots inline by default.
import matplotlib.pyplot equally plt - Elementary plots are then (fairly) simple to create.
time = [ 0 , i , 2 , 3 ] position = [ 0 , 100 , 200 , 300 ] plt . plot ( time , position ) plt . xlabel ( 'Time (hour)' ) plt . ylabel ( 'Position (km)' )
Display All Open Figures
In our Jupyter Notebook example, running the cell should generate the figure straight below the lawmaking. The effigy is besides included in the Notebook document for futurity viewing. Nonetheless, other Python environments like an interactive Python session started from a final or a Python script executed via the command line require an additional command to display the figure.
Instruct
matplotlibto evidence a figure:This command can also be used inside a Notebook - for instance, to display multiple figures if several are created past a single cell.
Plot data directly from a Pandas dataframe.
- We can also plot Pandas dataframes.
- This implicitly uses
matplotlib.pyplot. - Before plotting, nosotros convert the column headings from a
stringtointegerinformation type, since they represent numerical values
import pandas as pd data = pd . read_csv ( 'data/gapminder_gdp_oceania.csv' , index_col = 'land' ) # Extract year from terminal 4 characters of each column name # The current column names are structured as 'gdpPercap_(year)', # so nosotros want to keep the (year) office only for clarity when plotting GDP vs. years # To practice this we use strip(), which removes from the string the characters stated in the argument # This method works on strings, so we call str before strip() years = information . columns . str . strip ( 'gdpPercap_' ) # Convert yr values to integers, saving results back to dataframe data . columns = years . astype ( int ) information . loc [ 'Australia' ]. plot ()
Select and transform information, then plot it.
- By default,
DataFrame.plotplots with the rows as the 10 axis. - We tin can transpose the data in order to plot multiple series.
data . T . plot () plt . ylabel ( 'Gdp per capita' )
Many styles of plot are available.
- For case, practise a bar plot using a fancier style.
plt . style . use ( 'ggplot' ) data . T . plot ( kind = 'bar' ) plt . ylabel ( 'GDP per capita' )
Information can also be plotted by calling the matplotlib plot role directly.
- The command is
plt.plot(x, y) - The color and format of markers can also be specified as an boosted optional argument e.g.,
b-is a bluish line,g--is a green dashed line.
Get Australia data from dataframe
years = information . columns gdp_australia = data . loc [ 'Australia' ] plt . plot ( years , gdp_australia , 'thousand--' )
Can plot many sets of data together.
# Select ii countries' worth of data. gdp_australia = information . loc [ 'Australia' ] gdp_nz = data . loc [ 'New Zealand' ] # Plot with differently-colored markers. plt . plot ( years , gdp_australia , 'b-' , characterization = 'Commonwealth of australia' ) plt . plot ( years , gdp_nz , 'thousand-' , label = 'New Zealand' ) # Create legend. plt . legend ( loc = 'upper left' ) plt . xlabel ( 'Yr' ) plt . ylabel ( 'GDP per capita ($)' ) Adding a Fable
Often when plotting multiple datasets on the same effigy it is desirable to take a legend describing the information.
This tin can be done in
matplotlibin two stages:
- Provide a label for each dataset in the effigy:
plt . plot ( years , gdp_australia , label = 'Australia' ) plt . plot ( years , gdp_nz , label = 'New Zealand' )
- Instruct
matplotlibto create the legend.By default matplotlib will endeavor to place the fable in a suitable position. If you lot would rather specify a position this can be done with the
loc=argument, e.grand to place the legend in the upper left corner of the plot, specifyloc='upper left'
- Plot a scatter plot correlating the Gross domestic product of Australia and New Zealand
- Utilize either
plt.scatterorDataFrame.plot.scatter
plt . scatter ( gdp_australia , gdp_nz )
information . T . plot . scatter ( ten = 'Australia' , y = 'New Zealand' )
Minima and Maxima
Fill in the blanks below to plot the minimum GDP per capita over time for all the countries in Europe. Change it again to plot the maximum Gdp per capita over time for Europe.
data_europe = pd . read_csv ( 'data/gapminder_gdp_europe.csv' , index_col = 'country' ) data_europe . ____ . plot ( characterization = 'min' ) data_europe . ____ plt . fable ( loc = 'best' ) plt . xticks ( rotation = ninety )Solution
data_europe = pd . read_csv ( 'information/gapminder_gdp_europe.csv' , index_col = 'country' ) data_europe . min (). plot ( label = 'min' ) data_europe . max (). plot ( label = 'max' ) plt . fable ( loc = 'all-time' ) plt . xticks ( rotation = 90 )
![]()
Correlations
Modify the example in the notes to create a besprinkle plot showing the human relationship betwixt the minimum and maximum Gdp per capita among the countries in Asia for each yr in the information set. What relationship do you meet (if any)?
Solution
data_asia = pd . read_csv ( 'data/gapminder_gdp_asia.csv' , index_col = 'country' ) data_asia . describe (). T . plot ( kind = 'scatter' , x = 'min' , y = 'max' )
![]()
No particular correlations can be seen between the minimum and maximum gdp values twelvemonth on year. It seems the fortunes of asian countries exercise not rise and fall together.
You lot might note that the variability in the maximum is much higher than that of the minimum. Take a look at the maximum and the max indexes:
data_asia = pd . read_csv ( 'information/gapminder_gdp_asia.csv' , index_col = 'country' ) data_asia . max (). plot () print ( data_asia . idxmax ()) print ( data_asia . idxmin ())Solution
![]()
Seems the variability in this value is due to a sharp drop after 1972. Some geopolitics at play maybe? Given the authorisation of oil producing countries, maybe the Brent crude index would make an interesting comparison? Whilst Myanmar consistently has the lowest gdp, the highest gdb nation has varied more notably.
More Correlations
This short program creates a plot showing the correlation between Gross domestic product and life expectancy for 2007, normalizing mark size by population:
data_all = pd . read_csv ( 'data/gapminder_all.csv' , index_col = 'country' ) data_all . plot ( kind = 'scatter' , x = 'gdpPercap_2007' , y = 'lifeExp_2007' , s = data_all [ 'pop_2007' ] / 1e6 )Using online help and other resource, explain what each statement to
plotdoes.Solution
![]()
A proficient identify to wait is the documentation for the plot function - assist(data_all.plot).
kind - As seen already this determines the kind of plot to be drawn.
10 and y - A column proper noun or alphabetize that determines what information will be placed on the x and y axes of the plot
south - Details for this tin be found in the documentation of plt.scatter. A single number or one value for each information point. Determines the size of the plotted points.
Saving your plot to a file
If you are satisfied with the plot you see yous may desire to save information technology to a file, peradventure to include information technology in a publication. There is a function in the matplotlib.pyplot module that accomplishes this: savefig. Calling this function, e.chiliad. with
plt . savefig ( 'my_figure.png' )will relieve the current figure to the file
my_figure.png. The file format will automatically be deduced from the file name extension (other formats are pdf, ps, eps and svg).Note that functions in
pltrefer to a global figure variable and afterward a figure has been displayed to the screen (e.g. withplt.prove) matplotlib will make this variable refer to a new empty effigy. Therefore, make certain you callplt.savefigbefore the plot is displayed to the screen, otherwise you may detect a file with an empty plot.When using dataframes, data is often generated and plotted to screen in one line, and
plt.savefigseems non to be a possible arroyo. Ane possibility to save the figure to file is then to
- save a reference to the electric current figure in a local variable (with
plt.gcf)- telephone call the
savefigcourse method from that variable.fig = plt . gcf () # get electric current effigy data . plot ( kind = 'bar' ) fig . savefig ( 'my_figure.png' )
Making your plots accessible
Whenever you are generating plots to become into a paper or a presentation, in that location are a few things y'all can practice to make certain that everyone can empathise your plots.
- Always make certain your text is large enough to read. Use the
fontsizeparameter inxlabel,ylabel,championship, andlegend, andtick_paramswithlabelsizeto increase the text size of the numbers on your axes.- Similarly, you should make your graph elements piece of cake to meet. Use
sto increase the size of your scatterplot markers andlinewidthto increment the sizes of your plot lines.- Using color (and zero else) to distinguish between different plot elements will make your plots unreadable to anyone who is colorblind, or who happens to accept a black-and-white function printer. For lines, the
linestyleparameter lets you use different types of lines. For scatterplots,markerlets you alter the shape of your points. If you're unsure about your colors, you tin use Coblis or Color Oracle to simulate what your plots would look similar to those with colorblindness.
Cardinal Points
matplotlibis the nearly widely used scientific plotting library in Python.Plot data direct from a Pandas dataframe.
Select and transform data, then plot it.
Many styles of plot are bachelor: see the Python Graph Gallery for more options.
Can plot many sets of information together.
Source: https://swcarpentry.github.io/python-novice-gapminder/09-plotting/index.html
Posted by: janusagelf2001.blogspot.com

0 Response to "How To Draw Graphs In Python"
Post a Comment