banner



How To Draw Graphs In Python

Plotting

Overview

Teaching: fifteen min
Exercises: fifteen min

Questions

  • How tin can I plot my data?

  • How can I relieve my plot for publishing?

Objectives

  • Create a time serial plot showing a single data gear up.

  • Create a scatter plot showing human relationship between ii data sets.

matplotlib is the almost widely used scientific plotting library in Python.

  • Commonly use a sub-library called matplotlib.pyplot.
  • The Jupyter Notebook will return plots inline by default.
                          import              matplotlib.pyplot              equally              plt                      
  • Elementary plots are then (fairly) simple to create.
                          time              =              [              0              ,              i              ,              2              ,              3              ]              position              =              [              0              ,              100              ,              200              ,              300              ]              plt              .              plot              (              time              ,              position              )              plt              .              xlabel              (              'Time (hour)'              )              plt              .              ylabel              (              'Position (km)'              )                      

Simple Position-Time Plot

Display All Open Figures

In our Jupyter Notebook example, running the cell should generate the figure straight below the lawmaking. The effigy is besides included in the Notebook document for futurity viewing. Nonetheless, other Python environments like an interactive Python session started from a final or a Python script executed via the command line require an additional command to display the figure.

Instruct matplotlib to evidence a figure:

This command can also be used inside a Notebook - for instance, to display multiple figures if several are created past a single cell.

Plot data directly from a Pandas dataframe.

  • We can also plot Pandas dataframes.
  • This implicitly uses matplotlib.pyplot.
  • Before plotting, nosotros convert the column headings from a string to integer information type, since they represent numerical values
                          import              pandas              as              pd              data              =              pd              .              read_csv              (              'data/gapminder_gdp_oceania.csv'              ,              index_col              =              'land'              )              # Extract year from terminal 4 characters of each column name # The current column names are structured as 'gdpPercap_(year)',  # so nosotros want to keep the (year) office only for clarity when plotting GDP vs. years # To practice this we use strip(), which removes from the string the characters stated in the argument # This method works on strings, so we call str before strip()                            years              =              information              .              columns              .              str              .              strip              (              'gdpPercap_'              )              # Convert yr values to integers, saving results back to dataframe                            data              .              columns              =              years              .              astype              (              int              )              information              .              loc              [              'Australia'              ].              plot              ()                      

GDP plot for Australia

Select and transform information, then plot it.

  • By default, DataFrame.plot plots with the rows as the 10 axis.
  • We tin can transpose the data in order to plot multiple series.
                          data              .              T              .              plot              ()              plt              .              ylabel              (              'Gdp per capita'              )                      

GDP plot for Australia and New Zealand

Many styles of plot are available.

  • For case, practise a bar plot using a fancier style.
                          plt              .              style              .              use              (              'ggplot'              )              data              .              T              .              plot              (              kind              =              'bar'              )              plt              .              ylabel              (              'GDP per capita'              )                      

GDP barplot for Australia

Information can also be plotted by calling the matplotlib plot role directly.

  • The command is plt.plot(x, y)
  • The color and format of markers can also be specified as an boosted optional argument e.g., b- is a bluish line, g-- is a green dashed line.

Get Australia data from dataframe

                          years              =              information              .              columns              gdp_australia              =              data              .              loc              [              'Australia'              ]              plt              .              plot              (              years              ,              gdp_australia              ,              'thousand--'              )                      

GDP formatted plot for Australia

Can plot many sets of data together.

                          # Select ii countries' worth of data.                            gdp_australia              =              information              .              loc              [              'Australia'              ]              gdp_nz              =              data              .              loc              [              'New Zealand'              ]              # Plot with differently-colored markers.                            plt              .              plot              (              years              ,              gdp_australia              ,              'b-'              ,              characterization              =              'Commonwealth of australia'              )              plt              .              plot              (              years              ,              gdp_nz              ,              'thousand-'              ,              label              =              'New Zealand'              )              # Create legend.                            plt              .              legend              (              loc              =              'upper left'              )              plt              .              xlabel              (              'Yr'              )              plt              .              ylabel              (              'GDP per capita ($)'              )                      

Adding a Fable

Often when plotting multiple datasets on the same effigy it is desirable to take a legend describing the information.

This tin can be done in matplotlib in two stages:

  • Provide a label for each dataset in the effigy:
                              plt                .                plot                (                years                ,                gdp_australia                ,                label                =                'Australia'                )                plt                .                plot                (                years                ,                gdp_nz                ,                label                =                'New Zealand'                )                          
  • Instruct matplotlib to create the legend.

By default matplotlib will endeavor to place the fable in a suitable position. If you lot would rather specify a position this can be done with the loc= argument, e.grand to place the legend in the upper left corner of the plot, specify loc='upper left'

GDP formatted plot for Australia and New Zealand

  • Plot a scatter plot correlating the Gross domestic product of Australia and New Zealand
  • Utilize either plt.scatter or DataFrame.plot.scatter
                          plt              .              scatter              (              gdp_australia              ,              gdp_nz              )                      

GDP correlation using plt.scatter

                          information              .              T              .              plot              .              scatter              (              ten              =              'Australia'              ,              y              =              'New Zealand'              )                      

GDP correlation using data.T.plot.scatter

Minima and Maxima

Fill in the blanks below to plot the minimum GDP per capita over time for all the countries in Europe. Change it again to plot the maximum Gdp per capita over time for Europe.

                              data_europe                =                pd                .                read_csv                (                'data/gapminder_gdp_europe.csv'                ,                index_col                =                'country'                )                data_europe                .                ____                .                plot                (                characterization                =                'min'                )                data_europe                .                ____                plt                .                fable                (                loc                =                'best'                )                plt                .                xticks                (                rotation                =                ninety                )                          

Solution

                                  data_europe                  =                  pd                  .                  read_csv                  (                  'information/gapminder_gdp_europe.csv'                  ,                  index_col                  =                  'country'                  )                  data_europe                  .                  min                  ().                  plot                  (                  label                  =                  'min'                  )                  data_europe                  .                  max                  ().                  plot                  (                  label                  =                  'max'                  )                  plt                  .                  fable                  (                  loc                  =                  'all-time'                  )                  plt                  .                  xticks                  (                  rotation                  =                  90                  )                              

Minima Maxima Solution

Correlations

Modify the example in the notes to create a besprinkle plot showing the human relationship betwixt the minimum and maximum Gdp per capita among the countries in Asia for each yr in the information set. What relationship do you meet (if any)?

Solution

                                  data_asia                  =                  pd                  .                  read_csv                  (                  'data/gapminder_gdp_asia.csv'                  ,                  index_col                  =                  'country'                  )                  data_asia                  .                  describe                  ().                  T                  .                  plot                  (                  kind                  =                  'scatter'                  ,                  x                  =                  'min'                  ,                  y                  =                  'max'                  )                              

Correlations Solution 1

No particular correlations can be seen between the minimum and maximum gdp values twelvemonth on year. It seems the fortunes of asian countries exercise not rise and fall together.

You lot might note that the variability in the maximum is much higher than that of the minimum. Take a look at the maximum and the max indexes:

                              data_asia                =                pd                .                read_csv                (                'information/gapminder_gdp_asia.csv'                ,                index_col                =                'country'                )                data_asia                .                max                ().                plot                ()                print                (                data_asia                .                idxmax                ())                print                (                data_asia                .                idxmin                ())                          

Solution

Correlations Solution 2

Seems the variability in this value is due to a sharp drop after 1972. Some geopolitics at play maybe? Given the authorisation of oil producing countries, maybe the Brent crude index would make an interesting comparison? Whilst Myanmar consistently has the lowest gdp, the highest gdb nation has varied more notably.

More Correlations

This short program creates a plot showing the correlation between Gross domestic product and life expectancy for 2007, normalizing mark size by population:

                              data_all                =                pd                .                read_csv                (                'data/gapminder_all.csv'                ,                index_col                =                'country'                )                data_all                .                plot                (                kind                =                'scatter'                ,                x                =                'gdpPercap_2007'                ,                y                =                'lifeExp_2007'                ,                s                =                data_all                [                'pop_2007'                ]                /                1e6                )                          

Using online help and other resource, explain what each statement to plot does.

Solution

More Correlations Solution

A proficient identify to wait is the documentation for the plot function - assist(data_all.plot).

kind - As seen already this determines the kind of plot to be drawn.

10 and y - A column proper noun or alphabetize that determines what information will be placed on the x and y axes of the plot

south - Details for this tin be found in the documentation of plt.scatter. A single number or one value for each information point. Determines the size of the plotted points.

Saving your plot to a file

If you are satisfied with the plot you see yous may desire to save information technology to a file, peradventure to include information technology in a publication. There is a function in the matplotlib.pyplot module that accomplishes this: savefig. Calling this function, e.chiliad. with

                              plt                .                savefig                (                'my_figure.png'                )                          

will relieve the current figure to the file my_figure.png. The file format will automatically be deduced from the file name extension (other formats are pdf, ps, eps and svg).

Note that functions in plt refer to a global figure variable and afterward a figure has been displayed to the screen (e.g. with plt.prove) matplotlib will make this variable refer to a new empty effigy. Therefore, make certain you call plt.savefig before the plot is displayed to the screen, otherwise you may detect a file with an empty plot.

When using dataframes, data is often generated and plotted to screen in one line, and plt.savefig seems non to be a possible arroyo. Ane possibility to save the figure to file is then to

  • save a reference to the electric current figure in a local variable (with plt.gcf)
  • telephone call the savefig course method from that variable.
                              fig                =                plt                .                gcf                ()                # get electric current effigy                                data                .                plot                (                kind                =                'bar'                )                fig                .                savefig                (                'my_figure.png'                )                          

Making your plots accessible

Whenever you are generating plots to become into a paper or a presentation, in that location are a few things y'all can practice to make certain that everyone can empathise your plots.

  • Always make certain your text is large enough to read. Use the fontsize parameter in xlabel, ylabel, championship, and legend, and tick_params with labelsize to increase the text size of the numbers on your axes.
  • Similarly, you should make your graph elements piece of cake to meet. Use s to increase the size of your scatterplot markers and linewidth to increment the sizes of your plot lines.
  • Using color (and zero else) to distinguish between different plot elements will make your plots unreadable to anyone who is colorblind, or who happens to accept a black-and-white function printer. For lines, the linestyle parameter lets you use different types of lines. For scatterplots, marker lets you alter the shape of your points. If you're unsure about your colors, you tin use Coblis or Color Oracle to simulate what your plots would look similar to those with colorblindness.

Cardinal Points

  • matplotlib is the nearly widely used scientific plotting library in Python.

  • Plot data direct from a Pandas dataframe.

  • Select and transform data, then plot it.

  • Many styles of plot are bachelor: see the Python Graph Gallery for more options.

  • Can plot many sets of information together.

Source: https://swcarpentry.github.io/python-novice-gapminder/09-plotting/index.html

Posted by: janusagelf2001.blogspot.com

0 Response to "How To Draw Graphs In Python"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel