Practical work with matplotlib (2/2) - correction

Any issue related to the proper execution of code on your machine must be solved during this session. Feel free to ask for help.

We'll use panel data: nominal GDP per year and per country. The dataset and its documentation are available here.

Objective: applying what you have learned during this session:

Since matplotlib enables us to make very flexible graphs, we can make them as elegant as possible. This is a good exercise to learn how to use matplotlib to its full potential.

Dataset Information

The dataset is the same as that used in the first practical work session: Practical work with matplotlib (1/2).

  • Source: World Bank GDP data
  • URL: https://raw.githubusercontent.com/datasets/gdp/master/data/gdp.csv
  • Key Columns: Country Name, Country Code, Year, Value (GDP in USD)
  • Time Range: 1960-2020

Important information

Some entities are not countries but rather regions, income groups, etc. In some cases, you should exclude them; in other cases, they can be very useful. Here is the list of these entities.

python
non_country_entities = [
    ['Africa Eastern and Southern', 'AFE'],
    ['Africa Western and Central', 'AFW'],
    ['Arab World', 'ARB'],
    ['Caribbean small states', 'CSS'],
    ['Central Europe and the Baltics', 'CEB'],
    ['Channel Islands', 'CHI'],
    ['Early-demographic dividend', 'EAR'],
    ['East Asia & Pacific', 'EAS'],
    ['East Asia & Pacific (IDA & IBRD countries)', 'TEA'],
    ['East Asia & Pacific (excluding high income)', 'EAP'],
    ['Euro area', 'EMU'],
    ['Europe & Central Asia', 'ECS'],
    ['Europe & Central Asia (IDA & IBRD countries)', 'TEC'],
    ['Europe & Central Asia (excluding high income)', 'ECA'],
    ['European Union', 'EUU'],
    ['Fragile and conflict affected situations', 'FCS'],
    ['Heavily indebted poor countries (HIPC)', 'HPC'],
    ['High income', 'HIC'],
    ['IBRD only', 'IBD'],
    ['IDA & IBRD total', 'IBT'],
    ['IDA blend', 'IDB'],
    ['IDA only', 'IDX'],
    ['IDA total', 'IDA'],
    ['Late-demographic dividend', 'LTE'],
    ['Latin America & Caribbean', 'LCN'],
    ['Latin America & Caribbean (excluding high income)', 'LAC'],
    ['Latin America & the Caribbean (IDA & IBRD countries)', 'TLA'],
    ['Least developed countries: UN classification', 'LDC'],
    ['Low & middle income', 'LMY'],
    ['Low income', 'LIC'],
    ['Lower middle income', 'LMC'],
    ['Middle East & North Africa', 'MEA'],
    ['Middle East & North Africa (IDA & IBRD countries)', 'TMN'],
    ['Middle East & North Africa (excluding high income)', 'MNA'],
    ['Middle income', 'MIC'],
    ['North America', 'NAC'],
    ['OECD members', 'OED'],
    ['Other small states', 'OSS'],
    ['Pacific island small states', 'PSS'],
    ['Post-demographic dividend', 'PST'],
    ['Pre-demographic dividend', 'PRE'],
    ['Small states', 'SST'],
    ['South Asia', 'SAS'],
    ['South Asia (IDA & IBRD)', 'TSA'],
    ['Sub-Saharan Africa', 'SSF'],
    ['Sub-Saharan Africa (IDA & IBRD countries)', 'TSS'],
    ['Sub-Saharan Africa (excluding high income)', 'SSA'],
    ['Upper middle income', 'UMC'],
    ['World', 'WLD']
]

Setup Code (Run First)

Using pyodide

python
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from pyodide.http import open_url

# Load data
url = "https://raw.githubusercontent.com/datasets/gdp/master/data/gdp.csv"
df = pd.read_csv(open_url(url))

# Exclude non-country entities (regions, income groups)
non_country_entities  = {
    'AFE', 'AFW', 'ARB', 'CSS', 'CEB', 'CHI', 'EAR', 'EAS', 'TEA', 'EAP', 
    'EMU', 'ECS', 'TEC', 'ECA', 'EUU', 'FCS', 'HPC', 'HIC', 'IBD', 'IBT', 
    'IDB', 'IDX', 'IDA', 'LTE', 'LCN', 'LAC', 'TLA', 'LDC', 'LMY', 'LIC', 
    'LMC', 'MEA', 'TMN', 'MNA', 'MIC', 'NAC', 'OED', 'OSS', 'PSS', 'PST', 
    'PRE', 'SAS', 'TSA', 'SSF', 'TSS', 'SSA', 'SST', 'UMC', 'WLD'
}
df_countries = df[~df['Country Code'].isin(non_country_entities)]

df_non_countries = df[df['Country Code'].isin(non_country_entities)]

print(f"Dataset loaded: {df_countries.shape[0]} rows, {df_countries['Country Name'].nunique()} countries")

Local execution

python

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Load data
df = pd.read_csv("gdp.csv")

# Exclude non-country entities (regions, income groups)
non_country_entities  = {
    'AFE', 'AFW', 'ARB', 'CSS', 'CEB', 'CHI', 'EAR', 'EAS', 'TEA', 'EAP', 
    'EMU', 'ECS', 'TEC', 'ECA', 'EUU', 'FCS', 'HPC', 'HIC', 'IBD', 'IBT', 
    'IDB', 'IDX', 'IDA', 'LTE', 'LCN', 'LAC', 'TLA', 'LDC', 'LMY', 'LIC', 
    'LMC', 'MEA', 'TMN', 'MNA', 'MIC', 'NAC', 'OED', 'OSS', 'PSS', 'PST', 
    'PRE', 'SAS', 'TSA', 'SSF', 'TSS', 'SSA', 'SST', 'UMC', 'WLD'
}
df_countries = df[~df['Country Code'].isin(non_country_entities)]

df_non_countries = df[df['Country Code'].isin(non_country_entities)]

print(f"Dataset loaded: {df_countries.shape[0]} rows, {df_countries['Country Name'].nunique()} countries")

Exercises: Ranking

This exercise illustrates the Ranking section of the Visual Vocabulary - Financial Times Guide.

Ranking visualizations are essential for showing order and hierarchy in data. They help readers quickly identify leaders, laggards, and relative positions. In this exercise, you'll explore different ways to visualize rankings using GDP data.

Exercise 1.1: Ordered Bar Chart

Task: Create a horizontal bar chart showing the top 15 economies by GDP in 2019, ordered from highest to lowest (from bottom to top).

Requirements:

  • Sort countries by GDP value
  • Use a gradient colormap to emphasize ranking
  • Format GDP values in trillions
  • Add value labels at the end of each bar
  • Include gridlines for easier reading

Exercise 1.2: Lollipop Chart

Task: Create a lollipop chart comparing GDP growth between 2010 and 2019 for the G7 countries.

Requirements:

  • Show both 2010 and 2019 values on the same chart
  • Use different markers for each year
  • Sort by 2019 values
  • Add a legend and appropriate labels

Exercise 1.3: Slope Chart

Task: Create a slope chart showing how the ranking of the top 10 economies changed between 2000 and 2019.

Requirements:

  • Show rankings (not values) on the y-axis
  • Connect same countries with lines
  • Color code lines by change direction
  • Label countries on both sides

Exercise 1.4: Dot Strip Plot

Task: Create a dot strip plot showing nominal GDP ranges for different regions in 2019.

Requirements:

  • Group by regions
  • Show individual countries as dots
  • Highlight median values

Exercise 1.5: Bump Chart

Task: Create a bump chart showing ranking evolution of selected economies from 2010 to 2019.

Requirements:

  • Track ranking changes year by year
  • Use smooth lines to connect rankings
  • Apply distinct colors for each country
  • Show all intermediate years

Exercise 1.6: Ordered Proportional Symbol

Task: Create a proportional symbol chart showing GDP sizes with country positions based on GDP growth rate.

Requirements:

  • Calculate growth rate between 2010 and 2019
  • Filter for countries with significant GDP growth rate (top 30 in 2019)
  • Size circles by 2019 GDP
  • Position on x-axis by growth rate
  • Color by GDP size category
Maths.pm  ne collecte aucune donnée.
  • Aucun cookie collecté
  • Aucune ligne de log écrite
  • Pas l'ombre d'une base de données distante
  • nihil omnino

  • Ni par pointcarre.app
  • Ni par notre hébergeur
  • Ni par aucun service tiers

Nous expliquons notre démarche zéro donnée conservée sur cette page.

Maths.pm, par

pointcarre.app

Codes sources
Logo licence AGPLv3
Contenus
Logo licence Creative Commons