Data Types Classification and Introduction to Matplotlib
Understanding data types and basic visualization with matplotlib
Data Types Classification
1. Qualitative (Categorical) Variables
1.1. Nominal Variables
- Definition: Categories without inherent order
- Examples:
- Country names (France, USA, Japan)
- Colors (red, blue, green)
- Gender (male, female, other)
- Operations: Equality (=, ≠) only
- Basic visualization: Bar charts, pie charts
1.2. Ordinal Variables
- Definition: Categories with meaningful order
- Examples:
- Education level (Primary < Secondary < University)
- Survey ratings (Poor < Fair < Good < Excellent)
- Size categories (Small < Medium < Large)
- Operations: Equality and comparison (<, >, ≤, ≥)
- Basic visualization: Bar charts with ordered categories
2. Quantitative (Numerical) Variables
2.1. Discrete Variables
- Definition: Countable values, often integers
- Examples:
- Number of employees: 25, 126, 512
- Year of construction: 2010, 2015, 2023
- Number of children: 0, 1, 2, 3
- Operations: Most of the time, all numerical operations
- Basic visualization: Bar charts, scatter plots
2.2. Continuous Variables
- Definition: Any value within a range
- Examples:
- Temperature: 23.5°C, 24.7°C
- GDP: 1.234 trillion dollars, 45.678 billion dollars
- Height: 1.75m, 1.823m
- Operations: All numerical operations
- Basic visualization: Line plots, histograms, scatter plots
Quick Reference Table
Type | Order | Math Operations | Example | Best Chart Types |
---|---|---|---|---|
Nominal | ❌ | Count only | Country names | Bar, Pie |
Ordinal | ✓ | Count, Compare | Education level | Ordered Bar |
Discrete | ✓ | All | Year, Count | Bar, Scatter |
Continuous | ✓ | All | GDP, Temperature | Line, Histogram |
Basics with Matplotlib
2.1. Getting Started with Simple Plots
Key principles
matplotlib
is a powerful library for creating static, interactive, and animated visualizations in Python.
In particular, a lot of the complexity related to the the building of the layout of the plot is handled by matplotlib
, without us having to worry about it.
However, if more control is needed, we can always use the matplotlib
API to customize a plot.
Official documentation
Basic import and first plot
2.2. Line Plot (plt.plot
)
Key parameters: x
, y
, color
, linewidth
, linestyle
, marker
, markersize
, label
, alpha
2.3. Scatter Plot (plt.scatter
)
Key parameters: x
, y
, s
(size), c
(color), alpha
, edgecolors
, linewidths
, marker
2.4. Bar Chart (plt.bar
)
Key parameters: x
, height
, width
, color
, edgecolor
, linewidth
, alpha
, label
2.5. Pie Chart (plt.pie
)
Key parameters: x
, labels
, colors
, autopct
, startangle
, explode
, shadow
2.6. Subplots (plt.subplots
)
Creating multiple plots: fig, axes = plt.subplots(nrows, ncols)
2.7. Figure and Axes Control
Key methods for customization:
plt.figure()
: Create a new figure withfigsize
,dpi
,facecolor
plt.xlabel()
,plt.ylabel()
: Set axis labels withfontsize
,fontweight
plt.title()
: Set plot title withfontsize
,fontweight
,pad
plt.xlim()
,plt.ylim()
: Set axis limitsplt.xticks()
,plt.yticks()
: Customize tick positions and labelsplt.legend()
: Add legend withloc
,fontsize
,title
plt.grid()
: Add grid withaxis
,alpha
,linestyle
3. Real Data Examples with GDP Dataset
3.1. Loading External Data (pandas
with pyodide
as backend)
3.2. Line Plot with Real Time Series Data
3.3. Bar Chart with Country Comparison
3.4. Scatter Plot - Year-to-Year GDP Growth Variations
3.5. Complex Visualization with Subplots - Global GDP Analysis
4. Quick Reference Guide
Essential Matplotlib Methods
Function | Purpose | Common Parameters |
---|---|---|
plt.figure() | Create new figure | figsize=(width, height) , dpi , facecolor |
plt.plot() | Line plot | x , y , color , linewidth , linestyle , marker |
plt.scatter() | Scatter plot | x , y , s (size), c (color), alpha , edgecolors |
plt.bar() | Bar chart | x , height , width , color , edgecolor |
plt.pie() | Pie chart | x , labels , colors , autopct , explode |
plt.subplot() | Create subplots | rows , cols , index |
plt.subplots() | Create figure and axes | nrows , ncols , figsize |
plt.xlabel() | Set x-axis label | label , fontsize , fontweight |
plt.ylabel() | Set y-axis label | label , fontsize , fontweight |
plt.title() | Set plot title | title , fontsize , fontweight , pad |
plt.legend() | Add legend | labels , loc , fontsize |
plt.grid() | Add grid | True/False , axis , alpha , linestyle |
plt.xlim() | Set x-axis limits | left , right |
plt.ylim() | Set y-axis limits | bottom , top |
plt.xticks() | Set x-axis ticks | ticks , labels , rotation |
plt.tight_layout() | Adjust subplot params | pad , h_pad , w_pad |
plt.show() | Display plot | - |
Common Line Styles and Markers
Line Styles | Description | Markers | Description |
---|---|---|---|
'-' | Solid line | 'o' | Circle |
'--' | Dashed line | 's' | Square |
'-.' | Dash-dot line | '^' | Triangle up |
':' | Dotted line | 'v' | Triangle down |
'*' | Star | ||
'd' | Diamond | ||
'+' | Plus | ||
'x' | Cross |