Reshaping Data with Melt and Pivot

In Pandas, reshaping data involves changing the structure of a DataFrame without altering the data itself. Two common methods for reshaping are melt() and pivot(). They are complementary functions: while melt() transforms wide data into long format, pivot() does the reverse, transforming long data back into wide format.

1. `melt()` Function

The melt() function is used to convert wide-format data into long-format data. Long format is useful when you have multiple columns that you want to stack into one, typically when you’re preparing data for analysis or visualization.

Syntax of `melt()`:

pd.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value')

frame: The DataFrame to melt.
id_vars: Column(s) to keep as identifier variables (they stay in their original form).
value_vars: Column(s) to melt (convert from wide to long).
var_name: Name of the new “variable” column (optional).
value_name: Name of the new “value” column (default is ‘value’).

Example:

Suppose you have the following DataFrame representing sales data across different years:

import pandas as pd

df = pd.DataFrame({
    'Product': ['A', 'B', 'C'],
    '2019': [200, 150, 100],
    '2020': [220, 180, 130],
    '2021': [250, 190, 160]
})

print(df)

Output:

  Product  2019  2020  2021
0       A   200   220   250
1       B   150   180   190
2       C   100   130   160

You can use melt() to convert this wide-format data into long-format:

df_melted = pd.melt(df, id_vars='Product', var_name='Year', value_name='Sales')
print(df_melted)

Output:

  Product  Year  Sales
0       A  2019    200
1       B  2019    150
2       C  2019    100
3       A  2020    220
4       B  2020    180
5       C  2020    130
6       A  2021    250
7       B  2021    190
8       C  2021    160

Here, the data has been “melted” into long format, with Product as the identifier variable (id_vars), and the year columns (2019, 2020, 2021) converted into rows under the Year column. The sales values are now stored in the Sales column.

2. pivot() Function

The pivot() function is the opposite of melt()—it converts long-format data back into wide-format data, where one column’s values become the new columns.

Syntax of `pivot()`:

DataFrame.pivot(index=None, columns=None, values=None)

index: Column to use as the new index (the rows).

columns: Column whose unique values will become the new columns.

values: Column to populate the new table with data.

Example:

Let’s take the previously melted DataFrame df_melted and pivot it back to the original wide format:

df_pivoted = df_melted.pivot(index='Product', columns='Year', values='Sales')
print(df_pivoted)

Output:

Year     2019  2020  2021
Product                    
A         200   220   250
B         150   180   190
C         100   130   160

sanodsolutions

Reshaping Data with Melt and Pivot

1. `melt()` Function

Syntax of `melt()`:

Example:

Syntax of `pivot()`:

Example:

Leave a Reply Cancel reply

1. melt() Function

Syntax of melt():

Example:

Syntax of pivot():

Example:

Related Posts

Machine Learning: Transformative Uses and Applications Shaping the Future

Supervised vs. Unsupervised Learning

Pivot Tables and Cross-Tabulation

Grouping Data With GroupBy

Mastering Time-Based Data Analysis in Pandas: Parsing Dates, Creating Time-Based Indices, and Time-Based Grouping

Leave a Reply Cancel reply

1. `melt()` Function

Syntax of `melt()`:

Syntax of `pivot()`: