Skip to content

Histograms

Histograms are an essential tool for visualizing the distribution of a variable within your tabular data. In this guide, we'll demonstrate how to create a histogram using MicPy and Pandas by analyzing data from a MICRESS grain growth simulation.

Overview of the Data

The dataset we'll be using is stored in a tabular file named T10_01_GrainGrowth_2D.TabGD. This file contains data from a grain growth simulation, with columns that include:

  • Simulation time [s]: The elapsed time since the start of the simulation.
  • Gr. Nb.: A unique identifier for each grain.
  • Nb. of Neighbours: The number of neighboring grains for each grain.

Below is a sample structure of the dataset:

Simulation time [s] Gr. Nb. ... Nb. of Neighbours
0.0 1 ... 4
0.0 2 ... 8
0.0 3 ... 7
... ... ... ...
300.0 152 ... 5
300.0 154 ... 6
300.0 157 ... 6

Step 1: Importing the Required Modules

We begin by importing the tab module from MicPy, which will allow us to read the tabular file and process the data.

from micpy import tab

Step 2: Loading the Data

Next, we load the data from T10_01_GrainGrowth_2D.TabGD into a Pandas DataFrame.

df = tab.read("T10_01_GrainGrowth_2D.TabGD")

Step 3: Filtering Data for the Initial Timestep

Since we are interested in analyzing the initial timestep, we filter the DataFrame to include only the rows where the simulation time is 0 seconds.

timestep = df[df["Simulation time [s]"] == 0]

Step 4: Extracting the Relevant Column

From the filtered DataFrame, we extract the Nb. of Neighbours column, which contains the number of neighbors for each grain at the initial timestep.

neighbors = timestep["Nb. of Neighbours"]

Step 5: Defining Histogram Bins

To ensure that each bin in our histogram is centered on an integer value, we define the bin edges by shifting the integer values by -0.5.

bins = [x - 0.5 for x in range(neighbors.min(), neighbors.max() + 2)]

Step 6: Plotting the Histogram

Using the plot() method of the DataFrame object, we create a histogram to visualize the distribution of the number of neighbors.

ax = neighbors.plot(kind="hist", bins=bins)

Step 7: Customizing the X-Axis

To enhance the readability of the histogram, we adjust the x-axis ticks to display integer values that correspond to the number of neighbors.

ax.set_xticks(range(neighbors.min(), neighbors.max() + 1))

Step 8: Adding Titles and Labels

Finally, we add a title, labels for the axes, and a grid to the plot to make it more informative and easier to interpret.

ax.set_title("Histogram of the number of grain neighbors")
ax.set_xlabel("Number of grain neighbors")
ax.set_ylabel("Frequency")
ax.grid(axis="y", linestyle=":")

The resulting histogram shows the distribution of the number of neighbors for grains at the initial timestep.

Conclusion

The resulting histogram provides a clear visual representation of the distribution of the number of neighboring grains at the initial timestep of the simulation.

Below is the complete code used to create the histogram:

from micpy import tab

# Load the data
df = tab.read("T10_01_GrainGrowth_2D.TabGD")

# Filter for the initial timestep
timestep = df[df["Simulation time [s]"] == 0]

# Extract the number of neighbors
neighbors = timestep["Nb. of Neighbours"]

# Define the bins
bins = [x - 0.5 for x in range(neighbors.min(), neighbors.max() + 2)]

# Plot the histogram
ax = neighbors.plot(kind="hist", bins=bins)

# Customize the x-axis
ax.set_xticks(range(neighbors.min(), neighbors.max() + 1))

# Add titles and labels
ax.set_title("Histogram of the Number of Grain Neighbors")
ax.set_xlabel("Number of Grain Neighbors")
ax.set_ylabel("Frequency")
ax.grid(axis="y", linestyle=":")