Example usage

Here we will demonstrate how to use snapedautility package for your data science project.

Imports

import snapedautility
from palmerpenguins import load_penguins
print(snapedautility.__version__)
0.1.0

Sample Data

We will be using the penguins data as an example.

df = load_penguins()
df.head()
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007
3 Adelie Torgersen NaN NaN NaN NaN NaN 2007
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007

Plot Histograms

To generate histograms and bar plots for your data, import the plot_histograms function from the module snapedautility.plot_histograms.

from snapedautility.plot_histograms import plot_histograms
plot_histograms(df, ["species", "bill_length_mm", "island"], 100, 100)

Plot Correlation

To generate the correlation plots for your data, import the plot_corr function from the module snapedautility.plot_corr.

from snapedautility.plot_corr import plot_corr
plot_corr(df, ["bill_length_mm", "bill_depth_mm", 'species'])

Detect Outliers

To detect the outliers in your data, import the detect_outliers function from the module snapedautility.detect_outliers.

from snapedautility.detect_outliers import detect_outliers
[lq, hq], chart = detect_outliers(df["body_mass_g"], 250, 250)
print(f"The lower bound and the upper bound are {round(lq, 2)} and {round(hq, 2)}")
chart
The lower bound and the upper bound are 1750.0 and 6550.0