Example usage

Here we will demonstrate how to use snapedautility package for your data science project.

Imports

import snapedautility
from palmerpenguins import load_penguins
print(snapedautility.__version__)

0.1.0

We will be using the penguins data as an example.

df = load_penguins()
df.head()

	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	male	2007
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	female	2007
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	female	2007
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN	2007
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	female	2007

To generate histograms and bar plots for your data, import the plot_histograms function from the module snapedautility.plot_histograms.

from snapedautility.plot_histograms import plot_histograms
plot_histograms(df, ["species", "bill_length_mm", "island"], 100, 100)

To generate the correlation plots for your data, import the plot_corr function from the module snapedautility.plot_corr.

from snapedautility.plot_corr import plot_corr
plot_corr(df, ["bill_length_mm", "bill_depth_mm", 'species'])

To detect the outliers in your data, import the detect_outliers function from the module snapedautility.detect_outliers.

from snapedautility.detect_outliers import detect_outliers
[lq, hq], chart = detect_outliers(df["body_mass_g"], 250, 250)

print(f"The lower bound and the upper bound are {round(lq, 2)} and {round(hq, 2)}")
chart

The lower bound and the upper bound are 1750.0 and 6550.0