Example usage
Here we will demonstrate how to use snapedautility package for your data science project.
Imports
import snapedautility
from palmerpenguins import load_penguins
print(snapedautility.__version__)
0.1.0
Sample Data
We will be using the penguins data as an example.
df = load_penguins()
df.head()
| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
|---|---|---|---|---|---|---|---|---|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
| 2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
| 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN | 2007 |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
Plot Histograms
To generate histograms and bar plots for your data, import the plot_histograms function from the module snapedautility.plot_histograms.
from snapedautility.plot_histograms import plot_histograms
plot_histograms(df, ["species", "bill_length_mm", "island"], 100, 100)
Plot Correlation
To generate the correlation plots for your data, import the plot_corr function from the module snapedautility.plot_corr.
from snapedautility.plot_corr import plot_corr
plot_corr(df, ["bill_length_mm", "bill_depth_mm", 'species'])
Detect Outliers
To detect the outliers in your data, import the detect_outliers function from the module snapedautility.detect_outliers.
from snapedautility.detect_outliers import detect_outliers
[lq, hq], chart = detect_outliers(df["body_mass_g"], 250, 250)
print(f"The lower bound and the upper bound are {round(lq, 2)} and {round(hq, 2)}")
chart
The lower bound and the upper bound are 1750.0 and 6550.0