#General practice
#Wrap-up
#Feedback
With the data in the file "Salaries.csv" in the folder data do the following:
# Load the data
import pandas as pd
data = pd.read_csv('/home/mcubero/dataSanJose19/data/Salaries.csv')
#Check the structure of the file
data.head()
# Check the type of variables in the file
data.info()
#Select the numeric variables in a separate dataframe *Remember using columns
num = data.iloc[:,[3,4,6]]
num.head()
# Check if there are some missing values
num.isna().sum()
#Rescale the data using the method preprocessing.MinMaxScaler()
from sklearn import preprocessing
#Save columns names
names = num.columns
#Create scaler
scaler = preprocessing.MinMaxScaler() #StandardScaler() #MaxAbsScaler
#Transform your data frame (numeric variables )
data1 = num
data1 = scaler.fit_transform(data1)
data1 = pd.DataFrame(data1, columns=names)
print(data1.head())
print(num.head())
20 min
The Python 3 documentation covers the core language and the standard library.
PyCon is the largest annual conference for the Python community.
SciPy is a rich collection of scientific utilities. It is also the name of a series of annual conferences.
Jupyter is the home of Project Jupyter.
Pandas is the home of the Pandas data library.
Stack Overflow’s general Python section can be helpful, as well as the sections on NumPy, SciPy, and Pandas.