😀AMETODL01: MACHINE LEARNING
What is Machine Learning (ML)?
Buzz word or Jargon in recent days. Everybody spells the word Machine Learning.
ML change completely the computing ecosystem. Socially it helps to get better benefits as a Data Scientist/Analyst
Learning is any process (present continuous tense) to improve the performance using experience
Machine learning is concerned with autonomous computer programs that automatically improve performance through experience.
- ML is a form of AI (Artificial Intelligence)
- ML uses data sets (Past and Present) and algorithms to predict future outcomes.
- ML automates the decision support systems
Why Machine Learning?
- To Mimic human beings in monotonous hard/tired some works (Eg. Driverless car, Prescript remote patient).
- To support decision making for better performance and prosperity (Eg. Stock price prediction, Churn prediction)
- To discover new knowledge/patterns for diagnosis and prognosis (Raw item -> Data ->Information->Knowledge)
Simply without explicit programming, let the machine learns from given inputs(dependent variable) and output or labels(Independent variables)
Steps involved :
- import library's NumPy and Panda, Import data set
- scale the inputs to a smaller value because the models require small integer to process faster
- insert a dummy variable in the one's column
- import the matplotlib library and plot the data set in a graph
- split the data into input(area) and output(price).
- create a theta matrix and compute the error function
- create a linear model and use different hyperparameters such as learning rate
- plot our model graph and see the error in the graph using matplotlib
- accuracy is been calculated and the prediction function is been called
- print the predicted values for the given input
ML is applied in many real-life applications like:
Image recognition, voice recognition, weather, rainfall prediction, fraud detection, stock price prediction, product recommendation, customer churn prediction, virtual personal assistant, email spam and malware detection, traffic prediction and accident prevention, Language translation, medical diagnosis in health care systems etc.
Learning = Enhance and improve performance (P) with experience on some task
T: Task
E: Experience
Let us dive into an example so that what is actually ML.
Suppose somebody wants to buy a house in Chennai, how ML helps to predict the price.
Let us assume we have the two-item details area and price. If we ask for my requirement, it should predict the perfect price and suggest to me that.
We have to prepare ground set up for the model, train, test, and predict.
# Commented out IPython magic to ensure Python compatibility.
import NumPy as np
import pandas as PD
import matplotlib.pyplot as plt
# %matplotlib inline
raw_data= pd.read_csv('Chennai_house_univariate_train.csv')
raw_data.head(20)
data=(raw_data-raw_data.mean())/(raw_data.max()-raw_data.min())
data.head(20)
data.insert(0,"ones",1)
data.head(20)
raw_data.plot(kind="scatter",x="Size", y="Price", figsize=(10,5))
data.shape
cols=data.shape[1]
print(cols)
x=data.iloc[:,0:cols-1]
y=data.iloc[:,cols-1:cols]
x=np.matrix(x)
y=np.matrix(y)
theta=np.matrix(np.array([0,0]))
x.shape,theta.shape,y.shape
def computeError(x,y,theta):
inner=np.power(((x*theta.T)-y),2)
return np.sum(inner)/(2*len(x))
computeError(x,y,theta)
learn_rate=1.2
iters=150
def gradientDecent(x, y, theta, learn_rate, iters):
temp=np.matrix(np.zeros(theta.shape))
parameters= (theta.shape[1])
cost=np.zeros(iters)
for i in range(iters):
error=(x * theta.T)-y
for j in range(parameters):
term=np.multiply(error, x[:,j])
temp[0,j]=theta[0,j]-((learn_rate/len(x))*np.sum(term))
print(temp)
theta=temp
cost[i]=computeError(x, y, theta)
return theta,cost
new_theta,cost= gradientDecent(x, y, theta, learn_rate, iters)
print(new_theta, cost)
computeError(x,y,new_theta)
x=data.Size
print(x)
model_price=new_theta[0,0] +(new_theta[0,1]*x)
fig, ax=plt.subplots(figsize=(12,8))
ax.plot(x, model_price,'r', label= "Prediction")
ax.scatter(data.Size, data.Price, label="Training data")
ax.legend(loc=2)
ax.set_xlabel("Size")
ax.set_ylabel("Price")
ax.set_title("Predicted Price vs Size")
fig, ax= plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters),cost,'r')
ax.set_xlabel('Iterations')
ax.set_ylabel("cost")
ax.set_title('Error vs Iterations')
import math
erro_r=[np.power((b-a),2) for (a,b)in zip(model_price, y)]
error0= np.sum(erro_r)
error1= math.sqrt(error0)
error=(error0/len(y))*100
print('error %={}'.format(error))
accuracy= 100-error
print('accuracy%={}'.format(accuracy))
def predict(new_theta,accuracy):
size=float(input('Enter the size of the house in sqft'))
size=(size-raw_data.Size.mean())/(raw_data.Size.max()-raw_data.Size.min())
price=(new_theta[0,0]+(new_theta[0,1]*size))
predicted_price=(price*(raw_data.Price.max()-raw_data.Price.min()))+(raw_data.Price.mean())
price_at_max_accuracy=(predicted_price*(1/accuracy)*100)
price_range= price_at_max_accuracy-predicted_price
return predicted_price, price_range
predicted_price, price_range= predict(new_theta, accuracy)
print('your house cost is '+ str(predicted_price)+' lakhs'+" (+ or -) " + str(price_range)+ ' lakhs')
Result:
Output will be like this
3
[[2.33490203e-17 0.00000000e+00]]
[[2.33490203e-17 5.92859388e-02]]
....
[[1.92286050e-17 4.87172353e-01]]
.............
5.48026914e-05 5.47990073e-05 5.47957539e-05 5.47928810e-05
5.47903441e-05 5.47881038e-05 5.47861256e-05 5.47843786e-05
]
0 -0.182399
1 -0.181840
2 -0.180786
3 -0.180507
4 -0.179878
...
92 0.628734
93 0.705061
94 0.709195
95 0.763806
96 0.817601
Name: Size, Length: 97, dtype: float64
error %=0.010954275825215265
accuracy%=99.98904572417479
Enter the size of the house in sqft 2000
your house cost is 74.20234773124164 lakhs (+ or -) 0.008129220336471121 lakhsHappy learning to find price for your requirement 🏡 sq.ft in chennai.!!!👪
# Commented out IPython magic to ensure Python compatibility.
import NumPy as np
import pandas as PD
import matplotlib.pyplot as plt
# %matplotlib inline
raw_data= pd.read_csv('Chennai_house_univariate_train.csv')
raw_data.head(20)
data=(raw_data-raw_data.mean())/(raw_data.max()-raw_data.min())
data.head(20)
data.insert(0,"ones",1)
data.head(20)
raw_data.plot(kind="scatter",x="Size", y="Price", figsize=(10,5))
data.shape
cols=data.shape[1]
print(cols)
x=data.iloc[:,0:cols-1]
y=data.iloc[:,cols-1:cols]
x=np.matrix(x)
y=np.matrix(y)
theta=np.matrix(np.array([0,0]))
x.shape,theta.shape,y.shape
def computeError(x,y,theta):
inner=np.power(((x*theta.T)-y),2)
return np.sum(inner)/(2*len(x))
computeError(x,y,theta)
learn_rate=1.2
iters=150
def gradientDecent(x, y, theta, learn_rate, iters):
temp=np.matrix(np.zeros(theta.shape))
parameters= (theta.shape[1])
cost=np.zeros(iters)
for i in range(iters):
error=(x * theta.T)-y
for j in range(parameters):
term=np.multiply(error, x[:,j])
temp[0,j]=theta[0,j]-((learn_rate/len(x))*np.sum(term))
print(temp)
theta=temp
cost[i]=computeError(x, y, theta)
return theta,cost
new_theta,cost= gradientDecent(x, y, theta, learn_rate, iters)
print(new_theta, cost)
computeError(x,y,new_theta)
x=data.Size
print(x)
model_price=new_theta[0,0] +(new_theta[0,1]*x)
fig, ax=plt.subplots(figsize=(12,8))
ax.plot(x, model_price,'r', label= "Prediction")
ax.scatter(data.Size, data.Price, label="Training data")
ax.legend(loc=2)
ax.set_xlabel("Size")
ax.set_ylabel("Price")
ax.set_title("Predicted Price vs Size")
fig, ax= plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters),cost,'r')
ax.set_xlabel('Iterations')
ax.set_ylabel("cost")
ax.set_title('Error vs Iterations')
import math
erro_r=[np.power((b-a),2) for (a,b)in zip(model_price, y)]
error0= np.sum(erro_r)
error1= math.sqrt(error0)
error=(error0/len(y))*100
print('error %={}'.format(error))
accuracy= 100-error
print('accuracy%={}'.format(accuracy))
def predict(new_theta,accuracy):
size=float(input('Enter the size of the house in sqft'))
size=(size-raw_data.Size.mean())/(raw_data.Size.max()-raw_data.Size.min())
price=(new_theta[0,0]+(new_theta[0,1]*size))
predicted_price=(price*(raw_data.Price.max()-raw_data.Price.min()))+(raw_data.Price.mean())
price_at_max_accuracy=(predicted_price*(1/accuracy)*100)
price_range= price_at_max_accuracy-predicted_price
return predicted_price, price_range
predicted_price, price_range= predict(new_theta, accuracy)
print('your house cost is '+ str(predicted_price)+' lakhs'+" (+ or -) " + str(price_range)+ ' lakhs')
Result:
3
[[2.33490203e-17 0.00000000e+00]]
[[2.33490203e-17 5.92859388e-02]]
....
[[1.92286050e-17 4.87172353e-01]]
.............
5.48026914e-05 5.47990073e-05 5.47957539e-05 5.47928810e-05
5.47903441e-05 5.47881038e-05 5.47861256e-05 5.47843786e-05
]
0 -0.182399
1 -0.181840
2 -0.180786
3 -0.180507
4 -0.179878
...
92 0.628734
93 0.705061
94 0.709195
95 0.763806
96 0.817601
Name: Size, Length: 97, dtype: float64
error %=0.010954275825215265
accuracy%=99.98904572417479
your house cost is 74.20234773124164 lakhs (+ or -) 0.008129220336471121 lakhs
No comments:
Post a Comment