Plot Linear Regression with obesity as independent variable and diabetes as dependent variable(13 Sep 2023)

Below is the linear regression result, in which the estimated coefficients are->:
b_0 = 2.055980432211423
b_1 = 0.27828827666358774

A Python program written to obtain the above regression is

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def estimate_coefficients(x, y):
# number of observations/points
no_observation = np.size(x)
# mean of x and y vector
slope_x = np.mean(x)
slope_y = np.mean(y)
# calculating cross-deviation and deviation about x
S_xy = np.sum(y*x) - no_observation*slope_y*slope_x
S_xx = np.sum(x*x) - no_observation*slope_x*slope_x
# calculating regression coefficients
b_1 = S_xy / S_xx
b_0 = slope_y - b_1*slope_x
return (b_0, b_1)
def plot_regression(x, y, b):
# plotting the actual points as scatter plot
plt.scatter(x, y, color="m",
marker="o",s=30)
# predicted response vector
diabetes_pred = b[0] + b[1]*x
# plotting the regression line
plt.plot(x, diabetes_pred, color="g")
# putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot
plt.show()
def main():
# observations / data
diabt = pd.read_excel('diabetes.xlsx')
diabArray = np.array(diabt)
diabetesList = list(diabArray)
obses = pd.read_excel('obesity.xlsx')
obsArray = np.array(obses)
obesityList = list(obsArray)
obesityArray = []
diabetesArray = []
for fpsObesity in obesityList:
for fpsDiabetes in diabetesList:
if fpsDiabetes[1] == fpsObesity[1]:
obesityArray.append(fpsObesity[4])
diabetesArray.append(fpsDiabetes[4])
obesityOnX = np.array(obesityArray)
diabetesOnY = np.array(diabetesArray)
# estimating coefficients
coefficient = estimate_coefficients(obesityOnX, diabetesOnY)
print("coefficients are->:\nb_0 = {}\
\nb_1 = {}".format(coefficient[0], coefficient[1]))
# plotting regression line
plot_regression(obesityOnX, diabetesOnY, coefficient)
if __name__ == "__main__":
main()
Currently, I will read more about residuals and how to minimize error with respect to coefficients i.e. changing b_0 and b_1.
Also will study if the error with current coefficients is minimum or not. As the shape of the graph obtained is fanning out.

Leave a Reply

Your email address will not be published. Required fields are marked *