Code Documentation

config.config(filename='database.ini', section='postgresql')

This function is used to pick the configuration from the section in the provided file

Parameters
  • filename – The path of the file which contains the configurations

  • section – The section inside the file which contains the configuration needed to connect to the database

Returns

the database config needed to connect

connect_database.connect_and_fetch_data(query, params)

This function is used to connect to the database and fetch the rows from the database, given the query and config params of the database.

Parameters
  • query – The select query to be used to fetch data

  • params – The parameters used to contact the database server

Returns

list of rows in database fetched from the query

data_pre_processing.clean_data(X)

This function is used to clean the data in the dataset

Parameters

X – the dataframe containing the features of the datapoints

Returns

the dataframe with the clean, processed values of the features

data_pre_processing.clean_labels(Y)

This function is used to clean the labels in the dataset

Parameters

Y – the array containing the labels of the datapoints

Returns

the array of cleaned labels of the datapoints

data_pre_processing.split_train_test(X, Y, ratio=0.3)

This function is used to split the dataset into training and testing dataset

Parameters
  • X – the feature set of the datapoints for regression prediction

  • Y – the labels of the datapoints

  • ratio – the test dataset ratio

Returns

the training and testing datasets

regression_model.regression_model_fit(X, Y, model=LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False))

This function is used to fit the provided model with the training dataset provided

Parameters
  • X – the feature set of the datapoints for regression prediction

  • Y – the labels of the datapoints

  • model – The regression model to be used

Returns

the fitted regression model

regression_model.regression_model_predict(X, model)

This function is used to predict the labels for the test dataset provided

Parameters
  • X – The feature set for which the prediction is to be made

  • model – The regression model used for the result

Returns

The Y labels predicted by the model

error_analysis.get_error(Y_pred, Y_test)

This function is get the mean absolute error between the predicted label and the actual label

Parameters
  • Y_pred – The predicted labels from the regression model

  • Y_test – The actual labels from the dataset

Returns

The mean absolute error value

plot_results.plot(X, Y, Y_predicted, Y_baseline, X_label, Y_label, title)

This function plots the graph, given the data values

Parameters
  • X – the feature value array to be plotted on X-axis

  • Y – the true label value array to be plotted on Y-axis

  • Y_predicted – the predicted label value array to be plotted on Y-axis

  • Y_baseline – the baseline label value array to be plotted on Y-axis

  • X_label – The label name to be used for X-axis

  • Y_label – The label name to be used for Y-axis

  • title – The title of the plot

Returns

None