对于学校的一个项目,我必须从具有四个变量(三个独立;一个目标)的CSV文件中读取传感器值,根据这些值创建一个简单的神经网络,允许用户输入他们自己的三个独立值,然后让程序给他们一个预测的目标变量.在这一切之后,my program must plot a scatter plot of the data, with a trend line of the independent variables against the predicted target value.

Please forgive my terrible formatting and description, I have barely any experience with this site 😭

Optional Boring Context Bit:
My project is focused on predicting mood 'scores' based on the sound level of a room. The sound values are collected by a separate program I have created which utilises sound sensors, then uploads them to a CSV file, which this program reads, and uses to build it's model. The program then asks the user for three predicted sound values, then asks three 'What-if Questions', which find which value is most important. It also gives the user a scatter plot of the sound values, which is where I'm having an issue.

The Important Context:
I have the program running pretty smoothly. It outputs predictions well, compares them, e.t.c. I then added matplotlib scatterplot functionality, to plot each of the three parameters, and a trend line on my plot, which uses a polynomial regression system, since I have a better understanding of sci-kit's polynomial regression function than neural networks. Now I want to take my previous code and, instead of plotting the polynomial regression trend line, plot a trend line which uses my tensorflow neural network.

如果看起来我真的不知道我在用这个模型做什么,那么你是对的.我对事物没有任何真正的经验.我知道这个平台可能会对此有点不宽容,但我真的别无 Select .如果你 Select 帮忙,谢谢你.

以下是前面的工作代码,它使用神经网络查找预测值,但使用多项式回归绘制趋势线:

from silence_tensorflow import silence_tensorflow
silence_tensorflow()

# Tensorflow has a bunch of functions available for moddeling and analysis
# Specifically, I'm using the layers module to do the hard work for me in constructing a nueral network.
import tensorflow as tf
from tensorflow.keras import layers

# Pandas is a little more flexible and has more options than the previously used 'CSV' module
# In my case, it's essential for constructing dataframes from my uh... data.
import pandas as pd

# train_test_split shuffles the data
# it then splits the aforementioned data into two samples, one for training the model, another for testing it's accuracy
from sklearn.model_selection import train_test_split

# StandardScaler is pretty self explanatory, it standardises the sample data, so that a clear model can be created
from sklearn.preprocessing import StandardScaler

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Plot that data, baby! (ᗒᗊᗕ)
import matplotlib.pyplot as plt
import numpy as np

# Load my CSV values from the serial program
data = pd.read_csv('your_dataset.csv')

# X = Independent variables
X = data[['mean_sound', 'max_sound', 'min_sound']]
# Y = Target variable
Y = data['average_mood']

# Split that data into train/test sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)

# Standardise (ze?) the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Yass queen! Model that neural network! Slay!
model = tf.keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(X_train_scaled.shape[1],)),
    layers.Dense(64, activation='relu'),
    layers.Dense(1)
])

# Compile! (optimize the model and esatblish the loss function, which finds the difference between the predicted and actual data)
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model. Queue 'Eye of The Tiger'!
# Huh, 'queue' is kind of a weird word, right?
model.fit(X_train_scaled, Y_train, epochs=410, validation_split=0.2)

# Evaluate how our new, beefed-up, trained model compares to the OG data
mse = model.evaluate(X_test_scaled, Y_test)
print(f"Test Mean Squared Error: {mse}")

# Function to figure out what the Mean Squared Error says about my program
def mse_rating(mse):
    if mse < 10:
        return "Amazing model accuracy! What a smarty-pants :)"
    elif mse < 20:
        return "Great model accuracy"
    elif mse < 30:
        return "Fine model accuracy."
    elif mse < 40:
        return "Less than average model accuracy."
    else:
        return "Poor model accuracy :("

# Print the model accuracy rating
print(mse_rating(mse))

# Function to plot multiple graphs of each independent variable against mood score
def scatter_2d_plot(X, Y):
    for column in X.columns:
        plt.scatter(X[column], Y, alpha=0.5)
        plt.title(f'Mood Score vs {column}')
        plt.xlabel(column)
        plt.ylabel('Mood Score')
        
        """
        scaler = StandardScaler()
        X_column_scaled = scaler.fit_transform(X[[column]])
        X_column_scaled = (scaler.transform(X_column_scaled)).reshape(1, -1)
        
        # Create a simple neural network with one neuron
        mini_model = tf.keras.Sequential([
            tf.keras.layers.Dense(units=1, input_shape=[1])
        ])
        
        mini_model.compile(optimizer='adam', loss='mean_squared_error')
        
        # Train the model
        mini_model.fit(X_column_scaled, Y, epochs=100, verbose=0)
        
        # Predict Y values based on the trained model
        Y_pred = mini_model.predict(X_column_scaled)
        
        # Plot the trendline
        plt.plot(X_column_scaled, Y_pred, color='red')
        
        # Set the plot limits to focus on the data points
        #plt.xlim(min(X[[column]]), max(X[[column]]))
        #plt.ylim(min(Y), max(Y))
        """
        # Create a PolynomialFeatures object with a specified degree, e.g., 2
        poly = PolynomialFeatures(degree=2)
        
        # Polynomial Transformation
        x_poly = poly.fit_transform(X[column].values.reshape(-1, 1))
        
        # Creating and fitting the model
        model = LinearRegression()
        model.fit(x_poly, Y)

        # Making predictions for the trendline
        x_range = np.linspace(X[column].min(), X[column].max(), 100).reshape(-1, 1)
        y_pred = model.predict(poly.transform(x_range))
        
        plt.plot(x_range, y_pred, color='red')
        
        #plt.savefig("NeuralNetworkOutputChart.png")
        plt.show()

# Have the model predict the mood score based on user input
def make_prediction(user_mean_sound, user_max_sound, user_min_sound):
    try:

        # Create a DataFrame for user input
        user_input_df = pd.DataFrame([[user_mean_sound, user_max_sound, user_min_sound]],
                                     columns=['mean_sound', 'max_sound', 'min_sound'])

        # Standardiseze user input so it compares to the rest of my data
        user_input_scaled = scaler.transform(user_input_df)

        # Predict mood score
        predicted_mood = model.predict(user_input_scaled)[0][0]
        print(f"Predicted Mood Score: {predicted_mood}")
        
        # Send our mood flying back out of my function
        return predicted_mood
        
    except ValueError:
        # It's not me, it's you ╭∩╮(-_-)╭∩╮
        print("Invalid input. Please enter numeric values.")

# Scatterplot of the OG data
scatter_2d_plot(X, Y)

# Ask user for input and make prediction
print("Enter some predicted values for the following")
user_mean_sound = float(input("Average (mean) Sound Level: "))
user_max_sound = float(input("Maximum Sound Level: "))
user_min_sound = float(input("Minimum Sound Level: "))

# Create a point of reference to see if my 'what-if' Qs actually make a positive impact.
# Otherwise, the program will conclude that it's better to leave the sound levels ualtered.
untampered_prediction = make_prediction(user_mean_sound, user_max_sound, user_min_sound)

#---------------------------------------------------

"What-if Questions"

print("\nWhat-if Q1:")
print("What if I double the mean noise level but keep the max and min the same?")
# Multiplies mean sound input by 2
what_if_1 = make_prediction(user_mean_sound*2, user_max_sound, user_min_sound)

print("\nWhat-if Q2:")
print("What if I double the min noise level but keep the mean and min the same?")
# Shockingly, this one multiplies MAXIMUM sound input by 2
what_if_2 = make_prediction(user_mean_sound, user_max_sound*2, user_min_sound)

print("\nWhat-if Q3:")
print("What if I double the min noise level but keep the max and mean the same?")
# In a strange turn of events, this line multiplies minimum sound input by 2!
what_if_3 = make_prediction(user_mean_sound, user_max_sound, user_min_sound*2)


if what_if_1 > untampered_prediction and what_if_1 > what_if_2 and what_if_1 > what_if_3:
    print("\nBased on my questions, it is clear that mean sound levels have the greatest impact on mood!")
    print("Therefore what-if Q1 gives the best result")
    
elif what_if_2 > untampered_prediction and what_if_2 > what_if_1 and what_if_2 > what_if_3:
    print("\nBased on my questions, it is clear that the maximum sound level reached has the greatest impact on mood!")
    print("Therefore what-if Q2 gives the best result")
    
elif what_if_3 > untampered_prediction and what_if_3 > what_if_1 and what_if_3 > what_if_2:
    print("\nBased on my questions, it is clear that the minimum sound level reached has the greatest impact on mood!")
    print("Therefore what-if Q3 gives the best result")
    
elif untampered_prediction > what_if_1 and untampered_prediction > what_if_2 and untampered_prediction > what_if_3:
    print("\nBased on my questions, it seems that none of my what-if questions have a postive impact on a person's mood score")
    print("Therefore the original inputted mean, max, and min sound levels provide the best result")

这是使用神经接口绘制趋势线的一个很好的垫脚石,它提供了以下输出:

Scatter plot SHOULD be here..

这是我将神经接口集成到我的绘图中的可怕try (请注意,我已经在ChatGPT上运行了难以理解的次数,所以它可能从那以后就不会做了):

from silence_tensorflow import silence_tensorflow
silence_tensorflow()

# Tensorflow has a bunch of functions available for moddeling and analysis
# Specifically, I'm using the layers module to do the hard work for me in constructing a nueral network.
import tensorflow as tf
from tensorflow.keras import layers

# Pandas is a little more flexible and has more options than the previously used 'CSV' module
# In my case, it's essential for constructing dataframes from my uh... data.
import pandas as pd

# train_test_split shuffles the data
# it then splits the aforementioned data into two samples, one for training the model, another for testing it's accuracy
from sklearn.model_selection import train_test_split

# StandardScaler is pretty self explanatory, it standardises the sample data, so that a clear model can be created
from sklearn.preprocessing import StandardScaler

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Plot that data, baby! (ᗒᗊᗕ)
import matplotlib.pyplot as plt
import numpy as np

# Load my CSV values from the serial program
data = pd.read_csv('your_dataset.csv')

# X = Independent variables
X = data[['mean_sound', 'max_sound', 'min_sound']]
# Y = Target variable
Y = data['average_mood']

# Split that data into train/test sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)

# Standardise (ze?) the data
scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Yass queen! Model that neural network! Slay!
model = tf.keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(X_train_scaled.shape[1],)),
    layers.Dense(64, activation='relu'),
    layers.Dense(1)
])

# Compile! (optimize the model and esatblish the loss function, which finds the difference between the predicted and actual data)
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model. Queue 'Eye of The Tiger'!
# Huh, 'queue' is kind of a weird word, right?
model.fit(X_train_scaled, Y_train, epochs=410, validation_split=0.2)

# Evaluate how our new, beefed-up, trained model compares to the OG data
mse = model.evaluate(X_test_scaled, Y_test)
print(f"Test Mean Squared Error: {mse}")

# Function to figure out what the Mean Squared Error says about my program
def mse_rating(mse):
    if mse < 10:
        return "Amazing model accuracy! What a smarty-pants :)"
    elif mse < 20:
        return "Great model accuracy"
    elif mse < 30:
        return "Fine model accuracy."
    elif mse < 40:
        return "Less than average model accuracy."
    else:
        return "Poor model accuracy :("

# Print the model accuracy rating
print(mse_rating(mse))

# Function to plot multiple graphs of each independent variable against mood score
def scatter_2d_plot(X, y, x_title, y_title):
    plt.scatter(X[x_title], y, alpha=0.5)
    plt.title(f'Mood Score vs {x_title}')
    plt.xlabel(x_title)
    plt.ylabel('Mood Score')
    
    # Predict mood scores for the range of x values using the neural network model
    x_range = np.linspace(X[x_title].min(), X[x_title].max(), 100).reshape(-1, 1)
    
    # Ensure that the x_range is scaled using the same scaler that was fitted with
    x_range_scaled = scaler.transform(x_range)  

    # Predict mood scores for the scaled x_range
    y_pred = model.predict(x_range_scaled)

    # Plot the trend line
    plt.plot(x_range, y_pred, color='red')
        
    plt.show()



# Have the model predict the mood score based on user input
def make_prediction(user_mean_sound, user_max_sound, user_min_sound):
    try:

        # Create a DataFrame for user input
        user_input_df = pd.DataFrame([[user_mean_sound, user_max_sound, user_min_sound]],
                                     columns=['mean_sound', 'max_sound', 'min_sound'])

        # Standardize user input so it compares to the rest of my data
        user_input_scaled = scaler.transform(user_input_df)

        # Predict mood score
        predicted_mood = model.predict(user_input_scaled)[0][0]
        print(f"Predicted Mood Score: {predicted_mood}")
        
        # Pass the entire DataFrame X_train to scatter_2d_plot
        scatter_2d_plot(X_train, Y_train, 'mean_sound', 'Predicted Mood Score')
        
        # Send our mood flying back out of my function
        return predicted_mood
        
    except ValueError:
        # It's not me, it's you
        print("Invalid input. Please enter numeric values.")



# Ask user for input and make prediction
print("Enter some predicted values for the following")
user_mean_sound = float(input("Average (mean) Sound Level: "))
user_max_sound = float(input("Maximum Sound Level: "))
user_min_sound = float(input("Minimum Sound Level: "))

# Create a point of reference to see if my 'what-if' Qs actually make a positive impact.
# Otherwise, the program will conclude that it's better to leave the sound levels ualtered.
untampered_prediction = make_prediction(user_mean_sound, user_max_sound, user_min_sound)

#---------------------------------------------------

# What-if Questions

print("\nWhat-if Q1:")
print("What if I double the mean noise level but keep the max and min the same?")
# Multiplies mean sound input by 2
what_if_1 = make_prediction(user_mean_sound*2, user_max_sound, user_min_sound)

print("\nWhat-if Q2:")
print("What if I double the min noise level but keep the mean and min the same?")
# Shockingly, this one multiplies MAXIMUM sound input by 2
what_if_2 = make_prediction(user_mean_sound, user_max_sound*2, user_min_sound)

print("\nWhat-if Q3:")
print("What if I double the min noise level but keep the max and mean the same?")
# In a strange turn of events, this line multiplies minimum sound input by 2!
what_if_3 = make_prediction(user_mean_sound, user_max_sound, user_min_sound*2)


# Validate predictions before comparing them
if all(pred is not None for pred in [untampered_prediction, what_if_1, what_if_2, what_if_3]):
    if what_if_1 > untampered_prediction and what_if_1 > what_if_2 and what_if_1 > what_if_3:
        print("\nBased on my questions, it is clear that mean sound levels have the greatest impact on mood!")
        print("Therefore what-if Q1 gives the best result")
        
    elif what_if_2 > untampered_prediction and what_if_2 > what_if_1 and what_if_2 > what_if_3:
        print("\nBased on my questions, it is clear that the maximum sound level reached has the greatest impact on mood!")
        print("Therefore what-if Q2 gives the best result")
        
    elif what_if_3 > untampered_prediction and what_if_3 > what_if_1 and what_if_3 > what_if_2:
        print("\nBased on my questions, it is clear that the minimum sound level reached has the greatest impact on mood!")
        print("Therefore what-if Q3 gives the best result")
        
    elif untampered_prediction > what_if_1 and untampered_prediction > what_if_2 and untampered_prediction > what_if_3:
        print("\nBased on my questions, it seems that none of my what-if questions have a positive impact on a person's mood score")
        print("Therefore the original inputted mean, max, and min sound levels provide the best result")
else:
    print("At least one prediction failed. Please check your input values.")

第二个代码块给出以下错误:

C:\Users\Nitro\AppData\Roaming\Python\Python310\site-packages\sklearn\base.py:465:UserWarning:X没有有效的功能名称,但StandardScaler安装了功能名称 Warnings.warning.warn( 输入无效.请输入数值. 至少有一项预测失败.请判断您的输入值.

我知道这是一个很长的问题,我真的不知道人们通常在这些事情上给出多少细节.从字面上看,任何帮助都是感激的,如果你 Select 帮助像我这样的鱼离开水,出于你的善良,我提前感谢你!

推荐答案

在前面的工作代码中,添加下面定义的函数scatter_2d_plot_withTFmodel()将执行所需的绘图.我用一些模拟数据对其进行了测试:

enter image description here

该网络针对3个特性进行训练,因此,每当您使用predict()时,它都需要查看这3个特性中的每一个的值.绘制趋势线时,您一次只关心一列,因此需要为其他两列得出something,否则predict()会抱怨每一个输入没有有效的值.

在代码中,我用x_range表示要绘制的列,对其他2列使用它们的平均值填充它们的列.这给我们提供了predict()所需的3个输入.

在假设其他特征具有平均值的情况下,对x_range上的每个点都有一个预测.

def scatter_2d_plot_withTFmodel(X, Y):
    for column in X.columns:
        plt.scatter(X[column], Y, alpha=0.5)
        plt.title(f'Mood Score vs {column} (with TF model trendline)')
        plt.xlabel(column)
        plt.ylabel('Mood Score')

        #We only want the trendline for "column" here,
        # but the TF model needs all 3 inputs (mean_sound, max_sound, min_sound).
        #
        #To get round this, we need to provide values for the other 2 columns.
        #
        #1. Find out which columns aren't selected.
        #2. Then get their average.
        #3. Make a new dataframe that is populated with
        #   "x_range" for "column", and the average for the other two columns.
        #4. Transform using the previously-fitted scaler, and predict.
        #
        #The prediction being made for "column" is therefore
        # making the assumption that we have an average amount of the other columns.
        
        #1. Find the unselected columns
        other_columns = [col for col in X.columns if col != column]
        #2. Get their average
        other_columns_average = X[other_columns].mean().values
        
        #3. Make a new dataframe - populate it with the values to use.
        
        # x_range for this column
        x_range = np.linspace(X[column].min(), X[column].max(), 100)
        
        df_for_trendline = pd.DataFrame({
            column: x_range,
            other_columns[0]: [other_columns_average[0]] * len(x_range),
            other_columns[1]: [other_columns_average[1]] * len(x_range),
        })
        # Reorder the columns so they match the order when "scaler" was fitted
        df_for_trendline = df_for_trendline[['mean_sound', 'max_sound', 'min_sound']]
        
        #4. Transform using previous scaler, and predict.
        # Prediction is for x_range, assuming an average value
        # for the other variables
        df_for_trendline_scaled = scaler.transform(df_for_trendline)
        y_pred = model.predict(df_for_trendline_scaled)
        
        plt.plot(x_range, y_pred, color='red')
        
        #plt.savefig("NeuralNetworkOutputChart.png")
        plt.show()

Python相关问答推荐

提取两行之间的标题的常规表达

如何在类和classy-fastapi -fastapi- followup中使用FastAPI创建路由

从groupby执行计算后创建新的子框架

对象的`__call__`方法的setattr在Python中不起作用'

把一个pandas文件夹从juyter笔记本放到堆栈溢出问题中的最快方法?

Python列表不会在条件while循环中正确随机化'

如何使用Numpy. stracards重新编写滚动和?

如果初始groupby找不到满足掩码条件的第一行,我如何更改groupby列,以找到它?

lityter不让我输入左边的方括号,'

网格基于1.Y轴与2.x轴显示在matplotlib中

Pandas:计算中间时间条目的总时间增量

当条件满足时停止ODE集成?

在电影中向西北方向对齐""

如何强制向量中的特定元素在Gekko中处于优化解决方案中

当我定义一个继承的类时,我可以避免使用`metaclass=`吗?

如何在SQLAlchemy + Alembic中定义一个"Index()",在基表中的列上

比较两个有条件的数据帧并删除所有不合格的数据帧

正在try 让Python读取特定的CSV文件

时长超过24小时如何从Excel导入时长数据

如何在python tkinter中绑定键盘上的另一个回车?