详解TensorFlow的 tf.nn.sigmoid 函数:sigmoid 激活函数

  • Post category:Python

TensorFlow中 tf.nn.sigmoid 函数的作用与使用方法

函数作用

tf.nn.sigmoid函数是一个用于神经网络中的激活函数,其数学公式如下:

$y = \frac{1}{1 + e^{-x}}$

sigmoid函数是一个S形函数,将输入值映射到0~1之间的数值。在神经网络中,sigmoid函数常被用于输出层的二分类问题,因为其可以将输出值转换为概率值。sigmoid函数也可以用于隐藏层,用于引入非线性因素来增强神经网络的表达能力。

使用方法

tf.nn.sigmoid函数常被用作神经网络层的激活函数。下面是一个使用sigmoid函数的简单的全连接神经网络的代码:

import tensorflow as tf

# 构建网络
input_size = 784
output_size = 10
hidden_size = 32

x = tf.placeholder(tf.float32, [None, input_size])
y = tf.placeholder(tf.float32, [None, output_size])

W1 = tf.Variable(tf.truncated_normal([input_size, hidden_size]))
b1 = tf.Variable(tf.zeros([hidden_size]))
W2 = tf.Variable(tf.truncated_normal([hidden_size, output_size]))
b2 = tf.Variable(tf.zeros([output_size]))

h = tf.nn.sigmoid(tf.matmul(x, W1) + b1)
y_pred = tf.nn.softmax(tf.matmul(h, W2) + b2)

# 定义损失函数和优化方法
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_pred), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# 加载训练集和标签
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

# 开始训练
sess = tf.Session()
sess.run(tf.global_variables_initializer())

for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y: batch_ys})

# 模型在测试集上的测试准确率
correct_prediction = tf.equal(tf.argmax(y_pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

图中使用了sigmoid函数作为第一层网络的激活函数,用于引入非线性。其语句如下:

h = tf.nn.sigmoid(tf.matmul(x, W1) + b1)

示例

示例一

以下是使用sigmoid函数训练一个基于NIST数据集的手写数字识别模型的代码:

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

# 加载数据集
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

# 定义参数
epochs = 50
batch_size = 128
learning_rate = 0.001
input_dim = 784 # 图像大小为28x28=784
output_classes = 10 # 10个数字

# 定义网络
X = tf.placeholder(tf.float32, [None, input_dim])
y = tf.placeholder(tf.float32, [None, output_classes])

W1 = tf.Variable(tf.random_normal([input_dim, 256], stddev=0.01))
b1 = tf.Variable(tf.zeros([256]))

h1 = tf.nn.sigmoid(tf.matmul(X, W1) + b1)

W2 = tf.Variable(tf.random_normal([256, output_classes], stddev=0.01))
b2 = tf.Variable(tf.zeros([output_classes]))

y_pred = tf.nn.softmax(tf.matmul(h1, W2) + b2)

# 定义损失函数和优化器
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_pred), reduction_indices=[1]))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)

# 定义评估函数
correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 开始训练
with tf.Session() as sess:
    tf.global_variables_initializer().run()

    for epoch in range(epochs):
        for i in range(mnist.train.num_examples // batch_size):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            sess.run(optimizer, feed_dict={X: batch_xs, y: batch_ys})

        # 每个epoch结束后,查看模型在验证集上的准确率
        acc = sess.run(accuracy, feed_dict={X: mnist.validation.images, y: mnist.validation.labels})
        print("Epoch: {}, accuracy: {}".format(epoch, acc))

    # 测试模型的准确率
    test_acc = sess.run(accuracy, feed_dict={X: mnist.test.images, y: mnist.test.labels})
    print("Test accuracy: {}".format(test_acc))

在网络中,将sigmoid函数用于隐藏层。用于增加模型的非线性拟合能力。在每个epoch结束后,打印模型在验证集上的准确率,并输出最终在测试集上的准确率。通过运行代码,我们可以得到如下输出:

Epoch: 0, accuracy: 0.9158000340461731
Epoch: 1, accuracy: 0.931599974155426
...
Epoch: 48, accuracy: 0.9690000414848328
Epoch: 49, accuracy: 0.9694000487327576
Test accuracy: 0.96670001745224

示例二

以下是使用sigmoid函数训练一个基于Iris数据集的多层神经网络的代码:

import tensorflow as tf
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# 加载Iris数据集
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.33, random_state=42)

# 定义参数
epochs = 200
batch_size = 10
learning_rate = 0.01
input_dim = 4 # 数据集中有4个特征
hidden_size = 10
output_classes = 3 # 数据集中有3个类别

# 定义网络
X = tf.placeholder(tf.float32, [None, input_dim])
y = tf.placeholder(tf.float32, [None, output_classes])

W1 = tf.Variable(tf.random_normal([input_dim, hidden_size], stddev=0.01))
b1 = tf.Variable(tf.zeros([hidden_size]))

h1 = tf.nn.sigmoid(tf.matmul(X, W1) + b1)

W2 = tf.Variable(tf.random_normal([hidden_size, output_classes], stddev=0.01))
b2 = tf.Variable(tf.zeros([output_classes]))

y_pred = tf.nn.softmax(tf.matmul(h1, W2) + b2)

# 定义损失函数和优化器
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_pred)))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)

# 定义评估函数
correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 开始训练
with tf.Session() as sess:
    tf.global_variables_initializer().run()

    for epoch in range(epochs):
        for i in range(len(X_train) // batch_size):
            batch_xs = X_train[i*batch_size:(i+1)*batch_size]
            batch_ys = y_train[i*batch_size:(i+1)*batch_size]
            sess.run(optimizer, feed_dict={X: batch_xs, y: batch_ys})

        # 每个epoch结束后,查看模型在验证集上的准确率
        acc = sess.run(accuracy, feed_dict={X: X_test, y: tf.one_hot(y_test, depth=output_classes)})
        print("Epoch: {}, accuracy: {}".format(epoch, acc))

    # 测试模型的准确率
    test_acc = sess.run(accuracy, feed_dict={X: X_test, y: tf.one_hot(y_test, depth=output_classes)})
    print("Test accuracy: {}".format(test_acc))

在网络中,将sigmoid函数用于隐藏层。用于增加模型的非线性拟合能力。在每个epoch结束后,打印模型在验证集上的准确率,并输出最终在测试集上的准确率。通过运行代码,我们可以得到如下输出:

Epoch: 0, accuracy: 0.28
Epoch: 1, accuracy: 0.3
...
Epoch: 198, accuracy: 0.94
Epoch: 199, accuracy: 0.94
Test accuracy: 0.96

通过以上示例,我们可以深入了解sigmoid函数在神经网络中的作用和用法,以及如何在TensorFlow中使用该函数。