如何使用 Python 将文本数据嵌入到维度向量中?

keraspythonserver side programmingprogramming

Tensorflow 是 Google 提供的机器学习框架。它是一个开源框架，与 Python 结合使用，用于实现算法、深度学习应用程序等。它用于研究和生产目的。

Keras 是作为 ONEIROS(开放式神经电子智能机器人操作系统)项目研究的一部分开发的。Keras 是一个用 Python 编写的深度学习 API。它是一个高级 API，具有高效的接口，可帮助解决机器学习问题。它在 Tensorflow 框架之上运行。它旨在帮助快速进行实验。它提供了开发和封装机器学习解决方案所必需的基本抽象和构建块。

Keras 已经存在于 Tensorflow 包中。可以使用下面的代码行访问它。

import tensorflow
from tensorflow import keras

与使用顺序 API 创建的模型相比，Keras 函数式 API 有助于创建更灵活的模型。函数式 API 可以处理具有非线性拓扑、可以共享层并处理多个输入和输出的模型。深度学习模型通常是包含多个层的有向无环图 (DAG)。函数式 API 有助于构建层图。

我们使用 Google Colaboratory 运行以下代码。Google Colab 或 Colaboratory 有助于在浏览器上运行 Python 代码，并且不需要任何配置，并且可以免费访问 GPU(图形处理单元)。Colaboratory 建立在 Jupyter Notebook 之上。以下是代码片段，我们将标题中的每个单词嵌入到 64 维向量中 −

示例

print("Number of unique issue tags")
num_tags = 12
print("Size of vocabulary while preprocessing text data")
num_words = 10000
print("Number of classes for predictions")
num_classes = 4
title_input = keras.Input(
   shape=(None,), name="title"
)
print("Variable length int sequence")
body_input = keras.Input(shape=(None,), name="body")
tags_input = keras.Input(
   shape=(num_tags,), name="tags"
)
print("Embed every word in the title to a 64-dimensional vector")
title_features = layers.Embedding(num_words, 64)(title_input)
print("Embed every word into a 64-dimensional vector")
body_features = layers.Embedding(num_words, 64)(body_input)
print("Reduce sequence of embedded words into single 128-dimensional vector")
title_features = layers.LSTM(128)(title_features)
print("Reduce sequence of embedded words into single 132-dimensional vector")
body_features = layers.LSTM(32)(body_features)
print("Merge available features into a single vector by concatenating it")
x = layers.concatenate([title_features, body_features, tags_input])
print("Use logistic regression to predict the features")
priority_pred = layers.Dense(1, name="priority")(x)
department_pred = layers.Dense(num_classes, name="class")(x)
print("Instantiate a model that predicts priority and class")
model = keras.Model(
   inputs=[title_input, body_input, tags_input],
   outputs=[priority_pred, department_pred],
)

代码来源 − https://www.tensorflow.org/guide/keras/functional

输出

Number of unique issue tags
Size of vocabulary while preprocessing text data
Number of classes for predictions
Variable length int sequence
Embed every word in the title to a 64-dimensional vector
Embed every word into a 64-dimensional vector
Reduce sequence of embedded words into single 128-dimensional vector
Reduce sequence of embedded words into single 132-dimensional vector
Merge available features into a single vector by concatenating it
Use logistic regression to predict the features
Instantiate a model that predicts priority and class

解释

函数式 API 可用于处理多个输入和输出。
这无法通过顺序 API 实现。

技术文章和资源

热门类别

如何使用 Python 将文本数据嵌入到维度向量中?

示例

输出

解释

相关文章

颜色选择器

读后有收获微信请站长喝咖啡

错误报告

您的建议:

感谢您的帮助！