[対訳] NeuralNetsInTesseract4.00

original (2019/05/14 付)	Google 翻訳 (2019/05/17 付)
# Overview of the new neural network system in Tesseract 4.00	#Tesseract 4.00の新しいニューラルネットワークシステムの概要
* Introduction	* はじめに
* Integration with Tesseract	* Tesseractとの統合
* Hardware and CPU requirements	* ハードウェアとCPUの要件
* For Open Source Contributors	* オープンソース貢献者のために
* Basics of the Implementation	* 実装の基本
* Adding a new Layer Type	* 新しいレイヤータイプを追加する
# Introduction	# 前書き
Tesseract 4.00 includes a new neural network subsystem configured as a textline	Tesseract 4.00には、テキスト行として構成された新しいニューラルネットワークサブシステムが含まれています
recognizer. It has its origins in OCRopus' Python-based LSTM implementation, but	レコグナイザその起源はOCRopusのPythonベースのLSTM実装にありますが、
has been totally redesigned for Tesseract in C++. The neural network system in	C ++でTesseract用に完全に再設計されました。ニューラルネットワークシステム
Tesseract pre-dates Tensor Flow, but is compatible with it, as there is a	TesseractはTensor Flowよりも前の日付ですが、互換性があります。
network description language called [Variable Graph Specification	[Variable Graph Specification]というネットワーク記述言語
Language](VGSLSpecs) (VGSL), that is also available for Tensor Flow. See	これはTensor Flowでも利用可能です。見る
https://github.com/tensorflow/models/tree/master/research/street	https://github.com/tensorflow/models/tree/master/research/street
The idea of VGSL is that it is possible to build a neural network and train it	VGSLのアイデアは、ニューラルネットワークを構築してそれを訓練することが可能であるということです
without having to learn a lot of anything. There is no need to learn Python,	何も学ぶ必要はありません。 Pythonを学ぶ必要はありません。
Tensor Flow, or even write any C++ code. It is merely required to understand the	Tensor Flow、あるいはどんなC ++コードでも書くことができます。理解することだけが必要です。
VGSL specification language well enough to build syntactically correct network	構文的に正しいネットワークを構築するのに十分なVGSL仕様言語
descriptions. Some basic knowledge of what the various neural network layer	説明さまざまなニューラルネットワーク層の基本知識
types are and how they are combined will go a very long way.	型がありそしてそれらがどのように組み合わされるかは非常に長い道のりを行くでしょう。
# Integration with Tesseract	#Tesseractとの統合
The Tesseract 4.00 neural network subsystem is integrated into Tesseract as a	Tesseract 4.00ニューラルネットワークサブシステムは、Tesseractに次のように統合されています。
line recognizer. It can be used with the existing layout analysis to recognize	ラインレコグナイザそれを認識するために既存のレイアウト分析と共に使用することができます。
text within a large document, or it can be used in conjunction with an external	大きな文書内のテキスト、または外部文書と組み合わせて使用できます。
text detector to recognize text from an image of a single textline.	単一のテキスト行の画像からテキストを認識するテキスト検出器。
The neural network engine is the default for 4.00. To recognize text from an	ニューラルネットワークエンジンは4.00のデフォルトです。のテキストを認識する
image of a single text line, use `SetPageSegMode(PSM_RAW_LINE)`. This can be	単一のテキスト行の画像は、 `SetPageSegMode(PSM_RAW_LINE)`を使います。これは
used from the command-line with `-psm 13`	コマンドラインから `-psm 13`を付けて使用
The neural network engine has been integrated to enable the multi- language mode	ニューラルネットワークエンジンは、多言語モードを有効にするために統合されました
that worked with Tesseract 3.04, but this will be improved in a future release.	これはTesseract 3.04で動作しましたが、これは将来のリリースで改善されるでしょう。
Vertical text is now supported for Chinese, Japanese and Korean, and should be	垂直テキストは現在、中国語、日本語、韓国語でサポートされています。
detected automatically.	自動的に検出されます。
# Hardware and CPU Requirements	#ハードウェアとCPUの要件
The Tesseract 4.00 neural network subsystem is heavily compute-intensive, using	Tesseract 4.00ニューラルネットワークサブシステムは、次のものを使用して、非常に計算集約的です。
the order of ten times the CPU resources of the base Tesseract, but the impact	基本TesseractのCPUリソースの10倍のオーダー
is mitigated, if your platform supports it, as follows:	プラットフォームがサポートしている場合は、次のように緩和されます。
* OpenMP allows use of four cores in parallel if your machine has them.	* OpenMPでは、あなたのマシンに4つのコアがあれば、それらを同時に使用することができます。
* Intel/AMD processors that support SSE and/or AVX benefit from SIMD	SSEやAVXをサポートするIntel / AMDプロセッサはSIMDの恩恵を受けます
parallelization of the core matrix multiplications.	コア行列の乗算の並列化
On a machine with multiple cores, and AVX, an easy English image may take twice	複数のコアとAVXを搭載したマシンでは、簡単な英語のイメージは2倍かかることがあります。
as much real time, and use 7 times the CPU as base Tesseract, whereas Hindi	ヒンディー語に対して、リアルタイムと同じくらい、そして基本Tesseractとして7倍のCPUを使う
takes more CPU than base Tesseract, but actually runs faster in terms of real	基本的なTesseractよりも多くのCPUを使用しますが、実際には実際の面で高速に実行されます
time.	時間。
If the above components are missing, there is a slower plain C++ implementation	上記のコンポーネントが欠けている場合は、遅いプレーンなC ++実装があります。
that enables the code to still work.	それはコードがまだ機能することを可能にします。
Little-endian and big-endian systems are both supported.	リトルエンディアンとビッグエンディアンの両方のシステムがサポートされています。
# For Open Source Contributors	#オープンソース貢献者のために
The initial implementation lacks the following:	初期の実装には以下が欠けています。
* There is a C++ implementation if the hardware does not have SSE and/or AVX,	*ハードウェアにSSEやAVXがない場合はC ++の実装があります。
but the code could benefit from SIMD implementations for other hardware,	しかし、コードは他のハードウェア用のSIMD実装から恩恵を受けることができます。
such as ARM. See the new `arch` directory for where to insert the code.	ARMなど。コードを挿入する場所については新しい `arch`ディレクトリを見てください。
# Basics of the Implementation	#実装の基本
All network layer types are derived from the `Network` base-class. The	すべてのネットワーク層のタイプは `Network`基本クラスから派生しています。の
`Plumbing` sub-class is a base-class for layers that manipulate other layers in	`Plumbing`サブクラスは他のレイヤーを操作するレイヤーの基本クラスです。
some way, e.g. by reshaping their input/output or organising a group of layers.	何らかの方法で、それらの入力/出力を作り直すか、層のグループを組織することによって。
The input/output data "Tensor" is `NetworkIO` and the weights are stored in a	入出力データ "Tensor"は `NetworkIO`であり、重みは
`WeightMatrix,` both of which contain a Tesseract `GENERIC_2D_ARRAY` to hold the	`WeightMatrix`は両方ともTesseractを保持するためのTesseract `GENERIC_2D_ARRAY`を含みます
data. `LSTMRecognizer` provides the higher-level abstraction of converting an	データ。 `LSTMRecognizer`は、より高度な抽象化を提供します。
image of a textline to a sequence of tesseract `WERD_RES` classes. `LSTMTrainer`	一連のtesseractの `WERD_RES`クラスへのテキスト行の画像。 `LSTMTrainer`
likewise handles the abstraction of training a network on an image of a textline	テキストラインの画像上でネットワークをトレーニングすることの抽象化も同様に処理します。
that has a UTF-8 string 'truth'. `NetworkBuilder` takes responsibility for	それはUTF-8文字列 '真実'を持ちます。 `NetworkBuilder`が責任を負います
converting the VGSL specification language to a graph of network elements.	VGSL仕様言語をネットワーク要素のグラフに変換する。
# Adding a new Layer Type	#新しいレイヤータイプを追加する
A new layer class must derive from `Network` or `Plumbing` and implement at	新しいレイヤクラスは `Network`または`Plumbing`から派生して、で実装しなければなりません
least the following virtual methods:	少なくとも次の仮想メソッド
* `spec`, which returns a string corresponding to the string that generated	* `spec`、生成した文字列に対応する文字列を返します
the layer.	レイヤー
* `Serialize/DeSerialize` to save/restore the layer to/from a TFile.	TFileとの間でレイヤーを保存/復元するための `Serialize / DeSerialize`。
* `Forward` to run the layer in the forwards direction.	* `Forward`はレイヤーを順方向に動かします。
* `Backward` to run the layer in the backwards direction during training.	トレーニング中に逆方向にレイヤーを動かす* Backward。
Layers that have weights must also implement `Update` to update the weights with	重みを持つレイヤーは、重みを更新するための `Update`も実装しなければなりません。
a set of gradients. There are quite a few other methods that may need to be	グラデーションのセット他にもいくつかの方法があります。
implemented, depending on the specific requirements of the new layer. See	新しい層の特定の要件に応じて実装されています。見る
`network.h` for more information on the methods that may require implementing.	実装が必要なメソッドの詳細については、 `network.h`。
* `NetworkBuilder` must be modified to parse the specification of the new	* `NetworkBuilder`は新しい仕様をパースするように修正されなければなりません
type.	タイプ。
* The `NetworkType` enum must be extended to include the new type.	* `NetworkType`列挙は新しい型を含むように拡張されなければなりません。
* A corresponding entry in `Network::kTypeNames` must be added for the new	* `Network :: kTypeNames`内の対応するエントリを新しいものに追加する必要があります
type.	タイプ。
* `Network::CreateFromFile` must be modified to construct the appropriate type	* `Network :: CreateFromFile`は適切な型を構築するために修正する必要があります
on deserialization.	逆シリアル化
* As with any new code, `lstm/Makefile.am` needs to be updated with the new	*他の新しいコードと同様に、 `lstm / Makefile.am`も新しいコードで更新する必要があります。
filenames.	ファイル名

「[対訳] NeuralNetsInTesseract4.00」をウィキ内検索

最終更新：2019年06月12日 20:15

＊99 [ e のない e 本]

おしながき

本棚

アーカイブ

リンク

更新履歴

リンク