TensorFlow Architecture
AnAn 2018/07/15 (under construction)
TensorFlow is designed for large-scale distributed training and inference.
TensorFlow is flexible to support experimentation with new machine learning models and system-level optimizations.
The TensorFlow system architecture combines scale and flexibility.
The [0] document is for developers who want to extend TensorFlow in some way not supported by current APIs.
The [0] document is for hardware engineers who want to optimize for TensorFlow,
The [0] document is for implementers of machine learning systems working on scaling and distribution,
(補圖)
Fig 1, The Tensorflow Architecture
Fig.1 illustrates its general architecture architecture. The TensorFlow runtime is a cross-language library. A C Application Programming Interface (C API) separates user level code in different languages from the core runtime.
The [0] document focuses on the following layers:
- Client
- Defines the computation as a dataflow graph.
- Initiates graph execution using a session.
- Distributed Master
- Prunes a specific subgraph from the graph, as defined by the arguments to Session.run().
- Partitions the subgraph into multiple pieces that run in different processes and devices.
- Distributes the graph pieces to worker services.
- Initiates graph piece execution by worker services.
- Worker Services (one for each task)
- Schedule the execution of graph operations using kernel implementations appropriate to the available hardware (CPUs, GPUs, etc).
- Send and receive operation results to and from other worker services.
- Kernel Implementations
- Perform the computation for individual graph operations.