DAEBO Communication & Systems
Deep learning, which allows a computer to understand an infinite amount of data (including an image, voice, text, etc.), is showing the fastest growth in the field of AI. Various industry-leading companies are adopting deep learning in order to process exponentially increasing data, using machine learning algorithms and computer H/W. In this way, they are leveraging such enormous amount of data in developing product, services, or procedures, and gaining a competitive edge successfully.
GPU is turning a new business idea into a reality in various industries, including a smart city, the public sector, financing, manufacturing, retail business, medical, service, and so on. And NVIDIA's GPU for enterprise use serves as the brain of an AI-based business.
NVIDIA DGX and TESLA provide a data center with the most powerful deep learning platform and performance for high computing workload, allowing data scientists to explore AI across a desk workstation, data center, and cloud.
DGX deep learning stack
Super computer system for the most powerful deep learning scaling NVIDIA DGX-2, the first two-petaflops system using sixteen GPUs, interconnected perfectly so as to increase the performance of deep learning ten-fold, is designed to overcome any limitations in AI speed and extension. Use of DGX-2, which is based on the architecture for AI-scale employing the NVIDIA® DGX™ S/W and NVIDIA NVSwitch technology, allows you to perform even the most complex AI projects in the world.
Comparison of training time (days)
|GPU||16x Tesla V100 SXM3||System Memory||1.5TB|
|GPU Memory||512GB Total (NVSwitch Technology)||Network||8x 100Gb / secInfiniband / 100GigE2x 10 / 25Gb / sec Ethernet|
|TFLOPS (FP16)||2,000 TFLOPS|
|NVIDIA CUDA Core||81,920||Storage||OS : 2x 960GB NVME SSDs Data : 30TB (8x 3.84TB) NVME SSDs|
|NVIDIA Tensor Core||10,240||Software||Ubuntu Linux OS, CUDA Toolkit|
|NNVSwithces||12||System Weight||340 lbs / 154.2 kg|
|Max. Power Consumption||10,000 W||System Dimensions||H 440 x W 482 x L 834 mm|
|CPU||2x Xeon Platinum 81682.7 GHz, 24 Core||Operating Temp||5 ~ 35℃|
Solution for enterprise-level AI R&DDGX-1V, a must for AI R&D, helps to accelerate a data center and simplify the deep learning workflow, allowing researchers to perform an experiment faster and train a bigger model. NVIDIA DGX-1, mounted with NVIDIA Volta™, offers industry-leading performance for AI and deep learning.
Comparison of training time (days)
|GPU||8x Tesla V100 SXM2||System Memory||512 GB|
|GPU Memory||256GB Total||Network||2x 10Gbe4x Infiniband EDR|
|TFLOPS (FP16)||1,000 TFLOPS|
|NVIDIA CUDA Core||40,960||Storage||4x 1.92 TB SSD RAID 0|
|NVIDIA Tensor Core||5,120||Software||Ubuntu Linux OS, CUDA Toolkit|
|NNVSwithces||12||System Weight||134 lbs / 60.8 kg|
|Max. Power Consumption||3,500 W||System Dimensions||H 131 x W 444 x L 866 mm|
|CPU||2x Xeon E5-2698 v42.2 GHz, 20 Core||Operating Temp.||10 ~ 35℃|
No-noise desk-side super workstation for advanced AI developmentNVIDIA DGX Station™ is an office-use super computer for advanced AI development, and is the only one of its kind.
This system, designed for an office environment and based on the software stack, the same as the one mounted on all DGX systems,
allows you to perform R&D projects effectively and easily.
NVIDIA AI system selection guideline (On-Premises)
Deep leaning training accelerated
|GPU||4x Tesla V100 PCI-e|
|GPU Memory||128GB Total||Network||2x 10Gbe|
|TFLOPS (FP16)||500 TFLOPS||Storage||OS : 1.92TB SSDData : 3x 1.92TB SSD|
|NVIDIA CUDA Core||20,480||Software||Ubuntu Linux OS, CUDA Toolkit|
|NVIDIA Tensor Core||2,560||소음||< 35 dB (liquid cooler, super quiet)|
|Max. Power Consumption||1,500 W (for general office-use)||System Weight||88 lbs / 40 kg|
|CPU||Xeon E5-2698 v42.2 GHz, 20 Core||System Dimensions||H 639 x W 256 x L 518 mm|
|System Memory||256 GB||Operating Temp.||10 ~ 30℃|
The most advanced data center GPUNVIDIA®Tesla® acceleration computing platform provides an advanced data center with performance
for accelerating both AI and high computing workload.
Time to Solution in HoursLower is Better
Performance Normalizedto CPU
|TESLA||Tesla V100 SXM2||Tesla V100||Tesla P100||Tesla P40||Tesla P4|
|Intended Use||DL Trainning / HPC||DL Trainning / HPC||DL Trainning / HPC||DL Inference||DL Inference|
|Double Precision||7.8 TFLOPS||7 TFLOPS||4.7 TFLOPS||−||−|
|Single Precision||15.7 TFLOPS||14 TFLOPS||9.3 TFLOPS||12 TFLOPS||5.5 TFLOPS|
|Half Precision||125 TFLOPS||112 TFLOPS||18.7 TFLOPS||47 TFLOPS||22TFLOPS|
|Memory Bandwidth||900 GB/s||900 GB/s||732 GB/s549 GB/s||346 GB/s||192 GB/s|