With the rise of the AI wave, various applications are emerging one after another. However, the price of AI inference computing power centered around GPUs is soaring and monopolized by large companies, greatly hindering the equal development of various applications, especially creative start-up teams. At the same time, many GPU resources remain idle. NGPU, as a Web3 decentralized GPU computing power network, is dedicated to providing accessible GPU nodes without any entry barriers, offering cost-effective, user-friendly, and stable GPU computing power resources for various AI applications, thereby achieving enterprise-level AI inference services. Simultaneously, it provides idle GPU nodes with opportunities to earn money, fully utilizing every bit of resource.
Compared to other Depin projects that serve as just renting separate GPU computing nodes, NGPU's main innovative technologies include intelligent allocating, pay-per-task GPU usage billing, and efficient HARQ network transmission. These technologies enable real-time perception of AI application load pressure, dynamic adjustment of resource usage, and achieve auto-scaling, security, high reliability, and low latency on GPU nodes with unrestricted access.
During the development of NGPU, we encountered the following major issues:
Instability of Decentralized Computing Nodes: Compared to the reliable service quality of high-grade IDCs built by big companies, GPU nodes with permissionless access might be personal gaming computers, which are highly likely to go offline randomly. To address this, NGPU developed the Smart Allocation framework, which on one hand, monitors the status of each GPU node in real-time and configures redundant nodes besides the working nodes to switch when a working node goes offline; on the other hand, it designed an incentive and staking mechanism to encourage stable online presence.
Various Specifications of Computing Node Networks and Hardware: Facing various specifications of GPU computing nodes, NGPU measures the AI inference capability of the nodes and combines it with the measurement of storage and network bandwidth to form a unified computing power value. This standardizes the node computing power, providing a basis for Allocation and incentives. Additionally, NGPU utilizes HARQ (Hybrid Automatic Repeat reQuest) to optimize the efficiency of public network transmission, achieving over a 100-fold speed improvement under strong network noise, compensating for the network deficiencies of decentralized computing nodes.
Significant Daily Fluctuations in AI Application Loads: Various AI applications, especially in the Web3 field, face load peaks and valleys. Typically, GPU nodes are rented based on peak loads to ensure service stability. However, NGPU calculates costs based on the actual usage of (GPU computing power * duration) through smart allocation, ensuring that every penny spent by the AI application provider goes towards their actual business needs. This not only enhances usage efficiency on the relatively low-cost decentralized GPU power but also significantly reduces GPU computing power costs, achieving fair access to GPU computing power.
Tracks Applied (1)
Discussion