Huawei has launched the CloudMatrix 384 AI chip cluster, designed for extensive AI model training. This platform utilizes a network of Ascend 910C processors connected through optical interconnects to enhance energy efficiency and training speed.
While CloudMatrix 384 claims to outperform traditional GPU-based clusters, individual Ascend chips do not yet achieve the performance levels of leading GPUs from Western competitors. As Huawei aims to rival NVIDIA’s market supremacy, the company is expanding its ecosystem of tools to reduce reliance on foreign technologies, particularly under current sanctions.
This includes developing its own AI workflows and products that directly compete with established entities. To leverage Huawei’s AI infrastructure, data engineers must adjust their workflows using tools specifically built for Ascend processors.
MindSpore, Huawei’s deep learning framework, is at the forefront of this transition. Although any shift to a new framework will introduce new tooling, engineers will find that essential workflows for data ingestion, transformation, and model iteration remain largely transferable.
Moving from PyTorch or TensorFlow to MindSpore involves adapting to differences in syntax, operator behavior, and model training pipelines. The conversion process is facilitated by MindConverter, but engineers should be prepared for manual adjustments, as feature parity is not guaranteed between the frameworks.
MindSpore utilizes MindIR as its model export format, streamlining cross-platform deployment specifically optimized for Ascend NPUs. This format allows for easy inference by loading the MindIR model and using APIs to handle model de-serialisation and memory management.
Huawei’s CANN (Compute Architecture for Neural Networks) serves as a foundational component comparable to NVIDIA’s CUDA, providing libraries and development tools for optimizing performance. MindSpore supports two execution modes: GRAPH_MODE for better hardware utilization and PYNATIVE_MODE for immediate execution useful in prototyping.
For deployment, ModelArts, Huawei’s cloud-native AI platform, offers a complete AI pipeline, enabling tasks from data preparation to automated model monitoring. Transitioning to Huawei’s ecosystem may require reskilling, particularly for those accustomed to CUDA-based tools.
Despite potential gaps in documentation and library compatibility, particularly outside Huawei’s key markets, the performance benefits can be significant for teams operating within the company’s infrastructure.