Está en la página 1de 43

CloudFabric Builds the Next-

Generation DCN for the AI Era


Contents

1 Data Center Network Overview

2 Huawei CloudFabric Solution

3 CE Product Introduction

4 How to Beat
What Is a Data Center?

A data center is the core service-oriented


infrastructure that supports an organization’s
business operations and development. A data
center is composed of the following elements:
 Secure network architecture
 Reliable supporting facilities (equipment rooms, generators,
UPSs, air conditioners, etc.)
 Consolidated servers/application platforms
 Centralized storage and backup devices
 Unified system management platform
 O&M organization and process for customer services
Enterprise Data Centers Have Undergone Virtualization and Are
Moving from the Cloud Era to the AI Era
Data source: IDC report, excluding the US

Virtualization Cloud computing AI


Internet AI/big data
Plus

Cloud-based Data value


Resource pool-  High-density ports and large-  Association with computing  Accelerated distributed storage
services, mining,
based sharing, buffer switches resources and AI high-speed computing
optimizing realizing
improving  Pool-based management  Interconnection with the cloud  Integration of computing,
provisioning business
utilization through the SDN controller platform to implement L2-L7 E2E storage, and data networks
efficiency monetization
service provisioning

Government and Government and Government and


Finance Internet Finance Internet Finance Internet
large enterprise large enterprise large enterprise

Sberbank, Russia Yandex, Russia BTK, Turkey China Merchants Bank


BPM, Italy SEA, Singapore Royal Thai Police, Baidu Hyundai, South
Thailand Korea

DBS, Singapore NBP, South Korea Siemens, Germany Bank Mandiri, SB Cloud, Japan Volkswagen DC,
Ping An, China Tencent LG, South Korea
Indonesia Germany
Customer Requirements on DCNs: Embrace AI for Efficient
Deployment, Zero Network Faults, and Low-Cost Evolution
Challenges Requirements

 Traditional DCN: Manual configuration and slow service rollout.


 Cloud DCN: Independent deployment but poor deployment
efficiency.
Deploy network services efficiently and seize
Digital
business opportunities
transformation

 Traditional O&M: after-the-fact processing and passive response.


 Expertise-reliant manual analysis and lengthy fault location Transform from reactive O&M to proactive O&M in order
process. to achieve zero faults
Service
cloudification
 Servers are upgraded every three years and the network is Smoothly upgrade the system to avoid vendor lock-
upgraded frequently, causing high CAPEX.
in and achieve low-cost evolution
 Multi-vendor devices need to be quickly integrated and managed
by a unified management system.

AI evolution
Build a low-cost network with zero packet
loss in the AI era
 Ethernet has high latency and packet loss. As a result, AI
training duration is long.
Intent-Driven CloudFabric: Application-Centric, Automatic Execution,
and Continuous Intent Guarantee
Intent-driven, automatic
deployment Service
Public cloud provisioning speed
Telco cloud Intelligent identification of intent, improving
Private cloud service rollout efficiency 10-fold Hours
Intelligent configuration verification, Seconds
eliminating configuration errors Full-process automation
Microsoft Closed-loop verification and
NSX Network fault rate
intelligent O&M
Intent Engine Intelligence Engine NCE
Proactive risk prediction and fault 68%
detection in seconds
Application and network quality detection Troubleshooting time
Intent model Big Data FabricInsight within seconds, and fault location within shortened from hours
minutes to minutes
Automation Engine Analytics Engine
Open collaboration and System integration
quick integration
Configuration delivery Data collection Open architecture, and interconnection Months Days
with more than 20 cloud platforms and Open APIs
VAS devices
Open APIs and multiple interfaces such
as Ansible
Intelligent and lossless AI training time
data center network Ultra-broadband, intelligent,
40%
Resource pool Intelligent O&M
lossless network
Zero packet loss, low latency, and high Compared with
VM VM throughput
traditional Ethernet
Contents

1 Data Center Network Overview

2 Huawei CloudFabric Solution

3 CE Product Introduction

4 How to Beat
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Ethernet: Congestion Leads to Frequent Packet Loss, Impeding AI


Running Efficiency
Traditional networks become a performance bottleneck for AI Customer profile service piloted by CMB reveals that InfiniBand and CEE
networks do not meet requirements

Data mining Autonomous driving


AI adoption rate 86%
Big data-based
Machine Life customer profile
learning science
16% Pilot network solution:
InfiniBand and CEE networks
2015 2025

AI cluster pilot:
540 GPU servers
CPUGPU AI chip HDSSD SCM
Computing speed 100x Storage speed 100x
Summary

Item InfiniBand CEE

Low, failing to meet AI


Throughput High
Bottleneck Conclusion: requirements
encountered in network
communications
The InfiniBand and O&M
Too difficult for existing
Easy
personnel
CEE networks are not
suitable for CMB’s big Price
High, double the price of
Low
CEE
data services.
Network latency is high, CPU capability usage is Exclusive use for the Incorporation into the cloud-
Other
inefficient, and AI training efficiency is low. dedicated network network integration solution
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

AI Fabric Objectives
Build next-generation lossless Ethernet with high throughput and low latency to meet AI
service requirements

• When N:1 traffic model is used in HPC and distributed storage


scenarios, network congestion causes severe packet loss and
No packet seriously reduces service efficiency.
• In the same scenario, AI Fabric reduces congestion-induced
loss packet loss and ensures efficient and stable transmission.

Spine-1 Spine-m

AI Fabric
TOR-1 TOR-2 TOR-n

Low High
latency throughput

• The parallel processing mechanism is used in HPC and distributed storage. • 25GE/100GE high-performance servers are widely used as computing and
Each node performs computing and data access simultaneously. storage nodes. The higher the server capability, the more intuitively it
• AI Fabric provides extremely low latency to reduce the FCT of HPC and tail reflects the requirement for bandwidth on the network.
latency of distributed storage, improving I/O throughput. • AI Fabric requires high bandwidth to ensure the throughput for large data
transmission of lossless applications.
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

AI Fabric: Using Huawei Proprietary Congestion Control Algorithms


to Build a Large-Scale Lossless Network

VIQ scheduling algorithm: Packet loss is eliminated in the


Flowlet: The NIC generates a flowlet based on the congestion status.
switching matrix, and the tail latency is controlled.
The switch selects the packet forwarding path based on the interface’s
buffer size and bandwidth usage, and creates a flow table to ensure
that packets are sent in the correct order. Who are backpressure
Input_1 Output_1
signals sent to
Traffic Packet loss
balancing and fairness Output n_1
Source
Leaf Input_m Output_n
end
{

NIC
node Output n_m
Flowlet
Internal
feedback

Optimized congestion control mechanism: The ECN threshold is Fast CNP: An intermediate device generates a CNP packet according
dynamically configured based on traffic characteristics while to the destination information of the congested packet and sends the
considering the throughput and latency. packet to the transmit end through the original ingress of the packet.
ECN
Dynamic CPU FPGA/Dedicated CPU Fast
S R
ECN Normal CNP
When are backpressure signals sent CNP
Who sends backpressure signals
Input_m
Forwarding chip S Fast CNP
R
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

AI Fabric Receives Interop Award and EANTC Certification

Zero packet loss, reducing Winner of Best of Show Award Grand Prize
latency by up to 44.3%

EANTC certification
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Baidu: Huawei AI Fabric Realizes Dedicated Network Performance at


Ethernet Prices and Improves AI Training Efficiency by 40%
10 EB video 500 GPUs and 7 days
collected in one day required for processing
Challenges
Network latency becomes the key bottleneck that  AI training for autonomous driving is slow, with
affects the training time networks as the bottleneck, hampering the L4
AI is at the core of Baidu's current GTM plan for 2021.
business. In 2018, Baidu  SSD replacement does not markedly improve
implemented large-scale global the performance of the distributed storage system,
deployment of its distributed and storage efficiency remains low.
storage and AI training services.

Facial Autonomous Life Data


recognition driving science mining

AI Why Huawei Benefits


Performance of InfiniBand at Ethernet
prices
CE12800
 Autonomous driving training:
• Innovative algorithm + dedicated chip:
… VIQ, dynamic ECN, fast CNP, and other 40% training efficiency 53%TCO
innovative algorithms  Distributed storage:
Waterlin
e
VIQ1 VIQ2
• Based on open Ethernet: Lower price
CE6865
compared to InfiniBand and no dedicated
technicians required
25% IOPS
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Manual Configuration too Slow for Provisioning Cloud DC Resources


Cloud DC: IT resources have been virtualized and can be
Traditional DC: Manual configuration is error-prone and
brought online quickly. Separate network deployment
takes several months to deploy the network.
restricts the service rollout speed.
Cloud
Traditional DC platform
With manual configuration, provisioning a
single service take 30+ days Computing virtualization
resource delivery
Virtualized storage
resource delivery
Virtualized network
resource delivery
platform platform platform?

VM
. VM

VM
Network Service VM
Requirement output: VM
VM
configuration: commissioning:
one week
two weeks one week VM Storage Physical network

 Bradesco, one of Brazil’s biggest banks, operates  UnionPay deployed its cloud platform in 2014. On this
nearly 1000 switches in its DC. Hundreds of network platform, service rollout takes more than 10 days, with
changes are made weekly through manual computing and storage resources taking 4 hours to
configuration, which is error-prone, inefficient, and deploy. Network deployment is the bottleneck, as
requires one to three months to deploy each service. configurations are carried out manually.
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

CloudFabric Horizontal Solution Overview: Four Scenario-based Solutions


Scenario 4: Cloud-Network
Scenario 2: Connecting to a
Integration
Third-Party Controller
Service
Network administrator administrator
Third-party
VMware NSX controller
FusionSphere
OpenStack
vRNI interconnection

SecoManager

CloudEngine Layer 2 VTEP Network Overlay Network Overlay


Hybrid Overlay

Scenario 3: Computing
Scenario 1: Underlay Association (Virtualization)
Computing Network
Third-party
administrator administrator
configuration tool
such as Ansible
System Center
Microsoft and Huawei Hybrid /vCenter SecoManager
Cloud Solution

Underlay Network overlay

Note: The network overlay provides two modes – centralized and distributed. The distributed mode is
recommended and the centralized mode is not evolved. The hybrid overlay supports only the distributed mode.
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Intent-Driven DCN Automation: One-Network Multi-Cloud, One-Click


Deployment, and Intent-Driven Loop Closure
Private cloud Public cloud Telco cloud One network for multiple clouds, with 10x
higher management capacity than the
industry average
Open interconnection with 20+ cloud platforms, and flexible
collaboration
4200 devices can be managed, achieving smooth evolution

Intent
Design Conversion

GUI-based drag-and-drop deployment,


Pre-event Automatic Service achieving service rollout in minutes
check delivery verification
Underlay: one-click deployment and rapid network
service delivery
Overlay: intent orchestration and service provisioning in
minutes

Closed-loop verification, ensuring error-free


service configuration
Pre-event resource check, preventing delivery failures due
to insufficient resources
Post-event service verification, ensuring that services are
correctly delivered
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Continuously Improve Automatic Deployment Capabilities of IPv6,


Multicast, and Microsegmentation Based on the Agile Controller-DCN
Service-Centered IPv6 Evolution Microsegmentation Implements Fine-
Multicast overlay automation Grained East-West Isolation
Automation: Unicast Multicast Automation: IPv4 IPv6 Internal zero trust: Security policy
Microsegmentation
Securities Finance
User interface
IPv6 users IPv4 users

IPv6 IPv4

Control Plane Uses the NG-MVPN Protocol Cloud Network Connected Virtualization: GBP Model on the Cloud Network
to Transmit Multicast Routing Information to Third-Party OpenStack Based on Subnets or Discrete IP Addresses
CE6880/CE8861

IPv4 IPv4

Isolation

VM VM VM VM
IPv6 VM VM VM
IPv4 CE5880/CE6880/CE6857/CE6865
VM VM VM VM VM VM VM

VM VM VM
VM VM VM VM VM VM VM

Multicast Multicast Multicast


VM

VM
VM

VM
VM

VM

user 1 user 2 source

Customer benefit: The distributed network overlay Customer benefit: Reuse of network devices and Customer benefit: East-west security isolation is
supports automatic deployment of multicast O&M experience, smooth evolution, and minute- achieved using IT language instead of network
overlay, conserving bandwidth. level IPv6 service deployment language. Isolation granularity is finer and the
Typical case: SSE INFONET CO.,LTD (POC test) Typical case: PICC, Bank of China (to be dimension is wider.
launched) Typical case: China UnionPay
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Pre-Event Check and Post-Event Verification Ensure Error-Free Service


Configuration Post-event verification, ensuring
that services are correctly
Pre-event check, evaluating the delivered and operated
impact of service delivery  Obtain configurations and ensure data
 Obtain live network resources and consistency
evaluate resource status  Obtain the forwarding status and
 Generate the configuration model and verify the mutual access relationship
verify the configuration logic
Data plane verification
Control plane verification
Resource Intent Live network Forwarding
verification Design Conversion configuration entries
[Create vRouter] [Create vRouter]
ACL xxx >Create VRF on spine switch >Create VRF on spine switch
Routing xxx [Create Subnet] [Create Subnet]
VRF xx
VXLAN xxx Pre-event Automatic >Create VNI and add BDIF >Create VNI and add BDIF
Post-event
VNI xxx check delivery verification >Configure BDIF IP >Configure BDIF IP
BD xxx Live Leaf 1 ACL 99%... >Bind BDIF and VRF >Bind BDIF and VRF
Eth-Trunk xxx network
... resources Leaf 2 ACL 80%... >Configure BDIF as DHCP relay >Configure BDIF as DHCP relay
... ...

Resource availability
Forwarding status

Network verification model


Configuration
1 n n 1
verification EPG Subnet Subnet EPG
1 VPC 1
configuration
Live network
conf iguration

Configuration modeling

n n
Change

[Create vRouter] [Create vRouter]


>Create VRF on spine switch n 1 1 n
>Create VRF on VM IPAddr VM
[Create Subnet]
>Create VNI and add BDIF spine switch
>Configure BDIF IP ...
>Bind BDIF to VRF MAC
>Configure BDIF as DHCP relay
topology
Network

...

VNI/BD

...

Configuration impact
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Siemens: GUI-based Drag-and-Drop SDN


Solution Shortens the TTM 10-Fold
5 minutes
Test data center
Siemens MT data center, the
6000 VMs
Mass Transit test department
of Siemens transportation
service, is where test tasks of The test tasks change frequently, the
internal and external workload is heavy, and manual
customers are performed. configuration is error-prone (loops, etc.).

Challenges
Competitor’s solution: Menu-based UI, complex configuration

GUI-based Drag-and-
Drop SDN Solution Why Huawei Benefits
Deployment completed in 18s  TTM improved 10-fold
 Drag-and-drop service
SDN-based network automation shortens the TTM
deployment: The GUI greatly 10-fold (hours -> minutes).
simplifies network configuration. Even
personnel with no networking expertise  Simplified service deployment
can use the SDN controller to configure The Agile Controller-DCN provides a GUI, making it
services. easy to deploy the new SDN solution.

 Reliable network
 Automatic loop detection: SDN- Automatic Layer 2 loop detection and prevention
based automatic Layer 2 loop detection eliminate the possibility of network loops interrupting
and prevention improves network reliability. system operation.

Drag-and-drop delivery, WYSIWYG


Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

CloudFabric SDN Solution Proven for Reliable


Commercial Use, with 650 Deployments Worldwide

Finance: Volkswagen Financial


Services (Germany)
ISPs: Aruba (Italy)
Large enterprises: Volkswagen
(Germany), Skoda (Czech Republic) Finance: Sberbank of Russia, Central Bank of Russia,
NSPK
ISPs: Mail.Ru
Russia Enterprise: Russia Post

Western Japan
Europe and
South
Korea
Middle
East
China

Government: Ministry of
Interior (Saudi Arabia) ISPs: Naver (South Korea) and SB
Cloud (Japan)
South
M&E: CJ E&M (South Korea)
Pacific Finance: Tong Yang Life (South
Korea)

Large enterprises: Cement


plant (Indonesia)
Finance: Bank Mandiri
(Indonesia)
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

O&M Pain Points: After Production System Is Migrated to Cloud, Post-


Event Troubleshooting Unable to Handle Ongoing Service Interruptions
Migration of the production system to the cloud results in
intolerance to faults

Migrate the production system to the cloud


for 24/7, secure, and convenient services Downtime loss
per hour When a DCN fault occurs,
6.48 the entire network is
2.8
Cash/Investment/Wealth
1.6
2.0 affected.
Management 0.09 0.63 1.1
(US$ million)
Media Healthcare Retail Manufacturing Telecom Energy Finance

Online banking service Cloud hosting


service Source: Network Computing, the Meta Group and Contingency Planning Research

Automation eliminates the need for network black box and


Cloud traditional O&M methods
platform
30% faults can be O&M object: physical device

Proportion of abnormal flows in


-> logical NE

network-wide flows: 3.65%


identified by
traditional O&M
Imperceptible abnormal flows:
274,046/per day, 0.3%
70% faults cannot O&M difficulty 50x
VM VM VM VM VM VM
be identified by
VM
VM VM VM VM VM VM VM traditional O&M
VM VM VM VM VM VM VM
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Intelligent O&M: Traditional O&M Cannot Identify 70% Problems on


Cloud Data Center Networks
Huawei Dongguan Enterprise Data Center provides IT services for 400 nodes and 180,000 people around the world. Every day, there is an average of 96,545,774 daily flows
per POD, with 3,543,230 (3.67%) abnormal flows.
Xili Data Center of China Merchants Bank (CMB) has 1447 computing nodes and an average of 87,402,813 flows per day, of which 274,046 (0.3%) are link setup failure flows.

Problem Category Description Problem Category Description

Connectivity Locating unplanned service interruptions Policy Non-compliant policy check

Quality Network jitter caused by microbursts Resource Device/queue/port anomaly detection

Hardware Hardware fault prediction for optical modules of devices

Traditional O&M methods become ineffective in the cloud-based network era FabricInsight: Understanding the network status from the application
'Bottom-up' network perspective needs to change to 'top-down' business perspective, proactively identifying three types of problems that cannot
perspective be solved by traditional O&M, and locating faults in minutes

Intelligent O&M: Top-down, business perspective


O&M method: Correlation between network applications,
network paths, and network devices is analyzed, achieving
intelligent analysis and location of five types of problems Control plane Application flow (path)
Cloud-based based on Huawei's IT live-network operation practices. Status statistical Real-time behavior
network model model
Database
Traditional O&M: Bottom-up, network perspective
Telemetry (devices, links, ports, and chips)
O&M method: Topology management; alarm
management; performance management
Static network
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Intelligent O&M: FabricInsight Solution Architecture


4 Deep Fault Connectivity Performance Policy
...
Analysis Issues Issues Issues Real-time service awareness based on
1 Telemetry: Term Based License (TBL) mode
for quasi-real-time service flow awareness.
3
Big data-based network analysis: Tens of
2 billions of data records can be searched in
FabricInsight seconds.

AI-based analysis of correlation between


3 network applications, network paths, and
SNMP, network devices: various applications such
1 Query as those related to network connectivity,
ERSPAN,
GRPC network performance, network policies, and
Collector network resources.

Full-flow in-depth analysis capability based


4 on distributed intelligence: Perform on-
demand full-flow analysis to implement fault
Switch-based Filter
Big Data mode matching and root cause analysis.
Load Balancer
2
Aggregation
Collector
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

FabricInsight Detects Faults in Seconds, Locates Faults in Minutes, and


Provides Predictive Maintenance
Obtaining service flows and network KPIs in seconds Application-based O&M
based on telemetry Visibility of applications and networks, and fault
detection in seconds
Intelligent drawing of application maps, presenting network policy
evaluation at the port level
Association between abnormal exceptions and faulty links,
Collector Analyzer identifying faults in seconds
Telemetry
Spine Intelligent edge analysis and fault location in minutes
On-demand full-flow analysis based on the intelligent chip in switches
Association of applications, paths, and devices, locating the position of
packet loss in minutes
Leaf
AI-based predictive maintenance, reducing the fault rate by 68%
Dynamic baselines established through machine learning to identify
Server exceptions
WEB
Proactive prediction of optical module faults
APP1 APP2 DB

Scenario remarks Locating 75 typical faults in 15 minutes


(1) Networking adaptation: Only IPv4 TCP unicast is supported.
(a) Network overlay scenario: Distributed and centralized modes are supported. The hybrid overlay Physical server management scale:
scenario is not supported. • Management scale: The initial three analyzers manage 8,000 flows per second. One
(b) Host overlay and underlay traditional DCN scenarios: analyzer needs to be added for each increase of 5,000 flows per second.
CE switches can be used on the Layer 3 underlay but not Layer 2 underlay. • Each collector supports management of 100,000 flows per second. The collector needs to
Standard VXLAN encapsulation is supported. Other encapsulation formats, such as NVGRE and STT, be expanded based on the traffic calculated by analyzers after capacity expansion.
are not supported.
(c) CE switches in Layer 2 networking (for example, STP networking) are not supported. VM management scale:
(d) Only IP and VXLAN packet encapsulation is supported. Other packet encapsulation formats such as • Management scale: The initial three analyzers manage 3,000 flows per second. One
MPLS, PWE3, and TRILL are not supported. analyzer needs to be added for each increase in 1,000 flows per second.
(e) IP address overlapping is not supported, and VXLAN mapping is not supported. • Each collector supports management of 100,000 flows per second. The collector needs to
(2) The CE6865 (TD3 chip model) used as the server leaf node supports congestion and packet loss
be expanded based on the traffic calculated by analyzers after capacity expansion.
detection. Other models do not support this feature.
(3) ERSPAN and VXLAN cannot be enabled together on the CE12800 equipped with E series cards (Arad
To obtain the capacity planning manual, visit
chip series). http://http://support.huawei.com/enterprise/en/doc/EDOC1100042285?idPath=7919710|21782
(4) Version restriction: The CE switch must run V200R003C00 or later. 036|21782103|22620781.
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

China Merchants Bank: Full Path Monitoring, Proactive Fault Detection,


and Quick Fault Location
Challenges
Manual fault Manual packet Manual
Xili Cloud Data Center
Top retail bank in Asia Pacific detection obtaining and location isolation 6000 VMs
130 million retail customers
70 million active app users • The network is a black box, passively responding to
Online requests received 24/7 faults instead of proactively detecting them.
Customer
complaint • 10x more NEs, taking hours to locate faults

FabricInsight proactively detects connection exception risks in July 2018


Benefits
Proactively identifies problems and finds root Why Huawei
cause in 18 minutes
"FabricInsight helps us manage
Slow response to SYN and ACK packets by servers 100% visibility networks from the service perspective.
Network-wide full- Each network device is a probe that
flow detection
can perform full-path monitoring on
each service flow to proactively detect
300k abnormal retransmissions of data in one hour Zero service problems and quickly locate and
interruption
AI-based predictive demarcate faults."
maintenance, proactively
identifying risks
Mr. Li Yunlong
VM VM VM VM
Fault location in Network Manager of Information
VM VM VM

VM

VM
VM

VM
VM

VM
VM

VM
VM

VM
VM

VM
VM

VM
minutes Technology (IT) Department, CMB
Intelligent association of
Channel big data cluster Kafka cluster application flows, paths, and
devices
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

Open Collaboration: CloudFabric Is Compatible with Various Types of


Resources and Builds an Open Cloud Data Center Ecosystem
Cloud DCN Architecture Full-Layer Openness and Compatibility
1. The Agile Controller-DCN can interconnect with mainstream
Cloud Platform cloud platforms:
 Cloud platforms: OpenStack, FusionSphere, Red Hat, and
Mirantis
1  Standard Neutron model, providing interconnection with 20+
mainstream OpenStack platforms

2. The Agile Controller-DCN uses an open architecture to


Third- achieve compatibility with third-party VAS devices:
party VAS 3  F5 LB

2  Check Point, Palo Alto, and Fortinet firewalls

3. CE switches are compatible with third-party controllers:


KVM  Eight fixed CE switches have passed the VMware NSX
certification.
4  Certified NSX versions: 6.3.6, 6.4.1, 6.4.2, and 6.4.3
IT resources or VMs
4. Full series of CE switches support Ansible:
 Standard API modules and over 60 automation modules
 Ansible script supports Microsoft Azure Stack deployment.
Intelligent and Lossless
Automatic Deployment Intelligent O&M Open Collaboration
Network

ICBC: Open Platform Provides Dynamic and Elastic


Scheduling of VMware and OpenStack Cloud Resources
The world's largest bank by Benefits of  Use the hybrid overlay to build a uniform
market value Business requirements:
• Vendor lock-in, leading to high
CloudFabric to ICBC resource pool, flexibly scheduling
costs computing resources and improving the
• Lack of customization capabilities, 15% 60% resource usage
Closed failing to implement fast
innovation Resource usage  Open interconnection: The Agile
• Elimination of vendor lock-in Controller-DCN uses open APIs to
• Customization of bare metal
servers to maintain compatibility interconnect with VMware and
with legacy systems, improving Minute-level
OpenStack, achieving flexible service
competitiveness Service
Open provisioning migration and uniform provisioning.

Architecture on the Live Network Next-generation cloud architecture


70% of IT systems are not deployed on the cloud,
with most cloud-based IT systems deployed on
the VMware platform. Challenge 1: How can ICBC improve resource usage in
1. In the siloed system, resources in different partitions independent partitions on the live network?
cannot be shared.

storage resource

storage resource
Challenge 2: How can ICBC connect to the VMware and

Computing and

Computing and
2. A closed VMware architecture is used in some

scheduling

scheduling
cloudified zones. OpenStack platforms when they coexist?
Network resource scheduling
Network: manual configuration

Counter Core system Operation and


system management VM VM VM VM VM
VM VM VM VM VM

Partition 1 Partition... Partition n VM VM VM VM VM


Intelligent and Ultra-
Automatic Open
Lossless Intelligent O&M broadband
Deployment Collaboration
Network Network

DCs Undergo Rapid Growth, and High Capacity and Smooth


Evolution Become Key Network Capabilities
DC scale increases rapidly, and new and old servers Tencent has stringent requirements on network
coexist for a long time. capacity and interface rates.

Unit: EB Global DCN traffic trend

……
20
10
0
2015 2016 2017 2018 2019 2020

Cloud DC Traffic Traditional DC Traffic The network scale has increased fivefold over the past three
years, and the DCI bandwidth has doubled year-on-year.
DCN traffic triples over five years.
Tencent IDC
Percent of Yearly Server Shipments

Virtualized 20k*25G
Server High-Speed Migration (Total Market) Cluster plan network E/40GE
100% servers
Traditional
200 Gbit/s network
100 Gbit/s 20k*10GE 20k*100G
50 Gbit/s servers Cloud servers
40 Gbit/s network
50%
25 Gbit/s 5k*GE
AI
10 Gbit/s servers network

0
2014 2015 2016 2017 2018 2019 2020 2021 2022 2013 2015 2017 2018
Due to fast upgrade of server network adapters, AI drives server port upgrade to 100G, requiring uplink
multiple generations of DC servers coexist. interfaces with a higher rate.
Contents

1 Data Center Network Overview

2 Huawei CloudFabric Solution

3 CE Product Introduction

4 How to Beat
Product Overview: CloudEngine Series Data Center Switches

Core switches Access switches


TOR switch with flexible cards
10GE TOR switch 10GE large-buffer TOR
CE12800 switch

CE8860-4C-EI /CE8861-4C-EI CE6857-48S6CQ-EI (new model)


CE6870-48S6CQ-EI
100GE switch
CE6855/CE6856-48S6Q-HI
CE8850-64CQ-EI CE6870-24S6CQ-EI

CE6855/CE6856-48T6Q-HI
CE12816 CE12808 CE12804 CE8850-32CQ-EI
CE6870-48T6CQ-EI

CE12800S 40GE switch CE6851-48S6Q-HI


CE6875-48S4CQ-EI (new model)
CE7855-32Q-EI CE6810-48S4Q-LI GE TOR switch
25GE TOR switch
CE12808S CE12804S
CE6810-32T16S4Q-LI CE5855-48T4S2Q-EI
CE6865-48S8CQ-EI

Virtual switch CE6810-24S2Q-LI CE5855-24T4S2Q-EI


CE6860-48S8CQ-EI

FC/FCOE switch

CE6850U-48S6Q-HI CE6880-24S4Q2CQ-EI CE5880-48T6Q-EI


CE1800V
CE8861: Flexible Cards
Interface side view
Parameter CE8861-4C-EI
2U Interface type Flexible cards
Maximum number of devices
in a stack 9

Switching capacity 6.4 Tbit/s


Flexible cards: five types of line cards with different rates
2030 Mpps
Forwarding performance
Line-rate forwarding for 246 bytes or more
16*40GE 24 x 10GE electrical interfaces + 2 x 100GE Buffer 32 MB
FIB (v4/v6): 380k/256k
Performance
MAC: 288k
8*100GE 24 x 25GE optical
specifications
ARP: 168k
interfaces + 2 x 100GE

24*25GE/16GE FC+2*100GE
 25GE access switch with flexible cards, supporting VXLAN
Double fan trays (two fan
and BGP EVPN
Front panel view 1+1 power  4 card slots, 5 types of cards, flexible combination, building
modules in each tray) redundancy
flexible and high-density access and aggregation layers
(Cisco does not provide models with flexible cards)
 Hardware BFD, minimum 3.3-ms packet sending interval
(test feature)
 Telemetry, INT (IOAM), and ERSPAN enhancement
 Microsegmentation
Replaces CE8860  AI Fabric (dynamic ECN, fast CNP, VIQ, and DLB)
CE6857: TOR Switch with 10GE Downlink and 100GE Uplink

Interface side view Parameter CE6857-48S6CQ-EI


Downlink: 48*10GE SFP+ optical interface
Interface type
Uplink: 6*100GE QSFP 28/6*100GE QSFP+

Maximum number of
1U 16
devices in a stack
Switching capacity 2.16 Tbit/s
48*10GE SFP+ 6*100GE QSFP28 2030 Mpps
Forwarding performance
Line-rate forwarding for 115 bytes or more
Buffer 32 MB
Front panel view
FIB (v4/v6): 380k/256k
Performance
MAC: 288k
specifications ARP: 168k

 TOR switch with 10GE downlink and 100GE uplink,


supporting VXLAN and BGP EVPN

Four fan trays (one fan 1+1 power


 Hardware BFD, minimum 3.3 ms packet sending interval
module in each tray) redundancy  Microsegmentation
For 100GE uplink scenarios in which a large buffer is  Telemetry and ERSPAN enhancement
not required, the CE6870 is used.
CE5880: GE VXLAN TOR Switch
Parameter CE5880-48T6Q-EI
Interface side view Downlink: 44*GE RJ45, 4*10GE RJ45
Uplink: 6*40GE QSFP+ (Of the six 40GE
Interface type
interfaces, only the first two can be split into
10GE interfaces.)

1U Maximum number of
16
devices in a stack
Switching capacity 648 Gbit/s
44*GE RJ45+4*10GE RJ45 6*40GE QSFP+
406 Mpps
Forwarding performance
Line-rate forwarding for 131 bytes or more
Buffer 16.5 MB
FIB (v4/v6): 128k/64k
Front panel view Performance
MAC: 176k
Double fan trays (two fan specifications ARP: 128k
modules in each tray)

 TOR switch with GE downlink and 40GE uplink, supporting


VXLAN and BGP EVPN
 Hardware BFD, minimum 3.3-ms packet sending interval (test
feature)
 Accurate time synchronization: 1588v2
 Microsegmentation
1+1 power redundancy  Telemetry and ERSPAN enhancement

Sold outside China only.


The CloudFabric Solution Provides Stable and Reliable Operation on
Customers' Production Networks
Bank Data Center China Construction Bank:

ICBC: In October 2016, Huawei won the three- Service


Service Service October 2017: Daoxianghu DC SDN project

Management area
partition 1 partition n
year framework for ICBC. Huawei exclusively built intranet ... May 2016: Next-generation phase 3.2 project
zone

Headquarters
data center cloud networks of ICBC, carrying FW FW June 2015: Next-generation phase 2 A+ service
production services such as quick payment, (online banking and online transaction) and access to
personal online banking, mobile banking, enterprise Switching core midrange computer databases
online banking, MPP big data, converged e- December 2013: Wuhan Nanhu DC and desktop
Extranet access zone
commerce, and e-purchase. Backbone cloud
Intranet FW Intranet FW
Extranet
core zone
Internet zone
access area
Extranet Extranet
FW FW
Bank of China:

November 2018: Xi'an center cloud computing SDN


October 2017: Huawei was made exclusive supplier

Level-1 unit
phase II project
for three-year framework and responsible for
constructing three DCNs April 2018: Xi'an center cloud computing SDN project,
FW FW FW FW FW FW carrying big data and other production services
May 2016: Production area and operation &
Service Service Service Service Service Service August 2017: Beijing intra-city DC, carrying mainframe
management area in Shanghai Center partition partition partition partition partition partition
November 2015: Core production and big data 1 n 1 n 1 2 services
analysis service areas in Beijing Center November 2015: Core production area of the
Business center Level-1 branch Business center
December 2014: Management area in Shanghai headquarters
Center January 2015: Anhui and Xi'an customer center
January 2013: National level-1 bank data center network project

• In the global financial industry, Huawei's CloudFabric solution provides secure and reliable networks for bank customers such as Sberbank, DBS,
Bradesco, Mandiri, and BPM.
• In China, the CloudFabric solution is used in China Merchants Bank, China UnionPay, Haitong Securities, PICC, CPIC, and rural credit cooperatives.
• In the ISP industry, the CloudFabric solution safeguards Alibaba, Tencent, Baidu, SoftBank, Volkswagen private cloud, Yandex, SEA, and NAVER.
CloudFabric Serves 6400+ Global Enterprise DCs
 The market share is No.1 in China and No.3 in the  Over 20,000 CE12800 switches have been sold
world. around the world, serving 6400+ DCs in 120+
 No.1 in global market share growth rate for four countries.
consecutive years.  Over 650 sets of SDN solutions have been sold
around the world.

Gartner's Magic
Quadrant
 2018 Approaching
the Leaders
Quadrant
DC SDN
AI Fabric obtains the
 2017 Challenger SDN hardware platform
Best of Show Award at Interop
leader
Contents

1 Data Center Network Overview

2 Huawei CloudFabric Solution

3 CE Product Introduction

4 How to Beat
Mapping Between Huawei and Cisco DC Switches

CE12800
N9500 N7700
Core/Aggregation Switches
CE12800S N7000

CE8861/68-EI CE8850-64CQ-EI N9236C


100GE Aggregation Switches N9364C
CE8860-EI CE8850-EI N3232C

CE7855-EI CE8861/68 N9300 N3200


40GE Aggregation Switches CE8860-EI
N9200 N3100-V
N5600 N3100

CE6865-EI
25GE TOR Switches N9300 N36180YC
CE6860-EI CE8860-EI N9200

N9300 N6001
CE6880-EI CE6857-EI
N3100
CE6870/75-EI CE6855/56-HI N3500
N3000
10GE TOR Switches N2300
CE6810-LI CE6850U-HI N5600
N5500 N2200

N9348G N3048
CE5855-EI CE5880-EI
GE TOR Switches N2200
Note: Deep red models Note: Light green models
support the SDN solution. support the ACI solution.
Cisco and Huawei Protocol Mapping
Cisco Huawei Cisco Huawei

MAC Address Table Notification MAC Trap


UDLD DLDP
EtherChannel ETH-Trunk
PVST/PVST+/RPVST+ MSTP
Private Hosts MFF
UDE (Unidirectional Ethernet) single-fiber
Flex links Smart Link
IGRP IBGP/OSPF/ISIS
SVI VLANIF
EIGRP EBGP
VTP GVRP
HSRP/HSRPv2 VRRP
Layer 2 Protocol tunneling l2protocol-tunnel
CGMP HGMP
REP SEP/RRPP
RGMP PIM Snooping
MC LAG E-Trunk
GLBP VRRP
dying gasp dying gasp
PVLAN MUX-VLAN
vPC E-Trunk

PAGP LACP VSS CSS

CDP LLDP NetFlow NetStream

CDPv2 LLDP-MED MVR MVLAN


TACACS+ HWTACACS Auto Install/Smart Install Auto Config/Easy Operation
TDR VCT (virtual-cable-test) EnergyWise SPM
Cisco: Two Incompatible Architectures Hinders Smooth Evolution to
Cloud DCs
Traditional DC vDC Cloud DC

Key Points Fabric infrastructure Automatic service Intelligent


construction deployment network O&M
N9K ACI: oriented towards vDCs
Two incompatible
N2K - N7K > N77 > N9K VTS: oriented towards cloud DCs, immature
architectures
Tetration: oriented towards application analysis and policy migration

One architecture for CloudFabric: focusing on smooth evolution from traditional DCs to cloud DCs
smooth evolution CE6800–CE12800+Agile-Controller+FabricInsight

Cisco ACI lacks system-level reliability and Huawei CloudFabric supports system-level service DR
has limited scalability. and has high scalability.
CloudFabric leads virtualization to multi-cloud evolution and is applicable for multi-DC
Cisco ACI is designed for the virtualization phase and is applicable for a services.
single DC network.  Controller: Primary and secondary clusters plus one independent arbitration are
 Controller: A single cluster node is deployed across DCs, and it deployed across DCs, and VMs can smoothly migrate across DCs.
becomes unavailable in the event of a heartbeat fault. CloudFabric scale: The controller can manage 3000 leaf nodes, meeting the
ACI scale: The controller can manage only 300 leaf nodes, limiting the expansion requirements over the next 10 years.
network-wide SDN deployment capability.
CloudFabric offers open-source and open architecture
Cisco ACI is a private and closed solution, for smooth evolution
complicating evolution to clouds.
CloufFabric leads smooth evolution from traditional DCs to open clouds.
Cisco ACI is oriented towards non-cloudified enterprise virtualization scenarios.  Open architecture: Supports interconnection with 20+ cloud platforms and NSX.
 Closed architecture: Supports only a few cloud platforms, NSX not included.  Open-source standard APIs: Supports automatic and unified orchestration of VAS
 Proprietary interfaces: Unified orchestration is not supported for open-source VASs. network services.
- Private service model and GUI - Standard OpenStack model APIs
- Incapability of Cisco service chaining with open-source VAS plug-ins - Cooperation with third-party VAS (such as F5) plug-ins
 Poor compatibility with the live network: Does not support centralized gateways,  Live network compatibility: Supports centralized and distributed deployment and
integrated border leaf and spine, or automatic provisioning of BM servers. automatic provisioning of BM servers.
Huawei Enterprise Networking Marketing Support Resources
1. Visit http://e.huawei.com/en, and log in with a partner account.
2. Choose Partners > Marketing Materials Download.

3. Enter Networking Marketing Materials Bookshelf, and start your search.

4. Find and download Huawei Enterprise Networking Marketing Materials


Bookshelf.

Enterprise Visio Hardware iStack Tool Info Query Tool PCC&PDA Tool
Networking Stencil & Icon Query Tool
HUAWEI ENTERPRISE ICT SOLUTIONS A BETTER WAY

Copyright © 2019 Huawei Technologies Co., Ltd. All Rights Reserved.


The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive
statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time
without notice.

También podría gustarte