RAN14 Capacity Monitoring Guide

RAN14.
Capacity Monitoring Guide
Issue Date
Draft A 2012-07-23
HUAWEI TECHNOLOGIES CO., LTD.
Copyright Huawei Technologies Co., Ltd. 2012. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China Website: Email: http://www.huawei.com support@huawei.com
Draft A (2012-07-23)
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.
RAN14.1 Capacity Monitoring Guide
About This Document
About This Document

Purpose
Traffic on a mobile telecommunications network, especially a newly deployed network, increases over time. To support the ever-increasing traffic, more and more network resources are required, such as signaling processing resources, transmission resources, and air interface resources. If any type of network resource is insufficient, the key performance indicators (KPIs) decline (for example, the call drop rate increases), and user experience deteriorates. Therefore, real-time resource monitoring, resource bottleneck detection, and timely network expansion are critical to good user experience on a mobile telecommunications network. This document is intended to help network maintenance personnel monitor UTRAN system capacity and detect network resource bottlenecks. This document applies to RAN14.1 UTRAN products, including BSC6910 and 3900 series base stations. This document is only to be used as a reference for RNCs and NodeBs of earlier versions.
Organization
This document provides guidelines for preventing resource congestion. When following these guidelines, you must consider intuitiveness and operability based on site requirements. This document consists of the following chapters. Chapter 1 Network Resource Monitoring Methods 2 Network Resource Performance Counters 3 HSPA-related Resource Monitoring 4 Diagnosis of Problems Related to Network Resources 5 Counter Definitions Description Describes basic concepts associated with network resources, including definitions and monitoring activities. Describes the counters for monitoring various network resources. Describes how to monitor network resources when High Speed Packet Access (HSPA) is enabled. Describes fault analysis and diagnosis methods that experienced WCDMA network maintenance personnel can use to handle network congestion or overload efficiently. Lists all performance counters mentioned in the preceding chapters and provides the definitions of these counters.
Draft A (2012-07-23)
ii
About This Document
Change History
Issue 1 Date 2012-07-23 Description This is the Draft A release of RAN14.1. Version Draft A Prepared By Inventory Solutions Department
Draft A (2012-07-23)
iii
Contents
Contents
About This Document ....................................................................................................................... ii 1 Network Resource Monitoring Methods....................................................................................... 1
1.1 Introduction to Network Resources .................................................................................................................. 1 1.1.1 RNC Resources ....................................................................................................................................... 2 1.1.2 NodeB Resources .................................................................................................................................... 2 1.1.3 Air Interface Resources ........................................................................................................................... 3 1.2 Resource Monitoring Procedure ....................................................................................................................... 3
2 Network Resource Performance Counters ................................................................................... 5

2.1 RNC Resources ................................................................................................................................................ 5 2.1.1 Control-Plane Load and User-Plane Load............................................................................................... 5 2.1.2 Interface Board Load .............................................................................................................................. 6 2.1.3 RNC Inter-Subrack Bandwidth ............................................................................................................... 7 2.2 NodeB Resources ............................................................................................................................................. 8 2.2.1 NodeB CPU Usage ................................................................................................................................. 8 2.2.2 WMPT/UMPT CNBAP Usage ............................................................................................................... 8 2.2.3 CE Usage ................................................................................................................................................ 9 2.2.4 Iub Bandwidth ....................................................................................................................................... 11 2.3 Air Interface Resources .................................................................................................................................. 11 2.3.1 Uplink Load .......................................................................................................................................... 11 2.3.2 Downlink Load ..................................................................................................................................... 13 2.3.3 Downlink OVSF Code Usage ............................................................................................................... 14 2.3.4 Common Channels ................................................................................................................................ 15
3 HSPA-related Resource Monitoring ............................................................................................ 18

3.1 HSDPA ........................................................................................................................................................... 18 3.1.1 Power Resources ................................................................................................................................... 18 3.1.2 OVSF Code Resources.......................................................................................................................... 19 3.2 HSUPA ........................................................................................................................................................... 20 3.2.1 CE Resources ........................................................................................................................................ 20 3.2.2 RTWP.................................................................................................................................................... 20
4 Diagnosis of Problems Related to Network Resources .............................................................. 21

4.1 Call Congestion in the Basic Call Flow ......................................................................................................... 21
Draft A (2012-07-23)
iv
Contents
4.2 Call Congestion Counters ............................................................................................................................... 23 4.2.1 Counters Related to Paging Loss .......................................................................................................... 23 4.2.2 Counters Related to RRC Congestion ................................................................................................... 24 4.2.3 Counters Related to RAB Congestion ................................................................................................... 24 4.3 Signaling Storms and Solutions ..................................................................................................................... 25 4.4 Resource Usage Analysis ............................................................................................................................... 27 4.4.1 RNC Resource Usage Analysis ............................................................................................................. 29 4.4.2 NodeB Resource Usage Analysis .......................................................................................................... 32 4.4.3 Air Resource Usage Analysis ................................................................................................................ 35
5 Counter Definitions ....................................................................................................................... 39
Draft A (2012-07-23)
1 Network Resource Monitoring Methods
Network Resource Monitoring Methods
Two methods are available for monitoring network resources and detecting resource bottlenecks: Prediction-based monitoring
This method monitors various network resources simultaneously. You can monitor the usage of a network resource (for example, the downlink transmit power of a cell), compare the detected resource usage with a preset upper threshold, predict the resource usage trend and impact, and determine whether to perform network expansion. If the resource usage remains above the preset upper threshold for a long time (for example, a cell remains overloaded during busy hours for several consecutive days), you can split the cell or add carriers for network expansion. This method is used to identify high-load cells and RNCs. It is a conventional means of monitoring resource usage and is easy to implement. This chapter describes how to use this method to monitor network resources.
NOTE
For details about the counters for monitoring network resources, see chapter 2 "Network Resource Performance Counters." For details about HSPA-related resources, see chapter 3 "HSPA-related Resource."
Problem-driven analysis
This method is used to analyze decreased network performance counters after network congestion has occurred. This method is more complex than prediction-based monitoring because it requires more analysis instruments and skills. However, this method can maximize how the system utilizes existing resources and eliminate the need for immediate network expansion. For details about this method, see chapter 4 "Diagnosis of Problems Related to Network Resources."
NOTE
Experienced network maintenance engineers can also use other methods to analyze network resource problems.
1.1 Introduction to Network Resources

Network resources that need to be monitored consist of RNC resources, NodeB resources, and air interface resources. Figure 1-1 shows the distribution of these network resources.
Draft A (2012-07-23)
Figure 1-1 Distribution of radio resources to be monitored
1.1.1 RNC Resources

The RNC resources that need to be monitored are as follows:
Control-plane and user-plane resources BSC6910 boards do not differentiate between the control plane and the user plane. A new Evolved General Processing Unit REV: a (EGPUa) board was introduced into the BSC6910 to process control-plane data and user-plane data. As the traffic volume grows, the control-plane load and user-plane load of the EGPUa board may exceed its planned processing capability and become a system bottleneck.
Interface board resources Interface boards on the RNC provide various transmission ports and resources to process transport network messages and to enable interaction between RNC internal data and external data. On the interface boards, control-plane overload affects service connection, and user-plane overload causes packets to be discarded.
Inter-subrack bandwidth The RNC provides the bandwidth for inter-subrack information exchange.
1.1.2 NodeB Resources

The NodeB resources that need to be monitored are as follows:
Channel elements (CEs) CEs are baseband processing resources managed at the NodeB level. For a newly deployed network, CE usage is initially set to a small value to save capital expenditure (CAPEX). Generally, CE congestion is most likely to occur on the network.
NodeB boards NodeB boards are classified into the WMPT/UMPT, WCDMA Baseband Processing Unit (WBBP), and Universal Transmission Processing unit (UTRP). The WCDMA Main Processing and Transmission unit (WMPT) or Universal Main Processing and Transmission unit (UMPT) performs signaling processing and resource management. Central processing unit (CPU) overload of the WMPT or UMPT decreases system processing capabilities, which affects NodeB-related KPIs.
Iub transmission resources Based on the transmission medium, there are two types of transmission: asynchronous transfer mode (ATM) transmission and Internet Protocol (IP) transmission. On an IP transport network, the NodeB and RNC can dynamically adjust uplink and downlink Iub transmission bandwidth. Generally, transmission resource bottlenecks do
Draft A (2012-07-23)
not result from insufficient capacities of interface boards but from low bandwidth available on the IP transport network.
1.1.3 Air Interface Resources

The air interface resources that need to be monitored are as follows:
Received total wideband power (RTWP) RTWP is the total power received by a NodeB within a bandwidth, including the receiver noise, external radio interference, and uplink power generated due to traffic. RTWP, which is similar to the received signal strength indicator (RSSI) in the CDMA system, measures uplink load. RSSI measures the total channel power received by a UE in the downlink.
Transmitted carrier power (TCP) TCP is the full-carrier transmit power in a cell and monitors downlink load. The TCP value is limited by the maximum transmission capability of the power amplifier on a NodeB.
Orthogonal variable spreading factor (OVSF) OVSF is a downlink code resource. In the downlink, only one OVSF code tree is available for each cell.
Paging channel (PCH) PCH usage is directly related to the location area code (LAC) plan and PCH state transition. PCH overload decreases the paging success rate.
Random access channel (RACH) and forward access channel (FACH) The RACH and FACH carry signaling and a small amount of user-plane data. RACH/FACH overload may decrease the access success rate, affecting user experience.
1.2 Resource Monitoring Procedure

This section describes the resource monitoring procedure. This procedure is easy to implement and applicable in most scenarios. For a newly deployed network, you can monitor only one resource. Once you detect that this resource exceeds its upper threshold, check whether other resources exceed their upper thresholds.

If yes, the cell or NodeB is overloaded. Then, perform network expansion. If no, the cell or NodeB is not necessarily overloaded. In this case, network expansion is not mandatory because the problem might be solved by adjusting parameter configurations.
For example, the CE usage is above 70% but the usages of other resources such as RTWP, TCP, and OVSF codes are within their allowed ranges. In this case, CE resources are insufficient but the cell is not overloaded. To solve the problem, configure licenses allowing more CEs or add baseband processing boards instead of performing network expansion immediately. After the load is basically balanced between the control plane and the user plane:

If the CPU usage of the control plane is above 50%, the RNC is overloaded. If the CPU usage of the user plane is above 60%, the RNC is overloaded.
Draft A (2012-07-23)
Figure 1-2 shows the details. Figure 1-2 Resource monitoring flowchart
Other resource usages are not used to determine whether the RNC is overloaded. This flowchart applies to most resource monitoring scenarios, except when the system overload is caused by an unexpected event rather than a service increase. To simplify the procedure, unexpected events are not considered in this flowchart. The causes of unexpected events might be located through a comprehensive analysis of various resource bottlenecks. For details about how to locate a resource-related problem, see chapter 4 "Diagnosis of Problems Related to Network Resources."
Draft A (2012-07-23)
2 Network Resource Performance Counters
Network Resource Performance Counters

This chapter provides the performance counters for monitoring the network resources described in chapter 1 and their upper thresholds. These counters indicate the resource usage or load on a UTRAN. Identifying the busy hour is crucial to accurate counter monitoring. There are various methods for identifying the busy hour. The simplest method is recommended, that is, determining the hour during which the most resources are consumed. If the value of a counter during the busy hour exceeds the upper threshold for three consecutive days, consider capacity expansion.
2.1 RNC Resources

Table 2-1 lists the counters for monitoring RNC resources. Table 2-1 Counters for monitoring RNC resources Resource Type Control-plane load User-plane load CPU usage on the interface board Forwarding load on the interface board Counter VS.SUBSYS.CPULOAD.MEAN VS.SUBSYS.CPULOAD.MEAN VS.INT.CPULOAD.MEAN VS.INT.TRANSLOAD.RATIO.M EAN Upper Threshold 50% 60% 50% 70%
2.1.1 Control-Plane Load and User-Plane Load

The control plane processes air interface signaling and transport layer signaling. The control plane may be overloaded due to signaling storms. If the control plane is overloaded, new messages are discarded and new call requests are rejected. This affects end-user experience. The user plane processes and distributes service data. The BSC6910 uses the EGPUa board to process user-plane data and control-plane data simultaneously. Different types of resources on all EGPUa boards form different resource
Draft A (2012-07-23)
pools. Control-plane resources form a control-plane resource pool, and user-plane resources form a user-plane resource pool. The number of resources allocated to the control plane and user plane can be adjusted based on service requirements so that load is balanced between the two planes. Consider the control-plane load and user-plane load comprehensively before performing capacity expansion.
Configuration of Control-Plane and User-Plane Sharing

You can run the SET UCPUPFLEXCFG command to set resource sharing policies for the control and user planes. In this command, the FlexCfgMode parameter specifies the dynamic adjustment mode. Parameter ID FlexCfgMode Description Value Range FrozenMode, ManulMode, AutoMode Default Value FrozenMode
For details about other parameters in the SET UCPUPFLEXCFG command, see BSC6910 UMTS MML Command Reference.
Counters for Monitoring Average CPU Usage on the Control and User Planes
The VS.SUBSYS.CPULOAD.MEAN counter measures the average CPU usage of a subsystem during a measurement period and reflects the CPU load and quality of the subsystem during the measurement period. The average CPU usage on the control plane is calculated by the following formula: Average CPU usage on the control plane = Average (VS.SUBSYS.CPULOAD.MEAN measured for all CP subsystems) The average CPU usage on the user plane is calculated by the following formula: Average CPU usage on the user plane = Average (VS.SUBSYS.CPULOAD.MEAN measured for all UP subsystems) The recommended upper thresholds for monitoring the average CPU usage on the control and user planes are 50% and 60%, respectively. When FlexCfgMode is set to ManulMode or FrozenMode, you are advised to enable dynamic sharing or to adjust the ratio of resources split between processing user plane data and control plane data if the control plane or user plane is overloaded. If the control-plane or user-plane resources cannot meet your traffic model's requirements, perform capacity expansion immediately.
2.1.2 Interface Board Load

CPU Usage on the Interface Board
The control-plane load on an interface board is indicated by the board CPU usage. The load includes forwarding load and session load. An RNC can house multiple interface boards. If an interface board is overloaded, adjust the transmission bandwidth or add an interface board to expand system capabilities.
Draft A (2012-07-23)
The counters for monitoring the CPU usage on the interface board are as follows:
VS.INT.CPULOAD.MEAN This counter measures the average CPU usage of an interface board as a percentage. Average CPU usage for session load This counter is calculated by the following formula: Average CPU usage for session load = VS.INT.CFG.INTERWORKING.NUM/Number of established or released sessions per second x 60 x SP x 100% where

VS.INT.CFG.INTERWORKING.NUM is the number of call setup attempts on an interface board. SP is the measurement period (in minutes). The number of established or released sessions per second is subject to system specifications: 5000 for the GOUc and FG2c boards and 50,000 for the EXOUa board.
It is recommended that an interface board be added when the CPU usage or session load of an interface board exceeds 50%.
Forwarding Load on the Interface Board

The VS.INT.TRANSLOAD.RATIO.MEAN counter measures the average forwarding load on the user plane of the interface board as a percentage. It is recommended that an interface board be added when the CPU usage for forwarding load exceeds 70%.
2.1.3 RNC Inter-Subrack Bandwidth

Ports on the SCU boards form a trunk group, which connects the Main Process Subrack (MPS) and Extension Process Subrack (EPS). If the inter-subrack bandwidth approaches the overload threshold, voice and data service quality deteriorates, network KPIs decline, and the system becomes unstable.
Introduction
If active and standby SCUa boards are used, the inter-subrack bandwidth is 4 Gbit/s. If active and standby SCUb boards are used, the inter-subrack bandwidth is 40 Gbit/s. If the active or standby SCUa/SCUb board is faulty, the inter-subrack bandwidth decreases by half.
Counters for Monitoring Inter-Subrack Traffic

The counters for monitoring inter-subrack traffic are as follows:
Frame Peak Utility Ratio This counter measures the peak usage of inter-subrack traffic and is calculated by the following formula: Frame Peak Utility Ratio = VS.Frame.Flux.Peak.TxRate/Inter-subrack bandwidth x 100% where VS.Frame.Flux.Peak.TxRate is the peak traffic volume transmitted between subracks.
Draft A (2012-07-23)
If the value of Frame Peak Utility Ratio is greater than 60%, a prewarning is required.
Frame Mean Utility Ratio This counter measures the average usage of inter-subrack traffic and is calculated by the following formula: Frame Mean Utility Ratio = VS.Frame.Flux.Mean.TxRate/Inter-subrack bandwidth x 100% where VS.Frame.Flux.Mean.TxRate is the average traffic volume transmitted between subracks. If the value of Frame Mean Utility Ratio is greater than 40%, a prewarning is required.
Frame DropPackets Ratio This counter measures the packet loss rate between subracks and is calculated by the following formula: Frame DropPackets Ratio = VS.Frame.Flux.DropPackets/VS.Frame.Flux.TxPackets x 100% where

VS.Frame.Flux.DropPackets is the number of packets discarded during packet transmission between subracks. VS.Frame.Flux.TxPackets is the number of packets transmitted between subracks.
If the value of Frame DropPackets Ratio is greater than 0.01%, a prewarning is required. If the prewarning threshold is reached or an inter-subrack bandwidth congestion alarm is reported, contact Huawei engineers for problem handling.
2.2 NodeB Resources

2.2.1 NodeB CPU Usage
On a network with a large number of smartphones, the main control board, baseband processing board, and extended transmission board may be overloaded due to signaling storms. If the CPU of a NodeB board is overloaded, more signaling messages are discarded, which may result in no response to new call requests. The VS.BRD.CPULOAD.MEAN counter measures the average CPU usage of a NodeB board as a percentage. If the NodeB CPU usage is higher than 60%, you are advised to perform capacity expansion by adding a board, splitting the NodeB, or deploying a new NodeB.
2.2.2 WMPT/UMPT CNBAP Usage

The WMPT or UMPT board processes signaling messages and manages resources for other boards. If the WMPT/UMPT board is overloaded, a radio link (RL) fails to be set up, or no response to an RL setup request is received. This significantly decreases KPIs such as the radio resource control (RRC) connection setup success rate and the radio access bearer (RAB) setup success rate. For a Huawei NodeB, Common NodeB Application Protocol (CNBAP) is used to assess the WMPT/UMPT processing capability. The CNBAP usage is calculated by the following formula:
Draft A (2012-07-23)
CNBAP usage
where

VS.IUB.AttRLSetup is the number of Iub RL setup requests in a cell. VS.IUB.AttRLAdd is the number of Iub RL addition requests in a cell. VS.IUB.AttRLRecfg is the number of Iub RL reconfiguration attempts in a cell. SP is the measurement period (in seconds). CNBAP capacity of the NodeB depends on the configuration of the WMPT/UMPT, WBBP, and UTRP boards.
If the CNBAP usage of the WMPT/UMPT board is above 60% on a Huawei NodeB, you are advised to perform capacity expansion by adding a WBBP/UTRP board or splitting the NodeB.
2.2.3 CE Usage
Channel elements (CEs) are baseband resources provided by NodeBs. One CE resource can be consumed by a 12.2 kbit/s voice call. If available CE resources are insufficient, a new call request is rejected. The total available CE resources are limited by the hardware and configured licenses. If CE resources on the WBBP board are sufficient, the CE resources are limited by only licenses. In this case, expand the licensed capacity. CE resources are managed and shared in a resource group of the NodeB. Separate baseband processing units are used in the uplink and downlink directions of a NodeB. Therefore, uplink and downlink CE resources are managed and used independently of each other.
The number of licensed CE resources is distributed by the M2000. The number of CE resources provided by boards on a NodeB and the number of CE resources provided by boards in an uplink resource group are calculated based on NodeB board configuration and specifications. You can run MML commands to query the NodeB board configuration.
Uplink CE Usage
Since RAN14.0, the CE Overbooking feature has been introduced for uplink CE resources. The counters for monitoring uplink CE usage vary depending on whether CE Overbooking is enabled. If CE Overbooking is enabled, the following counters are used to monitor uplink CE usage:
VS.NodeB.ULCreditUsed.Mean This counter measures the average uplink NodeB credit resource usage when CE Overbooking is enabled.
NodeB_UL_CE_MEAN_RATIO This counter measures the uplink CE usage on a NodeB and is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.NodeB.ULCreditUsed.Mean/2)/UL CE Cfg Number
Draft A (2012-07-23)
where

VS.NodeB.ULCreditUsed.Mean is the average uplink NodeB credit resource usage when CE Overbooking is enabled. UL CE Cfg Number is the number of available uplink CE resources on a NodeB. The number is the smaller of the values for two counters: NodeB License UL CE Number and NodeB Physical UL CE Capacity. NodeB License UL CE Number measures the number of licensed uplink CE resources on a NodeB. NodeB Physical UL CE Capacity measures the number of uplink CE resources provided by boards on a NodeB.
If CE Overbooking is disabled, the following counters are used to monitor uplink CE usage:
NodeB_UL_GRP_CE_MEAN_RATIO This counter measures the CE usage in an uplink resource group of a NodeB. Each NodeB allows for multiple uplink resource groups. This counter is calculated by the following formula: NodeB_UL_GRP_CE_MEAN_RATIO = Sum_AllCells_of_ResourceGroup (VS.LC.ULCreditUsed.Mean/2)/UL GRP CE Cfg Number where

VS.LC.ULCreditUsed.Mean is the average uplink NodeB credit resource usage in a cell. UL GRP CE Cfg Number is the number of available CE resources in an uplink resource group. The number is the smaller of the values for two counters: NodeB License UL CE Number and NodeB Physical UL group CE Capacity. NodeB Physical UL group CE Capacity measures the number of CE resources provided by boards in an uplink resource group.
NodeB_UL_CE_MEAN_RATIO This counter measures the uplink CE usage on a NodeB and is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.LC.ULCreditUsed.Mean/2)/UL CE Cfg Number where

VS.LC.ULCreditUsed.Mean is the average uplink NodeB credit resource usage in a cell. UL CE Cfg Number is the number of available uplink CE resources on a NodeB.
The upper threshold for uplink CE usage is 70%. If CE usage is above 70% in an uplink resource group of a NodeB, add a WBBP board to the uplink resource group even if the CE usage is low on the NodeB.
Downlink CE Usage
The NodeB_DL_CE_MEAN_RATIO counter measures the downlink CE usage and is calculated by the following formula: NodeB_DL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB(VS.LC.DLCreditUsed.Mean)/DL CE Cfg Number where
VS.LC.DLCreditUsed.Mean is the average downlink NodeB credit resource usage in a cell.
Draft A (2012-07-23)
10
DL CE Cfg Number is the number of available downlink CE resources on a NodeB. The number is the smaller of the values for two counters: NodeB License DL CE Number and NodeB Physical DL CE Capacity. NodeB License DL CE Number measures the number of licensed downlink CE resources on a NodeB. NodeB Physical DL CE Capacity measures the number of downlink CE resources provided by boards on a NodeB.
If the downlink CE usage is above 70%, perform capacity expansion by adding a WBBP board.
2.2.4 Iub Bandwidth

Generally, Iub bandwidth needs to be monitored. Based on the transmission medium, there are two types of transmission: ATM transmission and IP transmission. On either an ATM or IP transport network, Huawei RNCs and NodeBs can monitor the average traffic in the uplink and downlink, compare the average traffic with the total Iub bandwidth, and obtain the Iub bandwidth usage. The average Iub bandwidth usage is indicated by RNC counters related to transmission resources. If calls are frequently rejected due to a large number of users on the network, the Iub bandwidth may be insufficient. If this occurs, increase the Iub bandwidth as required. Iub bandwidth is higher than air interface bandwidth. For an IP transport network, you are not advised to monitor the Iub bandwidth when the prediction-based monitoring method is implemented.
The BSC6910 does not support logical ports. The EGPUa board does not support virtual ports (VPs).
2.3 Air Interface Resources

2.3.1 Uplink Load
In a CDMA system, the radio performance of a cell is limited by the received noise. Therefore, the total received interference in a cell is used to measure the uplink cell capability. In a WCDMA system, received total wideband power (RTWP) is used to measure the uplink cell capability. As is the case with the CDMA system, the RTWP value minus the cell background noise is the interference increase that results from a service increase. The interference increase indicates the uplink service increase and is expressed as a percentage. For example, a 3 dB interference increase corresponds to 50% uplink load, and a 6 dB interference increase corresponds to 75% uplink load. Generally, the total uplink received bandwidth is 5 MHz and the background noise is -106 dBm in a WCDMA system. Figure 2-1 shows the relationship between RTWP, interference increase, and uplink load.
Draft A (2012-07-23)
11
Figure 2-1 Relationship between RTWP, interference increase, and uplink load
The recommended uplink load threshold is 75%, and the corresponding RTWP value is smaller than -100 dBm. If the RNC uses algorithm 2 during admission control, the value of the UL ENU Ratio counter should be smaller than 75%. UL ENU Ratio measures the ratio of uplink equivalent users to users supported in the cell. Algorithm 2 is the Equivalent Number of Users (ENU) algorithm. If the RTWP value is greater than -100 dBm, the cell is overloaded in the uplink. Generally, if a cell is overloaded or the RTWP value is too large, the cell coverage shrinks, the quality of ongoing services deteriorates, and new service requests may be rejected. The following counters related to RTWP and ENU are used on Huawei RNCs:

VS.MeanRTWP: average RTWP (in dBms) in a cell VS.MinRTWP: minimum RTWP (in dBms) in a cell VS.RAC.UL.EqvUserNum: number of uplink equivalent users on all dedicated channels in a cell UL ENU Ratio: this counter is calculated by the following formula: UL ENU Ratio = VS.RAC.UL.EqvUserNum/UlTotalEqUserNum where UlTotalEqUserNum is the maximum number of equivalent users in a cell, which can be set by running the ADD UCELLCAC command.
In some areas, the background noise increases to far higher than -106 dBm due to other interference or hardware problems, such as antennas or feeder connectors of poor quality. In this case, the VS.MinRTWP counter, which measures the RTWP when there is no traffic in a cell, reflects the background noise in the cell. If the VS.MinRTWP value during the idle hour is larger than -100 dBm or smaller than -110 dBm for three consecutive days in one week, there are hardware problems or external interference. Find the cause, and troubleshoot the problems or eliminate the external interference. The recommended uplink load threshold is indicated by the VS.MeanRTWP counter. A cell is considered heavily loaded in the uplink when either of the following is true for two or three days in one week:
Draft A (2012-07-23)
12

The VS.MeanRTWP value during the busy hour remains above -100 dBm, which corresponds to a 6 dB interference increase or 75% load. The UL ENU Ratio value during the busy hour remains above 75%.
When the cell is heavily loaded, perform capacity expansion by adding a carrier or increasing the UlTotalEqUserNum value.
2.3.2 Downlink Load

The downlink capacity of a cell is limited by its total available transmit power, which is determined by the base station power amplifier (PA) and by software settings. When the transmit power in a cell is exhausted, the following occur:

The cell coverage is insufficient. The data throughput is limited. The service quality deteriorates. New call requests may be rejected.
The downlink power consumption in a cell is related to the cell load, UE location, and cell coverage. Any one of the following scenarios leads to more power consumption:

Large cell coverage Long distance between the UE and the cell center Heavy cell load
In a WCDMA system, TCP is used to measure the downlink total transmit power in a cell. Four TCP-related counters are used on Huawei RNCs:

VS.MeanTCP: average downlink carrier transmit power in a cell VS.MaxTCP: maximum downlink carrier transmit power in a cell VS.MinTCP: minimum downlink carrier transmit power in a cell VS.MeanTCP.NonHS: average downlink carrier transmit power in a non-HSDPA cell
The downlink cell load is indicated by the TCP usage in the cell. If the TCP usage in a cell remains above 85% of the VS.MaxTCP value, the cell is overloaded in the downlink. The TCP usage in a cell is calculated by the following formula: TCP usage in a cell = where MaxTxPower is set by running the ADD UCELLSETUP command. The following provides two examples of determining whether a cell is overloaded based on the formula. If Downlink Maximum Transmit Power 20 W (43 dBm) Upper Threshold for TCP Usage 85% Then Upper TCP Threshold That Triggers Cell Overload 17 W (42.3 dBm) / x 100%
Draft A (2012-07-23)
13
40 W (46 dBm)
85%
34 W (45.3 dBm)
If the TCP usage during the busy hour remains above 85% for three consecutive days in one week, the cell is heavily loaded in the downlink. Then, perform capacity expansion by adding a carrier. Some live networks use hierarchical cell structures with multiple carriers. The cell power settings and the corresponding upper TCP thresholds vary depending on the networking policy and cell service priority.
2.3.3 Downlink OVSF Code Usage

In a WCDMA system, channels are distinguished by code. For a channel, two types of codes are available: scrambling code and orthogonal variable spreading factor (OVSF) code. In the uplink, each user is allocated a unique scrambling code. In the downlink, each cell is allocated a unique scrambling code, that is, all users in a cell use the same scrambling code. Each user in a cell is allocated a unique OVSF code. In a WCDMA cell, data from different users is distinguished based on CDMA technique, and all user data is transmitted over the same central frequency almost at the same time. OVSF codes provide perfect orthogonality, minimizing interference between different user data. Figure 2-2 shows an OVSF code tree. Figure 2-2 OVSF code tree
In the downlink, the maximum spreading factor (SF) 256 can be used. In a cell, only one OVSF code tree is available. In the OVSF code tree, sibling codes are orthogonal to each other, but are not with their parent or child codes. As a result, once a code is allocated to a user, neither its parent nor child code can be allocated to any other user. OVSF code resources are limited. If available OVSF codes are insufficient to achieve the desired quality of service (QoS), a new call request may be rejected.
Draft A (2012-07-23)
14
An OVSF code tree can be divided into 4 SF4 codes, 8 SF8 codes, 16 SF16 codes, or 256 SF256 codes. Codes with various SFs can be considered as equivalent codes with SF 256. For example, a code with SF 8 is equivalent to 32 codes with SF 256. Based on this equivalence mapping, the OVSF code usage can be calculated for a user or in a cell. Huawei RNCs monitor the average code usage of an OVSF code tree based on the number of equivalent codes with SF 256. The VS.RAB.SFOccupy counter measures the average code usage of an OVSF code tree. The counters for monitoring the OVSF code usage are as follows:
OVSF_Utilization, which is calculated by the following formula: OVSF_Utilization = VS.RAB.SFOccupy/256 DCH_OVSF_Utilization, which is calculated by the following formula: DCH_OVSF_Utilization = DCH_OVSF_CODE/256 where DCH_OVSF_CODE is calculated as follows: DCH_OVSF_CODE = ((<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>))
If the DCH_OVSF_Utilization value during the busy hour remains above 70% for three consecutive days in one week, the cell runs out of OVSF codes. Perform capacity expansion by splitting the cell or adding a carrier.
2.3.4 Common Channels

Common channels include paging channels (PCHs) and forward access channels (FACHs). Both the PCH and the FACH are downlink transport channels. The PCH is used to transmit paging messages. The FACH is used to transmit user signaling and a small amount of user data to UEs in the CELL_FACH state. The capacity of PCHs and FACHs is configurable. If the PCH or FACH capacity is insufficient, signaling messages or user data is lost. Common channel resource analysis involves the monitoring of both PCHs and FACHs. If the PCH usage is too high, paging messages may be lost. Paging messages are broadcast across an area identified by location area code (LAC), which is referred to as a LAC area for short. Therefore, improper LAC planning leads to high PCH usage. High FACH usage is mainly caused by two factors:

State transition of a large number of UEs performing packet switched (PS) services RRC signaling storms
Based on the default parameter settings for Huawei RNCs, the PCH and FACH usages are calculated as follows:
PCH usage The PCH Physical Channel Utility Ratio counter measures the PCH usage and is calculated by the following formula:
Draft A (2012-07-23)
15
PCH Physical Channel Utility Ratio = VS.UTRAN.AttPaging1/(<SP> x 60 x 5/0.01) where

VS.UTRAN.AttPaging1 is the number of PAGING TYPE1 messages transmitted in a cell. SP indicates the measurement period (in seconds). If paging messages are not retransmitted, 5% of them will be lost when the PCH usage reaches 60%. If paging messages are retransmitted once or twice, 1% of them will be lost when the PCH usage reaches 70%.
The basic principles for evaluating PCHs are as follows:

Based on the basic principles, you are advised to perform fault diagnosis or replan LAC areas.
FACH usage The FACH usage is indicated by the FACH Utility Ratio counter. The method for calculating the value of this counter varies depending on whether a standalone secondary common control physical channel (SCCPCH) is used. If a FACH is carried on a non-standalone SCCPCH, the FACH usage is calculated by the following formula: FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/((60 x <SP> x 168 x 1/0.01) x VS.PCH.Bandwidth.UsageRate x 6/7 + (60 x <SP> x 360 x 1/0.01) x (1VS.PCH.Bandwidth.UsageRate x 6/7)) where

VS.CRNCIubBytesFACH.Tx is the traffic transmitted on the FACH on the Iub interface. VS.PCH.Bandwidth.UsageRate is the PCH bandwidth usage and is calculated as follows: VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx>/(<VS.CRNC.IUB.PCH.Bandwidth> x SP x 60.0) In this formula, VS.CRNCIubBytesPCH.Tx measures the traffic transmitted on the PCH on the Iub interface; VS.CRNC.IUB.PCH.Bandwidth measures the PCH bandwidth for the CRNC in a cell.
SP is the measurement period (in seconds).
If a FACH is carried on a standalone SCCPCH, the FACH usage is calculated by the following formula: FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/(60 x <SP> x 360 x 1/0.01) where

VS.CRNCIubBytesFACH.Tx is the traffic transmitted on the FACH on the Iub interface. SP is the measurement period (in seconds).
Ratio of users on the FACH to users allowed to use the FACH This ratio is indicated by the FACH UserNum Utility Ratio counter. The method for calculating the value of this counter varies depending on the value of the FACH_60_USER_SWITCH parameter in the SET UCACALGOSWITCH command.
If FACH_60_USER_SWITCH is set to OFF, this counter is calculated by the following formula: FACH UserNum Utility Ratio = VS.FACHUEs/30 x 100%
Draft A (2012-07-23)
16
where VS.FACHUEs is the number of users on the FACH.
If FACH_60_USER_SWITCH is set to ON, this counter is calculated by the following formula: FACH UserNum Utility Ratio = VS.FACHUEs/60 x 100% where VS.FACHUEs is the number of users on the FACH.
If the FACH usage or the ratio of users on the FACH to users allowed to use the FACH reaches 70%, you are advised to modify the parameter settings.
Draft A (2012-07-23)
17
3 HSPA-related Resource Monitoring
3
3.1 HSDPA
HSPA-related Resource Monitoring
HSPA includes High Speed Downlink Packet Access (HSDPA) and High Speed Uplink Packet Access (HSUPA). HSDPA and HSUPA protocols are part of the WCDMA standard. HSPA includes techniques such as fast scheduling, adaptive modulation, and hybrid automatic repeat request (HARQ) to achieve high speed data transmission. HSPA carries PS services. Conversational services are prioritized over PS services, and therefore HSPA uses network resources only after conversational services are served. This chapter describes how to monitor network resources when HSPA is enabled.
3.1.1 Power Resources

Figure 3-1 illustrates how the downlink transmit power in a cell is allocated. The dashed line indicates the total downlink transmit power in a cell. Figure 3-1 Dynamic power resource allocation
Draft A (2012-07-23)
18
The downlink transmit power in a cell is divided into four portions:

Power for CCH: This portion of power is allocated to common transport channels (CCHs) in the cell, such as the broadcast channel, pilot channel, and paging channel. Power margin: This portion of power cannot be allocated. The power margin is reserved to ensure that the system can remain stable even if the UE position or environment changes. Power for DPCH: This portion of power is allocated to real-time services (voice and video calls) and PS R99 services, and it varies with the number and location of users. The RNC and UE can adjust power for DPCH based on the power control algorithm. Power for HSPA: This portion of power is allocated to HSDPA and is calculated by the following formula: HSDPA User Power = Max Cell Transmit Power - (Power for CCH + Power Margin + Power for DPCH)
HSPA power schedulers are designed primarily to maximize power efficiency. For an HSDPA-enabled cell, you still need to monitor the TCP to check whether the cell is overloaded in the downlink. TCP thresholds for this cell are the same as those for a cell without HSDPA. With HSDPA, downlink power overload affects HSDPA performance before it affects conversational services.
3.1.2 OVSF Code Resources

OVSF code resources can be shared between HSDPA services and real-time services. The system can dynamically reallocate OVSF codes to HSDPA services and real-time services based on parameter settings, such as the number of codes reserved only for HSDPA and the number of codes that can be shared. The parameter settings can be modified online based on the network plan. When HSDPA is enabled, OVSF code resources are monitored in the same way as when HSDPA is not enabled.
You can reduce high OVSF code usage by modifying the parameter settings.
Figure 3-2 OVSF code sharing
Draft A (2012-07-23)
19
3.2 HSUPA
3.2.1 CE Resources
HSUPA channels are dedicated channels, and resource consumption of HUSPA services is measured by CEs. When HSUPA is enabled, uplink CE resources are shared between R99 services and HSUPA services. HSUPA increases uplink throughput and improves user experience. However, HSUPA consumes more uplink CE overhead for HARQ and soft handovers. Therefore, uplink CE resources may become a system bottleneck. Uplink CE usage needs to be monitored when HSUPA is enabled. Huawei NodeBs support dynamic HSUPA CE resource management, which uses CE resources efficiently.
3.2.2 RTWP
Just as HSDPA makes the most of downlink power resources, HSUPA maximizes uplink capacity. HSUPA data can always be sent in authorized mode unless the RTWP rises to 6 dBm. HSUPA increases uplink cell throughput but consumes a large amount of uplink RTWP. The uplink RTWP is monitored in the same way regardless of whether HSUPA is enabled. If RTWP overload occurs, HSUPA service rates must be lowered to ensure the QoS of conversational services.
Draft A (2012-07-23)
20
4 Diagnosis of Problems Related to Network Resources
Diagnosis of Problems Related to Network Resources

The preceding chapters describe the basic methods for monitoring network resources. These methods can be used to resolve most problems caused by high resource usage. In certain scenarios, further problem diagnosis is required to determine whether high resource usage is caused by a traffic increase or exceptions. This chapter describes how to diagnose problems related to network resources, including the following topics:

4.1 Call Congestion in the Basic Call Flow 4.2 Call Congestion Counters 4.3 Signaling Storms and Solutions 4.4 Resource Usage Analysis
This chapter is intended for experts who have a deep understanding of WCDMA networks.
4.1 Call Congestion in the Basic Call Flow

When network resources are insufficient, KPIs related to system accessibility are likely to be affected first. Figure 4-1 shows the basic call flow in which possible block points and failure points are marked. This flow uses a mobile-terminated call as an example. For details about the call flow, see 3GPP TS 25.931.
Draft A (2012-07-23)
21
Figure 4-1 Call flowchart showing possible block and failure points
The call flow in Figure 4-1 works as follows: 1. 2. 3. 4. 5. 6. The CN sends a paging message to the RNC. Upon receiving the paging message, the RNC broadcasts the message on the PCH. If the PCH is congested, the RNC may discard the message, as shown by block point 1. The UE probably cannot receive the paging message or connect to the network, as shown by failure point 2. If the UE receives the paging message, it sends an RRC connection request message to the RNC. Upon receiving the RRC connection request message, the RNC may discard the message if the RNC is congested, as shown by block point 3. If the RNC receives the RRC connection request message and does not discard the message, the RNC determines whether to accept or reject the request. The request may be rejected due to insufficient resources, as shown by block point 4. If the RNC accepts the request, the RNC instructs the UE to set up an RRC connection. However, the RRC connection setup may fail, the UE probably cannot receive the
7.
Draft A (2012-07-23)
22
instruction, or the UE receives the message but finds the configuration incorrect, as shown by failure points 5 and 6. 8. If the RRC connection is set up, the UE sends non-access stratum (NAS) messages to negotiate with the CN about service setup. If the CN determines to set up a service, the CN sends an RAB assignment request to the RNC. Upon receiving the RAB assignment request, the RNC accepts or rejects the request based on the resource usage on the RAN side. If the RNC rejects the request, the failure is shown by block point 7.
9.
10. If the RNC accepts the RAB assignment request, the RNC initiates a radio bearer (RB) setup procedure. During the procedure, the RNC allocates resources as follows:

The RNC allocates transmission resources over the Iub interface by setting up an RL to the NodeB. The RNC allocates channel resources over the Uu interface by sending an RB setup message to the UE.
A failure may occur in the RL or RB setup process, as shown by failure points 8 and 9.
4.2 Call Congestion Counters

As shown in Figure 4-1, call congestion may affect paging, RRC connection setup, or RAB setup. The following describes the performance counters and KPIs related to call congestion rate. For details about call congestion counters, see chapter 5 "Counter Definitions." You can also refer to the following documents:

BSC6910 UMTS Performance Counter Reference in the BSC6910 UMTS Product Documentation NodeB Performance Counter Reference in the 3900 Series WCDMA NodeB Product Documentation
4.2.1 Counters Related to Paging Loss

Counters related to paging loss are monitored at the RNC and cell levels. The RNC-level counters related to paging loss are as follows:

VS.RANAP.CsPaging.Loss: measures the number of CS paging messages discarded due to Iu-interface flow control, CPU overload, or PCH congestion. VS.RANAP.PsPaging.Loss: measures the number of PS paging messages discarded due to Iu-interface flow control, CPU overload, or PCH congestion. IU Paging Congestion Ratio (RNC): measures the ratio of discarded paging messages to total paging messages at the RNC level and is calculated by the following formula: IU Paging Congestion Ratio (RNC) = [(VS.RANAP.CsPaging.Loss+ VS.RANAP.PsPaging.Loss)/(VS.RANAP.CsPaging.Att + VS.RANAP.PsPaging.Att)] x 100%
The cell-level counters related to paging loss are as follows:

VS.RRC.Paging1.Loss.PCHCong.Cell: measures the number of paging messages discarded due to PCH congestion in a cell. IU Paging Congestion Ratio (Cell): measures the ratio of discarded paging messages to total paging messages at the cell level and is calculated by the following formula:
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 23
Draft A (2012-07-23)
IU Paging Congestion Ratio (Cell) = (VS.RRC.Paging1.Loss.PCHCong.Cell/ VS.UTRAN.AttPaging1) x 100%
4.2.2 Counters Related to RRC Congestion

The counters related to RRC congestion are as follows:
Counters for measuring the number of rejected RRC connection setup requests, including:

VS.RRC.Rej.ULPower.Cong (due to uplink power congestion) VS.RRC.Rej.DLPower.Cong (due to downlink power congestion) VS.RRC.Rej.UL.CE.Cong (due to uplink CE congestion) VS.RRC.Rej.DL.CE.Cong (due to downlink CE congestion) VS.RRC.Rej.ULIUBBand.Cong (due to uplink Iub bandwidth congestion) VS.RRC.Rej.DLIUBBand.Cong (due to downlink Iub bandwidth congestion) VS.RRC.Rej.Code.Cong (due to downlink code resource congestion)
VS.RRC.AttConnEstab.Sum
This counter measures the total number of RRC connection setup requests in a cell.
Vs.RRC.Block.Rate
This counter measures the call congestion rate and is calculated by the following formula: Vs.RRC.Block.Rate = Total RRC Rej/VS.RRC.AttConnEstab.Sum x 100% where Total RRC Rej is calculated as follows: Total RRC Rej = < VS.RRC.Rej.ULPower.Cong > + < VS.RRC.Rej.DLPower.Cong > + < VS.RRC.Rej.UL.CE.Cong > + < VS.RRC.Rej.DL.CE.Cong > + < VS.RRC.Rej.ULIUBBand.Cong > + < VS.RRC.Rej.DLIUBBand.Cong > + < VS.RRC.Rej.Code.Cong >
4.2.3 Counters Related to RAB Congestion

The counters related to RAB congestion are as follows:
Counters for measuring the number of failed RAB establishments due to power congestion, including:

VS.RAB.FailEstabCS.ULPower.Cong VS.RAB.FailEstabCS.DLPower.Cong VS.RAB.FailEstabPS.ULPower.Cong VS.RAB.FailEstabPS.DLPower.Cong
Counters for measuring the number of failed RAB establishments due to uplink CE congestion, including:

VS.RAB.FailEstabCS.ULCE.Cong VS.RAB.FailEstabPS.ULCE.Cong
Counters for measuring the number of failed RAB establishments due to downlink CE congestion, including:
VS.RAB.FailEstabCs.DLCE.Cong
Draft A (2012-07-23)
24

VS.RAB.FailEstabPs.DLCE.Cong
Counters for measuring the number of failed RAB establishments due to downlink code resource congestion, including:

VS.RAB.FailEstabCs.Code.Cong VS.RAB.FailEstabPs.Code.Cong
Counters for measuring the number of failed RAB establishments due to Iub bandwidth congestion, including:

VS.RAB.FailEstabCS.DLIUBBand.Cong VS.RAB.FailEstabCS.ULIUBBand.Cong VS.RAB.FailEstabPS.DLIUBBand.Cong VS.RAB.FailEstabPS.ULIUBBand.Cong
VS.RAB.AttEstab.Cell This counter measures the total number of RAB establishment requests in a cell. VS.RAB.Block.Rate This counter measures the RAB congestion rate and is calculated by the following formula: VS.RAB.Block.Rate = Total number of failed RAB establishments regardless of the cause of failure/VS.RAB.AttEstab.Cell
4.3 Signaling Storms and Solutions

In most cases, a smartphone makes more than 10 PS RAB setup attempts during a busy hour. This leads to more signaling exchanges on a smartphone than on a common UE during the service process. The additional signaling exchanges consume a large amount of signaling processing resources on the control plane of the RNC and NodeB. Figure 4-2 shows the process for analyzing signaling storms.
Draft A (2012-07-23)
25
Figure 4-2 Process for analyzing signaling storms
Table 4-1 describes solutions to signaling storms. These solutions aim to reduce signaling load so that the network capacity does not need to be expanded immediately.
Draft A (2012-07-23)
26
Table 4-1 Signaling storm causes and solutions UE Behavior The UE does not send signaling connection release indication (SCRI) messages. The UE sends SCRI messages that do not contain the cause value The R8 UE sends SCRI messages containing the cause value UE Type Common Nokia/Samsung/ Motorola UEs iPhone (R6) Solution Enable the CELL_PCH function to reduce signaling traffic because these common UEs do not send SCRI messages. Enable the enhanced fast dormancy (EFD) function on the RNC, and add the international mobile equipment identities (IMEIs) of the iPhones to the whitelist. Enable the R8 FD function on the RNC, and add the IMEIs of the iPhones to the whitelist.
iPhone4 (after R6)
4.4 Resource Usage Analysis

Figure 4-3 illustrates the general troubleshooting process for resource usage issues.
Draft A (2012-07-23)
27
Figure 4-3 General troubleshooting process
In most cases, an abnormal KPI triggers the troubleshooting process. Determining the possible top N problem cells facilitates follow-up troubleshooting. You are advised to analyze accessibility KPIs to identify the system bottleneck that causes access congestion.
Draft A (2012-07-23)
28
Figure 4-4 Key points for bottleneck analysis
4.4.1 RNC Resource Usage Analysis

Analysis of High CPU Usage of the Control Plane
Smartphones cause signaling storms on live networks. Therefore, signaling processing capability on the control plane is most likely to become a system bottleneck. If the control-plane load exceeds a preset alarm threshold, the RNC starts flow control and discards some RRC connection request messages or paging request messages. From the perspective of maintenance, the CPU usage of the control plane must be kept strictly below the alarm threshold to ensure system security.
Draft A (2012-07-23)
29
Figure 4-5 Process for analyzing the CPU usage of the control plane
As shown in Figure 4-5, if the CPU usage of the control plane is above 50%, identify the causes to prevent a continuous increase in the CPU usage. If the high CPU usage is caused by signaling storms, determine whether the parameter settings are correct. If the high CPU usage is caused by a traffic increase, add an EGPUa board. Control-plane and user-plane sharing adjusts the load on EGPUa boards to balance the average load on the control and user planes. NodeB Management (NBM) load cannot be shared between the control and user planes. Therefore, you must perform dynamic cell
Draft A (2012-07-23) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 30
reallocation to balance the NBM load among EGPUa boards in the control-plane resource pool. After the traffic model changes, reduce the number of cells under the RNC or add an EGPUa board when both of the following conditions are met:

Control-plane and user-plane sharing is enabled. The load is imbalanced between the control and user planes. For example, the control-plane load is lighter than the user-plane load, but the available control-plane resources are insufficient to add a NodeB or cell.
Analysis of High CPU Usage of the User Plane

If the user-plane load on an EGPU board or interface board is heavy, the RNC discards some user-plane data. Therefore, user-plane resources must be strictly monitored. Figure 4-6 Process for analyzing the CPU usage of the user plane
Draft A (2012-07-23)
31
If the CPU usage of the user plane is above 60%, identify the causes to prevent a continuous increase in the CPU usage. If the high CPU usage is caused by a traffic increase, add an EGPUa board.
4.4.2 NodeB Resource Usage Analysis

CE Usage Analysis
CE resources form a resource pool, which is shared among services on the NodeB. Common channels do not consume extra CE resources because the NodeB has reserved CE resources for these channels. Signaling, which is carried on an associated channel of the DCH, does not consume extra CE resources. When CE Overbooking is enabled, the NodeB calculates the CE usage of all admitted UEs at the cell group and NodeB levels, and reports the CE resource to the RNC every measurement period. The RNC then performs admission control based on the reported CE usage. Table 4-2 Number of CE resources consumed by different services Service type Number of Consumed CE Resources in the Uplink 1 3 3 5 5 10 1 2 4 8 16 32 48 64 Number of Consumed CE Resources in the Downlink 1 2 2 4 4 8 N/A N/A N/A N/A N/A N/A N/A N/A
AMR 12.2 kbit/s CS 64 kbit/s PS 64 kbit/s PS 128 kbit/s PS 144 kbit/s PS 384 kbit/s SF32 (HSUPA) SF16 (HSUPA) SF8 (HSUPA) SF4 (HSUPA) 2 x SF4 (HSUPA) 2 x SF2 (HSUPA) 2 x SF2+2 x SF4 (HSUPA) 2 x M2+2 x M4
NOTE
The preceding table assumes that signaling radio bearers (SRBs) are carried over HSUPA. If an SRB is carried on an R99 DCH separately, an extra CE is consumed by the SRB. In this case, add 1 to each number listed in the preceding table.
Draft A (2012-07-23)
32
HSDPA services do no consume downlink R99 CE resources. HSUPA services and R99 services share uplink CE resources. CE congestion or routine CE usage monitoring may trigger CE resource analysis. If the CE usage remains above a preset capacity expansion threshold or CE congestion occurs, the CE resources are insufficient and must be increased to ensure system stability. Figure 4-7 Process for analyzing CE congestion
Draft A (2012-07-23)
33
CE resources can be shared within a resource group. Therefore, CE usage on the NodeB must be calculated to determine where CE congestion occurs in a resource group or on the NodeB. If CE congestion occurs in a resource group, reallocate CE resources between resource groups. If CE congestion occurs on the site, perform site capacity expansion and modify parameter settings.
Iub Bandwidth Usage Analysis

NOTE
After IP RAN is introduced, Iub resources no longer need to be monitored. This section is retained to provide a complete solution so that operators can compare solutions provided by different vendors.
If insufficient Iub bandwidth causes congestion, check the Iub bandwidth usage. If the Iub bandwidth usage remains above 80% for a period of time, the Iub bandwidth is insufficient. If no more Iub bandwidth is available or the issue is not urgent, decrease the activity factor for PS services to admit more users. The activity factor, which is the ratio of actual bandwidth occupied by a UE to the allocated bandwidth, is used to estimate the real bandwidth usage. The activity factor can be set on a per-NodeB basis. The default activity factor is 0.7 for voice services and 0.4 for PS best effort (BE) services. Figure 4-8 Process for analyzing Iub bandwidth congestion
Draft A (2012-07-23)
34
4.4.3 Air Resource Usage Analysis

Power Resource Analysis
If the RTWP and TCP values are greater than preset thresholds, power congestion occurs. If the power congestion occurs in the downlink, enable the load reshuffling (LDR) and overload control (OLC) functions. If the power congestion occurs in the uplink, check whether any interference exists. This is because in most cases, interference rather than traffic increases causes uplink power congestion. If the RTWP value remains above -97 dBm for a period of time, identify root causes and troubleshoot the problem. If the high RTWP is caused by heavy traffic instead of signaling storms, implement the following:

Enable LDR and OLC for temporary troubleshooting. Add carriers or split cells for a long-term solution.
Figure 4-9 Process for analyzing power resource usage
Draft A (2012-07-23)
35
NOTE
RRC-related counters are updated based on the cause values in the RRC connection requests. This facilitates analysis of risks related to signaling storms.
Generally, adding a carrier is the most effective means for relieving uplink power congestion. If no additional carrier is available, consider deploying a new site or splitting the cell by reducing the tilt angle of the antenna.
Code Resource Usage Analysis

For Huawei RNCs, a certain number of codes can be reserved for HSDPA services by setting parameters. However, if five SF16 codes are reserved for HSDPA services, code congestion may occur under high traffic. The only solution to code congestion is to add carriers or split sectors because code resources are strongly correlated to hardware. In some scenarios, massive signaling exchanges on the network occupy a large number of codes, which results in code congestion, power congestion, or CPU overload. In these scenarios, analyze and pinpoint the exact cause, and rectify the fault rather than expand capacity immediately. If code congestion occurs, perform the following operations to reduce system load before expanding capacity:

Reduce the maximum number of PS RABs. Enable code-based LDR. Reduce the minimum number of codes reserved for HSDPA services. Activate the license for dynamic code allocation on the NodeB.
The thresholds for parameters involved in the preceding operations must be set based on the operator's requirements for service quality.
PCH Usage Analysis

In most cases, PCHs are overloaded because improper LAC planning results in excess paging messages. A common measure for PCH overload is to replan LAC areas.
Draft A (2012-07-23)
36
Figure 4-10 Process for analyzing PCH usage
FACH Usage Analysis

FACH congestion is less likely to occur when UE state transition is disabled. However, the RNC usually enables UE state transition to transfer low-traffic services to FACHs. This saves radio resources, but increases traffic on FACHs. Two methods are available for relieving FACH congestion:
Reduce the period during which PS services are carried on FACHs to enable fast UE state transition to the CELL_PCH state or idle mode. In addition, set up RRC connections on DCHs if DCH resources are sufficient. Add an SCCPCH to carry FACHs.
Draft A (2012-07-23)
37
Figure 4-11 Process for analyzing FACH usage
Draft A (2012-07-23)
38
5 Counter Definitions
5
Table 5-1 Counters Counter Name # Blocking Metrics Call block rate Vs.Call.Block.Rate (custom) Associated Counter
Counter Definitions
Table 5-1 defines all performance counters mentioned in the preceding chapters.
Calculation Formula
Vs.RRC.Block.Rate + (<RRC.SuccConnEstab.sum> /(<VS.RRC.AttConnEstab.CellDCH> + <VS.RRC.AttConnEstab.CellFACH>)) x Vs.Rab.Block.Rate (<VS.RRC.Rej.ULPower.Cong> + <VS.RRC.Rej.DLPower.Cong> + <VS.RRC.Rej.ULIUBBand.Cong> + <VS.RRC.Rej.DLIUBBand.Cong> + <VS.RRC.Rej.ULCE.Cong> + <VS.RRC.Rej.DLCE.Cong> + <VS.RRC.Rej.Code.Cong>)/<VS.RRC.AttConn Estab.Sum> (<VS.RAB.FailEstabCS.ULPower.Cong> + <VS.RAB.FailEstabCS.DLPower.Cong> +<VS.RAB.FailEstabPS.ULPower.Cong> + <VS.RAB.FailEstabPS.DLPower.Cong> + <VS.RAB.FailEstabCS.ULCE.Cong> + <VS.RAB.FailEstabPS.ULCE.Cong> + <VS.RAB.FailEstabCs.DLCE.Cong> + <VS.RAB.FailEstabPs.DLCE.Cong> + <VS.RAB.FailEstabCs.Code.Cong> + <VS.RAB.FailEstabPs.Code.Cong> + <VS.RAB.FailEstabCS.DLIUBBand.Cong> + <VS.RAB.FailEstabCS.ULIUBBand.Cong> + <VS.RAB.FailEstabPS.DLIUBBand.Cong> + <VS.RAB.FailEstabPS.ULIUBBand.Cong>)/VS. RAB.AttEstab.Cell
RRC block rate
Vs.RRC.Block.Rate (custom)
RAB block rate
Vs.RAB.Block.Rate (custom)
Draft A (2012-07-23)
39
Counter Name Call Attempt Times
Associated Counter VS.RAB.AttEstab.Cell (custom)
Calculation Formula (<VS.RAB.AttEstCS.Conv.64> + <VS.RAB.AttEstab.AMR> + <VS.RAB.AttEstabPS.Conv> + <VS.RAB.AttEstabPS.Str> + <VS.RAB.AttEstabPS.Inter> + <VS.RAB.AttEstabPS.Bkg>)
>>Power metrics R99_TCP_Utiliz ation_Ratio Total_TCP_Utili zation_Ratio Max UL RTWP Mean UL RTWP Min UL RTWP UL ENU Ratio >>IUB Metrics IUB BW Utilization ratio NODEB_Trans_ Cap NODEB_Throughput (custom) NODEB_Trans_Cap (custom) VS.IPDLTotal.1 VS.IPDLTotal.2 VS.IPDLTotal.3 VS.IPDLTotal.4 NODEB_Throug hput NODEB_Throug hput_DL NODEB_Throughput_DL (custom) NODEB_Throughput_UL (custom) VS.IPDLAvgUsed.1 VS.IPDLAvgUsed.2 VS.IPDLAvgUsed.3 VS.IPDLAvgUsed.4 NODEB_Throug hput_UL VS.IPULAvgUsed.1 VS.IPULAvgUsed.2 VS.IPULAvgUsed.3 VS.IPULAvgUsed.4 >>PCH&FACH Utilization Metrics PCH Physical Channel Utility Ratio VS.UTRAN.AttPaging1 VS.UTRAN.AttPaging1/(60 x 60 x 5/0.01) (VS.IPULAvgUsed.1 + VS.IPULAvgUsed.2 + VS.IPULAvgUsed.3 + VS.IPULAvgUsed.4) MAX(NODEB_Throughput_DL, NODEB_Throughput_UL) (VS.IPDLAvgUsed.1 + VS.IPDLAvgUsed.2 + VS.IPDLAvgUsed.3 + VS.IPDLAvgUsed.4) (VS.IPDLTotal.1 + VS.IPDLTotal.2 + VS.IPDLTotal.3 + VS.IPDLTotal.4) NODEB_Throughput/NODEB_Trans_Cap VS.MeanTCP.NonHS VS.MeanTCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum VS.MeanTCP.NonHS/Configured_Total_Cell_T CP (43 dBm or 46 dBm) VS.MeanTCP/Configured_Total_Cell_TCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum/UlTotalEqUserNum
Draft A (2012-07-23)
40
Counter Name FACH Utility Ratio
Associated Counter VS.CRNCIubBytesFACH.Tx VS.PCH.Bandwidth.UsageRate
Calculation Formula (1) Utilization of FACH carried on non-standalone SCCPCH FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/((60 x <SP> x 168 x 1/0.01) x VS.PCH.Bandwidth.UsageRate x 6/7 + (60 x <SP> x 360 x 1/0.01) x (1VS.PCH.Bandwidth.UsageRate x 6/7)) where VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx>/(<VS.CRNC.IUB .PCH.Bandwidth> x SP x 60.0) (2) Utilization of FACH carried on standalone SCCPCH FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/(60 x <SP> x 360 x 1/0.01)
FACH UserNum Utility Ratio
VS.FACHUEs
(1)FACH_60_USER_SWITCH = OFF FACH UserNum Utility Ratio = VS.FACHUEs/30 x 100% (2)FACH_60_USER_SWITCH = ON FACH UserNum Utility Ratio = VS.FACHUEs/60*100%
>>OVSF Utilization Metrics OVSF occupancy OVSF Usability Ratio DCH OVSF Ratio VS.RAB.SFOccupy VS.RAB.SFOccupy.Ratio DCH_OVSF_Utilization VS.RAB.SFOccupy VS.RAB.SFOccupy/256 ((<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>))/256
>>CPU Usage Metrics CP Utilization Rate VS.SUBSYS.CPULOAD.MEAN VS.SUBSYS.CPULOAD.MEAN
Draft A (2012-07-23)
41
Counter Name UP Utilization Rate INT Utilization Rate
Associated Counter VS.SUBSYS.CPULOAD.MEAN VS.INT.CPULOAD.MEAN VS.INT.TRANSLOAD.RATIO.MEA N VS.BRD.CPULOAD.MEAN
Calculation Formula VS.SUBSYS.CPULOAD.MEAN VS.INT.CPULOAD.MEAN VS.INT.TRANSLOAD.RATIO.MEAN VS.BRD.CPULOAD.MEAN
NodeB CPU Utilization Rate
>>Credit Utilization Metrics
When CE Overbooking is enabled, NodeB_UL_CE_MEAN_RATIO is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB(VS.NodeB.ULCred itUsed.Mean/2)/UL CE Cfg Number
NODEB UL CE Utility Ratio VS.NodeB.ULCreditUsed.Mean; VS.LC.ULCreditUsed.Mean;
When CE Overbooking is disabled, NodeB_UL_CE_MEAN_RATIO is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.LC.ULCreditUsed.Mean/2)/UL CE Cfg Number
NodeB_UL_GR P_CE_MEAN_R ATIO NodeB_DL_CE_ MEAN_RATIO
NodeB_UL_GRP_CE_MEAN_RATIO = Sum_AllCells_of_ResourceGroup(VS.LC.ULCr editUsed.Mean/2)/UL GRP CE Cfg Number Sum_AllCells_of_NodeB (VS.LC.DLCreditUsed.Mean)/MIN(NodeB License DL CE Number, NodeB Physical DL CE Capacity)
VS.LC.DLCreditUsed.Mean
Draft A (2012-07-23)
42

RAN14 Capacity Monitoring Guide

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

RAN14 Capacity Monitoring Guide

Cargado por

Copyright:

Formatos disponibles

RAN14.

Capacity Monitoring Guide

HUAWEI TECHNOLOGIES CO., LTD.

Copyright Huawei Technologies Co., Ltd. 2012. All rights reserved.

Trademarks and Permissions

Huawei Technologies Co., Ltd.

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

About This Document

About This Document

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

About This Document

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

2 Network Resource Performance Counters ................................................................................... 5

3 HSPA-related Resource Monitoring ............................................................................................ 18

4 Diagnosis of Problems Related to Network Resources .............................................................. 21

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

5 Counter Definitions ....................................................................................................................... 39

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

1 Network Resource Monitoring Methods

Network Resource Monitoring Methods

1.1 Introduction to Network Resources

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

1 Network Resource Monitoring Methods

Figure 1-1 Distribution of radio resources to be monitored

1.1.1 RNC Resources

1.1.2 NodeB Resources

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

1 Network Resource Monitoring Methods

1.1.3 Air Interface Resources

1.2 Resource Monitoring Procedure

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

1 Network Resource Monitoring Methods

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

2 Network Resource Performance Counters

Network Resource Performance Counters

2.1 RNC Resources

2.1.1 Control-Plane Load and User-Plane Load

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

2 Network Resource Performance Counters

Configuration of Control-Plane and User-Plane Sharing

2.1.2 Interface Board Load

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

2 Network Resource Performance Counters

Forwarding Load on the Interface Board

2.1.3 RNC Inter-Subrack Bandwidth

Counters for Monitoring Inter-Subrack Traffic

Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd.

RAN14.1 Capacity Monitoring Guide

2 Network Resource Performance Counters

2.2 NodeB Resources

2.2.2 WMPT/UMPT CNBAP Usage