Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Issue Date
Draft A 2012-07-23
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Draft A (2012-07-23)
Organization
This document provides guidelines for preventing resource congestion. When following these guidelines, you must consider intuitiveness and operability based on site requirements. This document consists of the following chapters. Chapter 1 Network Resource Monitoring Methods 2 Network Resource Performance Counters 3 HSPA-related Resource Monitoring 4 Diagnosis of Problems Related to Network Resources 5 Counter Definitions Description Describes basic concepts associated with network resources, including definitions and monitoring activities. Describes the counters for monitoring various network resources. Describes how to monitor network resources when High Speed Packet Access (HSPA) is enabled. Describes fault analysis and diagnosis methods that experienced WCDMA network maintenance personnel can use to handle network congestion or overload efficiently. Lists all performance counters mentioned in the preceding chapters and provides the definitions of these counters.
Draft A (2012-07-23)
ii
Change History
Issue 1 Date 2012-07-23 Description This is the Draft A release of RAN14.1. Version Draft A Prepared By Inventory Solutions Department
Draft A (2012-07-23)
iii
Contents
Contents
About This Document ....................................................................................................................... ii 1 Network Resource Monitoring Methods....................................................................................... 1
1.1 Introduction to Network Resources .................................................................................................................. 1 1.1.1 RNC Resources ....................................................................................................................................... 2 1.1.2 NodeB Resources .................................................................................................................................... 2 1.1.3 Air Interface Resources ........................................................................................................................... 3 1.2 Resource Monitoring Procedure ....................................................................................................................... 3
Draft A (2012-07-23)
iv
Contents
4.2 Call Congestion Counters ............................................................................................................................... 23 4.2.1 Counters Related to Paging Loss .......................................................................................................... 23 4.2.2 Counters Related to RRC Congestion ................................................................................................... 24 4.2.3 Counters Related to RAB Congestion ................................................................................................... 24 4.3 Signaling Storms and Solutions ..................................................................................................................... 25 4.4 Resource Usage Analysis ............................................................................................................................... 27 4.4.1 RNC Resource Usage Analysis ............................................................................................................. 29 4.4.2 NodeB Resource Usage Analysis .......................................................................................................... 32 4.4.3 Air Resource Usage Analysis ................................................................................................................ 35
Draft A (2012-07-23)
Two methods are available for monitoring network resources and detecting resource bottlenecks: Prediction-based monitoring
This method monitors various network resources simultaneously. You can monitor the usage of a network resource (for example, the downlink transmit power of a cell), compare the detected resource usage with a preset upper threshold, predict the resource usage trend and impact, and determine whether to perform network expansion. If the resource usage remains above the preset upper threshold for a long time (for example, a cell remains overloaded during busy hours for several consecutive days), you can split the cell or add carriers for network expansion. This method is used to identify high-load cells and RNCs. It is a conventional means of monitoring resource usage and is easy to implement. This chapter describes how to use this method to monitor network resources.
NOTE
For details about the counters for monitoring network resources, see chapter 2 "Network Resource Performance Counters." For details about HSPA-related resources, see chapter 3 "HSPA-related Resource."
Problem-driven analysis
This method is used to analyze decreased network performance counters after network congestion has occurred. This method is more complex than prediction-based monitoring because it requires more analysis instruments and skills. However, this method can maximize how the system utilizes existing resources and eliminate the need for immediate network expansion. For details about this method, see chapter 4 "Diagnosis of Problems Related to Network Resources."
NOTE
Experienced network maintenance engineers can also use other methods to analyze network resource problems.
Draft A (2012-07-23)
Control-plane and user-plane resources BSC6910 boards do not differentiate between the control plane and the user plane. A new Evolved General Processing Unit REV: a (EGPUa) board was introduced into the BSC6910 to process control-plane data and user-plane data. As the traffic volume grows, the control-plane load and user-plane load of the EGPUa board may exceed its planned processing capability and become a system bottleneck.
Interface board resources Interface boards on the RNC provide various transmission ports and resources to process transport network messages and to enable interaction between RNC internal data and external data. On the interface boards, control-plane overload affects service connection, and user-plane overload causes packets to be discarded.
Inter-subrack bandwidth The RNC provides the bandwidth for inter-subrack information exchange.
Channel elements (CEs) CEs are baseband processing resources managed at the NodeB level. For a newly deployed network, CE usage is initially set to a small value to save capital expenditure (CAPEX). Generally, CE congestion is most likely to occur on the network.
NodeB boards NodeB boards are classified into the WMPT/UMPT, WCDMA Baseband Processing Unit (WBBP), and Universal Transmission Processing unit (UTRP). The WCDMA Main Processing and Transmission unit (WMPT) or Universal Main Processing and Transmission unit (UMPT) performs signaling processing and resource management. Central processing unit (CPU) overload of the WMPT or UMPT decreases system processing capabilities, which affects NodeB-related KPIs.
Iub transmission resources Based on the transmission medium, there are two types of transmission: asynchronous transfer mode (ATM) transmission and Internet Protocol (IP) transmission. On an IP transport network, the NodeB and RNC can dynamically adjust uplink and downlink Iub transmission bandwidth. Generally, transmission resource bottlenecks do
Draft A (2012-07-23)
not result from insufficient capacities of interface boards but from low bandwidth available on the IP transport network.
Received total wideband power (RTWP) RTWP is the total power received by a NodeB within a bandwidth, including the receiver noise, external radio interference, and uplink power generated due to traffic. RTWP, which is similar to the received signal strength indicator (RSSI) in the CDMA system, measures uplink load. RSSI measures the total channel power received by a UE in the downlink.
Transmitted carrier power (TCP) TCP is the full-carrier transmit power in a cell and monitors downlink load. The TCP value is limited by the maximum transmission capability of the power amplifier on a NodeB.
Orthogonal variable spreading factor (OVSF) OVSF is a downlink code resource. In the downlink, only one OVSF code tree is available for each cell.
Paging channel (PCH) PCH usage is directly related to the location area code (LAC) plan and PCH state transition. PCH overload decreases the paging success rate.
Random access channel (RACH) and forward access channel (FACH) The RACH and FACH carry signaling and a small amount of user-plane data. RACH/FACH overload may decrease the access success rate, affecting user experience.
If yes, the cell or NodeB is overloaded. Then, perform network expansion. If no, the cell or NodeB is not necessarily overloaded. In this case, network expansion is not mandatory because the problem might be solved by adjusting parameter configurations.
For example, the CE usage is above 70% but the usages of other resources such as RTWP, TCP, and OVSF codes are within their allowed ranges. In this case, CE resources are insufficient but the cell is not overloaded. To solve the problem, configure licenses allowing more CEs or add baseband processing boards instead of performing network expansion immediately. After the load is basically balanced between the control plane and the user plane:
If the CPU usage of the control plane is above 50%, the RNC is overloaded. If the CPU usage of the user plane is above 60%, the RNC is overloaded.
Draft A (2012-07-23)
Figure 1-2 shows the details. Figure 1-2 Resource monitoring flowchart
Other resource usages are not used to determine whether the RNC is overloaded. This flowchart applies to most resource monitoring scenarios, except when the system overload is caused by an unexpected event rather than a service increase. To simplify the procedure, unexpected events are not considered in this flowchart. The causes of unexpected events might be located through a comprehensive analysis of various resource bottlenecks. For details about how to locate a resource-related problem, see chapter 4 "Diagnosis of Problems Related to Network Resources."
Draft A (2012-07-23)
Draft A (2012-07-23)
pools. Control-plane resources form a control-plane resource pool, and user-plane resources form a user-plane resource pool. The number of resources allocated to the control plane and user plane can be adjusted based on service requirements so that load is balanced between the two planes. Consider the control-plane load and user-plane load comprehensively before performing capacity expansion.
For details about other parameters in the SET UCPUPFLEXCFG command, see BSC6910 UMTS MML Command Reference.
Counters for Monitoring Average CPU Usage on the Control and User Planes
The VS.SUBSYS.CPULOAD.MEAN counter measures the average CPU usage of a subsystem during a measurement period and reflects the CPU load and quality of the subsystem during the measurement period. The average CPU usage on the control plane is calculated by the following formula: Average CPU usage on the control plane = Average (VS.SUBSYS.CPULOAD.MEAN measured for all CP subsystems) The average CPU usage on the user plane is calculated by the following formula: Average CPU usage on the user plane = Average (VS.SUBSYS.CPULOAD.MEAN measured for all UP subsystems) The recommended upper thresholds for monitoring the average CPU usage on the control and user planes are 50% and 60%, respectively. When FlexCfgMode is set to ManulMode or FrozenMode, you are advised to enable dynamic sharing or to adjust the ratio of resources split between processing user plane data and control plane data if the control plane or user plane is overloaded. If the control-plane or user-plane resources cannot meet your traffic model's requirements, perform capacity expansion immediately.
Draft A (2012-07-23)
The counters for monitoring the CPU usage on the interface board are as follows:
VS.INT.CPULOAD.MEAN This counter measures the average CPU usage of an interface board as a percentage. Average CPU usage for session load This counter is calculated by the following formula: Average CPU usage for session load = VS.INT.CFG.INTERWORKING.NUM/Number of established or released sessions per second x 60 x SP x 100% where
VS.INT.CFG.INTERWORKING.NUM is the number of call setup attempts on an interface board. SP is the measurement period (in minutes). The number of established or released sessions per second is subject to system specifications: 5000 for the GOUc and FG2c boards and 50,000 for the EXOUa board.
It is recommended that an interface board be added when the CPU usage or session load of an interface board exceeds 50%.
Introduction
If active and standby SCUa boards are used, the inter-subrack bandwidth is 4 Gbit/s. If active and standby SCUb boards are used, the inter-subrack bandwidth is 40 Gbit/s. If the active or standby SCUa/SCUb board is faulty, the inter-subrack bandwidth decreases by half.
Frame Peak Utility Ratio This counter measures the peak usage of inter-subrack traffic and is calculated by the following formula: Frame Peak Utility Ratio = VS.Frame.Flux.Peak.TxRate/Inter-subrack bandwidth x 100% where VS.Frame.Flux.Peak.TxRate is the peak traffic volume transmitted between subracks.
Draft A (2012-07-23)
If the value of Frame Peak Utility Ratio is greater than 60%, a prewarning is required.
Frame Mean Utility Ratio This counter measures the average usage of inter-subrack traffic and is calculated by the following formula: Frame Mean Utility Ratio = VS.Frame.Flux.Mean.TxRate/Inter-subrack bandwidth x 100% where VS.Frame.Flux.Mean.TxRate is the average traffic volume transmitted between subracks. If the value of Frame Mean Utility Ratio is greater than 40%, a prewarning is required.
Frame DropPackets Ratio This counter measures the packet loss rate between subracks and is calculated by the following formula: Frame DropPackets Ratio = VS.Frame.Flux.DropPackets/VS.Frame.Flux.TxPackets x 100% where
VS.Frame.Flux.DropPackets is the number of packets discarded during packet transmission between subracks. VS.Frame.Flux.TxPackets is the number of packets transmitted between subracks.
If the value of Frame DropPackets Ratio is greater than 0.01%, a prewarning is required. If the prewarning threshold is reached or an inter-subrack bandwidth congestion alarm is reported, contact Huawei engineers for problem handling.
Draft A (2012-07-23)
CNBAP usage
where
VS.IUB.AttRLSetup is the number of Iub RL setup requests in a cell. VS.IUB.AttRLAdd is the number of Iub RL addition requests in a cell. VS.IUB.AttRLRecfg is the number of Iub RL reconfiguration attempts in a cell. SP is the measurement period (in seconds). CNBAP capacity of the NodeB depends on the configuration of the WMPT/UMPT, WBBP, and UTRP boards.
If the CNBAP usage of the WMPT/UMPT board is above 60% on a Huawei NodeB, you are advised to perform capacity expansion by adding a WBBP/UTRP board or splitting the NodeB.
2.2.3 CE Usage
Channel elements (CEs) are baseband resources provided by NodeBs. One CE resource can be consumed by a 12.2 kbit/s voice call. If available CE resources are insufficient, a new call request is rejected. The total available CE resources are limited by the hardware and configured licenses. If CE resources on the WBBP board are sufficient, the CE resources are limited by only licenses. In this case, expand the licensed capacity. CE resources are managed and shared in a resource group of the NodeB. Separate baseband processing units are used in the uplink and downlink directions of a NodeB. Therefore, uplink and downlink CE resources are managed and used independently of each other.
The number of licensed CE resources is distributed by the M2000. The number of CE resources provided by boards on a NodeB and the number of CE resources provided by boards in an uplink resource group are calculated based on NodeB board configuration and specifications. You can run MML commands to query the NodeB board configuration.
Uplink CE Usage
Since RAN14.0, the CE Overbooking feature has been introduced for uplink CE resources. The counters for monitoring uplink CE usage vary depending on whether CE Overbooking is enabled. If CE Overbooking is enabled, the following counters are used to monitor uplink CE usage:
VS.NodeB.ULCreditUsed.Mean This counter measures the average uplink NodeB credit resource usage when CE Overbooking is enabled.
NodeB_UL_CE_MEAN_RATIO This counter measures the uplink CE usage on a NodeB and is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.NodeB.ULCreditUsed.Mean/2)/UL CE Cfg Number
Draft A (2012-07-23)
where
VS.NodeB.ULCreditUsed.Mean is the average uplink NodeB credit resource usage when CE Overbooking is enabled. UL CE Cfg Number is the number of available uplink CE resources on a NodeB. The number is the smaller of the values for two counters: NodeB License UL CE Number and NodeB Physical UL CE Capacity. NodeB License UL CE Number measures the number of licensed uplink CE resources on a NodeB. NodeB Physical UL CE Capacity measures the number of uplink CE resources provided by boards on a NodeB.
If CE Overbooking is disabled, the following counters are used to monitor uplink CE usage:
NodeB_UL_GRP_CE_MEAN_RATIO This counter measures the CE usage in an uplink resource group of a NodeB. Each NodeB allows for multiple uplink resource groups. This counter is calculated by the following formula: NodeB_UL_GRP_CE_MEAN_RATIO = Sum_AllCells_of_ResourceGroup (VS.LC.ULCreditUsed.Mean/2)/UL GRP CE Cfg Number where
VS.LC.ULCreditUsed.Mean is the average uplink NodeB credit resource usage in a cell. UL GRP CE Cfg Number is the number of available CE resources in an uplink resource group. The number is the smaller of the values for two counters: NodeB License UL CE Number and NodeB Physical UL group CE Capacity. NodeB Physical UL group CE Capacity measures the number of CE resources provided by boards in an uplink resource group.
NodeB_UL_CE_MEAN_RATIO This counter measures the uplink CE usage on a NodeB and is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.LC.ULCreditUsed.Mean/2)/UL CE Cfg Number where
VS.LC.ULCreditUsed.Mean is the average uplink NodeB credit resource usage in a cell. UL CE Cfg Number is the number of available uplink CE resources on a NodeB.
The upper threshold for uplink CE usage is 70%. If CE usage is above 70% in an uplink resource group of a NodeB, add a WBBP board to the uplink resource group even if the CE usage is low on the NodeB.
Downlink CE Usage
The NodeB_DL_CE_MEAN_RATIO counter measures the downlink CE usage and is calculated by the following formula: NodeB_DL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB(VS.LC.DLCreditUsed.Mean)/DL CE Cfg Number where
Draft A (2012-07-23)
10
DL CE Cfg Number is the number of available downlink CE resources on a NodeB. The number is the smaller of the values for two counters: NodeB License DL CE Number and NodeB Physical DL CE Capacity. NodeB License DL CE Number measures the number of licensed downlink CE resources on a NodeB. NodeB Physical DL CE Capacity measures the number of downlink CE resources provided by boards on a NodeB.
If the downlink CE usage is above 70%, perform capacity expansion by adding a WBBP board.
Draft A (2012-07-23)
11
Figure 2-1 Relationship between RTWP, interference increase, and uplink load
The recommended uplink load threshold is 75%, and the corresponding RTWP value is smaller than -100 dBm. If the RNC uses algorithm 2 during admission control, the value of the UL ENU Ratio counter should be smaller than 75%. UL ENU Ratio measures the ratio of uplink equivalent users to users supported in the cell. Algorithm 2 is the Equivalent Number of Users (ENU) algorithm. If the RTWP value is greater than -100 dBm, the cell is overloaded in the uplink. Generally, if a cell is overloaded or the RTWP value is too large, the cell coverage shrinks, the quality of ongoing services deteriorates, and new service requests may be rejected. The following counters related to RTWP and ENU are used on Huawei RNCs:
VS.MeanRTWP: average RTWP (in dBms) in a cell VS.MinRTWP: minimum RTWP (in dBms) in a cell VS.RAC.UL.EqvUserNum: number of uplink equivalent users on all dedicated channels in a cell UL ENU Ratio: this counter is calculated by the following formula: UL ENU Ratio = VS.RAC.UL.EqvUserNum/UlTotalEqUserNum where UlTotalEqUserNum is the maximum number of equivalent users in a cell, which can be set by running the ADD UCELLCAC command.
In some areas, the background noise increases to far higher than -106 dBm due to other interference or hardware problems, such as antennas or feeder connectors of poor quality. In this case, the VS.MinRTWP counter, which measures the RTWP when there is no traffic in a cell, reflects the background noise in the cell. If the VS.MinRTWP value during the idle hour is larger than -100 dBm or smaller than -110 dBm for three consecutive days in one week, there are hardware problems or external interference. Find the cause, and troubleshoot the problems or eliminate the external interference. The recommended uplink load threshold is indicated by the VS.MeanRTWP counter. A cell is considered heavily loaded in the uplink when either of the following is true for two or three days in one week:
Draft A (2012-07-23)
12
The VS.MeanRTWP value during the busy hour remains above -100 dBm, which corresponds to a 6 dB interference increase or 75% load. The UL ENU Ratio value during the busy hour remains above 75%.
When the cell is heavily loaded, perform capacity expansion by adding a carrier or increasing the UlTotalEqUserNum value.
The cell coverage is insufficient. The data throughput is limited. The service quality deteriorates. New call requests may be rejected.
The downlink power consumption in a cell is related to the cell load, UE location, and cell coverage. Any one of the following scenarios leads to more power consumption:
Large cell coverage Long distance between the UE and the cell center Heavy cell load
In a WCDMA system, TCP is used to measure the downlink total transmit power in a cell. Four TCP-related counters are used on Huawei RNCs:
VS.MeanTCP: average downlink carrier transmit power in a cell VS.MaxTCP: maximum downlink carrier transmit power in a cell VS.MinTCP: minimum downlink carrier transmit power in a cell VS.MeanTCP.NonHS: average downlink carrier transmit power in a non-HSDPA cell
The downlink cell load is indicated by the TCP usage in the cell. If the TCP usage in a cell remains above 85% of the VS.MaxTCP value, the cell is overloaded in the downlink. The TCP usage in a cell is calculated by the following formula: TCP usage in a cell = where MaxTxPower is set by running the ADD UCELLSETUP command. The following provides two examples of determining whether a cell is overloaded based on the formula. If Downlink Maximum Transmit Power 20 W (43 dBm) Upper Threshold for TCP Usage 85% Then Upper TCP Threshold That Triggers Cell Overload 17 W (42.3 dBm) / x 100%
Draft A (2012-07-23)
13
40 W (46 dBm)
85%
34 W (45.3 dBm)
If the TCP usage during the busy hour remains above 85% for three consecutive days in one week, the cell is heavily loaded in the downlink. Then, perform capacity expansion by adding a carrier. Some live networks use hierarchical cell structures with multiple carriers. The cell power settings and the corresponding upper TCP thresholds vary depending on the networking policy and cell service priority.
In the downlink, the maximum spreading factor (SF) 256 can be used. In a cell, only one OVSF code tree is available. In the OVSF code tree, sibling codes are orthogonal to each other, but are not with their parent or child codes. As a result, once a code is allocated to a user, neither its parent nor child code can be allocated to any other user. OVSF code resources are limited. If available OVSF codes are insufficient to achieve the desired quality of service (QoS), a new call request may be rejected.
Draft A (2012-07-23)
14
An OVSF code tree can be divided into 4 SF4 codes, 8 SF8 codes, 16 SF16 codes, or 256 SF256 codes. Codes with various SFs can be considered as equivalent codes with SF 256. For example, a code with SF 8 is equivalent to 32 codes with SF 256. Based on this equivalence mapping, the OVSF code usage can be calculated for a user or in a cell. Huawei RNCs monitor the average code usage of an OVSF code tree based on the number of equivalent codes with SF 256. The VS.RAB.SFOccupy counter measures the average code usage of an OVSF code tree. The counters for monitoring the OVSF code usage are as follows:
OVSF_Utilization, which is calculated by the following formula: OVSF_Utilization = VS.RAB.SFOccupy/256 DCH_OVSF_Utilization, which is calculated by the following formula: DCH_OVSF_Utilization = DCH_OVSF_CODE/256 where DCH_OVSF_CODE is calculated as follows: DCH_OVSF_CODE = ((<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>))
If the DCH_OVSF_Utilization value during the busy hour remains above 70% for three consecutive days in one week, the cell runs out of OVSF codes. Perform capacity expansion by splitting the cell or adding a carrier.
State transition of a large number of UEs performing packet switched (PS) services RRC signaling storms
Based on the default parameter settings for Huawei RNCs, the PCH and FACH usages are calculated as follows:
PCH usage The PCH Physical Channel Utility Ratio counter measures the PCH usage and is calculated by the following formula:
Draft A (2012-07-23)
15
VS.UTRAN.AttPaging1 is the number of PAGING TYPE1 messages transmitted in a cell. SP indicates the measurement period (in seconds). If paging messages are not retransmitted, 5% of them will be lost when the PCH usage reaches 60%. If paging messages are retransmitted once or twice, 1% of them will be lost when the PCH usage reaches 70%.
Based on the basic principles, you are advised to perform fault diagnosis or replan LAC areas.
FACH usage The FACH usage is indicated by the FACH Utility Ratio counter. The method for calculating the value of this counter varies depending on whether a standalone secondary common control physical channel (SCCPCH) is used. If a FACH is carried on a non-standalone SCCPCH, the FACH usage is calculated by the following formula: FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/((60 x <SP> x 168 x 1/0.01) x VS.PCH.Bandwidth.UsageRate x 6/7 + (60 x <SP> x 360 x 1/0.01) x (1VS.PCH.Bandwidth.UsageRate x 6/7)) where
VS.CRNCIubBytesFACH.Tx is the traffic transmitted on the FACH on the Iub interface. VS.PCH.Bandwidth.UsageRate is the PCH bandwidth usage and is calculated as follows: VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx>/(<VS.CRNC.IUB.PCH.Bandwidth> x SP x 60.0) In this formula, VS.CRNCIubBytesPCH.Tx measures the traffic transmitted on the PCH on the Iub interface; VS.CRNC.IUB.PCH.Bandwidth measures the PCH bandwidth for the CRNC in a cell.
If a FACH is carried on a standalone SCCPCH, the FACH usage is calculated by the following formula: FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/(60 x <SP> x 360 x 1/0.01) where
VS.CRNCIubBytesFACH.Tx is the traffic transmitted on the FACH on the Iub interface. SP is the measurement period (in seconds).
Ratio of users on the FACH to users allowed to use the FACH This ratio is indicated by the FACH UserNum Utility Ratio counter. The method for calculating the value of this counter varies depending on the value of the FACH_60_USER_SWITCH parameter in the SET UCACALGOSWITCH command.
If FACH_60_USER_SWITCH is set to OFF, this counter is calculated by the following formula: FACH UserNum Utility Ratio = VS.FACHUEs/30 x 100%
Draft A (2012-07-23)
16
If FACH_60_USER_SWITCH is set to ON, this counter is calculated by the following formula: FACH UserNum Utility Ratio = VS.FACHUEs/60 x 100% where VS.FACHUEs is the number of users on the FACH.
If the FACH usage or the ratio of users on the FACH to users allowed to use the FACH reaches 70%, you are advised to modify the parameter settings.
Draft A (2012-07-23)
17
3
3.1 HSDPA
HSPA includes High Speed Downlink Packet Access (HSDPA) and High Speed Uplink Packet Access (HSUPA). HSDPA and HSUPA protocols are part of the WCDMA standard. HSPA includes techniques such as fast scheduling, adaptive modulation, and hybrid automatic repeat request (HARQ) to achieve high speed data transmission. HSPA carries PS services. Conversational services are prioritized over PS services, and therefore HSPA uses network resources only after conversational services are served. This chapter describes how to monitor network resources when HSPA is enabled.
Draft A (2012-07-23)
18
Power for CCH: This portion of power is allocated to common transport channels (CCHs) in the cell, such as the broadcast channel, pilot channel, and paging channel. Power margin: This portion of power cannot be allocated. The power margin is reserved to ensure that the system can remain stable even if the UE position or environment changes. Power for DPCH: This portion of power is allocated to real-time services (voice and video calls) and PS R99 services, and it varies with the number and location of users. The RNC and UE can adjust power for DPCH based on the power control algorithm. Power for HSPA: This portion of power is allocated to HSDPA and is calculated by the following formula: HSDPA User Power = Max Cell Transmit Power - (Power for CCH + Power Margin + Power for DPCH)
HSPA power schedulers are designed primarily to maximize power efficiency. For an HSDPA-enabled cell, you still need to monitor the TCP to check whether the cell is overloaded in the downlink. TCP thresholds for this cell are the same as those for a cell without HSDPA. With HSDPA, downlink power overload affects HSDPA performance before it affects conversational services.
Draft A (2012-07-23)
19
3.2 HSUPA
3.2.1 CE Resources
HSUPA channels are dedicated channels, and resource consumption of HUSPA services is measured by CEs. When HSUPA is enabled, uplink CE resources are shared between R99 services and HSUPA services. HSUPA increases uplink throughput and improves user experience. However, HSUPA consumes more uplink CE overhead for HARQ and soft handovers. Therefore, uplink CE resources may become a system bottleneck. Uplink CE usage needs to be monitored when HSUPA is enabled. Huawei NodeBs support dynamic HSUPA CE resource management, which uses CE resources efficiently.
3.2.2 RTWP
Just as HSDPA makes the most of downlink power resources, HSUPA maximizes uplink capacity. HSUPA data can always be sent in authorized mode unless the RTWP rises to 6 dBm. HSUPA increases uplink cell throughput but consumes a large amount of uplink RTWP. The uplink RTWP is monitored in the same way regardless of whether HSUPA is enabled. If RTWP overload occurs, HSUPA service rates must be lowered to ensure the QoS of conversational services.
Draft A (2012-07-23)
20
4.1 Call Congestion in the Basic Call Flow 4.2 Call Congestion Counters 4.3 Signaling Storms and Solutions 4.4 Resource Usage Analysis
This chapter is intended for experts who have a deep understanding of WCDMA networks.
Draft A (2012-07-23)
21
Figure 4-1 Call flowchart showing possible block and failure points
The call flow in Figure 4-1 works as follows: 1. 2. 3. 4. 5. 6. The CN sends a paging message to the RNC. Upon receiving the paging message, the RNC broadcasts the message on the PCH. If the PCH is congested, the RNC may discard the message, as shown by block point 1. The UE probably cannot receive the paging message or connect to the network, as shown by failure point 2. If the UE receives the paging message, it sends an RRC connection request message to the RNC. Upon receiving the RRC connection request message, the RNC may discard the message if the RNC is congested, as shown by block point 3. If the RNC receives the RRC connection request message and does not discard the message, the RNC determines whether to accept or reject the request. The request may be rejected due to insufficient resources, as shown by block point 4. If the RNC accepts the request, the RNC instructs the UE to set up an RRC connection. However, the RRC connection setup may fail, the UE probably cannot receive the
7.
Draft A (2012-07-23)
22
instruction, or the UE receives the message but finds the configuration incorrect, as shown by failure points 5 and 6. 8. If the RRC connection is set up, the UE sends non-access stratum (NAS) messages to negotiate with the CN about service setup. If the CN determines to set up a service, the CN sends an RAB assignment request to the RNC. Upon receiving the RAB assignment request, the RNC accepts or rejects the request based on the resource usage on the RAN side. If the RNC rejects the request, the failure is shown by block point 7.
9.
10. If the RNC accepts the RAB assignment request, the RNC initiates a radio bearer (RB) setup procedure. During the procedure, the RNC allocates resources as follows:
The RNC allocates transmission resources over the Iub interface by setting up an RL to the NodeB. The RNC allocates channel resources over the Uu interface by sending an RB setup message to the UE.
A failure may occur in the RL or RB setup process, as shown by failure points 8 and 9.
BSC6910 UMTS Performance Counter Reference in the BSC6910 UMTS Product Documentation NodeB Performance Counter Reference in the 3900 Series WCDMA NodeB Product Documentation
VS.RANAP.CsPaging.Loss: measures the number of CS paging messages discarded due to Iu-interface flow control, CPU overload, or PCH congestion. VS.RANAP.PsPaging.Loss: measures the number of PS paging messages discarded due to Iu-interface flow control, CPU overload, or PCH congestion. IU Paging Congestion Ratio (RNC): measures the ratio of discarded paging messages to total paging messages at the RNC level and is calculated by the following formula: IU Paging Congestion Ratio (RNC) = [(VS.RANAP.CsPaging.Loss+ VS.RANAP.PsPaging.Loss)/(VS.RANAP.CsPaging.Att + VS.RANAP.PsPaging.Att)] x 100%
VS.RRC.Paging1.Loss.PCHCong.Cell: measures the number of paging messages discarded due to PCH congestion in a cell. IU Paging Congestion Ratio (Cell): measures the ratio of discarded paging messages to total paging messages at the cell level and is calculated by the following formula:
Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 23
Draft A (2012-07-23)
Counters for measuring the number of rejected RRC connection setup requests, including:
VS.RRC.Rej.ULPower.Cong (due to uplink power congestion) VS.RRC.Rej.DLPower.Cong (due to downlink power congestion) VS.RRC.Rej.UL.CE.Cong (due to uplink CE congestion) VS.RRC.Rej.DL.CE.Cong (due to downlink CE congestion) VS.RRC.Rej.ULIUBBand.Cong (due to uplink Iub bandwidth congestion) VS.RRC.Rej.DLIUBBand.Cong (due to downlink Iub bandwidth congestion) VS.RRC.Rej.Code.Cong (due to downlink code resource congestion)
VS.RRC.AttConnEstab.Sum
This counter measures the total number of RRC connection setup requests in a cell.
Vs.RRC.Block.Rate
This counter measures the call congestion rate and is calculated by the following formula: Vs.RRC.Block.Rate = Total RRC Rej/VS.RRC.AttConnEstab.Sum x 100% where Total RRC Rej is calculated as follows: Total RRC Rej = < VS.RRC.Rej.ULPower.Cong > + < VS.RRC.Rej.DLPower.Cong > + < VS.RRC.Rej.UL.CE.Cong > + < VS.RRC.Rej.DL.CE.Cong > + < VS.RRC.Rej.ULIUBBand.Cong > + < VS.RRC.Rej.DLIUBBand.Cong > + < VS.RRC.Rej.Code.Cong >
Counters for measuring the number of failed RAB establishments due to power congestion, including:
Counters for measuring the number of failed RAB establishments due to uplink CE congestion, including:
VS.RAB.FailEstabCS.ULCE.Cong VS.RAB.FailEstabPS.ULCE.Cong
Counters for measuring the number of failed RAB establishments due to downlink CE congestion, including:
VS.RAB.FailEstabCs.DLCE.Cong
Draft A (2012-07-23)
24
VS.RAB.FailEstabPs.DLCE.Cong
Counters for measuring the number of failed RAB establishments due to downlink code resource congestion, including:
VS.RAB.FailEstabCs.Code.Cong VS.RAB.FailEstabPs.Code.Cong
Counters for measuring the number of failed RAB establishments due to Iub bandwidth congestion, including:
VS.RAB.AttEstab.Cell This counter measures the total number of RAB establishment requests in a cell. VS.RAB.Block.Rate This counter measures the RAB congestion rate and is calculated by the following formula: VS.RAB.Block.Rate = Total number of failed RAB establishments regardless of the cause of failure/VS.RAB.AttEstab.Cell
Draft A (2012-07-23)
25
Table 4-1 describes solutions to signaling storms. These solutions aim to reduce signaling load so that the network capacity does not need to be expanded immediately.
Draft A (2012-07-23)
26
Table 4-1 Signaling storm causes and solutions UE Behavior The UE does not send signaling connection release indication (SCRI) messages. The UE sends SCRI messages that do not contain the cause value The R8 UE sends SCRI messages containing the cause value UE Type Common Nokia/Samsung/ Motorola UEs iPhone (R6) Solution Enable the CELL_PCH function to reduce signaling traffic because these common UEs do not send SCRI messages. Enable the enhanced fast dormancy (EFD) function on the RNC, and add the international mobile equipment identities (IMEIs) of the iPhones to the whitelist. Enable the R8 FD function on the RNC, and add the IMEIs of the iPhones to the whitelist.
Draft A (2012-07-23)
27
In most cases, an abnormal KPI triggers the troubleshooting process. Determining the possible top N problem cells facilitates follow-up troubleshooting. You are advised to analyze accessibility KPIs to identify the system bottleneck that causes access congestion.
Draft A (2012-07-23)
28
Draft A (2012-07-23)
29
Figure 4-5 Process for analyzing the CPU usage of the control plane
As shown in Figure 4-5, if the CPU usage of the control plane is above 50%, identify the causes to prevent a continuous increase in the CPU usage. If the high CPU usage is caused by signaling storms, determine whether the parameter settings are correct. If the high CPU usage is caused by a traffic increase, add an EGPUa board. Control-plane and user-plane sharing adjusts the load on EGPUa boards to balance the average load on the control and user planes. NodeB Management (NBM) load cannot be shared between the control and user planes. Therefore, you must perform dynamic cell
Draft A (2012-07-23) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 30
reallocation to balance the NBM load among EGPUa boards in the control-plane resource pool. After the traffic model changes, reduce the number of cells under the RNC or add an EGPUa board when both of the following conditions are met:
Control-plane and user-plane sharing is enabled. The load is imbalanced between the control and user planes. For example, the control-plane load is lighter than the user-plane load, but the available control-plane resources are insufficient to add a NodeB or cell.
Draft A (2012-07-23)
31
If the CPU usage of the user plane is above 60%, identify the causes to prevent a continuous increase in the CPU usage. If the high CPU usage is caused by a traffic increase, add an EGPUa board.
AMR 12.2 kbit/s CS 64 kbit/s PS 64 kbit/s PS 128 kbit/s PS 144 kbit/s PS 384 kbit/s SF32 (HSUPA) SF16 (HSUPA) SF8 (HSUPA) SF4 (HSUPA) 2 x SF4 (HSUPA) 2 x SF2 (HSUPA) 2 x SF2+2 x SF4 (HSUPA) 2 x M2+2 x M4
NOTE
The preceding table assumes that signaling radio bearers (SRBs) are carried over HSUPA. If an SRB is carried on an R99 DCH separately, an extra CE is consumed by the SRB. In this case, add 1 to each number listed in the preceding table.
Draft A (2012-07-23)
32
HSDPA services do no consume downlink R99 CE resources. HSUPA services and R99 services share uplink CE resources. CE congestion or routine CE usage monitoring may trigger CE resource analysis. If the CE usage remains above a preset capacity expansion threshold or CE congestion occurs, the CE resources are insufficient and must be increased to ensure system stability. Figure 4-7 Process for analyzing CE congestion
Draft A (2012-07-23)
33
CE resources can be shared within a resource group. Therefore, CE usage on the NodeB must be calculated to determine where CE congestion occurs in a resource group or on the NodeB. If CE congestion occurs in a resource group, reallocate CE resources between resource groups. If CE congestion occurs on the site, perform site capacity expansion and modify parameter settings.
After IP RAN is introduced, Iub resources no longer need to be monitored. This section is retained to provide a complete solution so that operators can compare solutions provided by different vendors.
If insufficient Iub bandwidth causes congestion, check the Iub bandwidth usage. If the Iub bandwidth usage remains above 80% for a period of time, the Iub bandwidth is insufficient. If no more Iub bandwidth is available or the issue is not urgent, decrease the activity factor for PS services to admit more users. The activity factor, which is the ratio of actual bandwidth occupied by a UE to the allocated bandwidth, is used to estimate the real bandwidth usage. The activity factor can be set on a per-NodeB basis. The default activity factor is 0.7 for voice services and 0.4 for PS best effort (BE) services. Figure 4-8 Process for analyzing Iub bandwidth congestion
Draft A (2012-07-23)
34
Enable LDR and OLC for temporary troubleshooting. Add carriers or split cells for a long-term solution.
Draft A (2012-07-23)
35
NOTE
RRC-related counters are updated based on the cause values in the RRC connection requests. This facilitates analysis of risks related to signaling storms.
Generally, adding a carrier is the most effective means for relieving uplink power congestion. If no additional carrier is available, consider deploying a new site or splitting the cell by reducing the tilt angle of the antenna.
Reduce the maximum number of PS RABs. Enable code-based LDR. Reduce the minimum number of codes reserved for HSDPA services. Activate the license for dynamic code allocation on the NodeB.
The thresholds for parameters involved in the preceding operations must be set based on the operator's requirements for service quality.
Draft A (2012-07-23)
36
Reduce the period during which PS services are carried on FACHs to enable fast UE state transition to the CELL_PCH state or idle mode. In addition, set up RRC connections on DCHs if DCH resources are sufficient. Add an SCCPCH to carry FACHs.
Draft A (2012-07-23)
37
Draft A (2012-07-23)
38
5 Counter Definitions
5
Table 5-1 Counters Counter Name # Blocking Metrics Call block rate Vs.Call.Block.Rate (custom) Associated Counter
Counter Definitions
Table 5-1 defines all performance counters mentioned in the preceding chapters.
Calculation Formula
Vs.RRC.Block.Rate + (<RRC.SuccConnEstab.sum> /(<VS.RRC.AttConnEstab.CellDCH> + <VS.RRC.AttConnEstab.CellFACH>)) x Vs.Rab.Block.Rate (<VS.RRC.Rej.ULPower.Cong> + <VS.RRC.Rej.DLPower.Cong> + <VS.RRC.Rej.ULIUBBand.Cong> + <VS.RRC.Rej.DLIUBBand.Cong> + <VS.RRC.Rej.ULCE.Cong> + <VS.RRC.Rej.DLCE.Cong> + <VS.RRC.Rej.Code.Cong>)/<VS.RRC.AttConn Estab.Sum> (<VS.RAB.FailEstabCS.ULPower.Cong> + <VS.RAB.FailEstabCS.DLPower.Cong> +<VS.RAB.FailEstabPS.ULPower.Cong> + <VS.RAB.FailEstabPS.DLPower.Cong> + <VS.RAB.FailEstabCS.ULCE.Cong> + <VS.RAB.FailEstabPS.ULCE.Cong> + <VS.RAB.FailEstabCs.DLCE.Cong> + <VS.RAB.FailEstabPs.DLCE.Cong> + <VS.RAB.FailEstabCs.Code.Cong> + <VS.RAB.FailEstabPs.Code.Cong> + <VS.RAB.FailEstabCS.DLIUBBand.Cong> + <VS.RAB.FailEstabCS.ULIUBBand.Cong> + <VS.RAB.FailEstabPS.DLIUBBand.Cong> + <VS.RAB.FailEstabPS.ULIUBBand.Cong>)/VS. RAB.AttEstab.Cell
Vs.RRC.Block.Rate (custom)
Vs.RAB.Block.Rate (custom)
Draft A (2012-07-23)
39
5 Counter Definitions
>>Power metrics R99_TCP_Utiliz ation_Ratio Total_TCP_Utili zation_Ratio Max UL RTWP Mean UL RTWP Min UL RTWP UL ENU Ratio >>IUB Metrics IUB BW Utilization ratio NODEB_Trans_ Cap NODEB_Throughput (custom) NODEB_Trans_Cap (custom) VS.IPDLTotal.1 VS.IPDLTotal.2 VS.IPDLTotal.3 VS.IPDLTotal.4 NODEB_Throug hput NODEB_Throug hput_DL NODEB_Throughput_DL (custom) NODEB_Throughput_UL (custom) VS.IPDLAvgUsed.1 VS.IPDLAvgUsed.2 VS.IPDLAvgUsed.3 VS.IPDLAvgUsed.4 NODEB_Throug hput_UL VS.IPULAvgUsed.1 VS.IPULAvgUsed.2 VS.IPULAvgUsed.3 VS.IPULAvgUsed.4 >>PCH&FACH Utilization Metrics PCH Physical Channel Utility Ratio VS.UTRAN.AttPaging1 VS.UTRAN.AttPaging1/(60 x 60 x 5/0.01) (VS.IPULAvgUsed.1 + VS.IPULAvgUsed.2 + VS.IPULAvgUsed.3 + VS.IPULAvgUsed.4) MAX(NODEB_Throughput_DL, NODEB_Throughput_UL) (VS.IPDLAvgUsed.1 + VS.IPDLAvgUsed.2 + VS.IPDLAvgUsed.3 + VS.IPDLAvgUsed.4) (VS.IPDLTotal.1 + VS.IPDLTotal.2 + VS.IPDLTotal.3 + VS.IPDLTotal.4) NODEB_Throughput/NODEB_Trans_Cap VS.MeanTCP.NonHS VS.MeanTCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum VS.MeanTCP.NonHS/Configured_Total_Cell_T CP (43 dBm or 46 dBm) VS.MeanTCP/Configured_Total_Cell_TCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum/UlTotalEqUserNum
Draft A (2012-07-23)
40
5 Counter Definitions
Calculation Formula (1) Utilization of FACH carried on non-standalone SCCPCH FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/((60 x <SP> x 168 x 1/0.01) x VS.PCH.Bandwidth.UsageRate x 6/7 + (60 x <SP> x 360 x 1/0.01) x (1VS.PCH.Bandwidth.UsageRate x 6/7)) where VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx>/(<VS.CRNC.IUB .PCH.Bandwidth> x SP x 60.0) (2) Utilization of FACH carried on standalone SCCPCH FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx x 8/(60 x <SP> x 360 x 1/0.01)
VS.FACHUEs
(1)FACH_60_USER_SWITCH = OFF FACH UserNum Utility Ratio = VS.FACHUEs/30 x 100% (2)FACH_60_USER_SWITCH = ON FACH UserNum Utility Ratio = VS.FACHUEs/60*100%
>>OVSF Utilization Metrics OVSF occupancy OVSF Usability Ratio DCH OVSF Ratio VS.RAB.SFOccupy VS.RAB.SFOccupy.Ratio DCH_OVSF_Utilization VS.RAB.SFOccupy VS.RAB.SFOccupy/256 ((<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>))/256
Draft A (2012-07-23)
41
5 Counter Definitions
When CE Overbooking is enabled, NodeB_UL_CE_MEAN_RATIO is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB(VS.NodeB.ULCred itUsed.Mean/2)/UL CE Cfg Number
When CE Overbooking is disabled, NodeB_UL_CE_MEAN_RATIO is calculated by the following formula: NodeB_UL_CE_MEAN_RATIO = Sum_AllCells_of_NodeB (VS.LC.ULCreditUsed.Mean/2)/UL CE Cfg Number
NodeB_UL_GRP_CE_MEAN_RATIO = Sum_AllCells_of_ResourceGroup(VS.LC.ULCr editUsed.Mean/2)/UL GRP CE Cfg Number Sum_AllCells_of_NodeB (VS.LC.DLCreditUsed.Mean)/MIN(NodeB License DL CE Number, NodeB Physical DL CE Capacity)
VS.LC.DLCreditUsed.Mean
Draft A (2012-07-23)
42