|Year : 2009 | Volume
| Issue : 5 | Page : 212-217
Ambience-Based Voice Over Internet Protocol Quality Testing Model
Karthikeyan Vaiapury1, Malmurugan Nagarajan1, Sunil Kumar Jain2
1 Open Design House - Mobile Technologies, Satyam Computer Services Limited, Singapore
2 Satyam Computer Services Limited, Bangalore, India
|Date of Web Publication||5-Nov-2009|
Open Design House - Mobile Technologies, Satyam Computer Services Limited
| Abstract|| |
In this paper, we explore a new voice quality management model under ambience environment suitable for voice over internet protocol (VoIP) calls in wireless WLAN 802.11 Linux environment. The system is based on a setup that assimilates environment noise level using noise detector and adaptive audio manager used to tune the audio level and thereby quality is ensured. Further, existing models such as E-model, PESQ model and adaptive model address QoS in a VoIP networking perspective but do not address voice quality management in real time environment to improve voice quality. We propose to use background noise and its associated source information in an adaptive environment to boost user perception, audio level, thereby ensuring quality. This issue is important because in real time, consumers might be interested in mobiles that address ambience effect to determine human audible effect based on environment conditions. The experimental results of the proposed method outperform the existing method in terms of QoS factor metric; we get extensive results with more number of calls.
Keywords: Linux mobile, QoS in Voice over internet protocol, SIP, Voice quality management, Wireless mobile multimedia.
|How to cite this article:|
Vaiapury K, Nagarajan M, Jain SK. Ambience-Based Voice Over Internet Protocol Quality Testing Model. IETE J Res 2009;55:212-7
| 1.Introduction|| |
Due to significant decrease in cost of mobile devices and internet services, the usage of VoIP is growing tremendously. In this paper we concentrate on quality of service with regard to VoIP. In general, quality of service includes ensuring clarity, avoiding delay and alleviating echo etc  .
In addition to the above, the factors that influence VoIP quality of service as stated in  are as follows
- throughput - amount of data (inbound/outbound) that goes through the network
- availability of service - continuous connection for data/voice
- delay - transfer of packets over the internet in case routes are congested
- delay variation - packet is routed differently and takes more time to reach
- loss of packets - major congestion packets are sometimes dropped and not forwarded
It is also noted that there is significant amount of research work done in QoS of VoIP with a network perspective than voice quality management.
It is seen that if somebody is attending a call amidst background noise of diversified environments it is difficult to perceive other end speech with clarity. The system should be intelligent to sense the environment and tune audio level accordingly. The set-up requires sensor, background noise detector and AAF (adaptive audibility factor) manager to identify and tune environment noise level.
This work includes a) forming a QoS framework that addresses ambience in VoIP networks, b) sensing environment noise, burst level under a particular context, c) identifying when to make audio signal quality high and d) validation
We have already tested the performance of our model in  . In this work, we provide extensive results with more number of calls. VoIP faces the challenge of quantifying voice quality  . Chandler has addressed the need and significance of voice quality management. He has also enunciated why standard QoS is not sufficient for voice. The QoS of network alone is not enough to judge VoIP call quality. High quality customer experience needs to be considered for monitoring and managing quality VoIP calls  . Kreiman found from his study that forming valid measures of vocal quality with accuracy and replicability are significant challenges. Psychophysical research methods would be of added value in estimating such measures  . E-model is widely adopted to calculate QoS score of network under normal conditions , . Further, it is stated in  that it is not a true psycho-physical model and cannot be used to predict absolute opinion of an individual user. The existing QoS models (E-model, PESQ model and adaptive model) address this problem for VoIP calls. However, they do not address voice quality management in real time environment to improve voice quality. QoS is required to manage multitude of applications such as VoIP, streaming etc  .
As stated in  , VQmon is used for non-intrusive (passive) monitoring of the RTP streams as per the observations and produces R-factor. Quality of VoIP calls is usually given by R-score, with respect to delay and loss of packets. (Refer equation 1). PSQM  is used to analyze the distortion on voice signals through a VoIP network and to produce an estimated MOS score (Refer equation 2).
It is to be noted that the existing models do not have the concept of quality of service metric in the perspective of user satisfaction. Based on our survey of literature, we found that no work has been done so far to quantify the quality of information with the user's environment information at a particular instant of time. There is significant amount of research in QoS of VoIP in networks perspective rather than voice quality management  . This paper provides extensive study and proposes to measure background noise during a conversation and selectively boost the voice levels depending on the level of the background noise.
E-model and other psychological models are available for measuring QoS of VoIP in networks. Following is a brief introduction to R-value used in E-model ,, . There exists a non-linear mapping between R and MOS score and a comparison table between both of them with meanings of the values of R and MOS scores is given below. A MOS score ranges from 1 for a worst rated call to 5 for a best rated call. A typical range for voice over IP would be from 3.5 to 4.2. R-value is calculated as shown in equation 1.
where R 0 is Signal-Noise-Ratio. The default value of R 0 is 100. I S represents the combination of voice signal impairments such as packet loss and echo, I d represents impairments associated with end to end delay due to delay and delay jitter, I e represents impairments caused by low rate codecs, encoding artifacts and A represents user convenience factor. Delay, delay jitter and packet loss are considered in a single parameter R. Further, non-linear mapping between R and MOS is as shown in equation 2. MOS value of VoIP call expressed as nonlinear function of R is as follows:
One can refer to [Figure 1] to understand how threshold level is fixed in our model. Let's analyze R value and MOS in terms of environment sound source shown in [Table 1] and [Table 2]. Under ideal conditions, users are satisfied with R values 91 and 80 for noise levels 50 and 60 dB (A) respectively [Table 2].
However, when Room noise is around 70 dB (A), then R value is around 59 which is poor. MOS value is 3.06 where 4.5 is perceived to be best. This scenario enunciates the need for voice quality management. User satisfaction analysis is done in E-model with R-range.
MOS value  ranges from 1 to 5 and score can be interpreted as follows.
5 - perfect. (Example: Like face-to-face conversation or radio reception)
4 - fair. Imperfections can be perceived, but sound still clear. This is (supposedly) the range for cell phones.
3 - annoying.
2 - very annoying. Nearly impossible to communicate.
1 - impossible to communicate
Room noise at both sender and receiver end is considered in E-model for calculation of QoS in VoIP networks  . In the perspective of psychophysical research survey, we provide some of sample sound sources and respective dB ranges  in [Table 1]. As stated in  , Emodel is based on combinations of impairment using stored information on the effects of individual impairments such as delay, jitter, packet loss and codec performance. So, E-model is useful for network planners to design networks. For example, refer equation 1 and 2 are used for calculating R-value and MOS score respectively.
On the other hand, Perceptual Evaluation of Speech Quality (PESQ) predicts MOS (Mean Opinion Score) based upon comparison of a voice file that has been processed by the network under test against a clean reference file. Tests using these models are intrusive because they require a dedicated test call, rather than actual conversations.
| 2.Proposed Methodology|| |
We have proposed a model that estimates QoI (Quality of Information) which is based on parameters addressed in existing QoS models and as well as adaptive audibility factor under ambience environment. In this work, voice quality management model is proposed for ambience environment since the R/MOS score in the current models does not consider voice quality adjustment that depends on environment at any particular instant.
In the proposed framework as shown in [Figure 2], audio level is adaptively increased based on feedback from environment noise detector at receiver end. The idea is to measure background noise during a conversation and selectively boost the voice levels depending on the level of the background noise. In [Figure 2], VAD, CNG and EC represent Voice Activity Detector, Comfort Noise Generator and Echo Canceller. The algorithm of the proposed framework is given below:
For each VoIP call session,
Step 1: Start AAF Decision Manager For each call instance,
Step 2: Calculate Environment background noise
Step 3: Determine AAF (refer equation 3).
Step 4: If AAF = 1, increase audibility level by 2
Else if AAF = 0.5, increase audibility level by
1 unit. Else null End
Step 5: Calculate QoI score (refer equation 5) End
As seen in the above algorithm, adaptive audibility factor is introduced to improve the quality perceived by adjusting the audio level at receiver/hearing end during particular time instant. This is achieved by AAF decision manager with aid of AAF factor (refer equation 3).
AAF (Adaptive Audibility Factor) is defined as in equation 3.
where BGN is Background Noise level in decibels dB (A).
The above equation is formed with reference to [Table 1]. This uses experience as inference rule for setting an adaptive audible level during any instance in a call. More the noise, AAF value is higher. The volume of incoming speech signal is increased by two units, one unit is one dB. This means when BGN is greater than or equal to 70 dB, then audio level is increased by two units whereas in case of BGN between 60 and 69, the audio level is increased by one unit. Further, the audio level is not at all increased when BGN is zero. AAF Decision manager adaptively changes the hearing level on receiver end at any particular instance. The thresholds are fixed based on [Figure 1]. It is the trade off point at which there is transition where the noise level starts that is above office 60 dB and 70 dB where loud conversation is possible [Table 1].
As said earlier, overall QoI according to the proposed model is calculated from both network and voice quality management.
Where α and β represent weightage given to E-model and VPAF such that α + β = 1. As seen in equation 4, we introduce VPAF (voice perceivance accuracy factor) as a measure for determining voice quality accuracy of a call. VPAF is used to model accuracy of audio adjustment being done to increase user audio perceivance. It ensures that user has good audible level even in presence of noise. Given more weightage to VPAF since the model is more towards voice quality management meanwhile considering R score also into account.
Precisely, VPAF represents the probability of correctly adjusted audio instances in a call. VPAF can be calculated using equation 6.
where IC,j represent correctly adjusted audio instances in a call j and IT,j represents total number of audio instances in a call j.
For all instance sessions in a call j, the probability of correctly adjusted audio instances need to be calculated. One instance is period where person talks at one end at that time instant.
where m is number of instances in a call j. VPAF takes value between 0 and 1. Likewise, for all instances of a call, the receiver's hearing level is dynamically changed by AAF manager based on environment at corresponding end.
| 3.Testbed Implementation|| |
Our test bed comprises PC workstations connected over wireless LAN 802.11 G NIC [network interface cards] over nodes in our mobile open design house laboratory. Wireless cards used for our PC are of type Planex GW-US54GZL. Linksys WRT54G Wireless-G 2.4 GHz Broadband router is used which can support up to 54 Mbps. ZD1211 driver is configured to make it work under linux kernel 2.6.9.
| 4.Results and Discussion|| |
In this work, we further extend the simulation results for 20 calls in setup with background noise room noise (receiver) is equal to 70 dB (A). We have already done the performance analysis of E-model over our proposed model  .
For sample understanding, we provide the behavior of MOS with PLR (packet loss rate) for the work  in [Figure 3]. PLR (packet loss rate) is more sensitive with respect to mean opinion score. As it can be apparently seen that MOS score deteriorates whenever packet loss rate is relatively higher.
Also, as one can see from [Figure 4], when number of correctly adjusted instances of a call is high, QoI value is higher. In other words, more the number of calls are adjusted successfully, the VPAF is higher. Further, one can notice the increase in QoI value with VPAF values under room noise 35 dB and 70 dB respectively [Figure 5].
For experimentation, we included loud conversation with room noise (receiver) is equal to 70 dB (A). Then QoI is calculated with and without VPAF based on equation 5.
Unlike VPAF is equal to 0.9 in  , we have provided VPAF is equal to one and VPAF is equal to zero with more number of calls (20) for determining QoI in this work, In [Figure 6], the utility of VPAF in boosting QoI (quality of information) is clearly shown.
| 5.Conclusion and Future Work|| |
We propose a framework to ensure that voice quality is boosted to maximum performance level. It is based on audio perceivance accuracy, adaptive audibility factor. We have analyzed the framework with more number of calls (VPAF is equal to one and VPAF is equal to zero) conducted under loud conversation 70 dB. As future work, we intend to model this framework using experiential sampling technique  to choose the QoS method that is appropriate at a particular time and context based on the current environment.
We proposed a model for measuring and improving voice quality under ambience environment for VoIP calls. Then QoS factor is determined for this set-up. According to our model, when dB level increases, audio level at receiver side is tuned adaptively at that point of time by decision manager thereby providing better QoI metric under ambience environment.
| 6.Acknowledgment|| |
Authors would like to express their thanks to the management of Satyam Computer Services Limited for their support by the ODH initiative at Changi Business Park, Singapore. We would also like to thank Prof. Mohan S. Kankanhalli for useful suggestions.
| Authors|| |
Karthikeyan Vaiapury received his B. Tech. in 2003 from School of Engineering and Technology, Bharathidasan University (currently Anna University, Trichy). He received his M.S. (Computer Science) in 2007 from National University of Singapore. Presently he is working as a Software Engineer (R& D) at Mobile Design House, Satyam Computer Services Limited, Singapore. He worked as Lecturer at the School of Engineering and Technology, Bharathidasan University from Oct 2003 to Dec 2004. His current research interests include multimedia information retrieval, visual attention, mobile multimedia and wireless networks.
Malmurugan Nagarajan is presently an editor of Journal of Multimedia and Journal of Simulation and Modeling. He has around 21 years of teaching and industrial experience. He worked extensively in audio, video codec development for mobile platforms and well versed with audio and video standards. He is strong in developing tools and system applications and modeling and simulation. His doctoral thesis is on multimedia compression techniques using wavelet variants. Currently he is developing new algorithms and IPs for the components in wireless broadband physical layer at Satyam Computer Services Limited, Singapore. He is also an educationist and held various positions ranging from lecturer to principal in various educational organizations. His area of interest include wavelet variants based signal and image processing, multimedia compression and qatermarking, signal processing algorithm development in telecom domain.
Sunil Kumar Jain has 12 years experience in SIP/H.323 based applications and VoIP. He has worked extensively on routing protocols like OSPF for IPv4 and IPv6; has vast knowledge and experience on the NAT and firewall devices and their issues with respect to SIP/VoIP on the boundaries of the network. His areas of interest include overall networking domain including wireline as well as wireless networking with major focus on SIP/VoIP applications based on various standards like IETF, PacketCable, True2Way, and SAF for high availability.
| References|| |
|1.||Effective Bandwidth Management: QoS for VoIP, http://www.voiplobby.com/voip-articles/qos-voip.htm . |
|2.||V Karthikeyan, N Malmurugan, and S Jain. A -Novel Real time Voice Quality Testing Model for VOIP -ambience environment in Wireless LAN, NGIntw -workshop, COMSNETS 2009. |
|3.||N Chandler, Beyond QoS: Voice Quality -Management for VoIPnetworks. http://www.-commsdesign.com/-showArticle.jhtml?articleID=201001377, July 16, 2007 . |
|4.||J Kreiman, D Sidtis, and B Gerratt. Defining and -measuring Voice Quality, From Sound to Sense -Conference, MIT, 2004. |
|5.||C Hoene, H Karl, and A Wolisz. A perceptual -Quality model for Adaptive VoIP Applications, In the -Proceedings of International Symposium of Computer and -Telecommunication systems, 2004. |
|6.||A Lima, LSG Carvalho, J Souza, and E Mota. A framework for network quality monitoring in the VoIP -environment, International Journal of Network Management, Vol 17, Issue 4, 2007. |
|7.|| http://www.itu.int/ITU-T/studygroups/com12/-emodelv1 /introduction.htm . |
|8.|| http://www.cisco.com/../technologies_white_paper09186a00800a3e2f.html . |
|9.|| VQmon/EP, http://telchemy.com/vqmonep.html . |
|10.||PSQM, http://www.opticom.de/technology/psqm.html . |
|11.||http://www.itu.int/ITU-T/studygroups/com12/-emodelv1/audihelp_r.htm #COMPUTE. |
|12.|| http://voip.about.com/od/voipbasics/a/MOS.htm . |
|13.||Room noise detector, http://www.redcircuits.com/Page16.htm . |
|14.||M S Kankanhalli, J Wang, and R Jain, Experiential Sampling in Multimedia Systems, IEEE Transactions on Multimedia, 2006. |
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6]
[Table 1], [Table 2]