Está en la página 1de 15

An application of GIS modelling in assessing potential habitat areas for wild boar, Sus scrofa (Linnaeus 1758)

by Andrei Verdeanu

Abstract
This paper is an attempt to demonstrate a simple application of GIS modelling in the field of biology, for establishing potential habitat areas of a certain species. I have selected wild boar because of data availability and also because it could be considered a dominant species across the area where I was about to apply the modelling my bachelor thesis area. The input data used for modelling consisted of the geospatial layers representing the factors responsible for the species distribution (digital elevation model, Corine Land Cover, hidrography, roads and railways network). The layers used were considered to describe the ecological requirements of the species. I constructed 5 models/scenarios, each comprising of unique combinations of the respective factors, with different influence percentages. Using the suitability indexes obtained via the models, I devised a grading scale for the suitability of the areas. Once the models were established, a validation of their efficiency and accuracy was needed. To do so, I used two sets of points data, personal observations and random generated points. For each model, I measured the number of points overlapping over each suitability class and expressed it in percentages. For evaluating the models, the percentages were compared and the best model was selected considering certain criteria.

Keywords: GIS modelling, potential habitat, suitability, land cover, wild boar. Acknowledgements
The approach presented in this paper is inspired by the work of A. Belda, B. Zaragoz, J. E. Martnez-Prez, V. Peir, A. Ramn, E. Seva & J. Arques (2011): Use of GIS to predict potential distribution areas for wild boar (Sus scrofa Linnaeus 1758) in Mediterranean regions (SE Spain), Italian Journal of Zoology, DOI:10.1080/11250003.2011.631944, which was mainly consulted in order to have some references regarding the ecological requirements of the species, since this article contained specific data on that topic. The article was used as an example and the copyright of the authors was fully respected. The present paper does not attempt to replicate or copy any of the methods used in the article or results, any similarities which may have arisen are purely coincidental or dictated by the standard GIS methodology applied in the field of biology. Also, I would like to show my gratitude to the following: My project supervisor, Peder Klith Bcher, Senior Scientist, PhD, GIS Coordinator, Ecoinformatics and Biodiversity Group, Dept. of Biological Sciences, Aarhus University, Jens-Christian Svenning, Professor, PhD, Dept. of Biological Sciences, Aarhus University, Mihai Niculita, Teaching assistant, PhD, Faculty of Geography and Geology, Dept. of Geography, University Al. I. Cuza, Iasi, Romania, for their help and input on the project.

Introduction
Considering such a topic for a biological project was much related to the fact that my bachelor thesis is using intensively GIS techniques and methods. This project was a good opportunity to use all the data that I have been working on already, as a basis over which to apply certain methodological procedures and derive from the existing digital layers even more useful information. Since all the available data that I have already worked on was on a very detailed level, this was even better to use for such an application. I have chosen the wild boar for this project because of a few different reasons: across my study area, it can be considered a quite dominant and widespread species; during the last 15 years or so, I actively did

hiking across the whole extent of my study area, and I had numerous encounters with the species, much more compared to other ungulates inhabiting the area. I kept good record of the areas of encounter and areas where I was able to identify occurrence by specific signs (hoof tracks, feces, tramping and rooting of the soil litter); at the time, it was the only species for which I have found specific data regarding (see the ecological requirements Acknowledgements);

since it is a game species, the project may also have an outcome regarding possible management and conservation strategies for the wild boar. The study area The area - figure 01, is located in the Eastern Carpathian Mountains, Romania. The extent (373km), delineates the valley

Fig. 01 Geographical location of the study area

of the Bistrita river in the section between the locality of Poiana Teiului (north-west) and the city of Piatra Neam (south-east) which is the largest city contained in the study area, along the city of Bicaz (south). The area rests at the interference between the low altitude hills and valley depressions in the east, extreme south-east and the medium-high mountains rising in the west side. There are a few notable reservoirs on the river, which were primarily constructed for hydro-energy. The biggest of them all, Izvorul Muntelui lake, constructed in the 1960s, has brought with it new land characteristics and because of its impressive size (length 34km, area 33km, med. depth - 36m, max. depth - 97m, volume - 1,250mil m) a very specific topoclimate in the surroundings. The valley

perimeter was extracted by automatically generating watersheds in the area and manual filtering by certain criteria of size. After the main drainage area was established, a buffer area of 1000m was generated around it, this way obtaining the river valley in the respective sector. The altitude in the area ranges from 291m in the south-east up to 1273m in the extreme north-west. Since it encompasses a good variety of landscape types and relief, the area is even better suited as a background for applying species habitat related methodology. Also, the land cover is diverse and well distributed in the area, both attitudinally and longitudinally figure 02 and 03 land cover percentages of the area (as calculated from Corine Land Cover 2006).

Fig. 02 Land cover percentages of the study area

Fig. 03 Land cover of the study area

Materials and methods


The main goals of this paper are to identify, weigh and combine the factors (variables) which dictate the habitat range and distribution of the wild boar in the study area. By using this technique, the end result will be a map emphasizing the potential habitat areas and their suitability index. A basic workflow for such an approach can be seen in figure 04 a HSI model (Habitat Suitability Index) from the United States Environmental Protection Agency.
Fig. 04 Basic Habitat Suitability Index model workflow (HSI)
Source: US Environmental Protection Agency - http://www.epa.gov/

The model I developed in this paper follows pretty much this type of structure. These are the steps I followed in my approach: 4

a) The purpose of the model: to determine potential habitat areas and assess their suitability; b) Informational input: literature and internet resources; c) Determining variables: Elevation Proximity to water resources Proximity to road and railway network Land cover Topographic wetness index; and choosing the

which predicts habitat areas by referring only to the environmental characteristics, which dictate the species habitat areas. I used, however, the few occurrence data I had for the validation of the models. Informational input For requirements determining I made use the of species and various

environmental

characteristics

literature and web resources. For the ecological requirements in particular, I used as a reference and starting point, the data series I found in the article A. Belda, B. Zaragoz, J. E. Martnez-Prez, V. Peir, A. Ramn, E. Seva & J. Arques (2011): Use of of the variables: GIS to predict potential distribution areas for wild boar (Sus scrofa Linnaeus 1758) in Mediterranean regions (SE Spain). Since the study refers to a mediterranean area, I adapted the values found in the article according to literature and in regard to my evaluating the study area, which is temperate. Determining and choosing the variables Considering the area of choice and the species characteristic requirements (literature consulted), I settled on five factors/variables: 1) Elevation the altitude range in the area is 291-1273m, and combined with the slope and the other aspects of the terrain, it has a significant impact on the species. 2) Proximity to water resources the hydrographic network is well developed across the study area, and the fragmentation it induces in the terrain could have significant importance Being on the the single species water 5 distributions.

d) Introducing the variables into a GIS environment: ArcGIS 10, TNT Mips 6.9; e) Weighing reclassification method; f) Combining the variables: weighted sum and weighted overlay; g) Generating multiple scenarios: 5 models/scenarios; h) Validating and models: random generated points and personal occurrence data; i) Choosing the best model: the one which emphasizes best the potential habitat areas according to certain criteria. Further on I will discuss in detail each of the above steps. Determining the purpose of the model Since I worked at such a detailed scale and all the occurrence data available on the web is at a much coarser resolution, I didnt do the classic approach, where the model is constructed starting with occurrence data, and I preferred a model

resource available for the species, their presence in the model is mandatory as it dictates many of the species behavioral characteristics. 3) Proximity to road and railway network as with the hydrographic network, the transportation network is an important factor in the species distribution. Mainly because it acts as a physical barrier and divergence mechanism (since it determines the overall movement pattern of the species). However, in the present paper I didnt dealt with the movement barrier approach, but I used this variable taking into account its anthropic nature and repellent properties for the species. 4) Land cover probably the most important factor of all, the species distribution is directly related to the nature of the topographical surface, but most importantly the type of land cover. It affects most of the sectors in the species life since it is the basal layer over which all the processes within the species life regime take place. The nature of the land cover dictates movement, feeding, resting, mating, etc. of the species. Since I used Corine Land Cover, it is more than a land use type of layer, and it contains also the anthropic transformed land types which have a great impact on the species habitat. 5) Topographic wetness index although it may have the same output as the hydrography factor, it however takes a sensu lato approach by its nature, linking the slope, soil characteristics, drainage capacity and so on. It could have a more detailed aspect than just using the hydrographic network as a factor. Not only it predicts the areas more prone to higher

levels of water abundance, but it does so by linking it with the terrain which is a good aspect considering that the terrain itself is sometimes a limiting or advantageous factor for the species. By combining the terrain with the water availability the model will be much more realistic, since the water resources and the terrain will counterbalance themselves and the final availability output for the species will be different than that of the hydrography itself. Introducing the variables into a GIS environment The GIS software used in this paper was mostly ArcMap 10 from ESRI and in a few isolated instances TNT Mips 6.9 from MicroImages (mainly for the manual vector extraction). Next I will briefly present the equivalent digital layers I used to represent each of the variables in the model: 1) Elevation I used a digital elevation model which I had previously constructed manually by extracting contours from a topographical map of the area (1:25000). The resolution of the raster cell was 4m. (Notice in the final map layouts I overlaid both a SRTM DEM with a 90m resolution and the detailed DEM, only for display purposes). 2) Proximity to water resources for this layer I used the hydrography of the area which I had also previously manually extracted as a vector file from the topographical map. The drainage network was not extracted. 3) Proximity to road and railway network the same as with the hydrography, I used the vector files manually extracted from the topographical map. 6

4) Land cover for this layer I used the Corine Land Cover seamless vector data version 15 (08/2011) downloaded from the European Environment Agency site. 5) Topographic wetness index from the DEM I manually constructed, I derived with the help of my supervisor, a TWI raster layer. The TWI was calculated with regard to the slope, as the logarithm of the slope/aspect ratio. Weighing of the variables Before going any further with the explanations I must state that from this point on all the examples of the GIS procedures applied in this paper will be exemplified on a detailed portion of the study area, from the lower end of the extent. (see figure 01). This area was chosen since it contains almost all the characteristics found across the whole extent of the study area, compressed hydrographic into a small patch. Fair a altitudinal range, good development of the network (including reservoir) as well as the transportation network, and a great variety of land cover. Another reason for choosing this detailed patch is the relative proximity to the biggest city across the study area, which is located just east of the reservoir. This could have an interesting outcome in the final model. As stated before, in order to create a map that emphasizes the potential habitat areas and their respective suitability index, all the layers representing the variables need to be combined, with different influence percentages. But before this final step, each of the variables needs to be reclassified according to the species

ecological requirements. In figure 05 you can see the representation of each of the variables layer before, and after the reclassification, and also the old and new range of values. The layers are from top to bottom a) Digital elevation model, b) Hydrography, c) Transportation network, d) Corine Land Cover and e) Topographic wetness index. All the new suitability values attributed to the variables are integer values, from 0 to 100 (percentile range). This way I avoided the need for further data standardization. Each of the variables was reclassified using the Reclassify toolset from ArcMap 10, and the Value field was accessed. Further on I will explain the reclassification process for each of the variable: 1) Digital elevation model since the altitudinal range in the area was 291-1273m, I established a median habitat niche (band). The altitude at which the urban structure begins to disperse is ~400m and the altitude at which the forest density decreases and the vegetation is replaced by shrubs and pastures was ~900m. This gives us an optimum (between altitudinal 400-900m) range which of took 500m the

maximal suitability value of 100 and the less probable range which is either below 400m or above 900m, took the value of 20.1 2) Hydrography for this layer I used the vector file and I constructed buffer areas around the hydrographic network, in two ranges, from
1

A. Belda, B. Zaragoz, J. E. Martnez-Prez, V. Peir, A. Ramn, E. Seva & J. Arques (2011): Use of GIS to predict potential distribution areas for wild boar (Sus scrofa Linnaeus 1758) in Mediterranean regions (SE Spain).

a)

b)

c)

d)

e)

Fig. 05 The geospatial layers - a) Digital elevation model, b) Hydrography, c) Transportation network, d) Corine Land Cover, e) Topographic wetness index, before and after reclassification (left to right), exemplified on the detailed study area patch

0-50m and 50-200m. The suitability values were: 0-50m > 90, 50-200m > 60, over 200m > 30.2 3) Transportation network the same as with the hydrography layer, I constructed buffer areas around the roads and railways network in two ranges, from 0-50m and 50-200m. The suitability values were inverted in this case (since there is a negative correlation between proximity to the transportation network and the species abundance) 0-50m > 30, 50-200m > 60, over 200m > 90.
2

The land cover types (and their CLC code equivalent and surface area) and the new suitability values assigned are shown in Table 1.2 All the values assigned were adapted to my temperate area according to literature. 5) Topographic wetness index the resulted TWI raster layer derived from the DEM had the range of 1-31. This was reclassified as follows: 1-7 > 10, 7-14 > 40, 14-21 > 70, 21-32 > 100.2 Although presented here as an independent variable, the TWI was used only in a single scenario out of 5, for reasons I will detail later on. Since all the variables were re-classified using the same value scale, there is no need for further data standardization, all values being integers.

4) Corine Land Cover this is one of the most important layer, if not the most important one, because of the inherit impact of the land characteristics to the species distribution.

Table 1 Reclassification of the Corine Land Cover geospatial layer

A. Belda, B. Zaragoz, J. E. Martnez-Prez, V. Peir, A. Ramn, E. Seva & J. Arques (2011): Use of GIS to predict potential distribution areas for wild boar (Sus scrofa Linnaeus 1758) in Mediterranean regions (SE Spain).

Combining the variables After all the layers were reclassified, the next step was to combine them in different ways to achieve the final potential habitat map. All the combinations were done in ArcMap 10, and, for better results, each of the combinations was done using three different methods, this way, verifying the accuracy of the results and guaranteeing similar results. First, the variables were combined using a Raster Calculator by means of a simple weighted sum. Then the combining was done again using this time the dedicated Weighted Sum toolset, and finally, one more time using the Weighted Overlay toolset. After comparing the results, all of the methods gave the exact same results. The Weighted Overlay tool was chosen to be used for all of the scenarios that were about to be generated. Generating multiple scenarios I decided to construct five different scenarios, in which I tried to use unique combination of the variables (chosen randomly), this way ensuring that more real-life situations were being covered. Also by modifying the weight percentages of the variables, their respective counter-balance effect for the other variables was changed, thus revealing possible singular effects which could be quite relevant in the species distribution. The first four models do not incorporate the TWI. I included the Topographic Wetness index only in the fifth model since it induced great scatter in the final output of the models (although being displayed as classified and not stretched, its

very dispersed nature caused the final output representation of the model to be somewhat diffuse in some areas. Therefore I used it only as a fail-safe test in the last model, in order to have it accounted for in at leas one model. At the previous tests I made, the main effect of adding TWI to the models, besides the diffuse display, was emphasizing the river valleys as positive areas which is quite redundant, as it is overlapping the hydrography buffer areas which are showing the same thing. Further on I will present each of the variable combination and their respective influence percentages for each model: Model 1
Corine Land Cover 60% Hydrography 15% Transportation network 15% Elevation 10% Corine Land Cover 50% Hydrography 10% Transportation network 10% Elevation 30% Corine Land Cover 80% Hydrography 10% Transportation network 5% Elevation 5% Corine Land Cover 30% Hydrography 50% Transportation network 10% Elevation 10% Corine Land Cover 40% Hydrography 15% Transportation network 10% Elevation 10% Topographic wetness index 25%

Model 2

Model 3

Model 4

Model 5

10

Results
Each of the five resulted models had different suitability index numerical ranges. I classified each model into 6 suitability index classes, keeping however, the different numerical values for each of the ranges. The different ranges appeared because of the different variable combinations. (i.e. for the first model it ranged 11-97, for model 3 0-99, etc.). It is the expression of each of the models and a standardization of all the classes would not bring justice to the realistic side of the models. It was better to keep the numerical accuracy, at least for the comparison of the efficiency of the models.

tries to develop the habitat areas starting with the environmental factors, I could not use such an approach to validate the model. However, I adapted the random points and occurrence observations approach to a more simple design, such as suggested here: Most habitat-association studies use a very restricted set of error measures, of which percentage overall accuracy is the most common. (e.g., Brennan et al. 1986; Capen et al. 1986; Verbyla & Litvaitis 1989; Donzar et al. 1993)4 For each of the suitability classes, I devised an equivalent grading scale to use in the assessment of the efficiency of each of the models:
1st class Very low probability 2nd class Low probability 3rd class Medium probability 4th class Good probability 5th class High probability 6th class Very high probability

Validating and evaluating the models In order to choose the best model, each of the model needed to be validated and evaluated. To do so, I made use of two sets of occurrence data, one made up of random generated points and one with personal occurrence observations. To further explore the results, we calculated a series of metrics that define the distances between sites, and the area occupied, in both environmental and geographic space. [] We used the 10 000 random points and the presences in the evaluation data set and calculated the median of the minimum distances between any one random point and all the presence points.3 Since my model is not based on presence-absence data, on the contrary, it
Elith, Jane, Graham, Catherine H., Anderson, [], (2006) Novel methods improve prediction of species' distributions from occurrence data. Ecography, 29 (2). pp. 129-151. ISSN 16000587
3

The

two

point

datasets

were

obtained as follows: 300 random points were generated automatically using the Create Random Points tool in ArcMap 10, across the whole extent of the study area, with a conditional distance of 400m between each of the points. The personal observations (200 points) were manually inserted using the topographical map as a reference.

Alan H. Fielding, John F. Bell, (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models, Environmental Conservation 24 (1): 3849 1997

11

For the validation of the models, I measured for each one, the number of points overlapping each suitability class, using the Extract Multi Values to Point tool in ArcMap 10, then derived percentages for each one and compared the models. This way, by knowing the percentage of points from the total number overlapping each of the class, I could evaluate the efficiency and accuracy of the models figure 06 the percentages for each model, both personal observations and random points. Choosing the best model For evaluating the models I divided the grading scale into two ends the positive end, which includes the Very high probability, High probability and Good probability classes and the negative end, which includes the remaining three lower classes Medium probability, Low probability and Very low probability. The model which had the best representation of the positive end was designated as being the best. Since we are interested in a positive correlation of the points and suitability areas, only the top three classes would give us the assessment of the correlation. In figure 07 we can see the correlations of the models plotted, for both data sets and ends.
Fig. 06 Percentages of the points data set overlapping each suitability class

12

Personal observations - positive end


Model 5 Model 5 Model 4 Model 3 Model 2 Model 1 0% 20% 40% 60% 80% 100% Total Model 4 Model 3 Model 2 Model 1 0%

Random points - positive end

20%

40%

60%

80% Good probability

100% Total

Very high probability

High probability

Good probability

Very high probability

High probability

Personal observations - negative end


Model 5 Model 4 Model 3 Model 2 Model 1 0% 10% 20% 30% 40% 50% Total Model 5 Model 4 Model 3 Model 2 Model 1 0%

Random points - negative end

10%

20%

30%

40%

50% Total

Medium probability

Low probability

Very low probability

Medium probability

Low probability

Very low probability

Fig. 07 Expression of the models on the two suitability ends, both personal observations and random points

What we need to see, in order to identify a good model, is a high overall expression of the positive end and the lowest possible overall expression of the negative end. If we analyze the graphs above, we can observe the following: For the personal observations dataset, ~ Model 1 ~ is the best model, because it has the highest expression in the positive end and the lowest expression in the negative end. For the random points dataset, ~ Model 2 ~ is the best model, since it has the highest expression of the positive end and the lowest expression of the negative end. The final maps for the two models, as well as a detailed view of the study area patch are shown in figure 08. Although their suitability index range is slightly different,

their

graphic

expression

is

somewhat

similar. All the maps are projected in Stereo70/Dealul Piscului 1970, 10km grid for the large maps and 1km for the detailed patch.

Discussion
As expected, the resulted models follow quite well the characteristics of the land contained in the study area. Nevertheless, this is just a potential habitat map, and in certain locations the criteria used in determining the suitability of those areas remains hypothetical. Take for instance the inland marshes (wet lands) represented as having near optimal habitat suitability center of the detailed patch figure 08. Assessing those areas with such a high suitability index was based on the nature of the land cover, and indeed there have been

13

Fig. 08 Model 1 and 2 maps, large and detailed patches

14

numerous wild boar occurrences in those areas. However, the area is contained within progressively denser urban surface and water bodies. The circulation in and out of that area is impaired, but still the species is present there. This means that in order to develop even more the model, we need to take into account movement patterns, anthropic barriers or transit corridors and perform cost-distance analyses. Since the present paper wants to be, at least in this stage, a general theoretical example of applying such modeling techniques, there is room for improvements and refinements. For instance, the fact that the influence percentages for each of the model were chosen randomly, with the purpose in mind to cover as many aspects as possible. In a real life application, these percentages need to be scientifically supported by certain clearly defined reasons. Otherwise, the random approach would be to generate much more models, in which to cover almost all possible combinations, but that would take a considerable amount of resources and time; in the TWI reclassification, there was the need to add a mask in the process, since the water surface appears as having maximum suitability, because it inherits it from the neighboring areas; also, for the evaluation of the models I used a very simple method to assess their accuracy efficiency, but more often ROC curve or a confusion matrix are being used in such cases, but for now, my limited expertise did not allowed me to apply such techniques. The models can be enhanced and perfected in a future, more developed attempt. Overall, the goal set at the beginning of the paper was achieved, producing the desired maps using the materials imposed along the way.

All the maps for each of the model are available in high resolution as supplementary paper information.

References
A. Belda, B. Zaragoz, J. E. Martnez-Prez, V. Peir, A. Ramn, E. Seva & J. Arques (2011): Use of GIS to predict potential distribution areas for wild boar (Sus scrofa Linnaeus 1758) in Mediterranean regions (SE Spain), Italian Journal of Zoology, DOI:10.1080/11250003.2011.631944 Bolstad Paul, (2007), GIS Fundamentals: A First Text on Geographic Information Systems, Third Ed. Chengzhi Qin, A-xing Zhu, Lin Yang, Baolin Li, Tao Pei, (2010), Topographic Wetness Index Computed Using Multiple Flow Direction Algorithm and Local Maximum Downslope Gradient. Elith, J., Graham, C. H., Anderson, R. P., Dudk, M., Ferrier, S., Guisan, A., Hijmans, R. J., Huettmann, F., Leathwick, J. R., Lehmann, A., Li, J., Lohmann, L. G., Loiselle, B. A., Manion, G., Moritz, C., Nakamura, M., Nakazawa, Y., Overton, J. McC., Peterson, A. T., Phillips, S. J., Richardson, K. S., Scachetti-Pereira, R., Schapire, R. E., Soberon, J., Williams, S., Wisz, M. S. and Zimmermann, N. E. 2006. Novel methods improve prediction of species distributions from occurrence data., Ecography 29: 129-151 Fielding Alan H., Bell John F., (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models, Environmental Conservation 24 (1): 3849 1997 R. Srensen, U. Zinko, J. Seibert, (2005), On the calculation of the topographic wetness index: evaluation of different methods based on field observations. Internet resources
Animal Diversity Web http://animaldiversity.ummz.umich.edu/site/accounts/in formation/Sus_scrofa.html CORINE Land Cover (2006) http://www.eea.europa.eu/data-and maps/data#c12=corine+land+cover+version+13 Encyclopedia of Life http://eol.org/pages/328663/details The IUCN Red List of Threatened Species http://www.iucnredlist.org/technicaldocuments/classification-schemes/habitats-classificationscheme-ver3 ZipCodeZoo http://zipcodezoo.com/Animals/S/Sus_scrofa/ US Environmental Protection Agency - http://www.epa.gov/

15

También podría gustarte