Mapping Electricity Infrastructure with Deep Learning

Introduction

Одной из основных целей любого государства является обеспечение доступной, надежной и устойчивой электрической энергией. Главное препятствие к этому связано с тем фактом, что энергоснабжающие организации не всегда обладают всеобъемлющей картой воздушных линий электропередач, а схемы, которые существуют, как правило, устарели и неполны. Без централизованной карты, правительства или другие организации не обладают знаниями для принятия обоснованных решений об инвестировании средств в техническое обслуживание или расширение электрической сети. Этот недостаток информации также усложняет принятие решений об установке альтернативных источников энергии – не зная, где обычная сеть, трудно разумно использовать альтернативы, такие как солнечная или ветровая энергия. Помимо самой карты, правительства и организации нуждаются в быстром и экономически эффективном инструменте. Высоковольтная сеть постоянно расширяется, поэтому важным моментом является возможность создания точных снимков через равные промежутки времени. Функционал платформы cGIS включает алгоритм, способный эффективно картировать инфраструктуру воздушных линий электропередач.

This feature provides an automatic solution to the problem of power line detection based on the use of machine learning. The solution is an algorithm that receives remote sensing data as input. The resulting images are processed in the visible spectrum by a neural network and return geospatial locations that with a high degree of probability contain elements of high-voltage infrastructure - power line poles. This data further requires verification by the operator. However, even in their original form, they are classified as geographic positions that correspond with a high degree of probability to the locations of real objects in the areas.

Power lines in the satellite images.


Classification or detection?

The first question to create such a solution was to set the machine learning task itself. We were initially inspired by the way DevSeed solved a similar problem. Their solution involves the classification of vector tiles, on a map with a scale of 1:2000, which indicates whether there are power line poles on a given tile. Subsequently, the operator needs to manually trace power lines on the classified tiles, using a web interface with an interactive map.

An example of DevSeed's vector tiles classification. Those tiles that contain power line supports are selected for further manual tracing.

This method of overhead power lines recognition has proved to be insufficiently effective. We decided to find a better solution to this problem, namely the direct detection of the power poles themselves on the map.
 
We decided to use detection instead of classification because the detection algorithm has the same disadvantages as the classification algorithm.

Advantages:

  1. Strong approximation (1:2000) is not required to detect power lines. Publicly available maps at scales larger than 1:4000 require a special rate plan, while for detection the size of the tile is not essential, it is only important that the power line pole itself is directly distinguishable on the map.
  2. The manual tracing process is not required.
  3. The detection output immediately gives the location of the power line pole in geographic coordinates, allowing the power lines to be traced automatically.

Disadvantages:

  1. False positives when trees can stand out as power line poles.
  2. Misses, when some objects on the map may be missed due to an error in the detection algorithm.


Raw data for machine learning

The input data for the training task are the geographic coordinates of power line poles categorized by the following meaningful attributes: pole type, product material, voltages, and others. A total of 211725 objects were represented in the full dataset. Further work was carried out with the data grouped by support type as the most representative feature. The following is a summary of the data by this criterion.

TypeNumber of objects
Reinforced concrete transmission line poles up to 20 kV119383
Wooden transmission line supports up to 20 kV83453
Intermediate poles of 110 kV transmission lines4938
110 kV anchored poles1176
Intermediate poles up to 330 kV808
Metal poles up to 20 kV604
Tiebolts up to 330 kV265
Intermediate poles of power transmission lines 35 kV175
220 kV 220 kV intermediate towers96
Substation and overhead line gantries of 110-330 kV39
35 kV transmission line anchored poles31
Mast poles, road poles12
110 kV OL and Substation gantries9
220 kV transmission line anchor poles8
Spans2
35 kV OL and PL portals2
Aerial bundled cables1
TOTAL211725
Table No. 1. All types of energy infrastructure


Due to the lack of distinguishability of objects of all designated types on the images, the following objects were selected for further work in the machine learning algorithm:

TypeNumber of objects
Intermediate poles up to 330 kV808
220 kV 220 kV intermediate towers96
Intermediate poles of 110 kV transmission lines4938
Tiebolts up to 330 kV265
220 kV transmission line anchor poles8
110 kV anchored poles1176
TOTAL7291
Table 2. Selected types of objects for recognition


The data for machine learning was a raster image of 512x512 pixels with the areas of high-voltage power line towers marked on it. The markup was a rectangular area in the image, corresponding to an average area of 70x70 meters on the ground.

For machine learning, the original raster images were partitioned with 80% data for training and 20% data for testing.


Choice of solution architecture

For the task of detecting objects in an image where objects of the target class are only a small part of the image, it is a problem that the parts of the image where the target object is missing contribute too much to the training process, eventually leading to many gaps in the test set. To solve this problem, we used the Focal Loss neural network function, which reduces the influence of frequent backgrounds and increases the importance of infrequent objects in training.

Таким образом была выбрана следующая архитектура нейронной сети для детектирования – RetinaNet, которая как раз и использует Focal Loss в качестве функции потерь.

During the RetinaNet training, the loss function is calculated for all considered orientations of candidate areas (anchors), from all levels of image scaling. In total, there are about 100 thousand areas for one image. The Focal Loss value is calculated as the sum of function values for all anchors, normalized by the number of anchors containing the sought objects. The normalization is done only by them and not by the total number, since the vast majority of anchors are easily defined backgrounds, with little contribution to the total loss function.

Structurally, RetinaNet consists of Backbone and two additional networks (Classification Subnet) and Object Boundary Definition (Box Regression Subnet).

RetinaNet neural network architecture

В качестве базисной нейронной сети используется так называемая Feature Pyramid Network (FPN), работающая поверх одной из общеиспользуемых свёрточных нейронных сетей (например ResNet-50). FPN имеет дополнительные боковые выходы со скрытых слоев свёрточной сети, формирующие уровни пирамиды с разным масштабом. Каждый уровень дополняется «знаниями сверху», т.е. информацией с более высоких уровней, имеющих меньший размер, но содержащих сведения об областях большей площади. Выглядит это как искусственное увеличение (например, простым повтором элементов) более «свёрнутой» карты признаков до размера текущей карты, их поэлементное суммирование и передача как на более низкие уровни пирамиды, так и на вход остальных подсетей (т.е. в Classification Subnet и Box Regression Subnet). Это позволяет выделить из исходного изображения пирамиду признаков в разных масштабах, на которых могут быть обнаружены как большие, так и мелкие объекты. FPN используется во многих архитектурах, улучшая детектирование объектов разного масштаба – RPN, DeepMask, Fast R-CNN, Mask R-CNN и других.

Our network, like the original one, uses FPN with 5 levels numbered P3 through P7. The level Pl has a resolution 2l times smaller than the input image. All levels of the pyramid have the same number of channels C = 256 and the number of anchors.

Площади анкоров были подобраны следующим образом: [16 х 16] до [256 x 256] для каждого уровня пирамиды от P3 до P7 соответственно, с шагом смещения (strides) [8 — 128] пикселей. Такой размер позволяет анализировать мелкие объекты и некоторую окрестность вокруг. В нашем случае – это опоры линий электропередач с прилегающей к ним тенью.

Training and Results

Machine learning results were evaluated in the following ways:

  1. Testing on a delayed sample of 20% of the original sample to match the predicted rectangular frames and the true manually marked ones by the expert. 70% of the rectangle areas were correctly identified, given that the amount of data is not numerous.
  2. Calculating the deviation distance of the actual coordinates from the predicted coordinates in the entire dataset. During the automatic location of the supports by the machine-learning algorithm, up to 60% of the locations were found to correspond to the actual objects of the electrical networks (supports) and are, on average, at a distance of no more than 70 meters from the predicted ones. Also, there were found cases when there were no poles in the data, but the algorithm found them in the image, which eventually led to a distortion of the quality metric, because there is an object in the image, but it is not in the reference database, which suggests that the algorithm can be successfully used to update the databases with information about power lines.
en_US