Applying ML and AI to Image ProcessingWhen you look at a satellite image, it’s not always easy to know if you are looking at trees or grass… or roads vs buildings. So imagine how hard it would be for a computer to know.
Support Vector Machine (SVM) is a machine learning technique that takes classified data and looks at the extremes. Next, it draws a decision boundary line based on the data called a “hyperplane”. And the data points that the “hyperplane” margin pushes up against are the “support vectors”.
And “support vectors” are what’s important because they are the data points that are closest to the opposing classes. Because these points are the only ones considered, all other training points can be ignored in the model. Essentially, you feed SVM training samples of trees and grass. Based on this training data, it builds the model generating a decision boundary of its own.
Now, the results of this supervised classification aren’t perfect and algorithms still have a lot more learning to do. We still need work on features like roads, wetlands and buildings. As algorithms get more training data, it will eventually improve to classify anywhere.
Prediction Using Empirical Bayesian Kriging (EBK):
As you may know, kriging interpolation predicts unknown values based on spatial pattern. It estimates weights based on the variogram. And quality of the estimate surface is reflected in the quality of the weights. More specifically, you want weights that give an unbiased prediction and the smallest variance. Unlike kriging that fits one whole model for an entire data set, EBK kriging simulates at least one hundred local models by sub-setting the whole data set. Because the model can morph itself locally to fit each individual semi-variogram using kriging methodology, it overcomes the challenge of stationarity. In Empirical Bayesian Kriging (EBK), it predicts over and over again using a variety of simulations up to a hundred times. Each semi-variogram varies from each other. In the end, it mixes all of the semi-variograms for a final surface. You can’t customize as you can with traditional kriging. Finally, it outputs what it thinks is the best solution. Like a Monte Carlo analysis, it runs it repeatedly for you in the background. If it’s a random process, you let the random process run out over a thousand times. You see the trends in the resulting data and use that to justify your selection. This is why EBK almost always predicts better than straight kriging.
Image Segmentation and Clustering with K-means:
By far, the K-means algorithm is one of the most popular methods of clustering data. In K-means segmentation, it groups unlabeled data into the number of groups represented by the variable K. This unsupervised learning approach iteratively assigns each data point into one of the K groupings based on similarity of features. For example, similarity can be based on spectral characteristics and location. In an unsupervised classification, the k-means algorithm first segments the image for further analysis. Next, each cluster is assigned a land cover class. However, GIS can use clustering in other unique ways. For example, data points could represent crime and you may want to cluster hot and low spots of crime. Alternatively, you may want to segment based on socioeconomic, health or environmental (like pollution) characteristics.